COMP1004 W2

Module Information

Learning Objectives

Understanding how Entity-Relationship (ER) diagrams map to relational databases.
Mastering normalization techniques.
Exploring functional dependencies and their properties.
Learning the process of normalization from un-normalized data to:
- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)

Transforming ER Diagrams into Relations

The process of transforming ER diagrams into relations in relational databases is straightforward and systematic.

Steps in Transformation

Mapping Regular Entities
- Every regular entity in an ER diagram is transformed into a relation.
- The relation's name is typically the same as the entity name.
- Each simple attribute of the entity becomes an attribute in the new relation.
- The identifier (primary key) of the entity becomes the primary key of that relation.
Mapping Composite Attributes
- For entities with composite attributes, only the simple components are included as attributes in the new relation.
Mapping Multivalued Attributes
- For an entity with a multivalued attribute, create two new relations:
  - The first relation contains all attributes of the entity except the multivalued attribute.
  - The second relation includes:
  - The primary key from the first relation
  - The multivalued attribute.
Mapping Weak Entities
- Create a new relation for each weak entity.
- Include all simple attributes of the weak entity in this new relation.
- Include the primary key of the identifying relation as a foreign key in this new relation.
Mapping Binary Relationships
- For each binary one-to-many (1:M) relationship:
  - Create a relation for each entity involved.
  - Include the primary key of the one-side entity as a foreign key in the relation on the many-side.
  - Note: The primary key migrates to the many-side of the relationship.
Mapping Binary Many-to-Many Relationships
- For a binary many-to-many (M:N) relationship between entities A and B:
  - Create a new relation C (associative entity).
  - The primary keys of A and B will become foreign keys in relation C.
Mapping Binary One-to-One Relationships
- Create two relations, one for each participating entity.
- Include the primary key of one relation as a foreign key in the other relation.
Mapping Unary Relationships
- An entity in a unary relationship is mapped to a relation.
- Add a foreign key attribute in the same relation to reference primary key values within it.

Normalization

Definition:
- A fundamental aspect of relational database theory and practice; it is the process of decomposing relations with anomalies to produce smaller, well-structured relations.
- It encompasses a set of rules to reorganize relations to reduce redundancy and improve data integrity.
- Principle: Each table should serve a unique purpose.

Purpose of Normalization

To identify relations that fulfill data requirements for an organization.
Characteristics of a well-normalized database:
- Only necessary attributes are present to store data.
- Data attributes closely related in nature are stored in the same table.
- Data redundancy is minimized to ensure each piece of data is stored once.

Why Normalize?

Data Anomalies: Defined as complications due to excessive redundancy, resulting in inconsistencies.
- Anomalies arise when a particular record is stored in multiple locations with not all copies being updated consistently.
- Common operations (insert, update, delete) can introduce these inconsistencies.

Types of Anomalies

Insert Anomalies
Delete Anomalies
Update Anomalies

Example of Data Redundancy and Anomalies

Consider the relation StaffBranch(staffNo, sName, position, salary, branchNo, bAddress) with redundant data, where branch details are repeated for each staff member located at that branch.

Problematic Scenarios

Insertion Anomalies
- Inserting new member details requires branch details, which must be correct, leading to potential inconsistencies.
- Cannot insert a new branch without a member; if inserted, the member record must make staffNo nullable (which violates entity integrity constraints).
Deletion Anomalies
- Deletion of a tuple (e.g., staff number SA9) results in loss of branch details.
Modification Anomalies
- Updating branch address for a branch number may require multiple updates at different locations, risking mistakes and inconsistent data.

Functional Dependencies - Examples

The concept of functional dependencies illustrated with relationships such as father-child mappings:
- Example: John -> Susan; Hamid -> Nomi; Max -> Susan.

Definitions of Functional Dependencies

A functional dependency defines a constraint between two attributes, where the value of one attribute is determined by another.
If we denote A and B as attributes of relation R, then B is functionally dependent on A (denoted as A -> B), meaning each value of A is linked to exactly one value of B.
Here, A is termed as the determinant; this means for each value of A, there exists a single corresponding value of B, but the converse does not hold true.

Examples of Functional Dependency

Example 1: staffNo -> sName (staffNo determines the staff name)
Example 2: sName -> staffNo (only true for the given dataset).

Additional Types of Functional Dependencies

Full Functional Dependency

A functional dependency A -> B is a full functional dependency if the removal of any attribute from A results in the dependency collapsing.
- Example: (StudentID, CourseID) -> Grade.

Partial Dependency

A dependency A -> B is considered partial if removal of an attribute from A still keeps the dependency.
- Example: (StudentID, CourseID) -> StudentName;
- Example: (StaffNo, sName) -> branchNo.

Transitive Dependency

A dependency where a functional relationship exists between the primary key and one or more non-key attributes via another non-key attribute.
- Example: staffNo -> sName, position, salary, branchNo, bAddress; branchNo -> bAddress
- Here, branchNo is transitively dependent on staffNo via branchNo, and neither branchNo nor bAddress determines staffNo.

Dependency Evaluation Examples

Example 1

Evaluating the dependency: Convenerid, Module -> Convenerage.
- Determine whether it is a full functional dependency or partial dependency.

Example 2

Evaluating the dependency: Convenerid -> Convenername, Convenerpostcode, Convenercity.
- Analyze if the dependency is full, partial, or transitive.

Steps in Normalization

First Normal Form (1NF):
- All multivalued attributes or repeating groups must be eliminated, ensuring a single value sits at each row and column intersection of the table.
Second Normal Form (2NF):
- Removal of partial functional dependencies:
  - Non-key attributes depend entirely on the primary key.
Third Normal Form (3NF):
- Elimination of transitive dependencies:
  - Non-key attributes depend solely on the primary key.

Characteristics of Normalization

Lossless: Ensuring that no data is lost or fabricated during the splitting of relations.
Dependency Preservation: Aiming to ensure that functional dependencies are preserved when decomposing relations into smaller units.
Normalization up to 3NF is noted for always being lossless and preserving dependencies.

Practical Normalization Examples

Example of ClientRental Relation

Initial (Un-Normalized) relation:
- ClientRental(clientNo, propertyNo, cName, pAddress, rentStart, rentFinish, rent, ownerNo, oName)
Initial Functional Dependencies:
- fd1: clientNo, propertyNo -> rentStart, rentFinish (Primary key)
- fd2: clientNo -> cName (Partial)
- fd3: propertyNo -> pAddress, rent, ownerNo, oName (Partial)
- fd4: ownerNo -> oName (Transitive)

Transitioning to Normal Forms

First Normal Form (1NF):
- Remove repeating groups as appropriate, forming a new structure.
Second Normal Form (2NF):
- Eliminate partial dependencies, constructing new relations.
- Derived relations could be:
  - Client(clientNo, cName), Rental(clientNo, propertyNo, rentStart, rentFinish), PropertyForRent(propertyNo, pAddress, rent, ownerNo), Owner(ownerNo, oName).
Third Normal Form (3NF):
- Ensure all transitive dependencies are eliminated, ensuring that all attributes are solely dependent on the primary key.

Example of Invoice Structure

Unnormalized form:
- InvoiceNo, CustNo, CustAddress, InvoiceDate, ItemCode, ItemDesc, ItemPrice, Quantity, Price
Normalization Process:
- Analysis ongoing; primary key established for dependencies.
Results:
- INVOICE (InvoiceNo, ItemCode, Quantity, Price)
- INVOICEDETAIL (InvoiceNo, CustNo, InvoiceDate, InvoiceTotal)
- ITEM(ItemCode, ItemDesc, ItemPrice)
- CUSTOMER(CustNo, CustAddress)

Home Exercise

Conceptual exercise to transform a provided dataset into Third Normal Form (3NF).