COMP1004 W2

Module Information

Learning Objectives

  • Understanding how Entity-Relationship (ER) diagrams map to relational databases.

  • Mastering normalization techniques.

  • Exploring functional dependencies and their properties.

  • Learning the process of normalization from un-normalized data to:

    • First Normal Form (1NF)

    • Second Normal Form (2NF)

    • Third Normal Form (3NF)

Transforming ER Diagrams into Relations

  • The process of transforming ER diagrams into relations in relational databases is straightforward and systematic.

Steps in Transformation

  1. Mapping Regular Entities

    • Every regular entity in an ER diagram is transformed into a relation.

    • The relation's name is typically the same as the entity name.

    • Each simple attribute of the entity becomes an attribute in the new relation.

    • The identifier (primary key) of the entity becomes the primary key of that relation.

  2. Mapping Composite Attributes

    • For entities with composite attributes, only the simple components are included as attributes in the new relation.

  3. Mapping Multivalued Attributes

    • For an entity with a multivalued attribute, create two new relations:

      • The first relation contains all attributes of the entity except the multivalued attribute.

      • The second relation includes:

      • The primary key from the first relation

      • The multivalued attribute.

  4. Mapping Weak Entities

    • Create a new relation for each weak entity.

    • Include all simple attributes of the weak entity in this new relation.

    • Include the primary key of the identifying relation as a foreign key in this new relation.

  5. Mapping Binary Relationships

    • For each binary one-to-many (1:M) relationship:

      • Create a relation for each entity involved.

      • Include the primary key of the one-side entity as a foreign key in the relation on the many-side.

      • Note: The primary key migrates to the many-side of the relationship.

  6. Mapping Binary Many-to-Many Relationships

    • For a binary many-to-many (M:N) relationship between entities A and B:

      • Create a new relation C (associative entity).

      • The primary keys of A and B will become foreign keys in relation C.

  7. Mapping Binary One-to-One Relationships

    • Create two relations, one for each participating entity.

    • Include the primary key of one relation as a foreign key in the other relation.

  8. Mapping Unary Relationships

    • An entity in a unary relationship is mapped to a relation.

    • Add a foreign key attribute in the same relation to reference primary key values within it.

Normalization

  • Definition:

    • A fundamental aspect of relational database theory and practice; it is the process of decomposing relations with anomalies to produce smaller, well-structured relations.

    • It encompasses a set of rules to reorganize relations to reduce redundancy and improve data integrity.

    • Principle: Each table should serve a unique purpose.

Purpose of Normalization

  • To identify relations that fulfill data requirements for an organization.

  • Characteristics of a well-normalized database:

    • Only necessary attributes are present to store data.

    • Data attributes closely related in nature are stored in the same table.

    • Data redundancy is minimized to ensure each piece of data is stored once.

Why Normalize?

  • Data Anomalies: Defined as complications due to excessive redundancy, resulting in inconsistencies.

    • Anomalies arise when a particular record is stored in multiple locations with not all copies being updated consistently.

    • Common operations (insert, update, delete) can introduce these inconsistencies.

Types of Anomalies

  1. Insert Anomalies

  2. Delete Anomalies

  3. Update Anomalies

Example of Data Redundancy and Anomalies

  • Consider the relation StaffBranch(staffNo, sName, position, salary, branchNo, bAddress) with redundant data, where branch details are repeated for each staff member located at that branch.

Problematic Scenarios
  1. Insertion Anomalies

    • Inserting new member details requires branch details, which must be correct, leading to potential inconsistencies.

    • Cannot insert a new branch without a member; if inserted, the member record must make staffNo nullable (which violates entity integrity constraints).

  2. Deletion Anomalies

    • Deletion of a tuple (e.g., staff number SA9) results in loss of branch details.

  3. Modification Anomalies

    • Updating branch address for a branch number may require multiple updates at different locations, risking mistakes and inconsistent data.

Functional Dependencies - Examples

  • The concept of functional dependencies illustrated with relationships such as father-child mappings:

    • Example: John -> Susan; Hamid -> Nomi; Max -> Susan.

Definitions of Functional Dependencies

  • A functional dependency defines a constraint between two attributes, where the value of one attribute is determined by another.

  • If we denote A and B as attributes of relation R, then B is functionally dependent on A (denoted as A -> B), meaning each value of A is linked to exactly one value of B.

  • Here, A is termed as the determinant; this means for each value of A, there exists a single corresponding value of B, but the converse does not hold true.

Examples of Functional Dependency

  • Example 1: staffNo -> sName (staffNo determines the staff name)

  • Example 2: sName -> staffNo (only true for the given dataset).

Additional Types of Functional Dependencies

Full Functional Dependency

  • A functional dependency A -> B is a full functional dependency if the removal of any attribute from A results in the dependency collapsing.

    • Example: (StudentID, CourseID) -> Grade.

Partial Dependency

  • A dependency A -> B is considered partial if removal of an attribute from A still keeps the dependency.

    • Example: (StudentID, CourseID) -> StudentName;

    • Example: (StaffNo, sName) -> branchNo.

Transitive Dependency

  • A dependency where a functional relationship exists between the primary key and one or more non-key attributes via another non-key attribute.

    • Example: staffNo -> sName, position, salary, branchNo, bAddress; branchNo -> bAddress

    • Here, branchNo is transitively dependent on staffNo via branchNo, and neither branchNo nor bAddress determines staffNo.

Dependency Evaluation Examples

Example 1

  • Evaluating the dependency: Convenerid, Module -> Convenerage.

    • Determine whether it is a full functional dependency or partial dependency.

Example 2

  • Evaluating the dependency: Convenerid -> Convenername, Convenerpostcode, Convenercity.

    • Analyze if the dependency is full, partial, or transitive.

Steps in Normalization

  1. First Normal Form (1NF):

    • All multivalued attributes or repeating groups must be eliminated, ensuring a single value sits at each row and column intersection of the table.

  2. Second Normal Form (2NF):

    • Removal of partial functional dependencies:

      • Non-key attributes depend entirely on the primary key.

  3. Third Normal Form (3NF):

    • Elimination of transitive dependencies:

      • Non-key attributes depend solely on the primary key.

Characteristics of Normalization

  • Lossless: Ensuring that no data is lost or fabricated during the splitting of relations.

  • Dependency Preservation: Aiming to ensure that functional dependencies are preserved when decomposing relations into smaller units.

  • Normalization up to 3NF is noted for always being lossless and preserving dependencies.

Practical Normalization Examples

Example of ClientRental Relation
  • Initial (Un-Normalized) relation:

    • ClientRental(clientNo, propertyNo, cName, pAddress, rentStart, rentFinish, rent, ownerNo, oName)

  • Initial Functional Dependencies:

    • fd1: clientNo, propertyNo -> rentStart, rentFinish (Primary key)

    • fd2: clientNo -> cName (Partial)

    • fd3: propertyNo -> pAddress, rent, ownerNo, oName (Partial)

    • fd4: ownerNo -> oName (Transitive)

Transitioning to Normal Forms
  1. First Normal Form (1NF):

    • Remove repeating groups as appropriate, forming a new structure.

  2. Second Normal Form (2NF):

    • Eliminate partial dependencies, constructing new relations.

    • Derived relations could be:

      • Client(clientNo, cName), Rental(clientNo, propertyNo, rentStart, rentFinish), PropertyForRent(propertyNo, pAddress, rent, ownerNo), Owner(ownerNo, oName).

  3. Third Normal Form (3NF):

    • Ensure all transitive dependencies are eliminated, ensuring that all attributes are solely dependent on the primary key.

Example of Invoice Structure

  1. Unnormalized form:

    • InvoiceNo, CustNo, CustAddress, InvoiceDate, ItemCode, ItemDesc, ItemPrice, Quantity, Price

  2. Normalization Process:

    • Analysis ongoing; primary key established for dependencies.

  3. Results:

    • INVOICE (InvoiceNo, ItemCode, Quantity, Price)

    • INVOICEDETAIL (InvoiceNo, CustNo, InvoiceDate, InvoiceTotal)

    • ITEM(ItemCode, ItemDesc, ItemPrice)

    • CUSTOMER(CustNo, CustAddress)

Home Exercise

  • Conceptual exercise to transform a provided dataset into Third Normal Form (3NF).