Normalization in Database Design Study Notes
Introduction to Normalization
- Introduction to the concept of normalization in database logical design.
- Link to the previous week's content.
- Review of the seven steps to convert ER diagrams to relational tables.
Review of Last Week's Content
Conversion of Entity-Relationship (ER) Diagrams
Regular Entity:
- Create a table; add all attributes as columns.
- Entity identifier becomes the primary key.
Weak Entity:
- Create a table with the weak entity.
- Use identifier of the strong entity along with the weak entity’s partial identifier as a composite primary key.
Relationships based on Cardinality:
- One-to-One Relationship: Include primary key from one side as a foreign key in the other side’s table.
- Many-to-Many Relationship: Create a new table for this relationship; include primary IDs from both sides as foreign keys.
- Associative Entity:
- Create a table; if it has an identifier, use it as a natural primary key.
- If no identifier, use a combination of primary keys from connected entities as a composite primary key.
- Ternary/Ennary Relationships: Create a new table for the relationships.
- Subtype and Supertype: Create separate tables for each subtype and supertype, incorporating the supertype’s primary key as a foreign key in the subtype table.
Structure of Well-Structured Relations
- Emphasis on starting the database design with an ER diagram prior to table creation.
- If tables are created without careful design, normalization is necessary to resolve potential redundancies.
Introduction to Data Normalization
Definition of Normalization
- Normalization is the process of deciding which attributes to group together in relations.
- Objective: Minimize data redundancy and avoid anomalies.
Understanding Well-Structured Relations
- A well-structured relation minimizes data redundancy and avoids anomalies, including:
- Insertion anomaly
- Deletion anomaly
- Modification anomaly
Types of Anomalies
1. Insertion Anomaly
- Occurs when adding a new row leads to data duplication.
2. Deletion Anomaly
- Deleting a row leads to unintended loss of data.
3. Modification Anomaly
- Modifying a row necessitates updates to other rows to avoid inconsistency.
Design Principles of Well-Structured Relations
- General rule of thumb:
- A table should correspond to at most one entity.
Examples and Case Studies
Employee Course Table Example
- Description: Details of employees and courses they completed.
- Question about its relational integrity based on uniqueness of rows.
- Composite Primary Key: Employee ID and Course Title ensure unique records.
- Highlighted the need for normalization due to the anomalies presented by this table structure.
Detail on Anomaly Examples
- Insertion Anomaly:
- New employee with no completed courses cannot be added.
- Deletion Anomaly:
- Deleting an employee removes course information when no other entries would retain it.
- Modification Anomaly:
- Updating an employee’s data requires consistency across multiple entries.
Process of Normalization
- Overview of normalization steps and requirements for each normal form.
First Normal Form (1NF)
- Achieved by removing multivalued attributes.
- Every attribute must be atomic; must include a defined primary key.
Second Normal Form (2NF)
- After achieving 1NF:
- Remove partial dependencies.
- A non-key attribute should depend on the entire primary key, not just part of it.
Third Normal Form (3NF)
- After achieving 2NF:
- Remove transitive dependencies (where a non-key attribute depends on another non-key attribute).
- A well-structured database should ideally reach 3NF.
Higher Normal Forms
- Boyce-Codd Normal Form (BCNF), Fourth (4NF), and Fifth Normal Form (5NF) are discussed but not often applied for the well-structured relations in this course.
Functional Dependencies
Definition
- A constraint between two attributes where one attribute's value is dependent on another's.
- Example: Employee ID determines salary.
Candidate Key
- A potential unique identifier for each row; may have multiple candidate keys but one primary key.
Steps to Achieve Normal Forms
Example Case Study of a Relation into Normal Forms
Initial Table State:
- Identified primary keys: Composite of Student ID and Course ID.
- Functional dependencies identified (partial, transitive, full).
Transformation Steps:
- Steps to break tables into smaller relations to meet 2NF and 3NF requirements.
- 2NF involves removing partial dependencies.
- 3NF involves addressing transitive dependencies.
Referencing and Referential Integrity
- Overview of defining relationships among tables through foreign keys and maintaining data integrity with primary keys.
Issues During Relation Merging
Potential Problems
- Synonyms: Same concept described by different names in different tables.
- Homonyms: Same name for different concepts across tables.
- Transitive Dependencies: Dependencies that must be resolved during merging.
- Implicit subtype/supertype relationships: Devise structures that recognize such relationships during the merging process.
Types of Keys in Database Design
- Primary Key: Uniquely identifies a row in a table.
- Foreign Key: Establishes relationships between tables.
- Candidate Key: Potential keys for the primary key.
- Enterprise Key: Unique across the entire database, used to represent all objects of interest.
Conclusion
- Summary of the day's lessons on normalization:
- Importance of using normalization to derive well-structured relations.
- Understanding and applying functional dependencies is critical to reaching normal forms.
- Recognition of key types and relationship integrity is vital for effective database design.
Assignment and Course Guidance
- Reminder regarding upcoming assignments and group formation tasks.
- Installation guide for MySQL server and workbench will be distributed shortly.