Normalization

Normalization of Database Tables

Overview of Normalization

  • Normalization: A technique to design database tables minimizing data redundancy.

  • Importance: Essential for producing good database structures that are efficient and less prone to anomalies.

Learning Objectives

  • Understanding normalization's role in database design.

  • Learning about normal forms: 1NF, 2NF, 3NF, BCNF, and 4NF.

  • Transformation of lower normal forms to higher ones.

  • Concurrent use of normalization and ER modeling for optimal design.

  • Situations requiring denormalization for efficient information retrieval.


Database Design Principles

Key Concepts

  • Good design matches properly structured tables to minimize data redundancy and avoid anomalies.

  • The normalization process evaluates and corrects table structures to ensure efficiency.

Characteristics of Poor Table Structures

  • A poor structure can lead to data inconsistencies and inefficiencies.


Steps in Normalization Process

Recognizing a Good Table Structure

  1. Identify and Analyze Table Definitions: Evaluate the structure for efficiency.

  2. Recognize Anomalies in Poor Structures: Determine potential issues such as:

    • Insertion anomalies

    • Deletion anomalies

    • Update anomalies

Overview of Normal Forms

  • 1NF (First Normal Form): No repeating groups; all key attributes defined.

  • 2NF (Second Normal Form): Meets 1NF requirements and has no partial dependencies.

  • 3NF (Third Normal Form): Meets 2NF requirements and has no transitive dependencies.

  • BCNF (Boyce-Codd Normal Form): Every determinant is a candidate key.

  • 4NF (Fourth Normal Form): Meets 3NF and has no independent multivalued dependencies.


Identifying Functional Dependencies

Defining Dependencies

  • Understand Functional Dependence: Attribute relationships where one attribute uniquely identifies another.

    • Partial Dependency: Exists when an attribute is functionally dependent on only part of a composite key.

    • Transitive Dependency: A condition where one attribute depenends indirectly through another nonprime attribute.


Normalizing Example: Construction Company Database

  • Scenario: Managing building projects with employee assignments and billing hours.

  • Table Example: Initial reports generated showcasing employee data against projects.

Issues Identified in the Table

  1. Null Keys: Project number cannot serve as a primary key due to expected nulls.

  2. Data Inconsistencies: Various naming conventions lead to redundancy and update issues.

  3. Redundancy: Repeated data entries increase the likelihood of anomalies.


The Conversion Process

1NF Conversion Steps

  1. Eliminate Repeating Groups: Restructure data ensuring each cell containing a single value.

  2. Identify Primary Key: Establish a composite key combining necessary identifiers.

  3. Identify All Dependencies: Map out relationships defining functional dependencies.

Moving to 2NF and 3NF

  • 2NF: Remove partial dependencies by restructuring the table into new formats.

    • Create separate tables for each component that was part of the initial composite key.

  • 3NF: Eliminate transitive dependencies by creating additional tables for determining relationships.


Importance of Good Data Design

  • Eliminating Anomalies: Proper structure ensures that updates, deletions, or insertions do not lead to inconsistent data states.

  • Referential Integrity: Setting foreign keys establishes reliable relationships across tables.


Denormalization Explanation

  • When Required: In certain scenarios to increase system performance and speed, tables might be intentionally denormalized.

  • Trade-offs: Increases in redundancy and potential for anomalies must be carefully measured against the need for speed.


Conclusion

  • Final Thoughts on Database Design: Emphasize the need for normalization while allowing for specific denormalization where justified.

  • Summary of Key Terms: Understand essential terminologies such as atomic attributes, partial and transitive dependencies, etc. as foundational concepts.


Data Modeling Checklist

  • Business Rules: Document, verify, and sign off on all business rules with stakeholders.

  • Naming Conventions: Ensure clarity and uniqueness across entities, attributes, and relationships.

  • Normalization Validation: Perform checks against design guidelines to maintain high-quality structures and minimize redundancy.