Normalization of Database Tables

Normalization

  • Technique for creating relations with desirable properties, based on data requirements.
  • Objective: Identify suitable relations that support enterprise data needs.
  • Characteristics of suitable relations:
    • Minimal attributes to support data requirements.
    • Attributes with close logical relationship (functional dependency) in the same relation.
    • Minimal redundancy, with attributes represented only once (except for foreign keys).

Anomalies

  • Insert Anomaly: Adding extra, unnecessary data.
  • Update Anomaly: Updating some but not all duplicated data instances.
  • Delete Anomaly: Losing attributes due to the deletion of other attributes.

Normal Forms

  • 1NF (First Normal Form):
    • Eliminate repeating groups.
    • Identify the primary key.
    • Identify all dependencies.
  • 2NF (Second Normal Form):
    • Is in 1NF.
    • Includes no partial dependencies.
  • 3NF (Third Normal Form):
    • Is in 2NF.
    • Contains no transitive dependencies.

Functional Dependencies

  • Describes relationships between attributes.
  • Semantics indicate how attributes relate, specifying functional dependencies.
  • Determinant: Attribute(s) on the left-hand side of the functional dependency arrow.
  • Full Functional Dependency: Attribute B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A.
  • Partial Dependency: Exists if removing an attribute from A still maintains the dependency.
  • Transitive Dependency: If A \rightarrow B and B \rightarrow C, then C is transitively dependent on A via B (A is not functionally dependent on B or C).

Conversion to First Normal Form (1NF)

  • Eliminate repeating groups.
  • Identify the primary key.
  • Identify all dependencies.

Conversion to Second Normal Form (2NF)

  • Make new tables to eliminate partial dependencies.
  • Reassign corresponding dependent attributes.

Conversion to Third Normal Form (3NF)

  • Identify each new determinant.
  • Identify the dependent attributes.
  • Remove the dependent attributes from transitive dependencies.

Requirements for Good Normalized Tables

  • Evaluate PK assignments and naming conventions.
  • Refine attribute atomicity.
  • Identify new attributes and relationships.
  • Refine primary keys for data granularity.
  • Maintain historical accuracy; evaluate derived attributes.

Boyce-Codd Normal Form (BCNF)

  • Every determinant in the table should be a candidate key.
  • Violated only when the table contains more than one candidate key.

Fourth Normal Form (4NF)

  • Is in 3NF.
  • Has no multivalued dependencies.
  • All attributes must be dependent on the primary key and independent of each other.

Normalization and Database Design

  • Normalization should be part of the design process.
  • Proposed entities must meet the normal form before table structures are created.

Denormalization

  • Produces a lower normal form.
  • Results in increased performance and greater data redundancy.