Normalization of Database Tables
Normalization
- Technique for creating relations with desirable properties, based on data requirements.
- Objective: Identify suitable relations that support enterprise data needs.
- Characteristics of suitable relations:
- Minimal attributes to support data requirements.
- Attributes with close logical relationship (functional dependency) in the same relation.
- Minimal redundancy, with attributes represented only once (except for foreign keys).
Anomalies
- Insert Anomaly: Adding extra, unnecessary data.
- Update Anomaly: Updating some but not all duplicated data instances.
- Delete Anomaly: Losing attributes due to the deletion of other attributes.
- 1NF (First Normal Form):
- Eliminate repeating groups.
- Identify the primary key.
- Identify all dependencies.
- 2NF (Second Normal Form):
- Is in 1NF.
- Includes no partial dependencies.
- 3NF (Third Normal Form):
- Is in 2NF.
- Contains no transitive dependencies.
Functional Dependencies
- Describes relationships between attributes.
- Semantics indicate how attributes relate, specifying functional dependencies.
- Determinant: Attribute(s) on the left-hand side of the functional dependency arrow.
- Full Functional Dependency: Attribute B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A.
- Partial Dependency: Exists if removing an attribute from A still maintains the dependency.
- Transitive Dependency: If A \rightarrow B and B \rightarrow C, then C is transitively dependent on A via B (A is not functionally dependent on B or C).
- Eliminate repeating groups.
- Identify the primary key.
- Identify all dependencies.
- Make new tables to eliminate partial dependencies.
- Reassign corresponding dependent attributes.
- Identify each new determinant.
- Identify the dependent attributes.
- Remove the dependent attributes from transitive dependencies.
Requirements for Good Normalized Tables
- Evaluate PK assignments and naming conventions.
- Refine attribute atomicity.
- Identify new attributes and relationships.
- Refine primary keys for data granularity.
- Maintain historical accuracy; evaluate derived attributes.
- Every determinant in the table should be a candidate key.
- Violated only when the table contains more than one candidate key.
- Is in 3NF.
- Has no multivalued dependencies.
- All attributes must be dependent on the primary key and independent of each other.
Normalization and Database Design
- Normalization should be part of the design process.
- Proposed entities must meet the normal form before table structures are created.
Denormalization
- Produces a lower normal form.
- Results in increased performance and greater data redundancy.