Database Principles - Chapter 7: Normalising Database Designs

Database Principles Overview

Chapter 7: Normalising Database Designs

  • Introduction to the chapter topics:

    • Explanation of normalisation and its role in database design.

    • Overview of normal forms including 1NF, 2NF, 3NF, BCNF, and 4NF.

    • Techniques for transforming lower normal forms to higher normal forms.

    • Discussion on the concurrent use of normalisation and ER (Entity-Relationship) modeling for effective database design.

    • Situations necessitating denormalisation for improved information retrieval efficiency.

Normalisation

  • Definition: Normalisation refers to the process of evaluating and correcting table structures to minimize data redundancies, which subsequently reduces data anomalies in databases.

  • The stages of normalisation are categorized into several normal forms:

    • First Normal Form (1NF)

    • Second Normal Form (2NF)

    • Third Normal Form (3NF)

  • Implications of normalisation levels:

    • 2NF improves upon 1NF, while 3NF is an improvement over 2NF.

    • For most business database designs, 3NF is typically sufficient.

    • Achieving the highest level of normalisation is not always the most desirable approach for every situation.

The Need for Normalisation

  • Example Scenario: A construction management company bills clients based on hours worked on projects, with billing rates depending on employee position.

    • The report generated from their data must accurately reflect this billing system, as shown in Table 7.1, which details project and employee assignments along with hours charged.

Table 7.1: Sample Report Layout

  • Fields:

    • Proj. Num: Unique project number

    • Proj. Name: Name of the project

    • Employee Num: Unique employee identifier

    • Employee Name: Name of the employee

    • Job Class: Classification of the employee’s job

    • Chg Hour: Charge per hour

    • Hours: Number of hours worked

    • Total: Total charge calculated as (Charge per Hour * Hours Worked)

  • Example Entries from Report:

    • Project 15, Employee Mzwandile E. Baloyi — Charges €67.55 for 23.8 hours → Total €1,607.69.

    • Project 15, Employee William Smithfield — Charges €26.66 for 12.6 hours → Total €335.92.

  • Note: The subtotals and Total of all entries yield €38,942.09 in billing charges.

Issues in Data Structure

  • The table structure as shown may appear functional, but it fails to adequately manage data integrity, leading to discrepancies in report generation based on underlying data anomalies.

The Normalization Process

  • Each table must represent a single subject or type of data;

    • No data item should be unnecessarily stored in multiple tables;

    • All attributes in a table must directly depend on the primary key.

Normal Forms

Table 7.2: Normal Forms Characteristics

Normal Form

Characteristic

Sections

1NF

Table format; no repeating groups and PK identified

7.1

2NF

1NF with no partial dependencies

7.3.1

3NF

2NF with no transitive dependencies

7.3.2

BCNF

Every determinant is a candidate key

7.3.3

4NF

3NF and no independent multivalued dependencies

7.6.1

Conversion to First Normal Form (1NF)

  • Repeating Group: Refers to multiple entries of the same type that can exist for a single key attribute occurrence.

    • 1NF Requirement: A relational table must not contain repeating groups and must present data with unique values in each cell.

Steps for Achieving 1NF

  1. Eliminate Repeating Groups: Ensure data is arranged in tabular format without repeating groups and nulls.

  2. Identify Primary Key: The primary key must uniquely identify each attribute value.

  3. Identify Dependencies: Visualize relationships among attributes, ensuring all are dependent on the primary key.

First Normal Form (1NF) Essentials
  • A table in 1NF:

    • Defines all key attributes clearly.

    • Contains no repeating groups.

    • Ensures all attributes depend on the primary key.

Conversion to Second Normal Form (2NF)

  • Enhancing relational database design by transforming tables into 2NF requires elimination of partial dependencies.

Steps for Achieving 2NF

  1. Write Each Key Component on Separate Lines: Distribute composite key components to new lines.

  2. Assign Dependent Attributes: Map dependent attributes accurately to avoid any form of partial dependencies.

Second Normal Form (2NF) Criteria

  • A table is considered in 2NF if:

    • It is in 1NF

    • It excludes partial dependencies

Conversion to Third Normal Form (3NF)

  • Eliminating Transitive Dependencies: 3NF is usually achieved through three distinct steps:

  1. Identify Each New Determinant: Each transitive dependency’s determinant becomes the primary key of a new table.

  2. Identify Dependent Attributes: Attributes dependent on the newly identified determinant are recognized.

  3. Remove Dependent Attributes: Eliminate attributes involved in transitive dependencies from original tables, thereby restructuring the database schema.