Normalization is used in relational database systems to design the table structure.
It involves applying certain concepts and principles to data storage.
The lesson covers the first three forms of normalization.
Normalization organizes data in a database, following specific rules.
It results in the creation of multiple tables and the establishment of relationships between them.
Driving Factors Behind Normalization
Reduction in Data Redundancy: Avoid storing the same information in multiple locations within the database (even the same table).
Wasted Space: Redundancy wastes disk space and degrades disk performance.
Maintenance Problems: When data changes (e.g., an employee's address), you need to update every instance of that data, which is time-consuming and error-prone.
Disk reads become slower if you don't do regular disc maintenance for defragmentation, which impacts the performance of your database queries.
Normal Forms
Database administrators typically aim for the first three normal forms.
Fourth and fifth forms exist but are less commonly implemented because they require extra work and don't significantly impact database functionality.
Each form has rules to validate its implementation.
First Normal Form (1NF)
Rule: No repeating groups in an individual table.
Have a separate table for each set of related data.
Each set of related data is identified with a key.
Example: Instead of including sales items (item name, price, quantity) in a customer table, create a separate sales table and establish a relationship between the customer and sales tables.
This avoids repeated data across multiple sales for the same customer.
Second Normal Form (2NF)
An entity is in 2NF if all of its attributes depend on the whole primary key.
The values in the columns have a dependency on the other columns.
Example: In a customer table, store customer-related data such as first name, last name, address, city, state, and zip code.
All of these attributes depend on the primary key (e.g., customer number).
Third Normal Form (3NF)
Rule: Eliminate fields in a table that do not depend on the key.
Example: The state or country of a person does not depend on the key that uniquely identifies the person.
Store states in a state table and countries in a country table, then relate these tables together.
Avoids duplicate entries for states and countries in the customer table for customers in the same locations.
Considerations with Normalization
Normalization can make querying more difficult because it involves joining multiple tables.
Need to focus on how to join information from one table to another to retrieve the desired data.
Data warehousing scenarios may involve de-normalizing data for more efficient querying.
De-normalized data is often more efficient to query than normalized data.
Conclusion
Understanding normalization is essential for grasping the concepts behind multiple tables and the need to query them effectively.
Normalization principles will be further illustrated through examples in subsequent lessons on querying multiple tables.