AA

Module 7 - JOINS and Normalization

MIST 7510E - Database Management

  • Course: Database Management

  • Topic: JOINS and Normalization

  • Instructor: Nikhil Srinivasan

  • Institution: The University of Georgia, TERRY COLLEGE OF BUSINESS

SQL JOIN

  • Definition:

    • A SQL join clause combines rows from two or more database tables into a single table.

    • It locates related column values in the two tables.

  • Characteristics:

    • A query can contain zero, one, or multiple JOIN operations.

Different Types of JOINs

  • (INNER) JOIN:

    • Selects records that have matching values in both tables.

  • LEFT (OUTER) JOIN:

    • Selects records from the first (left) table with matching records from the right table.

  • RIGHT (OUTER) JOIN:

    • Selects records from the second (right) table with matching records from the left table.

  • FULL (OUTER) JOIN:

    • Selects all records that match either left or right table records.

Illustrating Joins

  • Example tables: left_table and right_table to understand JOINs better.

INNER JOIN

  • Definition:

    • An Inner Join selects all rows from both tables as long as there is a match between the columns.

    • Looks for matches in the right table corresponding to all entries in the key field of the left table.

  • Syntax:

    SELECT id, l.val AS l_val, r.val AS r_val 
    FROM left_table l INNER JOIN right_table r ON l.id= r.id;

Example from Database: INNER JOIN

  • Aim: Retrieve Order number corresponding to each customer.

  • Query:

    SELECT C.customerNumber, C.customerName, O.orderNumber 
    FROM Customers C INNER JOIN Orders O ON C.customerNumber=O.customerNumber;

LEFT JOIN

  • Definition:

    • Returns all records from the left table and the matched records from the right table.

    • If no match exists, the result is NULL from the right side.

  • Syntax:

    SELECT id, l.val AS l_val, r.val AS r_val 
    FROM left_table l LEFT JOIN right_table r ON l.id= r.id;

Example from Database (LEFT JOIN)

  • Aim: Retrieve Order numbers corresponding to each customer.

  • Query:

    SELECT C.customerNumber, C.customerName, O.orderNumber 
    FROM Customers C LEFT JOIN Orders O ON C.customerNumber=O.customerNumber;

RIGHT JOIN

  • Definition:

    • Returns all records from the right table and the matched records from the left table.

    • If no match exists, the result is NULL from the left side.

  • Syntax:

    SELECT id, l.val AS l_val, r.val AS r_val 
    FROM left_table l RIGHT JOIN right_table r ON l.id= r.id;

Example from Database (RIGHT JOIN)

  • Aim: Retrieve Total quantity ordered for each product.

  • Query:

    SELECT SUM(O.quantityOrdered) Total_Quantity_Ordered, P.productCode 
    FROM OrderDetails O RIGHT JOIN Products P ON P.productCode=O.productCode 
    GROUP BY P.productCode;

OUTER JOIN

  • Definition:

    • Returns all rows from both left and right tables.

    • Unmatched rows from either table will return as NULL.

  • Syntax:

    SELECT id, l.val AS l_val, r.val AS r_val 
    FROM left_table l OUTER JOIN right_table r ON l.id= r.id;

Example from Database (OUTER JOIN)

  • Aim: Retrieve Order numbers corresponding to each customer.

  • Note: MySQL does not support Full Join directly, but it can be simulated using a UNION of Left and Right joins.

  • Query:

    SELECT C.customerNumber, C.customerName, O.orderNumber 
    FROM Customers C LEFT JOIN Orders O ON C.customerNumber=O.customerNumber 
    UNION 
    SELECT C.customerNumber, C.customerName, O.orderNumber 
    FROM Customers C RIGHT JOIN Orders O ON C.customerNumber=O.customerNumber 
    WHERE C.customerNumber IS NULL;

Normalization

  • Definition:

    • An alternative database design tool to data modeling.

    • The theoretical foundation for the relational model.

    • Application of a series of rules that gradually improve the design.

Functional Dependency

  • Definition:

    • A relationship between attributes in an entity where one or more attributes determine the value of another attribute.

  • Example:

    • stock code → firm name, stock price, stock quantity, stock dividend

    • Knowing the stock code provides knowledge of the firm name, etc.

  • Multivalued dependency Formula:

    • (stock dividend, stock price) → yield

Full Functional Dependency

  • Example:

    • Yield is fully functionally dependent on stock dividend and stock price as both are needed to determine the yield.

  • Determinants:

    • An attribute that fully functionally determines another attribute.

  • Example:

    • stock code determines stock PE.

Multidetermination

  • Definition:

    • A given value can determine multiple values, denoted as A → → B.

  • Example:

    • Department multidetermines course.

Attribute Relationships

  • One-to-One:

    • A value of an attribute determines the value of another attribute and vice-versa.

    • Example: CH → Switzerland and Switzerland → CH.

  • One-to-Many:

    • A value of one attribute determines the value of another attribute but not vice versa.

    • Example: country name → currency unit but currency unit does not determine country name.

  • Many-to-Many:

    • Neither attribute determines the other.

    • Example: country name not → language and language not → country name.

Normal Forms

  • Definition:

    • A classification of relations, often depicted like Russian dolls (nested forms).

First Normal Form (1NF)

  • All rows must have the same number of columns.

  • Single-valued attributes only.

Second Normal Form (2NF)

  • Violated when a nonkey column is a fact about part of the primary key.

  • Example: customer-credit in an ordering context.

Third Normal Form (3NF)

  • Violated when a nonkey column is a fact about another nonkey column.

  • Example: exchange rate dependent on stock code.

Other Normal Forms

  • Bryce-Codd Normal Form (BCNF)

  • Fourth Normal Form

  • Fifth Normal Form

  • Domain-Key Normal Form (DKNF)

Data Modelling and Normalization

  • Data modeling is often an easier path to good database design.

  • A high-fidelity data model will be of high normal form.

  • 5NF can create significant complexities; check for special rules.

Data Modeling Methods

  • Chen's entity-relationship (E-R) approach is widely recognized.

  • No standard for E-R method; models share common concepts making learning transferable.

Representing Relationships

  • Various dialects exist for representing relationships in data modeling.

Goal

  • Objective: Learn to think like a data modeler; precision in representation comes after mastering basics.

Key Points

  • Normalization is one approach to data modeling.

  • Multiple representations for data models exist.

  • Learning to model is difficult but representing a model is easier.