Chapter 8 – Relational Database Model

Learning Objectives

  • By the end of Chapter 8 you should be able to:
    • Describe the logical structure of the relational model and explain how it differs from physical storage structures.
    • List and discuss the basic components of the model (tables, rows, columns, keys, relationships, constraints).
    • Manipulate table contents through relational operators (select, project, join, union, etc.).
    • Explain the roles of the data dictionary and system catalog in metadata management.
    • Identify entities, establish relationships, and translate business rules into the relational schema.
    • Evaluate and control data redundancy through proper key usage and table design.
    • Explain indexing, including index keys, unique indexes, and their impact on performance and integrity.

A Logical View of Data

  • The relational model presents data logically rather than physically.
  • Core logical construct is the relation (i.e., table).
  • Logical simplicity supports disciplined design methodologies and conceptual independence from storage concerns.

Relational Tables: Fundamental Characteristics

  • A table is perceived as a two-dimensional grid (rows × columns).
  • Each row (tuple) = one occurrence of an entity.
  • Each column = one attribute; each column name is unique within its table.
  • Intersection of row & column = one single data value (atomicity).
  • All values in the same column must share a common data format / domain.
  • Each column is defined over an attribute domain (allowable value range).
  • Ordering of rows or columns is immaterial to the DBMS.
  • Every table has at least one attribute (or set) that uniquely identifies each row (primary key).

Keys & Dependencies

  • Key = attribute(s) that determine other attribute values and ensure unique identification.
  • Determination: Knowing value of one attribute lets you determine another.
    • Attribute providing the value = determinant.
    • Attribute whose value is obtained = dependent.
  • Functional dependence: Attribute XX functionally determines YY if each XX value is associated with precisely one YY value.
  • Full functional dependence: The entire composite determinant is necessary; no subset of the determinant attributes suffices.

Types of Keys

  • Composite key: two or more attributes combined to serve as a key.
  • Key attribute: any attribute that forms part of a key.
  • Superkey: any attribute set that uniquely identifies rows (may contain unnecessary attributes).
  • Candidate key: minimal (irreducible) superkey—no proper subset is a superkey.
  • Primary key (PK): candidate key selected to identify rows; cannot contain nulls.
  • Entity integrity rule: every table row must have a non-null and unique PK.
  • Foreign key (FK): PK from one table placed in another to represent a relationship.
  • Referential integrity rule: every FK is either null (when allowed) or matches an existing PK in the referenced table.
  • Secondary key: attribute(s) used strictly for data retrieval (e.g., search, grouping) rather than row identity.
  • Null represents the absence of a value; can mean unknown, not yet available, or inapplicable.

Handling Nulls & Column Constraints

  • NOT NULL constraint: enforces that each row supplies a value for the column.
  • UNIQUE constraint: prevents duplicate values within the column.
  • Flags: special codes stored in a column to explicitly denote missing/exception conditions.

Integrity Rules in Practice

  • RDBMSs enforce entity & referential integrity automatically; nonetheless, design must honor them to avoid run-time rejection or anomalies.
  • Deleting a parent row whose PK is referenced by a child FK is disallowed unless cascading actions or nullification have been predefined.

Relational Algebra & Closure

  • Relvar = variable holding a relation; consists of a heading (attribute names) and a body (tuples).
  • Closure property: any relational algebra operation produces a new relation that can itself be queried.
Core Unary Operators
  • Select (Restrict): horizontal subset—filters rows satisfying a predicate.
  • Project: vertical subset—chooses specific columns.
Core Binary Set Operators (require union-compatibility)
  • Union: combines rows of two tables, eliminating duplicates.
  • Intersect: returns only rows common to both tables.
  • Difference: rows in table A that are not in table B.
  • Product (Cartesian product): every row of A paired with every row of B—basis for joins.
Join Variants
  • Natural join: equates common-named columns, eliminates duplicate column in result.
  • Equijoin: explicit equality condition between specified columns; duplicate columns retained.
  • Theta join: comparison operator other than equality ( >, <, ≠, etc.).
  • Inner join: keeps only matched rows.
  • Outer join: keeps matched rows plus unmatched rows padded with nulls.
    • Left outer join: all rows from left table retained.
    • Right outer join: all rows from right table retained.
Divide Operator
  • Dividend: two-column table R(A,B)R(A,B); Divisor: one-column table S(A)S(A).
  • Result = single-column table containing every BB value that pairs with all AA values in SS.

Data Dictionary & System Catalog

  • Data dictionary: designer-maintained metadata repository describing user-defined tables (names, columns, types, validation rules, etc.).
  • System catalog: system-maintained data dictionary that stores metadata for all database objects (tables, views, indexes, constraints, triggers, etc.).
  • Naming issues:
    • Homonym: identical name used for different attributes—creates ambiguity.
    • Synonym: different names for the same attribute—causes redundancy/confusion.

Relationships Among Entities

  • One-to-Many (1:M1{:}M): most common; PK of parent appears as FK in child table.
  • One-to-One (1:11{:}1): rare; often implemented for security or optional attributes.
  • Many-to-Many (M:NM{:}N): resolved via a composite (bridge / associative) entity possessing FKs referencing each parent and usually its own PK.

Controlling Data Redundancy

  • Proper PK–FK design minimizes unnecessary duplicate data.
  • Acceptable redundancy cases:
    • Performance optimization (denormalization for faster queries).
    • Historical tracking (storing now-static past values even if duplicated).
  • Always weigh redundancy benefits against update anomalies and storage overhead.

Indexes

  • An index is an ordered structure that accelerates row access based on key values.
  • Index key = attribute list whose values are mapped to physical row pointers.
  • Unique index: enforces uniqueness of key values (often auto-created for PKs).
  • Each index belongs to one table, but a table can possess multiple indexes (single-column or composite).

Codd’s 12 (+1) Relational Rules (Concise)

  1. Information: all data represented as values in tables.
  2. Guaranteed access: every value accessible via combination .
  3. Systematic treatment of nulls: uniform, type-independent handling.
  4. Dynamic online catalog: metadata stored as relational tables.
  5. Comprehensive data sublanguage: at least one complete declarative language (e.g., SQL).
  6. View updating: theoretically updatable views must be updatable in practice.
  7. High-level set operations: support bulk insert, update, delete.
  8. Physical data independence: apps unaffected by storage changes.
  9. Logical data independence: apps unaffected by table structure tweaks (e.g., column order).
  10. Integrity independence: constraints definable/stored in catalog, not app code.
  11. Distribution independence: transparency of data location (local vs. distributed).
  12. Nonsubversion: no bypass of relational integrity via low-level operations.
  13. Rule Zero: a DBMS is relational only if it uses relational principles exclusively for data management.

Key Takeaways & Practical Implications

  • Tables, keys, and constraints form the foundation of the relational paradigm.
  • Design goals: maintain integrity, minimize redundancy, optimize access.
  • Relational algebra provides the theoretical basis for SQL operations; understanding it aids in query optimization and correctness.
  • A robust data dictionary / catalog is indispensable for governance, security, and developer productivity.
  • Indexes and thoughtful enforcement of Codd’s rules influence scalability and evolvability of real-world systems.