Design Theory for Relational Databases

Design Theory for Relational Databases

Functional Dependencies (FDs)

  • Definition: X -> Y asserts that for any two tuples in relation R, if they agree on all attributes in X, they must also agree on all attributes in Y.
  • Notation: X, Y, Z represent sets of attributes; A, B, C represent single attributes.
  • Important Notes:
  • FDs are usually expressed with singleton right sides.
  • Example: If A -> BC, it implies A -> B and A -> C.
  • There's no splitting rule for left-hand side attributes.

Keys of Relations

  • A superkey (K) for relation R functionally determines all attributes in R if knowing K allows us to uniquely identify tuples.
  • A key is a minimal superkey; no subset of K can determine all attributes of R.
  • Example: In Drinkers(name, addr, beersLiked, manf, favBeer), {name, beersLiked} is a superkey, but also a key since no proper subset determines all attributes.

Inferring Functional Dependencies

  • Given FDs, one can deduce new FDs through logical implications.
  • Example: A -> B and B -> C imply A -> C.
  • Important for the design of good relation schemas.

Normal Forms

  • Goal: Avoid anomalies and redundancy in database design.
  • Update Anomaly: Occurs when a change in one tuple does not propagate to others.
  • Deletion Anomaly: Occurs when deleting a tuple results in losing valid data.
Boyce-Codd Normal Form (BCNF)
  • A relation R is in BCNF if for every nontrivial FD X -> Y, X is a superkey.
  • Violations of BCNF indicate potential redundancy or anomalies.
  • Example: In Drinkers, the FDs name -> addr and beersLiked -> manf show it is not in BCNF since {name} and {beersLiked} are not superkeys.
Decomposition into BCNF
  • To decompose R, identify a BCNF violation (X -> Y) and decompose using the attribute closure X+.
  • Example: Given FDs in Drinkers, if name -> addr violates BCNF, decompose into two relations preserving attributes and FDs corresponding to each.
Third Normal Form (3NF)
  • A hybrid approach to avoid issues present in BCNF while enforcing desirable constraints.
  • A non-trivial FD X -> A violates 3NF if X is not a superkey and A is not a prime attribute.
  • Example: C -> B may not violate 3NF if both are keys, allowing flexibility in schema design.

Testing for Lossless Join

  • When decomposing relations, ensure the original relation can be reconstructed from the decomposed fragments without loss.
  • Chase Test: Validates that each decomposed component can derive back to yield original tuples, ensuring no unintended data loss.

3NF Synthesis Algorithm

  • Construct a minimal basis for FDs:
  • Each FD to have a single attribute on the right.
  • Ensure no FDs can be removed while preserving dependencies.
  • Form relations from FDs in the minimal basis to guarantee lossless join and dependency preservation.