Design Theory for Relational Databases
Design Theory for Relational Databases
Functional Dependencies (FDs)
- Definition: X -> Y asserts that for any two tuples in relation R, if they agree on all attributes in X, they must also agree on all attributes in Y.
- Notation: X, Y, Z represent sets of attributes; A, B, C represent single attributes.
- Important Notes:
- FDs are usually expressed with singleton right sides.
- Example: If A -> BC, it implies A -> B and A -> C.
- There's no splitting rule for left-hand side attributes.
Keys of Relations
- A superkey (K) for relation R functionally determines all attributes in R if knowing K allows us to uniquely identify tuples.
- A key is a minimal superkey; no subset of K can determine all attributes of R.
- Example: In Drinkers(name, addr, beersLiked, manf, favBeer), {name, beersLiked} is a superkey, but also a key since no proper subset determines all attributes.
Inferring Functional Dependencies
- Given FDs, one can deduce new FDs through logical implications.
- Example: A -> B and B -> C imply A -> C.
- Important for the design of good relation schemas.
- Goal: Avoid anomalies and redundancy in database design.
- Update Anomaly: Occurs when a change in one tuple does not propagate to others.
- Deletion Anomaly: Occurs when deleting a tuple results in losing valid data.
- A relation R is in BCNF if for every nontrivial FD X -> Y, X is a superkey.
- Violations of BCNF indicate potential redundancy or anomalies.
- Example: In Drinkers, the FDs name -> addr and beersLiked -> manf show it is not in BCNF since {name} and {beersLiked} are not superkeys.
Decomposition into BCNF
- To decompose R, identify a BCNF violation (X -> Y) and decompose using the attribute closure X+.
- Example: Given FDs in Drinkers, if name -> addr violates BCNF, decompose into two relations preserving attributes and FDs corresponding to each.
- A hybrid approach to avoid issues present in BCNF while enforcing desirable constraints.
- A non-trivial FD X -> A violates 3NF if X is not a superkey and A is not a prime attribute.
- Example: C -> B may not violate 3NF if both are keys, allowing flexibility in schema design.
Testing for Lossless Join
- When decomposing relations, ensure the original relation can be reconstructed from the decomposed fragments without loss.
- Chase Test: Validates that each decomposed component can derive back to yield original tuples, ensuring no unintended data loss.
3NF Synthesis Algorithm
- Construct a minimal basis for FDs:
- Each FD to have a single attribute on the right.
- Ensure no FDs can be removed while preserving dependencies.
- Form relations from FDs in the minimal basis to guarantee lossless join and dependency preservation.