BCNF (Part 2)
Tricky Case of BCNF
Context: Week 4, Topic 3: Boyce-Codd Normal Form (BCNF)
Goal: Understand tricky BCNF cases, how to decompose, and how to check BCNF via closures
Base example for BCNF violation and decomposition:
- Start with relation R(A, B, C, D, E) with FDs: A → B, BC → D
- The key of R is ACE (a single composite key)
- FD A → B is a BCNF violation because A is not a superkey of R
- Decompose R into:
- R1(A, B)
- R2(A, C, D, E)
- Evaluate BCNF status after decomposition:
- R1 is in BCNF
- Key of R2 is ACE; need to check for BCNF violations in R2
- A is in R2 (A ∈ attributes of R2) but B is not in R2, so A → B does not apply directly to R2
- The potential concern is AC → D (derived from the FDs and closures) and whether AC is a superkey of R2
- Closures must be computed to determine BCNF status for R2
Closure-based check on R2(A, C, D, E):
- Given FDs: A → B, BC → D
- Compute closures considering only attributes in R2
- {A}^+ = {A} (since A → B is not applicable in R2 because B ∉ R2)
- {C}^+ = {C}
- {D}^+ = {D}
- {E}^+ = {E}
- {AC}^+ = {A, C, D} (because AC → D holds; A→B is not usable here due to B not in R2)
- {AC}^+ does not include all attributes of R2 (ACE would be the full key for R2, but {AC} does not contain E)
- Therefore, R2 is not in BCNF and must be decomposed further
Further decomposition of R2 using {AC}^+ = {A, C, D}:
- Decompose R2 into:
- R3(A, C, D)
- R4(A, C, E)
- R4: Attributes {A, C, E}; ACE is a key for the original R2, and in R4, it serves as a key; thus, R4 is in BCNF
- R3: Attributes {A, C, D}; need to check BCNF with the given FDs again
Tricky case within R3(A, C, D):
- FDs for the check remain A → B, BC → D; but B ∉ R3, so A → B does not apply directly to R3
- Evaluate closures within R3:
- {A}^+ = {A} (A → B not applicable here since B ∉ R3)
- {C}^+ = {C}
- {D}^+ = {D}
- {AC}^+ = {A, C, D} (AC → D, already within R3)
- {AD}^+ = {A, D}
- {CD}^+ = {C, D}
- No nontrivial FD seen that violates BCNF in R3
- Therefore, R3 is in BCNF
Final BCNF decomposition achieved:
- R1(A, B)
- R3(A, C, D)
- R4(A, C, E)
Key takeaways from the tricky BCNF case:
- When checking BCNF, consider FDs and their applicability within the current relation (only attributes present in the relation)
- Use closures to probe whether a determinant X is a superkey of the relation; if not, BCNF violation exists and decomposition is needed
- It can be necessary to decompose multiple times as new relations may reintroduce tricky constraints via closures
Summary of the step-by-step approach:
- Identify a potential BCNF violation X → Y where X is not a superkey of the relation
- Compute closures {X}^+ restricting to the current relation's attributes
- If {X}^+ does not include all attributes and X is not a superkey, decompose on {X}^+
- Recursively apply to resulting relations until all are BCNF
General BCNF Concepts and Methods
BCNF definition
- A relation R with FDs is in BCNF if for every nontrivial FD X → Y that holds on R, X is a superkey of R
- A superkey is a set of attributes that functionally determines all attributes of the relation
Use of closures in BCNF checks
- Closures help determine whether a given determinant X is a superkey within the relation being considered
- Notation:
- In tricky cases, you compute {X}^+ using only FDs whose attributes are contained in the relation
Example of closure methodology (from the tricky case)
- For R2(A, C, D, E) with FDs A → B and BC → D:
- {A}^+ = {A} (B not in R2)
- {C}^+ = {C}
- {D}^+ = {D}
- {E}^+ = {E}
- {AC}^+ = {A, C, D}
- {AC}^+ does not contain all attributes of R2 and AC is not a superkey
- Hence R2 is not BCNF and requires decomposition
Exercise: BCNF Decomposition Example
- Given R(A, B, C, D, E, F) with FDs: B → D, C → E, DE → A
- Keys: BCF (i.e., BCF functionally determines all attributes)
- Step 1: B → D violates BCNF (B is not a superkey for R)
- Decompose R into:
- R1(B, D)
- R2(A, B, C, E, F)
- R1 is in BCNF
- Step 2: Check R2 for BCNF
- Noting C → E also violates BCNF in R2 (C is not a superkey for R2)
- Decompose R2 using {C}^+ = {C, E}:
- R3(C, E)
- R4(A, B, C, F)
- Check R3: BCNF (C is a key for R3)
- Check R4: BCNF analysis with the FDs that apply within R4
- In R4, FDs B → D and C → E do not apply directly since D and E are not attributes of R4
- However, there exists a derived FD BC → A that holds in the original relation and also within R4 (because from B we get D, from C we get E, and DE → A gives A)
- This BC → A is a nontrivial FD on R4, and BC is not a superkey of R4 (Keys of R4 are BCF)
- Therefore R4 is not BCNF
- Decompose R4 using BC → A:
- R5(B, C, A)
- R6(B, C, F)
- Final BCNF decomposition:
- R1(B, D)
- R3(C, E)
- R5(B, C, A)
- R6(B, C, F)
- Key takeaways from this exercise:
- Dependency-derived FDs can cause BCNF violations even if the primary FDs do not appear to violate BCNF in a subrelation
- It is possible to have BCNF violations emerge in intermediate decompositions requiring further decomposition
Properties of BCNF Decomposition
Good properties
- No update or deletion anomalies
- Very small redundancy
- The original table can always be reconstructed from the decomposed tables if FDs are preserved (lossless join property)
- Reconstruction is at the schema level only if some FDs are not preserved
Bad property: Dependency preservation is not guaranteed
- A BCNF decomposition may not preserve all functional dependencies
- This can make certain integrity checks harder to enforce using only the decomposed relations
Dependency Preservation in BCNF
Example illustrating non-preservation in BCNF
- Original R(A, B, C) with FDs: AB → C, C → B
- Keys: AB, AC
- BCNF decomposition: R1(B, C) and R2(A, C)
- On R1, a nontrivial FD C → B holds
- On R2, there are no nontrivial FDs within the relation
- The remaining FD AB → C is not preserved by the decomposition (i.e., it does not hold on either R1 or R2 alone)
- Thus BCNF decomposition can fail to preserve all FDs
Why dependency preservation matters
- We want to detect and prevent inappropriate updates (e.g., inconsistent insertions) using local checks on decomposed relations
- If FDs are not preserved, some constraints may require joins to enforce, which defeats some benefits of normalization
Re-emphasis: Dependency preservation is not guaranteed in BCNF, unlike some normal forms that aim to preserve dependencies by design (e.g., 3NF preserves dependencies)
Dependency Preservation and the 3NF Hint
Why we care about preserving FDs
- Facilitates constraint enforcement without performing costly joins
- Helps avoid anomalies during updates by enabling local checks
Third Normal Form (3NF)
- A relaxation of BCNF that allows certain non-superkey FDs while still constraining redundancy
- 3NF is typically capable of preserving functional dependencies in a wider range of decompositions
- Will be discussed in the next lecture
Lossless Join Property (LJP)
- Definition
- A decomposition of a relation R into R1 and R2 is lossless if the natural join of R1 and R2 yields exactly the original relation R (no spurious tuples)
- Why BCNF guarantees LJP (brief intuition)
- If the common attributes of R1 and R2 form a superkey of either R1 or R2, then the join is lossless
- Example demonstrating LJP
- NRIC, Phone → Name, Address is preserved in a decomposition into:
- R1(Name, NRIC, Address)
- R2(NRIC, Phone)
- Common attribute: NRIC
- NRIC is a key of R1, hence a superkey of R1; thus, the join is lossless
- The formal criterion in BCNF terms
- When decompose R into R1 and R2 by a violation X → Y, if X is a superkey of R1 or R2, the decomposition is lossless
- Summary formula form
- If R1 ∩ R2 is a superkey of R1 or R2, then the decomposition {R1, R2} is lossless
Why BCNF Guarantees Lossless Join (More Detail)
- Restatement of the key principle
- A BCNF violation X → Y is used to split R into R1 and R2
- If the common attributes between the two resulting relations form a superkey of one of the relations, the decomposition is lossless
- Illustrative rule from the slides
- For a BCNF violation X → Y in R, compute {X}^+; if R1 contains {X}^+ and R2 contains all attributes in {X}^+, or the other way around, then lossless join holds
- Concrete example variants
- R(A, B, C) decomposed into R1(A, B) and R2(B, C) with B as the key of R2
- R(A, B, C, D) decomposed into R1(A, B, C) and R2(B, C, D) with BC as the key of R1
Summary: Practical Takeaways
- BCNF is a strong normal form aimed at eliminating all nontrivial FD determinants that are not superkeys
- Tricky BCNF checks require careful use of closures and sometimes multiple decompositions
- BCNF decomposition guarantees lossless join if the shared attributes form a superkey of one of the decomposed relations
- BCNF does not always preserve all functional dependencies; dependency preservation is not guaranteed in BCNF
- 3NF provides a more permissive framework that often preserves dependencies while still reducing redundancy; covered in the next lecture
Next Lecture Preview
- Topic 4: Third Normal Form (1)
- Focus: 3NF, how it differs from BCNF, and how it often preserves dependencies while offering a different trade-off in normalization