BCNF (Part 2)

Tricky Case of BCNF

  • Context: Week 4, Topic 3: Boyce-Codd Normal Form (BCNF)

  • Goal: Understand tricky BCNF cases, how to decompose, and how to check BCNF via closures

  • Base example for BCNF violation and decomposition:

    • Start with relation R(A, B, C, D, E) with FDs: A → B, BC → D
    • The key of R is ACE (a single composite key)
    • FD A → B is a BCNF violation because A is not a superkey of R
    • Decompose R into:
    • R1(A, B)
    • R2(A, C, D, E)
    • Evaluate BCNF status after decomposition:
    • R1 is in BCNF
    • Key of R2 is ACE; need to check for BCNF violations in R2
    • A is in R2 (A ∈ attributes of R2) but B is not in R2, so A → B does not apply directly to R2
    • The potential concern is AC → D (derived from the FDs and closures) and whether AC is a superkey of R2
    • Closures must be computed to determine BCNF status for R2
  • Closure-based check on R2(A, C, D, E):

    • Given FDs: A → B, BC → D
    • Compute closures considering only attributes in R2
    • {A}^+ = {A} (since A → B is not applicable in R2 because B ∉ R2)
    • {C}^+ = {C}
    • {D}^+ = {D}
    • {E}^+ = {E}
    • {AC}^+ = {A, C, D} (because AC → D holds; A→B is not usable here due to B not in R2)
    • {AC}^+ does not include all attributes of R2 (ACE would be the full key for R2, but {AC} does not contain E)
    • Therefore, R2 is not in BCNF and must be decomposed further
  • Further decomposition of R2 using {AC}^+ = {A, C, D}:

    • Decompose R2 into:
    • R3(A, C, D)
    • R4(A, C, E)
    • R4: Attributes {A, C, E}; ACE is a key for the original R2, and in R4, it serves as a key; thus, R4 is in BCNF
    • R3: Attributes {A, C, D}; need to check BCNF with the given FDs again
  • Tricky case within R3(A, C, D):

    • FDs for the check remain A → B, BC → D; but B ∉ R3, so A → B does not apply directly to R3
    • Evaluate closures within R3:
    • {A}^+ = {A} (A → B not applicable here since B ∉ R3)
    • {C}^+ = {C}
    • {D}^+ = {D}
    • {AC}^+ = {A, C, D} (AC → D, already within R3)
    • {AD}^+ = {A, D}
    • {CD}^+ = {C, D}
    • No nontrivial FD seen that violates BCNF in R3
    • Therefore, R3 is in BCNF
  • Final BCNF decomposition achieved:

    • R1(A, B)
    • R3(A, C, D)
    • R4(A, C, E)
  • Key takeaways from the tricky BCNF case:

    • When checking BCNF, consider FDs and their applicability within the current relation (only attributes present in the relation)
    • Use closures to probe whether a determinant X is a superkey of the relation; if not, BCNF violation exists and decomposition is needed
    • It can be necessary to decompose multiple times as new relations may reintroduce tricky constraints via closures
  • Summary of the step-by-step approach:

    • Identify a potential BCNF violation X → Y where X is not a superkey of the relation
    • Compute closures {X}^+ restricting to the current relation's attributes
    • If {X}^+ does not include all attributes and X is not a superkey, decompose on {X}^+
    • Recursively apply to resulting relations until all are BCNF

General BCNF Concepts and Methods

  • BCNF definition

    • A relation R with FDs is in BCNF if for every nontrivial FD X → Y that holds on R, X is a superkey of R
    • A superkey is a set of attributes that functionally determines all attributes of the relation
  • Use of closures in BCNF checks

    • Closures help determine whether a given determinant X is a superkey within the relation being considered
    • Notation: extIfX+=extallattributesofRextthenXextisasuperkeyinRext{If } \frac{X}{+} = ext{all attributes of } R ext{ then } X ext{ is a superkey in } R
    • In tricky cases, you compute {X}^+ using only FDs whose attributes are contained in the relation
  • Example of closure methodology (from the tricky case)

    • For R2(A, C, D, E) with FDs A → B and BC → D:
    • {A}^+ = {A} (B not in R2)
    • {C}^+ = {C}
    • {D}^+ = {D}
    • {E}^+ = {E}
    • {AC}^+ = {A, C, D}
    • {AC}^+ does not contain all attributes of R2 and AC is not a superkey
    • Hence R2 is not BCNF and requires decomposition

Exercise: BCNF Decomposition Example

  • Given R(A, B, C, D, E, F) with FDs: B → D, C → E, DE → A
  • Keys: BCF (i.e., BCF functionally determines all attributes)
  • Step 1: B → D violates BCNF (B is not a superkey for R)
  • Decompose R into:
    • R1(B, D)
    • R2(A, B, C, E, F)
  • R1 is in BCNF
  • Step 2: Check R2 for BCNF
    • Noting C → E also violates BCNF in R2 (C is not a superkey for R2)
    • Decompose R2 using {C}^+ = {C, E}:
    • R3(C, E)
    • R4(A, B, C, F)
  • Check R3: BCNF (C is a key for R3)
  • Check R4: BCNF analysis with the FDs that apply within R4
    • In R4, FDs B → D and C → E do not apply directly since D and E are not attributes of R4
    • However, there exists a derived FD BC → A that holds in the original relation and also within R4 (because from B we get D, from C we get E, and DE → A gives A)
    • This BC → A is a nontrivial FD on R4, and BC is not a superkey of R4 (Keys of R4 are BCF)
    • Therefore R4 is not BCNF
  • Decompose R4 using BC → A:
    • R5(B, C, A)
    • R6(B, C, F)
  • Final BCNF decomposition:
    • R1(B, D)
    • R3(C, E)
    • R5(B, C, A)
    • R6(B, C, F)
  • Key takeaways from this exercise:
    • Dependency-derived FDs can cause BCNF violations even if the primary FDs do not appear to violate BCNF in a subrelation
    • It is possible to have BCNF violations emerge in intermediate decompositions requiring further decomposition

Properties of BCNF Decomposition

  • Good properties

    • No update or deletion anomalies
    • Very small redundancy
    • The original table can always be reconstructed from the decomposed tables if FDs are preserved (lossless join property)
    • Reconstruction is at the schema level only if some FDs are not preserved
  • Bad property: Dependency preservation is not guaranteed

    • A BCNF decomposition may not preserve all functional dependencies
    • This can make certain integrity checks harder to enforce using only the decomposed relations

Dependency Preservation in BCNF

  • Example illustrating non-preservation in BCNF

    • Original R(A, B, C) with FDs: AB → C, C → B
    • Keys: AB, AC
    • BCNF decomposition: R1(B, C) and R2(A, C)
    • On R1, a nontrivial FD C → B holds
    • On R2, there are no nontrivial FDs within the relation
    • The remaining FD AB → C is not preserved by the decomposition (i.e., it does not hold on either R1 or R2 alone)
    • Thus BCNF decomposition can fail to preserve all FDs
  • Why dependency preservation matters

    • We want to detect and prevent inappropriate updates (e.g., inconsistent insertions) using local checks on decomposed relations
    • If FDs are not preserved, some constraints may require joins to enforce, which defeats some benefits of normalization
  • Re-emphasis: Dependency preservation is not guaranteed in BCNF, unlike some normal forms that aim to preserve dependencies by design (e.g., 3NF preserves dependencies)

Dependency Preservation and the 3NF Hint

  • Why we care about preserving FDs

    • Facilitates constraint enforcement without performing costly joins
    • Helps avoid anomalies during updates by enabling local checks
  • Third Normal Form (3NF)

    • A relaxation of BCNF that allows certain non-superkey FDs while still constraining redundancy
    • 3NF is typically capable of preserving functional dependencies in a wider range of decompositions
    • Will be discussed in the next lecture

Lossless Join Property (LJP)

  • Definition
    • A decomposition of a relation R into R1 and R2 is lossless if the natural join of R1 and R2 yields exactly the original relation R (no spurious tuples)
  • Why BCNF guarantees LJP (brief intuition)
    • If the common attributes of R1 and R2 form a superkey of either R1 or R2, then the join is lossless
  • Example demonstrating LJP
    • NRIC, Phone → Name, Address is preserved in a decomposition into:
    • R1(Name, NRIC, Address)
    • R2(NRIC, Phone)
    • Common attribute: NRIC
    • NRIC is a key of R1, hence a superkey of R1; thus, the join is lossless
  • The formal criterion in BCNF terms
    • When decompose R into R1 and R2 by a violation X → Y, if X is a superkey of R1 or R2, the decomposition is lossless
  • Summary formula form
    • If R1 ∩ R2 is a superkey of R1 or R2, then the decomposition {R1, R2} is lossless

Why BCNF Guarantees Lossless Join (More Detail)

  • Restatement of the key principle
    • A BCNF violation X → Y is used to split R into R1 and R2
    • If the common attributes between the two resulting relations form a superkey of one of the relations, the decomposition is lossless
  • Illustrative rule from the slides
    • For a BCNF violation X → Y in R, compute {X}^+; if R1 contains {X}^+ and R2 contains all attributes in {X}^+, or the other way around, then lossless join holds
  • Concrete example variants
    • R(A, B, C) decomposed into R1(A, B) and R2(B, C) with B as the key of R2
    • R(A, B, C, D) decomposed into R1(A, B, C) and R2(B, C, D) with BC as the key of R1

Summary: Practical Takeaways

  • BCNF is a strong normal form aimed at eliminating all nontrivial FD determinants that are not superkeys
  • Tricky BCNF checks require careful use of closures and sometimes multiple decompositions
  • BCNF decomposition guarantees lossless join if the shared attributes form a superkey of one of the decomposed relations
  • BCNF does not always preserve all functional dependencies; dependency preservation is not guaranteed in BCNF
  • 3NF provides a more permissive framework that often preserves dependencies while still reducing redundancy; covered in the next lecture

Next Lecture Preview

  • Topic 4: Third Normal Form (1)
  • Focus: 3NF, how it differs from BCNF, and how it often preserves dependencies while offering a different trade-off in normalization