Loading learning content...
You've learned to identify functional dependencies (Page 1) and diagnose normal form violations (Page 2). Now comes the intervention: how do you actually fix a poorly normalized relation? The answer is decomposition—the systematic splitting of a relation into smaller, cleaner relations that satisfy higher normal forms.
But decomposition isn't simply about splitting tables. Done carelessly, it can destroy information or lose constraints. The challenge is to decompose in a way that:
This is where theory meets practice. Understanding decomposition algorithms is essential for interviews and for designing maintainable production databases.
By the end of this page, you will be able to execute both 3NF and BCNF decomposition algorithms correctly, verify losslessness and dependency preservation, understand when decomposition cannot achieve both properties, and make informed trade-off decisions—exactly what interviewers expect from strong candidates.
Before any decomposition algorithm, you must understand the two properties every good decomposition should have.
Property 1: Lossless-Join (Lossless Decomposition)
A decomposition of R into R₁ and R₂ is lossless iff:
R = R₁ ⋈ R₂ (the natural join of R₁ and R₂ exactly reconstructs R)
Equivalently: No spurious tuples are generated when joining the decomposed relations.
Why It Matters: If decomposition is lossy, joining the tables back produces rows that didn't exist in the original—phantom data that corrupts query results.
The Lossless Test:
Decomposition of R into {R₁, R₂} is lossless iff:
The common attributes must be a superkey of at least one of the decomposed relations.
Original R(A, B, C) with tuples:
(1, 2, 3)
(1, 3, 4)
(2, 2, 4)
Decomposition 1: R₁(A, B), R₂(B, C)
Decomposition 2: R₁(A, B), R₂(A, C)**Decomposition 1: R₁(A,B), R₂(B,C)**
R₁: (1,2), (1,3), (2,2)
R₂: (2,3), (3,4), (2,4)
R₁ ⋈ R₂ (join on B):
(1,2,3), (1,2,4), (1,3,4), (2,2,3), (2,2,4)
**5 tuples! Original had 3. SPURIOUS TUPLES: (1,2,4), (2,2,3)**
**LOSSY DECOMPOSITION ❌**
---
**Decomposition 2: R₁(A,B), R₂(A,C)**
R₁: (1,2), (1,3), (2,2)
R₂: (1,3), (1,4), (2,4)
R₁ ⋈ R₂ (join on A):
(1,2,3), (1,2,4), (1,3,3), (1,3,4), (2,2,4)
**Still wrong!** Original had (1,2,3) not (1,2,4).
**LOSSY DECOMPOSITION ❌**
---
**Lesson:** We need FDs to ensure common attributes form a key. Without FDs guaranteeing this, decomposition may be lossy.Property 2: Dependency Preservation
A decomposition preserves dependencies iff all original FDs can be enforced using only the FDs of the decomposed relations (without needing cross-table joins for validation).
Formally: Let F be the original FDs. Let Fᵢ be the FDs over attributes of Rᵢ. Dependency preservation means:
(F₁ ∪ F₂ ∪ ... ∪ Fₙ)⁺ = F⁺
Why It Matters: If a dependency isn't preserved, the DBMS cannot enforce it without expensive cross-table checks. In practice, unpreserved dependencies often become application-level bugs.
Example:
Problem: The FD A → B cannot be checked without joining R₁ and R₂. Dependency NOT preserved.
Losslessness is always achievable for BCNF and 3NF decomposition.
Dependency preservation is NOT always achievable for BCNF decomposition.
3NF decomposition ALWAYS achieves both properties. This is why 3NF is sometimes preferred over BCNF in practice—you don't sacrifice dependency preservation.
| Decomposition Type | Lossless? | Dependency Preserving? |
|---|---|---|
| 3NF (Synthesis Algorithm) | ✅ Always | ✅ Always |
| BCNF (Analysis Algorithm) | ✅ Always | ⚠️ Sometimes Lost |
| Arbitrary Decomposition | ❌ Not guaranteed | ❌ Not guaranteed |
The fundamental operation is binary decomposition—splitting one relation into exactly two. All multi-way decompositions are sequences of binary decompositions.
Binary Decomposition for BCNF Violation:
If R contains a BCNF violation X → Y (where X is not a superkey):
Actually, the standard formulation:
123456789101112131415161718192021222324252627282930313233343536373839
/** * Binary BCNF Decomposition Step * * Given: Relation R with FDs F * Violation: X → Y where X is not a superkey of R * * Decompose R into R₁ and R₂ */function binaryBCNFDecompose(R, X, Y) { // R₁ contains the violating FD as a full dependency // X will be a key of R₁ let R1 = X.union(computeClosureOnR(X, R, F)); // Actually simpler: R1 = X ∪ Y (X → Y becomes X → Y in R1) // R₂ contains everything except what's fully determined by X // This removes Y but keeps X (the determinant) let R2 = R.minus(Y.minus(X)); // R - (Y - X) // Verification: // - R1 ∩ R2 = X // - X → Y means X → R1 (since R1 = X ∪ Y) // - Therefore X is a key of R1 // - R1 ∩ R2 = X → R1 satisfies lossless condition return [R1, R2];} // Example:// R(A, B, C, D) with FDs: A → B, C → D// Key: {A, C}// // Violation: A → B (A is not superkey)// // R1 = A ∪ B = {A, B}// R2 = R - B = {A, C, D}// // Check lossless: R1 ∩ R2 = {A}// Is A → R1? A → AB? Yes! (since A → B and A → A)// Lossless ✓When you decompose R into R₁, R₂, the FDs for each component are the projections of F onto those attributes. The projection of F onto R₁ includes all FDs X → Y from F⁺ where both X and Y are subsets of R₁. Computing projections can be exponential in theory, but for interview problems, you typically work with the given FDs.
The BCNF decomposition algorithm repeatedly applies binary decomposition until all relations are in BCNF.
Algorithm Overview:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
/** * BCNF Decomposition Algorithm (Analysis Method) * * Input: Relation R with FDs F * Output: Set of BCNF relations whose join equals R */function bcnfDecompose(R, F) { let result = new Set([R]); let done = false; while (!done) { done = true; for (let Ri of result) { // Find a BCNF violation in Ri let violation = findBCNFViolation(Ri, F); if (violation !== null) { let { X, Y } = violation; // X → Y where X not superkey // Decompose Ri let R1 = X.union(Y); // XY let R2 = Ri.minus(Y.minus(X)); // Ri - (Y - X) // Replace Ri with R1 and R2 result.delete(Ri); result.add(R1); result.add(R2); done = false; break; // Restart iteration with new set } } } return result;} function findBCNFViolation(R, F) { // Project F onto R's attributes let projectedFDs = projectFDs(F, R); for (let fd of projectedFDs) { if (!isTrivial(fd)) { let leftClosure = computeClosure(fd.X, projectedFDs); // X is not a superkey if closure doesn't cover all of R if (!leftClosure.containsAll(R)) { return fd; // Found a violation } } } return null; // No violations, Ri is in BCNF}R(A, B, C, D, E) with FDs:
- A → B
- BC → D
- D → E
Key: {A, C}**Iteration 1: Analyze R(A, B, C, D, E)**
Check A → B:
- Is A a superkey? A⁺ = {A, B} ≠ R. No.
- **BCNF Violation!**
Decompose on A → B:
- R₁ = {A, B}
- R₂ = R - {B} = {A, C, D, E}
result = {R₁(A,B), R₂(A,C,D,E)}
---
**Iteration 2: Analyze R₁(A, B)**
FDs on {A,B}: A → B
- Is A superkey of R₁? A⁺ = {A, B} = R₁. Yes!
- **R₁ is in BCNF ✓**
**Iteration 3: Analyze R₂(A, C, D, E)**
FDs on {A,C,D,E}: BC → D... but B not in R₂.
Project: Only D → E applies (C not implying D directly without B)
Actually, reconsider FDs:
- BC → D: B is not in R₂, so this doesn't apply
- D → E: Applies to R₂
Is D a superkey of R₂? D⁺ on R₂ = {D, E} ≠ {A,C,D,E}. No.
**BCNF Violation: D → E**
Decompose on D → E:
- R₃ = {D, E}
- R₄ = R₂ - {E} = {A, C, D}
result = {R₁(A,B), R₃(D,E), R₄(A,C,D)}
---
**Iteration 4: Analyze R₃(D, E)**
- D → E: Is D superkey? D⁺ = {D,E} = R₃. Yes!
- **R₃ is in BCNF ✓**
**Iteration 5: Analyze R₄(A, C, D)**
- No FDs apply directly to R₄ (BC→D needs B)
- Key of R₄: {A, C} (determines D via... wait, no direct FD)
- Actually, with original FDs, {A,C} is a key if we trace through
Hmm, this highlights the projection complexity. Let's verify:
In R₄, what FDs hold?
- From original: BC → D. Without B, does anything imply D? No direct FD.
- So in R₄, {A,C} → D? Only if we had such an FD in closure.
Since no non-trivial FDs violate BCNF in R₄ with projected FDs:
- **R₄ is in BCNF ✓** (every determinant containing A,C would be superkey)
---
**Final Decomposition:**
- R₁(A, B) with FD: A → B
- R₃(D, E) with FD: D → E
- R₄(A, C, D) — key is {A, C, D} or {A, C} if we consider original
**All relations in BCNF ✓**In the example above, the FD BC → D was lost! It cannot be checked using only the decomposed relations (B is in R₁, C is in R₄, D is in R₃/R₄—no single relation contains all three). This is the trade-off of BCNF: you might lose dependencies.
The 3NF synthesis algorithm takes a fundamentally different approach. Instead of repeatedly splitting, it builds up relations from the canonical cover of FDs.
Key Advantage: This algorithm guarantees both losslessness AND dependency preservation.
Algorithm Overview:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
/** * 3NF Decomposition Algorithm (Synthesis Method) * * Input: Relation R with FDs F * Output: Set of 3NF relations that is lossless and dependency-preserving */function threeNFDecompose(R, F) { // Step 1: Compute canonical cover let Fc = computeCanonicalCover(F); // Step 2: Create a relation for each FD (combine same left sides) let relations = new Map(); // leftSide -> Set of all attributes for (let fd of Fc) { let key = fd.X.toString(); // Left side as key if (!relations.has(key)) { relations.set(key, new Set(fd.X)); } // Add right side attribute relations.get(key).add(fd.Y); // Assuming single attribute Y } // Convert map to set of relations let result = new Set(); for (let [key, attrs] of relations) { result.add(attrs); } // Step 3: Ensure losslessness by adding a candidate key if needed let candidateKey = findCandidateKey(R, F); let keyIncluded = false; for (let rel of result) { if (isSubset(candidateKey, rel)) { keyIncluded = true; break; } } if (!keyIncluded) { result.add(new Set(candidateKey)); } // Step 4: Remove redundant relations (subsets of others) result = removeSubsets(result); return result;} // Example:// R(A, B, C, D) with Fc = {A → B, B → C, C → D}// // Step 2: Create relations// A → B => R1(A, B)// B → C => R2(B, C)// C → D => R3(C, D)//// Step 3: Candidate key = {A}. Is A in any relation? // R1 contains A. ✓ No need to add.//// Step 4: No subsets to remove.//// Result: {R1(A,B), R2(B,C), R3(C,D)}// // All FDs preserved:// A → B in R1 ✓// B → C in R2 ✓ // C → D in R3 ✓//// Lossless: R1 ∩ R2 = {B}. B → C → D...// Actually verified by key inclusion (A is key, in R1)R(A, B, C, D, E) with FDs:
- A → BC
- C → D
- D → E
- E → A
Decompose to 3NF using synthesis.**Step 1: Compute Canonical Cover**
Decompose right sides:
- A → B, A → C, C → D, D → E, E → A
Check for redundancy:
- A → B: Remove, check if A⁺ (under remaining) contains B? {A,C,D,E} via E→A? No B. Keep.
- All are essential.
**Fc = {A → B, A → C, C → D, D → E, E → A}**
---
**Step 2: Create Relations from Fc**
Group by left side:
- A → {B, C} => **R₁(A, B, C)**
- C → D => **R₂(C, D)**
- D → E => **R₃(D, E)**
- E → A => **R₄(E, A)**
---
**Step 3: Check for Candidate Key**
Find candidate key:
- A⁺ = {A,B,C,D,E} = R. A is a candidate key!
- Also: C⁺ = D⁺ = E⁺ = R (cyclic, all equivalent)
Is any candidate key fully contained in a relation?
- A ⊆ R₁? Yes! ✓
No need to add a key relation.
---
**Step 4: Remove Subsets**
No relation is a subset of another.
---
**Final 3NF Decomposition:**
- R₁(A, B, C)
- R₂(C, D)
- R₃(D, E)
- R₄(E, A)
**Verification:**
*Dependency Preservation:*
- A → B ✓ (in R₁)
- A → C ✓ (in R₁)
- C → D ✓ (in R₂)
- D → E ✓ (in R₃)
- E → A ✓ (in R₄)
*All FDs can be checked within a single relation!*
*Losslessness:*
- Key A is contained in R₁
- By theorem, this guarantees lossless joinThe synthesis algorithm creates one relation per FD (grouping by determinant). Since each FD gets its own table, every dependency can be checked within a single table—hence guaranteed preservation. The key relation ensures join reconstruction works.
When should you use BCNF vs. 3NF decomposition? This decision has real practical implications.
| Property | BCNF Decomposition | 3NF Decomposition |
|---|---|---|
| Resulting Normal Form | BCNF (stricter) | 3NF (slightly weaker) |
| Lossless | ✅ Always | ✅ Always |
| Dependency Preserving | ⚠️ Not always | ✅ Always |
| Algorithm Type | Analysis (top-down splitting) | Synthesis (bottom-up building) |
| Typical # of Relations | Often more (finer splitting) | Often fewer (grouped by FD) |
| Anomaly Elimination | Complete (all update anomalies gone) | Nearly complete (rare edge cases remain) |
| Constraint Enforcement | May need cross-table checks | All constraints local to tables |
In most schemas, 3NF and BCNF produce the same decomposition. The difference only arises when you have overlapping candidate keys with certain FD patterns. For interview purposes, know both algorithms and understand when they diverge.
In interviews, you may be given a decomposition and asked to verify its properties. Here's how to check both losslessness and dependency preservation.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
/** * Chase Algorithm for Losslessness Test * (For general n-way decompositions) * * For binary decomposition, use the simpler test: * Lossless iff (R1 ∩ R2) → R1 or (R1 ∩ R2) → R2 */function checkLossless(R, decomposition, F) { // Create tableau: one row per relation in decomposition // Each row has subscripted symbols let tableau = []; for (let i = 0; i < decomposition.length; i++) { let row = {}; for (let attr of R) { if (decomposition[i].contains(attr)) { row[attr] = attr; // Distinguished symbol (a) } else { row[attr] = attr + "_" + i; // Subscripted (b_i) } } tableau.push(row); } // Apply FDs repeatedly until no change let changed = true; while (changed) { changed = false; for (let fd of F) { // Find rows that agree on fd.X let groups = groupByAttributes(tableau, fd.X); for (let group of groups) { if (group.length > 1) { // Rows agree on X, make them agree on Y let targetValue = findDistinguished(group, fd.Y); for (let row of group) { for (let attr of fd.Y) { if (row[attr] !== targetValue[attr]) { row[attr] = targetValue[attr]; changed = true; } } } } } } } // Check if any row has all distinguished symbols for (let row of tableau) { if (Object.values(row).every(v => !v.includes("_"))) { return true; // Lossless! } } return false; // Lossy}Simpler Binary Test:
For decomposing R into just R₁ and R₂:
Dependency Preservation Test:
12345678910111213141516171819202122232425262728293031323334353637383940
/** * Check if FD X → Y is preserved in decomposition */function isPreserved(X, Y, decomposition, F) { // Compute closure of X using only projected FDs let closure = new Set(X); let changed = true; while (changed) { changed = false; for (let Ri of decomposition) { // Compute (closure ∩ Ri)+ restricted to Ri let Xcap = closure.intersection(Ri); let XcapClosure = computeClosure(Xcap, F); let contribution = XcapClosure.intersection(Ri); // Add any new attributes to closure for (let attr of contribution) { if (!closure.has(attr)) { closure.add(attr); changed = true; } } } } // FD is preserved if Y ⊆ closure return closure.containsAll(Y);} // Check all FDsfunction checkDependencyPreservation(R, decomposition, F) { for (let fd of F) { if (!isPreserved(fd.X, fd.Y, decomposition, F)) { return { preserved: false, violatingFD: fd }; } } return { preserved: true };}R(A, B, C, D)
FDs: A → B, B → C, C → D
Decomposition: {R₁(A,B), R₂(B,C,D)}**Losslessness Check:**
R₁ ∩ R₂ = {B}
Compute B⁺ under FDs:
B⁺ = {B} → {B, C} (B → C) → {B, C, D} (C → D)
B⁺ = {B, C, D}
Is B⁺ ⊇ R₁? {B,C,D} ⊇ {A,B}? No.
Is B⁺ ⊇ R₂? {B,C,D} ⊇ {B,C,D}? **Yes!**
**Lossless ✓**
---
**Dependency Preservation Check:**
FD: A → B
- A is in R₁, B is in R₁
- **Preserved ✓** (can check in R₁)
FD: B → C
- B is in both, C is in R₂
- **Preserved ✓** (can check in R₂)
FD: C → D
- C is in R₂, D is in R₂
- **Preserved ✓** (can check in R₂)
**All Dependencies Preserved ✓**Let's tackle several interview-style decomposition problems to build fluency.
R(A, B, C, D) with FDs:
- AB → C
- C → D
- D → B
Decompose to BCNF and state if dependency preservation is maintained.**Step 1: Find Candidate Keys**
Try {A,B}⁺ = {A,B,C,D} ✓ (via AB→C, C→D)
Try {A,C}⁺ = {A,C,D,B} ✓ (via C→D, D→B)
Try {A,D}⁺ = {A,D,B,C} ✓ (via D→B, then AB not directly...)
Actually: {A,D} → {A,D,B} (D→B) → {A,D,B,C} (AB→C) = R ✓
Try {A}⁺ = {A}. Not a key.
**Candidate Keys: {A,B}, {A,C}, {A,D}**
**Step 2: Find BCNF Violations**
AB → C: {A,B}⁺ = R. Superkey ✓
C → D: C⁺ = {C,D,B}. Not superkey (missing A). **Violation!**
D → B: D⁺ = {D,B}. Not superkey. **Violation!**
**Step 3: Decompose on C → D**
R₁ = {C, D}
R₂ = {A, B, C}
**Step 4: Check R₁(C,D)**
FD: C → D. C⁺ on {C,D} = {C,D} = R₁. Superkey ✓
**R₁ in BCNF ✓**
**Step 5: Check R₂(A,B,C)**
FDs on {A,B,C}: AB → C
AB⁺ on {A,B,C} = {A,B,C} = R₂. Superkey ✓
**R₂ in BCNF ✓**
**Final: R₁(C,D), R₂(A,B,C)**
**Dependency Preservation:**
- AB → C ✓ (in R₂)
- C → D ✓ (in R₁)
- D → B: D is in R₁, B is in R₂. **NOT PRESERVED ❌**
To check D → B, we'd need to join R₁ and R₂.R(Student, Subject, Teacher) with:
- Each student takes each subject from one teacher
- Each teacher teaches only one subject
- Multiple teachers can teach the same subject
FDs: {Student, Subject} → Teacher, Teacher → Subject
Decompose to (a) BCNF and (b) 3NF. Compare results.**Keys Analysis:**
{Student, Subject}⁺ = R ✓
{Student, Teacher}⁺ = {S,T,Subject} = R ✓
**Candidate Keys: {Student, Subject}, {Student, Teacher}**
---
**(a) BCNF Decomposition:**
Check Teacher → Subject:
- Teacher⁺ = {Teacher, Subject}. Not superkey.
- **BCNF Violation!**
Decompose:
- R₁(Teacher, Subject)
- R₂(Student, Teacher)
Both in BCNF.
**Dependency Check:**
- Teacher → Subject ✓ (in R₁)
- {Student, Subject} → Teacher... Subject in R₁, Teacher in R₁/R₂
To verify this FD, need to join! **NOT PRESERVED ❌**
---
**(b) 3NF Decomposition:**
Using synthesis:
- {Student, Subject} → Teacher => R₁(Student, Subject, Teacher)
- Teacher → Subject => R₂(Teacher, Subject)
**Verify 3NF for R₁(Student, Subject, Teacher):**
- {Student, Subject} → Teacher: SS is key. ✓
- Also has Teacher → Subject (via original), but Subject is prime (part of key).
In 3NF, non-superkey → prime is OK. ✓
**R₁ is in 3NF (but not BCNF).**
**Dependency Preservation:**
- {Student, Subject} → Teacher ✓ (in R₁)
- Teacher → Subject ✓ (in R₂)
**ALL PRESERVED ✓**
---
**Comparison:**
- BCNF: Cleaner (pure dependencies), but loses {S,Subj}→Teacher
- 3NF: Preserves all dependencies, slight redundancy possible
**For this schema, 3NF is preferred** to maintain constraint enforcement.What's Next:
With decomposition mastered, Page 4: Key Finding will formalize the algorithms for systematically finding all candidate keys—a prerequisite for both normal form analysis and decomposition that we've been using intuitively. You'll learn to reliably identify keys even in complex schemas.
You now understand both major decomposition algorithms and can verify decomposition properties. Practice by taking any non-BCNF schema and decomposing it both ways—then compare the results. The more you practice, the more intuitive these algorithms become.