Database Management SystemsBCNF Decomposition

BCNF Decomposition

LevelIntermediate

Duration75 mins

TopicBCNF Decomposition

2 / 5

Lossless Guarantee

Preserving Information Through Decomposition

When we decompose a relation to eliminate redundancy, we face a critical question: Can we reconstruct the original data from the decomposed pieces? If the answer is no—if information is lost or corrupted during decomposition—then the entire normalization process becomes worthless. We'd be trading redundancy for data loss, which is far worse.

The BCNF decomposition algorithm provides a powerful guarantee: every decomposition it produces is lossless. This means that joining the decomposed relations always reconstructs exactly the original relation—no missing tuples, no spurious tuples.

This page explores the lossless join property in depth. We'll understand what it means formally, why BCNF decomposition guarantees it, how to verify it independently, and what happens when decomposition is done carelessly.

What You Will Learn

By the end of this page, you will: (1) Define lossless join decomposition precisely, (2) Understand the fundamental theorem behind lossless decomposition, (3) Apply the theorem to verify decompositions, (4) Prove why BCNF decomposition is always lossless, (5) Recognize and avoid lossy decompositions.

What is Lossless Decomposition?

A decomposition is lossless (or lossless-join) if we can always recover the original relation by joining the decomposed relations. Formally:

Formal Definition

Let R be a relation schema decomposed into R₁, R₂, ..., Rₙ. The decomposition is lossless if and only if for every valid instance r of R:

r = πR₁(r) ⋈ πR₂(r) ⋈ ... ⋈ πRₙ(r)

Where π denotes projection and ⋈ denotes natural join.

This must hold for ALL possible valid instances of R, not just one particular database state.

The Two Failure Modes:

When decomposition goes wrong, two things can happen:

Lossy Decomposition (Spurious Tuples): The join produces MORE tuples than the original relation. These extra tuples are 'spurious'—they look valid but represent data that never existed.
Non-Reconstructible (Lost Tuples): The join produces FEWER tuples than the original. Information has been genuinely lost.

Interestingly, natural joins can only produce spurious tuples, never lose tuples. So our main concern is avoiding spurious data.

Lossy Decomposition ExampleConsider a relation showing which employees work on which projects under which managers.

Input

**Original Relation:** Assignments(Employee, Project, Manager)

Sample Data:
| Employee | Project | Manager |
|----------|---------|----------|
| Alice | Web | Bob |
| Alice | Mobile | Carol |
| Dave | Web | Bob |

Decompose into:
- R₁(Employee, Project)
- R₂(Project, Manager)

Output

**R₁:**
| Employee | Project |
|----------|----------|
| Alice | Web |
| Alice | Mobile |
| Dave | Web |

**R₂:**
| Project | Manager |
|----------|----------|
| Web | Bob |
| Mobile | Carol |

**Join R₁ ⋈ R₂:**
| Employee | Project | Manager |
|----------|---------|----------|
| Alice | Web | Bob |
| Alice | Mobile | Carol |
| Dave | Web | Bob |

✓ We recover exactly the original! This decomposition IS lossless.

Explanation

This worked because Project → Manager (each project has one manager). The joining attribute functionally determines one side.

Lossy Decomposition Counter-ExampleNow consider decomposing differently, without respecting functional dependencies.

Input

**Original Relation:** Assignments(Employee, Project, Manager)

Same Data:
| Employee | Project | Manager |
|----------|---------|----------|
| Alice | Web | Bob |
| Alice | Mobile | Carol |
| Dave | Web | Bob |

Decompose into:
- R₁(Employee, Manager)
- R₂(Employee, Project)

Output

**R₁:**
| Employee | Manager |
|----------|----------|
| Alice | Bob |
| Alice | Carol |
| Dave | Bob |

**R₂:**
| Employee | Project |
|----------|----------|
| Alice | Web |
| Alice | Mobile |
| Dave | Web |

**Join R₁ ⋈ R₂:**
| Employee | Manager | Project |
|----------|---------|----------|
| Alice | Bob | Web |
| Alice | Bob | Mobile | ← SPURIOUS!
| Alice | Carol | Web | ← SPURIOUS!
| Alice | Carol | Mobile |
| Dave | Bob | Web |

✗ TWO SPURIOUS TUPLES! Alice didn't work on Mobile under Bob, nor Web under Carol.

Explanation

This decomposition is LOSSY. The join created false combinations because Employee doesn't functionally determine either Manager or Project independently.

The Lossless Join Theorem

The fundamental theorem for lossless decomposition provides a simple, testable condition. For binary decomposition (splitting into exactly two relations), we have a complete characterization.

Lossless Join Theorem (Binary Decomposition)

Let R be a relation schema with functional dependency set F. A decomposition of R into R₁ and R₂ is lossless if and only if at least one of the following holds:

(R₁ ∩ R₂) → R₁ OR (R₁ ∩ R₂) → R₂

In words: The common attributes must be a superkey of at least one of the decomposed relations.

Understanding the Theorem:

Let's denote the common attributes as C = R₁ ∩ R₂. The theorem says C must functionally determine all attributes of R₁ or all attributes of R₂.

Why does this guarantee losslessness?

Consider joining R₁ and R₂ on C. For each value of C:

If C → R₁: Each C value appears in at most one tuple of R₁. The join is 'many-to-one' from R₂'s perspective.
If C → R₂: Each C value appears in at most one tuple of R₂. The join is 'many-to-one' from R₁'s perspective.

In either case, we can't create spurious combinations because one side contributes at most one tuple per C value.

The Danger Zone:

When C → R₁ fails AND C → R₂ fails, both sides can have multiple tuples with the same C value. The join becomes 'many-to-many', and different tuples on each side can combine arbitrarily—creating spurious results.

Applying the Lossless Join Theorem
Decomposition	Common Attributes (C)	C → R₁?	C → R₂?	Lossless?
R(A,B,C) → R₁(A,B), R₂(A,C) given A → B	{A}	Yes (A → AB)	No	✓ Yes
R(A,B,C) → R₁(A,B), R₂(B,C) — no FD on B	{B}	No	No	✗ No
R(A,B,C,D) → R₁(A,B,C), R₂(C,D) given C → D	{C}	No	Yes (C → CD)	✓ Yes
R(A,B,C) → R₁(A,B), R₂(A,C) given A → BC	{A}	Yes	Yes	✓ Yes

Proof of the Lossless Join Theorem

Understanding why the theorem works deepens our intuition and helps us apply it correctly. Let's prove both directions.

Proof Setup

Let R be decomposed into R₁ and R₂. Let C = R₁ ∩ R₂. Let r be any instance of R. Define:

• r₁ = πR₁(r) — projection of r onto R₁ • r₂ = πR₂(r) — projection of r onto R₂ • r' = r₁ ⋈ r₂ — the natural join

We need to prove: r = r' if and only if C → R₁ or C → R₂.

Part 1: r ⊆ r' (Always True)

For any tuple t in r:

Let t₁ = πR₁(t) (t's values on R₁ attributes)
Let t₂ = πR₂(t) (t's values on R₂ attributes)
t₁ ∈ r₁ and t₂ ∈ r₂ by definition of projection
t₁ and t₂ agree on C = R₁ ∩ R₂ (they came from the same tuple)
Therefore, t₁ ⋈ t₂ = t is in r'

Conclusion: Every original tuple survives the projection-join process. No tuples are ever lost.

Part 2: r' ⊆ r (Requires the FD Condition)

This is where the condition matters. Suppose C → R₁ holds.

Take any tuple t' in r' = r₁ ⋈ r₂. By definition of join:

t' = t₁ ⋈ t₂ for some t₁ ∈ r₁ and t₂ ∈ r₂ that agree on C
t₁ came from some original tuple s₁ ∈ r
t₂ came from some original tuple s₂ ∈ r

Since t₁ and t₂ agree on C (the common attributes):

πC(s₁) = πC(t₁) = πC(t₂) = πC(s₂)

Since C → R₁ holds in R:

s₁ and s₂ agree on C implies s₁ and s₂ agree on all of R₁
Therefore: πR₁(s₁) = πR₁(s₂)

Now, t' has:

R₁ values from t₁ = πR₁(s₁) = πR₁(s₂)
R₂ values from t₂ = πR₂(s₂)

So t' has exactly the values of s₂ on all attributes of R! Therefore t' = s₂ ∈ r.

Conclusion: Every tuple in the join came from an original tuple. No spurious tuples.

Proof Complete

We've shown that when C → R₁ (or symmetrically C → R₂), the join cannot produce spurious tuples. Combined with the fact that projections never lose tuples, this proves losslessness.

The converse (that the condition is NECESSARY) can be shown by constructing a counterexample when neither FD holds.

Why BCNF Decomposition is Always Lossless

Now we can understand why the BCNF decomposition algorithm guarantees lossless decomposition. The proof follows directly from the lossless join theorem.

The Key Observation

Recall how BCNF decomposition works: When we find a violating FD X → Y, we decompose R into:

• R₁ = XY (or X⁺ restricted to R) • R₂ = R - (Y - X) = X ∪ (R - Y)

The common attributes are: R₁ ∩ R₂ = X

Does X → R₁? We have R₁ = XY (or a superset). Since X → Y is our FD, and X → X is trivial, we have X → XY. Therefore X → R₁.

By the Lossless Join Theorem, the decomposition is lossless!

Formal Proof:

Claim: Every decomposition step in the BCNF algorithm is lossless.

Proof:

Let R be a relation with a BCNF violation X → Y (where X is not a superkey and Y - X ≠ ∅).
The algorithm decomposes R into:
- R₁ = X ∪ Y (attributes of the violating FD)
- R₂ = R - (Y - X) (original minus the dependent attributes not in X)
Common attributes: C = R₁ ∩ R₂
- R₁ = X ∪ Y
- R₂ = R - (Y - X) = X ∪ (R - Y)
- C = (X ∪ Y) ∩ (X ∪ (R - Y)) = X ∪ (Y ∩ (R - Y)) = X ∪ ∅ = X
Does C → R₁?
- C = X and R₁ = X ∪ Y
- X → X is trivial (by reflexivity)
- X → Y is our given FD
- Therefore, X → X ∪ Y, i.e., C → R₁ ✓
By the Lossless Join Theorem, R = R₁ ⋈ R₂ is lossless.

Extension to Multiple Decompositions:

Since each individual decomposition step is lossless, and the composition of lossless decompositions is lossless, the entire BCNF decomposition is lossless.

Transitivity of Losslessness

If decomposing R into (R₁, R₂) is lossless, and decomposing R₂ into (R₃, R₄) is lossless, then decomposing R into (R₁, R₃, R₄) is also lossless. This follows because r = r₁ ⋈ r₂ and r₂ = r₃ ⋈ r₄, so r = r₁ ⋈ r₃ ⋈ r₄.

Testing Losslessness: The Chase Algorithm

While the binary decomposition theorem gives us a simple test, what about decompositions into more than two relations? The chase algorithm provides a general method for testing losslessness of any decomposition.

Chase Algorithm Overview

Input: Relation R with attributes {A₁, ..., Aₙ}, decomposition {R₁, ..., Rₘ}, and FD set F.

Output: Whether the decomposition is lossless.

Procedure:

Create a tableau with m rows (one per Rᵢ) and n columns (one per attribute)
For row i, column j: Put 'aⱼ' if Aⱼ ∈ Rᵢ, else put 'bᵢⱼ' (a distinguished symbol)
Repeatedly apply FDs to make symbols equal (chase)
If any row becomes all 'a' symbols, decomposition is lossless

Detailed Example:

Consider R(A, B, C, D) decomposed into:

R₁ = {A, B}
R₂ = {B, C}
R₃ = {C, D}

With FDs: B → C, C → D

Initial Tableau:

	A	B	C	D
R₁	a₁	a₂	b₁₃	b₁₄
R₂	b₂₁	a₂	a₃	b₂₄
R₃	b₃₁	b₃₂	a₃	a₄

Chase Step 1: Apply B → C

Rows R₁ and R₂ both have a₂ in column B. Therefore, their C values must be equal. R₁ has b₁₃, R₂ has a₃. Replace b₁₃ with a₃.

	A	B	C	D
R₁	a₁	a₂	a₃	b₁₄
R₂	b₂₁	a₂	a₃	b₂₄
R₃	b₃₁	b₃₂	a₃	a₄

Chase Step 2: Apply C → D

All three rows now have a₃ in column C. Therefore, their D values must be equal. R₃ has a₄. Replace b₁₄ and b₂₄ with a₄.

	A	B	C	D
R₁	a₁	a₂	a₃	a₄
R₂	b₂₁	a₂	a₃	a₄
R₃	b₃₁	b₃₂	a₃	a₄

Result:

Row R₁ is now (a₁, a₂, a₃, a₄) — all distinguished symbols!

This means the decomposition is lossless.

Chase Algorithm Intuition

The chase simulates what happens when you join the decomposed relations. The 'a' symbols represent values that are definitely equal across original and joined tuples. The 'b' symbols represent potentially different values. When FDs force symbols to unify, we're discovering that the join preserves more information. A row of all 'a' symbols means we can reconstruct tuples exactly.

Common Misconceptions About Losslessness

Students often harbor misconceptions about what losslessness means and guarantees. Let's address the most common ones.

Misconceptions to Correct

•Misconception: 'Lossless means no data is lost during projection.' Reality: Projections always 'lose' attributes—that's what they do! Lossless means we can recover the original by joining, not that the projections individually contain all data.
•Misconception: 'Any decomposition with a common attribute is lossless.' Reality: Having common attributes is necessary but NOT sufficient. The common attributes must form a key of at least one side.
•Misconception: 'Lossless decomposition preserves all functional dependencies.' Reality: Losslessness and dependency preservation are INDEPENDENT properties. A decomposition can be lossless but not dependency-preserving (as we'll see with BCNF), or vice versa.
•Misconception: 'If the join has the right number of tuples, it's lossless.' Reality: Having the same count isn't enough. The join must produce the exact same tuples, not just the same quantity. The spurious tuples might replace some original tuples.
•Misconception: 'Losslessness depends on the actual data, not just the schema.' Reality: Losslessness is a property of the schema and FDs, verified for ALL possible instances. A decomposition is lossless if it works for every valid database state.

Wrong Thinking

'I decomposed R(A,B,C) into (A,B) and (B,C). They share B, so it must be lossless!'

This fails when B is not a key of either relation.

Correct Thinking

'I decomposed R(A,B,C) into (A,B) and (B,C). B is shared. Is B → AB or B → BC? If B → C holds, then B → BC, so this is lossless.'

Practical Implications

The lossless guarantee has significant practical implications for database design and implementation.

Practical Implications of Lossless Decomposition
Aspect	Implication	Real-World Impact
Query Reconstruction	Any query on the original relation can be rewritten as a join of decomposed relations	Views can present the 'original' table while storing normalized data
Data Integrity	No information is ever lost or fabricated during normalization	Safe to normalize existing databases without risking data corruption
Foreign Keys	The common attributes become natural foreign key relationships	Referential integrity can be enforced between decomposed tables
Join Cost	Joins are required to reconstruct original relations	Performance trade-off: reduced redundancy vs. join overhead
Normalization Reversibility	Can always 'denormalize' by materializing the join	Flexibility to optimize for different workloads

The Foreign Key Connection:

In a lossless decomposition where R is split into R₁ = XY and R₂ = X ∪ (R - Y):

X is a key of R₁
X appears in R₂ as a foreign key referencing R₁

This means lossless decomposition naturally creates proper primary key / foreign key relationships. The decomposed schema isn't just free of redundancy—it's well-structured with proper referential integrity.

Example:

When we decomposed CourseSchedule into:

CourseInfo(CourseID, Instructor)
RoomInfo(Room, Building, Capacity)
Enrollment(StudentID, CourseID, Room)

The underlined attributes in Enrollment are foreign keys referencing the respective tables. This is exactly how a well-designed database should look!

View-Based Presentation

In practice, you can create a view that joins the normalized tables back together. Applications can query the view as if the original denormalized table existed, while the underlying storage remains normalized. This provides the best of both worlds: clean storage with convenient access.

Summary and Key Takeaways

The lossless join property is the foundation that makes normalization safe and practical. Let's consolidate what we've learned.

Key Takeaways

•Definition: A decomposition is lossless if joining the pieces exactly reconstructs the original—no lost tuples, no spurious tuples.
•Binary Theorem: For R → (R₁, R₂), losslessness holds iff (R₁ ∩ R₂) → R₁ or (R₁ ∩ R₂) → R₂.
•BCNF Guarantee: The BCNF decomposition algorithm always produces lossless decomposition because we split on FD X → Y where X becomes the common key.
•Chase Algorithm: For multi-way decompositions, the chase algorithm provides a general losslessness test.
•Independence: Losslessness and dependency preservation are separate properties—one doesn't imply the other.
•Practical Value: Lossless decomposition creates well-structured schemas with proper foreign key relationships.

Page Complete

You now have a deep understanding of the lossless join guarantee in BCNF decomposition. You can verify losslessness using the binary theorem or chase algorithm, and you understand why BCNF decomposition always preserves information. Next, we'll examine the other side of the coin: what BCNF decomposition might NOT preserve—functional dependencies.

2 / 5

Loading learning content...

Database Management SystemsBCNF Decomposition

BCNF Decomposition

LevelIntermediate

Duration75 mins

TopicBCNF Decomposition

2 / 5

Lossless Guarantee

Preserving Information Through Decomposition

What You Will Learn

What is Lossless Decomposition?

A decomposition is lossless (or lossless-join) if we can always recover the original relation by joining the decomposed relations. Formally:

Formal Definition

Let R be a relation schema decomposed into R₁, R₂, ..., Rₙ. The decomposition is lossless if and only if for every valid instance r of R:

r = πR₁(r) ⋈ πR₂(r) ⋈ ... ⋈ πRₙ(r)

Where π denotes projection and ⋈ denotes natural join.

This must hold for ALL possible valid instances of R, not just one particular database state.

The Two Failure Modes:

When decomposition goes wrong, two things can happen:

Lossy Decomposition (Spurious Tuples): The join produces MORE tuples than the original relation. These extra tuples are 'spurious'—they look valid but represent data that never existed.
Non-Reconstructible (Lost Tuples): The join produces FEWER tuples than the original. Information has been genuinely lost.

Interestingly, natural joins can only produce spurious tuples, never lose tuples. So our main concern is avoiding spurious data.

Lossy Decomposition ExampleConsider a relation showing which employees work on which projects under which managers.

Input

**Original Relation:** Assignments(Employee, Project, Manager)

Sample Data:
| Employee | Project | Manager |
|----------|---------|----------|
| Alice | Web | Bob |
| Alice | Mobile | Carol |
| Dave | Web | Bob |

Decompose into:
- R₁(Employee, Project)
- R₂(Project, Manager)

Output

**R₁:**
| Employee | Project |
|----------|----------|
| Alice | Web |
| Alice | Mobile |
| Dave | Web |

**R₂:**
| Project | Manager |
|----------|----------|
| Web | Bob |
| Mobile | Carol |

**Join R₁ ⋈ R₂:**
| Employee | Project | Manager |
|----------|---------|----------|
| Alice | Web | Bob |
| Alice | Mobile | Carol |
| Dave | Web | Bob |

✓ We recover exactly the original! This decomposition IS lossless.

Explanation

This worked because Project → Manager (each project has one manager). The joining attribute functionally determines one side.

Lossy Decomposition Counter-ExampleNow consider decomposing differently, without respecting functional dependencies.

Input

**Original Relation:** Assignments(Employee, Project, Manager)

Same Data:
| Employee | Project | Manager |
|----------|---------|----------|
| Alice | Web | Bob |
| Alice | Mobile | Carol |
| Dave | Web | Bob |

Decompose into:
- R₁(Employee, Manager)
- R₂(Employee, Project)

Output

**R₁:**
| Employee | Manager |
|----------|----------|
| Alice | Bob |
| Alice | Carol |
| Dave | Bob |

**R₂:**
| Employee | Project |
|----------|----------|
| Alice | Web |
| Alice | Mobile |
| Dave | Web |

**Join R₁ ⋈ R₂:**
| Employee | Manager | Project |
|----------|---------|----------|
| Alice | Bob | Web |
| Alice | Bob | Mobile | ← SPURIOUS!
| Alice | Carol | Web | ← SPURIOUS!
| Alice | Carol | Mobile |
| Dave | Bob | Web |

✗ TWO SPURIOUS TUPLES! Alice didn't work on Mobile under Bob, nor Web under Carol.

Explanation

This decomposition is LOSSY. The join created false combinations because Employee doesn't functionally determine either Manager or Project independently.

The Lossless Join Theorem

The fundamental theorem for lossless decomposition provides a simple, testable condition. For binary decomposition (splitting into exactly two relations), we have a complete characterization.

Lossless Join Theorem (Binary Decomposition)

Let R be a relation schema with functional dependency set F. A decomposition of R into R₁ and R₂ is lossless if and only if at least one of the following holds:

(R₁ ∩ R₂) → R₁ OR (R₁ ∩ R₂) → R₂

In words: The common attributes must be a superkey of at least one of the decomposed relations.

Understanding the Theorem:

Let's denote the common attributes as C = R₁ ∩ R₂. The theorem says C must functionally determine all attributes of R₁ or all attributes of R₂.

Why does this guarantee losslessness?

Consider joining R₁ and R₂ on C. For each value of C:

If C → R₁: Each C value appears in at most one tuple of R₁. The join is 'many-to-one' from R₂'s perspective.
If C → R₂: Each C value appears in at most one tuple of R₂. The join is 'many-to-one' from R₁'s perspective.

In either case, we can't create spurious combinations because one side contributes at most one tuple per C value.

The Danger Zone:

Applying the Lossless Join Theorem
Decomposition	Common Attributes (C)	C → R₁?	C → R₂?	Lossless?
R(A,B,C) → R₁(A,B), R₂(A,C) given A → B	{A}	Yes (A → AB)	No	✓ Yes
R(A,B,C) → R₁(A,B), R₂(B,C) — no FD on B	{B}	No	No	✗ No
R(A,B,C,D) → R₁(A,B,C), R₂(C,D) given C → D	{C}	No	Yes (C → CD)	✓ Yes
R(A,B,C) → R₁(A,B), R₂(A,C) given A → BC	{A}	Yes	Yes	✓ Yes

Proof of the Lossless Join Theorem

Understanding why the theorem works deepens our intuition and helps us apply it correctly. Let's prove both directions.

Proof Setup

Let R be decomposed into R₁ and R₂. Let C = R₁ ∩ R₂. Let r be any instance of R. Define:

• r₁ = πR₁(r) — projection of r onto R₁ • r₂ = πR₂(r) — projection of r onto R₂ • r' = r₁ ⋈ r₂ — the natural join

We need to prove: r = r' if and only if C → R₁ or C → R₂.

Part 1: r ⊆ r' (Always True)

For any tuple t in r:

Let t₁ = πR₁(t) (t's values on R₁ attributes)
Let t₂ = πR₂(t) (t's values on R₂ attributes)
t₁ ∈ r₁ and t₂ ∈ r₂ by definition of projection
t₁ and t₂ agree on C = R₁ ∩ R₂ (they came from the same tuple)
Therefore, t₁ ⋈ t₂ = t is in r'

Conclusion: Every original tuple survives the projection-join process. No tuples are ever lost.

Part 2: r' ⊆ r (Requires the FD Condition)

This is where the condition matters. Suppose C → R₁ holds.

Take any tuple t' in r' = r₁ ⋈ r₂. By definition of join:

t' = t₁ ⋈ t₂ for some t₁ ∈ r₁ and t₂ ∈ r₂ that agree on C
t₁ came from some original tuple s₁ ∈ r
t₂ came from some original tuple s₂ ∈ r

Since t₁ and t₂ agree on C (the common attributes):

πC(s₁) = πC(t₁) = πC(t₂) = πC(s₂)

Since C → R₁ holds in R:

s₁ and s₂ agree on C implies s₁ and s₂ agree on all of R₁
Therefore: πR₁(s₁) = πR₁(s₂)

Now, t' has:

R₁ values from t₁ = πR₁(s₁) = πR₁(s₂)
R₂ values from t₂ = πR₂(s₂)

So t' has exactly the values of s₂ on all attributes of R! Therefore t' = s₂ ∈ r.

Conclusion: Every tuple in the join came from an original tuple. No spurious tuples.

Proof Complete

We've shown that when C → R₁ (or symmetrically C → R₂), the join cannot produce spurious tuples. Combined with the fact that projections never lose tuples, this proves losslessness.

The converse (that the condition is NECESSARY) can be shown by constructing a counterexample when neither FD holds.

Why BCNF Decomposition is Always Lossless

Now we can understand why the BCNF decomposition algorithm guarantees lossless decomposition. The proof follows directly from the lossless join theorem.

The Key Observation

Recall how BCNF decomposition works: When we find a violating FD X → Y, we decompose R into:

• R₁ = XY (or X⁺ restricted to R) • R₂ = R - (Y - X) = X ∪ (R - Y)

The common attributes are: R₁ ∩ R₂ = X

Does X → R₁? We have R₁ = XY (or a superset). Since X → Y is our FD, and X → X is trivial, we have X → XY. Therefore X → R₁.

By the Lossless Join Theorem, the decomposition is lossless!

Formal Proof:

Claim: Every decomposition step in the BCNF algorithm is lossless.

Proof:

Let R be a relation with a BCNF violation X → Y (where X is not a superkey and Y - X ≠ ∅).
The algorithm decomposes R into:
- R₁ = X ∪ Y (attributes of the violating FD)
- R₂ = R - (Y - X) (original minus the dependent attributes not in X)
Common attributes: C = R₁ ∩ R₂
- R₁ = X ∪ Y
- R₂ = R - (Y - X) = X ∪ (R - Y)
- C = (X ∪ Y) ∩ (X ∪ (R - Y)) = X ∪ (Y ∩ (R - Y)) = X ∪ ∅ = X
Does C → R₁?
- C = X and R₁ = X ∪ Y
- X → X is trivial (by reflexivity)
- X → Y is our given FD
- Therefore, X → X ∪ Y, i.e., C → R₁ ✓
By the Lossless Join Theorem, R = R₁ ⋈ R₂ is lossless.

Extension to Multiple Decompositions:

Since each individual decomposition step is lossless, and the composition of lossless decompositions is lossless, the entire BCNF decomposition is lossless.

Transitivity of Losslessness

Testing Losslessness: The Chase Algorithm

Chase Algorithm Overview

Input: Relation R with attributes {A₁, ..., Aₙ}, decomposition {R₁, ..., Rₘ}, and FD set F.

Output: Whether the decomposition is lossless.

Procedure:

Create a tableau with m rows (one per Rᵢ) and n columns (one per attribute)
For row i, column j: Put 'aⱼ' if Aⱼ ∈ Rᵢ, else put 'bᵢⱼ' (a distinguished symbol)
Repeatedly apply FDs to make symbols equal (chase)
If any row becomes all 'a' symbols, decomposition is lossless

Detailed Example:

Consider R(A, B, C, D) decomposed into:

R₁ = {A, B}
R₂ = {B, C}
R₃ = {C, D}

With FDs: B → C, C → D

Initial Tableau:

	A	B	C	D
R₁	a₁	a₂	b₁₃	b₁₄
R₂	b₂₁	a₂	a₃	b₂₄
R₃	b₃₁	b₃₂	a₃	a₄

Chase Step 1: Apply B → C

Rows R₁ and R₂ both have a₂ in column B. Therefore, their C values must be equal. R₁ has b₁₃, R₂ has a₃. Replace b₁₃ with a₃.

	A	B	C	D
R₁	a₁	a₂	a₃	b₁₄
R₂	b₂₁	a₂	a₃	b₂₄
R₃	b₃₁	b₃₂	a₃	a₄

Chase Step 2: Apply C → D

All three rows now have a₃ in column C. Therefore, their D values must be equal. R₃ has a₄. Replace b₁₄ and b₂₄ with a₄.

	A	B	C	D
R₁	a₁	a₂	a₃	a₄
R₂	b₂₁	a₂	a₃	a₄
R₃	b₃₁	b₃₂	a₃	a₄

Result:

Row R₁ is now (a₁, a₂, a₃, a₄) — all distinguished symbols!

This means the decomposition is lossless.

Chase Algorithm Intuition

Common Misconceptions About Losslessness

Students often harbor misconceptions about what losslessness means and guarantees. Let's address the most common ones.

Misconceptions to Correct

•Misconception: 'Lossless means no data is lost during projection.' Reality: Projections always 'lose' attributes—that's what they do! Lossless means we can recover the original by joining, not that the projections individually contain all data.
•Misconception: 'Any decomposition with a common attribute is lossless.' Reality: Having common attributes is necessary but NOT sufficient. The common attributes must form a key of at least one side.
•Misconception: 'Lossless decomposition preserves all functional dependencies.' Reality: Losslessness and dependency preservation are INDEPENDENT properties. A decomposition can be lossless but not dependency-preserving (as we'll see with BCNF), or vice versa.
•Misconception: 'If the join has the right number of tuples, it's lossless.' Reality: Having the same count isn't enough. The join must produce the exact same tuples, not just the same quantity. The spurious tuples might replace some original tuples.
•Misconception: 'Losslessness depends on the actual data, not just the schema.' Reality: Losslessness is a property of the schema and FDs, verified for ALL possible instances. A decomposition is lossless if it works for every valid database state.

Wrong Thinking

'I decomposed R(A,B,C) into (A,B) and (B,C). They share B, so it must be lossless!'

This fails when B is not a key of either relation.

Correct Thinking

'I decomposed R(A,B,C) into (A,B) and (B,C). B is shared. Is B → AB or B → BC? If B → C holds, then B → BC, so this is lossless.'

Practical Implications

The lossless guarantee has significant practical implications for database design and implementation.

Practical Implications of Lossless Decomposition
Aspect	Implication	Real-World Impact
Query Reconstruction	Any query on the original relation can be rewritten as a join of decomposed relations	Views can present the 'original' table while storing normalized data
Data Integrity	No information is ever lost or fabricated during normalization	Safe to normalize existing databases without risking data corruption
Foreign Keys	The common attributes become natural foreign key relationships	Referential integrity can be enforced between decomposed tables
Join Cost	Joins are required to reconstruct original relations	Performance trade-off: reduced redundancy vs. join overhead
Normalization Reversibility	Can always 'denormalize' by materializing the join	Flexibility to optimize for different workloads

The Foreign Key Connection:

In a lossless decomposition where R is split into R₁ = XY and R₂ = X ∪ (R - Y):

X is a key of R₁
X appears in R₂ as a foreign key referencing R₁

Example:

When we decomposed CourseSchedule into:

CourseInfo(CourseID, Instructor)
RoomInfo(Room, Building, Capacity)
Enrollment(StudentID, CourseID, Room)

The underlined attributes in Enrollment are foreign keys referencing the respective tables. This is exactly how a well-designed database should look!

View-Based Presentation

Summary and Key Takeaways

The lossless join property is the foundation that makes normalization safe and practical. Let's consolidate what we've learned.

Key Takeaways

•Definition: A decomposition is lossless if joining the pieces exactly reconstructs the original—no lost tuples, no spurious tuples.
•Binary Theorem: For R → (R₁, R₂), losslessness holds iff (R₁ ∩ R₂) → R₁ or (R₁ ∩ R₂) → R₂.
•BCNF Guarantee: The BCNF decomposition algorithm always produces lossless decomposition because we split on FD X → Y where X becomes the common key.
•Chase Algorithm: For multi-way decompositions, the chase algorithm provides a general losslessness test.
•Independence: Losslessness and dependency preservation are separate properties—one doesn't imply the other.
•Practical Value: Lossless decomposition creates well-structured schemas with proper foreign key relationships.

Page Complete

2 / 5