Loading content...
BCNF decomposition offers powerful guarantees: every decomposed relation is free of FD-based redundancy, and the decomposition is always lossless. But there's a catch—a significant one that every database designer must understand.
BCNF decomposition does not guarantee preservation of all functional dependencies.
This isn't a flaw in a particular algorithm or an edge case that rarely occurs. It's a fundamental, provable limitation: for some relation schemas, no BCNF decomposition can preserve all functional dependencies. You must sacrifice either BCNF or dependency preservation—you cannot always have both.
This page explores this limitation in depth. We'll understand what dependency preservation means, see concrete examples where BCNF loses dependencies, prove why this happens, and understand the practical implications.
This is not an academic footnote—it's a critical design consideration. When you lose a dependency, you lose the ability to enforce that constraint efficiently. Understanding this trade-off is essential for making informed decisions about normalization levels.
Before understanding how dependencies can be lost, we must precisely define what it means to preserve them.
Projection of FDs: Given a set of FDs F over relation R, and a subset of attributes S ⊆ R, the projection of F onto S, denoted πS(F), is:
πS(F) = {X → Y : X → Y ∈ F⁺ and XY ⊆ S}
In words: all functional dependencies that can be derived from F and involve only attributes in S.
Dependency Preservation: A decomposition of R into {R₁, R₂, ..., Rₙ} is dependency-preserving if:
(πR₁(F) ∪ πR₂(F) ∪ ... ∪ πRₙ(F))⁺ = F⁺
In words: the union of projected FDs has the same closure as the original FDs.
Intuitive Understanding:
A decomposition preserves dependencies if every functional dependency from the original schema can be:
Why Does This Matter?
Functional dependencies are constraints—they describe real-world rules that the data must satisfy. For example, 'Each course has exactly one instructor' is encoded as CourseID → Instructor.
When a dependency is preserved, we can enforce it by checking a single table. When it's NOT preserved, enforcing the constraint requires:
All of these are more expensive and error-prone than single-table constraint checking.
| Scenario | Constraint Enforcement | Performance | Reliability |
|---|---|---|---|
| Dependency Preserved | CHECK constraint or unique index on one table | O(1) to O(log n) | Guaranteed by DBMS |
| Dependency Not Preserved | Trigger joining multiple tables | O(n) or worse per check | Trigger logic can have bugs |
| Dependency Not Preserved | Application-layer validation | Variable | Can be bypassed; multiple code paths to maintain |
Let's examine the most famous example of a relation that cannot be decomposed into BCNF while preserving all dependencies.
Relation: TeachingAssignment(Student, Course, Instructor)
Semantics:
Functional Dependencies: • F₁: {Student, Course} → Instructor (each student has one instructor per course) • F₂: Instructor → Course (each instructor teaches one course)
Candidate Key: {Student, Course} (because {Student, Course}⁺ = {Student, Course, Instructor})
Analysis: Is This Relation in BCNF?
For BCNF, every determinant must be a superkey.
| FD | Determinant | Superkey? | BCNF Violation? |
|---|---|---|---|
| {Student, Course} → Instructor | {Student, Course} | Yes (it's the candidate key) | No |
| Instructor → Course | {Instructor} | No (Instructor⁺ = {Instructor, Course} ≠ all attributes) | YES |
The relation violates BCNF due to Instructor → Course.
BCNF Decomposition:
Applying the BCNF algorithm to the violation Instructor → Course:
Wait, let's be more careful. The original relation has {Student, Course, Instructor}.
Decomposing on Instructor → Course:
So our BCNF decomposition is:
Both relations are now in BCNF!
Original FD: {Student, Course} → Instructor
Question: Can we verify this FD using only R₁ or R₂?
• R₁ has attributes {Instructor, Course} — Student is missing! • R₂ has attributes {Student, Instructor} — Course is missing!
Neither relation contains all three attributes!
The dependency {Student, Course} → Instructor cannot be checked within a single table. It is NOT PRESERVED by this decomposition.
Given a decomposition, how do we systematically check whether dependencies are preserved? There are two approaches: the direct method and the algorithmic method.
Method 1: Direct Attribute Containment Check
For each FD X → Y in F:
This is a quick first pass but not complete—some FDs might be derivable even if not directly enforceable.
To test if FD X → Y is preserved in decomposition {R₁, ..., Rₙ}:
result := X
repeat
for each Rᵢ in decomposition:
result := result ∪ ((result ∩ Rᵢ)⁺ ∩ Rᵢ)
until result doesn't change
if Y ⊆ result:
X → Y is preserved
else:
X → Y is NOT preserved
This algorithm computes what we can derive about X by chaining through the decomposed relations.
Applying the Algorithm to Our Example:
Test: Is {Student, Course} → Instructor preserved?
Decomposition: R₁ = {Instructor, Course}, R₂ = {Student, Instructor} Original FDs: {Student, Course} → Instructor, Instructor → Course
Iteration 0: result = {Student, Course}
Iteration 1:
R₁ = {Instructor, Course}: result ∩ R₁ = {Course}
R₂ = {Student, Instructor}: result ∩ R₂ = {Student}
result unchanged. Stop.
Final result = {Student, Course}
Is Instructor ∈ result? NO!
Conclusion: {Student, Course} → Instructor is NOT preserved.
Even though Instructor → Course IS preserved (both attributes are in R₁), and even though the decomposition is lossless, we still lose {Student, Course} → Instructor. Losslessness and dependency preservation are independent properties!
The conflict between BCNF and dependency preservation isn't accidental—it's structural. Let's understand why some schemas inherently cannot satisfy both.
The Structural Conflict:
Consider our example again:
FD₂ violates BCNF because Instructor is not a superkey. To achieve BCNF, we must decompose to separate Instructor → Course.
But here's the problem:
No matter how we split, FD₁ spans attributes that end up in different relations after removing the BCNF violation.
When a non-key determinant (X) is part of the key (K) that determines another attribute (Y), removing the BCNF violation on X often separates Y from K.
In our example: Instructor is part of the process of connecting Student to Course (via the key {Student, Course}). When we isolate Instructor → Course, we break the three-way relationship needed to verify {Student, Course} → Instructor.
A Precise Characterization:
Dependency preservation is guaranteed to fail when:
The classic case (our example) has:
Note that γ contains Course (from β) and δ is Instructor (which is α). This creates the unseparable cycle.
| Characteristic | Description | Example in Classic Case |
|---|---|---|
| Overlapping Dependencies | FDs share attributes in complex ways | {Student, Course} → Instructor overlaps with Instructor → Course |
| Part-of-Key Determinant | BCNF-violating determinant touches the key | Instructor (BCNF violator) is determined by the key |
| Circular Attribute Usage | Attributes appear on both sides of different FDs | Course appears in both key {Student, Course} and as RHS of Instructor → Course |
When a dependency is lost, the constraint doesn't disappear from the real world—it just becomes harder to enforce. Let's examine the practical consequences.
Quantifying the Cost:
| Enforcement Method | INSERT Overhead | Code Complexity | Reliability |
|---|---|---|---|
| UNIQUE constraint (preserved FD) | ~O(log n) | Trivial | Perfect |
| Multi-table trigger (lost FD) | O(n) or worse | High | Depends on implementation |
| Application check (lost FD) | Variable | Very High | Error-prone |
| Nothing (accept data issues) | None | None | Data corruption |
The jump from 'preserved' to 'lost' isn't just a minor inconvenience—it can fundamentally change system architecture.
Consider an e-commerce system where {OrderID, ProductID} → WarehouseID (each product in an order ships from one warehouse). If this dependency is lost through normalization, the system might accidentally allow the same product in an order to be assigned to multiple warehouses—splitting the shipment illogically. Such bugs can be subtle and only appear at scale.
Can we identify schemas that will lose dependencies before committing to BCNF? Yes—there are patterns that signal trouble.
A schema is at risk of dependency loss if:
Non-key → Key-part: An FD X → A where X is not a superkey and A is part of some candidate key.
Overlapping Key + Non-key FDs: Two FDs X → Y and Z → W where X overlaps with Z or W.
Three+ Attribute Minimum: The key and the BCNF-violating FD together involve 3+ distinct attributes that must stay connected.
When you see these patterns, investigate carefully before decomposing to BCNF.
Systematic Detection Algorithm:
function will_lose_dependencies(R, F):
# Compute candidate keys
keys = find_all_candidate_keys(R, F)
# Find BCNF violations
for each FD X → Y in F where X is not a superkey:
# This is a BCNF violation
# Check if removing this violation breaks other FDs
for each FD A → B in F:
if A is a key or part of a key:
if X overlaps with A or B:
# Potential conflict!
if not can_preserve(F, R, X → Y, A → B):
return True # Will lose dependencies
return False # Safe to decompose
A simpler heuristic: if any BCNF-violating FD's LHS is contained in or overlaps significantly with a candidate key's attributes, be cautious.
R(A, B, C, D, E) Key: {A} FDs: A → BCDE, C → D
BCNF violation: C → D (C is not a superkey)
But C doesn't overlap with key A in a problematic way. Decomposition preserves A → BCDE and C → D.
R(A, B, C) Key: {A, B} FDs: {A, B} → C, C → A
BCNF violation: C → A (C is not a superkey)
C determines A, but A is part of the key. Decomposition will separate {A, B} from C, losing {A, B} → C.
When faced with a schema where BCNF will lose dependencies, you have several options. Each has trade-offs.
| Strategy | Description | Pros | Cons |
|---|---|---|---|
| Stay at 3NF | Accept minor redundancy to preserve dependencies | All FDs enforceable; simple design | Some redundancy remains; possible update anomalies |
| Use Triggers | Decompose to BCNF; enforce lost FDs via triggers | No redundancy; BCNF benefits | Performance overhead; complex implementation |
| Application Enforcement | Handle constraint checking in application code | Database stays simple | Highest risk of bugs; hard to maintain |
| Materialized Constraint Table | Create a table specifically for checking the constraint | Can use UNIQUE constraint on the mat table | Extra table; synchronization needed |
| Accept the Risk | Document but don't enforce the constraint | Simplest implementation | Data integrity not guaranteed |
The 3NF Compromise:
3NF is specifically designed to be achievable while preserving all dependencies. The synthesis algorithm for 3NF (which we'll study in the normalization chapter) guarantees:
For our problematic TeachingAssignment schema:
This is often the pragmatic choice when dependencies are critical business rules.
Ask yourself:
How important is this constraint? Critical business rules warrant the complexity of trigger enforcement. Nice-to-haves might be acceptable to lose.
What's the read/write ratio? High-read systems benefit more from BCNF (less redundant storage, faster reads). High-write systems pay more for trigger overhead.
How much redundancy would 3NF cause? If the redundancy is minimal (few rows affected), 3NF might be acceptable.
What's the team's DB expertise? Complex triggers require skilled maintenance. Simpler schemas are easier to manage.
Dependency preservation is a critical property that BCNF decomposition cannot always provide. Understanding this limitation is essential for informed database design.
You now understand the fundamental limitation of BCNF decomposition: it guarantees losslessness but NOT dependency preservation. You can identify when dependencies will be lost and evaluate strategies for handling the situation. In the next page, we'll synthesize this knowledge into a framework for making trade-off decisions between BCNF and 3NF.