Loading content...
Imagine you've successfully decomposed a large, redundant relation into smaller, well-normalized tables. You've eliminated update anomalies, reduced storage costs, and achieved lossless join decomposition—meaning you can reconstruct the original data perfectly through natural joins. But there's a hidden danger lurking in your design: you may have inadvertently made it impossible to efficiently enforce your original functional dependencies.
This scenario represents one of the most subtle yet critical challenges in database normalization. While lossless decomposition ensures data reconstruction, dependency preservation ensures constraint enforcement. Without dependency preservation, you may find that enforcing a simple integrity rule requires joining multiple tables—transforming what should be a straightforward constraint check into an expensive, error-prone operation.
By the end of this page, you will understand dependency preservation as a fundamental decomposition property, grasp its formal definition, and recognize why this property is essential for practical database design. You'll see how ignoring dependency preservation leads to constraint enforcement nightmares, and how preserving dependencies enables efficient integrity maintenance.
To understand dependency preservation, we must first understand the problem it solves. Consider a relation that tracks course assignments:
CourseAssignment(StudentID, CourseName, InstructorID, Department)
With the following functional dependencies:
Now, suppose we decompose this relation for normalization purposes into:
R₁(StudentID, CourseName, InstructorID) R₂(InstructorID, Department)
This decomposition is lossless—we can reconstruct the original data through a natural join on InstructorID. However, let's examine what happened to our functional dependencies:
| Functional Dependency | Can Check in R₁ Alone? | Can Check in R₂ Alone? | Preserved? |
|---|---|---|---|
| FD1: InstructorID → Department | No (no Department attribute) | ✓ Yes | Yes |
| FD2: CourseName → Department | No (no Department attribute) | No (no CourseName attribute) | No! |
| FD3: StudentID, CourseName → InstructorID | ✓ Yes | No | Yes |
The critical observation: FD2 (CourseName → Department) cannot be checked using either R₁ or R₂ alone. To verify this constraint, we must:
This join operation may involve millions of rows in a production database, turning a simple constraint check into an expensive operation that degrades database performance.
When a dependency is not preserved, every INSERT and UPDATE operation that could affect this constraint requires a multi-table join to validate. In high-transaction environments, this overhead can become a severe bottleneck—or worse, the constraint may be ignored entirely, leading to data corruption.
With the intuition established, let's formalize the concept precisely. This formal understanding is essential for rigorous analysis and algorithm development.
Definition: Projection of Functional Dependencies
Given a relation R with a set of functional dependencies F, and a decomposition of R into R₁, R₂, ..., Rₙ, we define the projection of F onto Rᵢ (denoted πRᵢ(F)) as:
πRᵢ(F) = { X → Y ∈ F⁺ | X ∪ Y ⊆ attributes(Rᵢ) }
In words: the projection of F onto Rᵢ contains all functional dependencies in the closure of F (F⁺) where both the determinant (X) and the dependent (Y) consist only of attributes present in Rᵢ.
Definition: Dependency-Preserving Decomposition
A decomposition of relation R into R₁, R₂, ..., Rₙ is dependency-preserving with respect to a set of functional dependencies F if and only if:
(πR₁(F) ∪ πR₂(F) ∪ ... ∪ πRₙ(F))⁺ = F⁺
This states that the closure of the union of all projected dependencies equals the closure of the original dependency set. In other words, all original dependencies can be derived from the dependencies enforceable on individual relations.
The definition uses closures rather than direct set equality because dependencies can be logically equivalent without being literally identical. For example, if we have A → B and B → C, then A → C is in the closure. We need all original constraints to be derivable—whether explicitly or through Armstrong's axioms—from the projected sets.
Practical Interpretation:
A decomposition preserves dependencies if every functional dependency in F either:
This means we don't need the exact original dependency in a single relation—we need the ability to enforce the constraint's effect using the decomposed schema.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
def is_dependency_preserved(F, decomposition): """ Check if a decomposition preserves all functional dependencies. Parameters: - F: Set of functional dependencies on original relation - decomposition: List of attribute sets (one per decomposed relation) Returns: - True if dependency-preserving, False otherwise """ # Compute the union of all projections projected_union = set() for relation_attrs in decomposition: # Project F onto this relation for fd in compute_closure(F): # Iterate over F+ lhs, rhs = fd if lhs.issubset(relation_attrs) and rhs.issubset(relation_attrs): projected_union.add(fd) # Compute closure of projected union projected_closure = compute_closure(projected_union) # Compute closure of original F original_closure = compute_closure(F) # Decomposition is dependency-preserving iff closures are equal return projected_closure == original_closure def compute_closure(FDs): """ Compute the closure of a set of functional dependencies (F+). Uses Armstrong's axioms iteratively until no new FDs can be derived. This is the theoretical algorithm - practical implementations optimize heavily. """ closure = set(FDs) changed = True while changed: changed = False new_fds = set() for fd in closure: # Apply reflexivity, augmentation, transitivity # (simplified - full implementation would be more complex) new_fds.update(apply_armstrongs_axioms(closure, fd)) if not new_fds.issubset(closure): closure.update(new_fds) changed = True return closureUnderstanding the why behind dependency preservation transforms it from an abstract property into a practical design requirement. Let's explore the concrete implications.
In production systems handling thousands of transactions per second, the difference between a single-table constraint check and a multi-table join can mean the difference between sub-millisecond response times and multi-second delays. Non-preserved dependencies are a common source of unexplained 'slow INSERTs' that frustrate both developers and DBAs.
A visual representation helps solidify understanding of how dependencies distribute across decomposed relations. Consider the following example with a more complex schema.
Original Relation: Employee(EmpID, Name, DeptID, DeptName, ManagerID, ProjectID, ProjectName)
With functional dependencies:
Analysis of This Decomposition:
This decomposition is dependency-preserving because all original FDs are either directly enforceable in one relation or derivable from the preserved dependencies through Armstrong's axioms.
Experience with decomposition reveals recurring patterns that frequently preserve or fail to preserve dependencies. Recognizing these patterns accelerates schema design.
| Pattern | Preservation Likelihood | Why |
|---|---|---|
| Decomposition along FD boundaries | High | Each FD naturally falls into one relation |
| Key-based decomposition | High | Key attributes propagate correctly across relations |
| Transitive FD decomposition | Medium | Transitivity enables derivation but requires care |
| Arbitrary attribute splitting | Low | Random splits likely separate FD components |
| Multi-attribute determinant FDs | Low-Medium | Determinant may be split across relations |
When decomposing, always keep the determinant (left-hand side) and dependent (right-hand side) of each important functional dependency together in at least one relation. This simple heuristic prevents most preservation failures.
Anti-Patterns to Avoid:
Splitting composite keys arbitrarily — If FD is {A,B} → C, ensure A, B, and C appear together somewhere.
Focusing only on lossless property — A decomposition can be lossless but lose dependencies. Both properties must be verified.
Ignoring transitive dependencies during decomposition — When removing transitive FDs for 3NF, ensure the intermediate step attributes are properly placed.
Over-decomposition — Creating too many small relations increases the risk of splitting FD components.
These two properties are often discussed together but are fundamentally independent. Understanding their relationship is crucial for database design.
| Property | Lossless Join | Dependency Preservation |
|---|---|---|
| Goal | Reconstruct original data exactly | Enforce original constraints efficiently |
| Concern | Data integrity (no spurious tuples) | Constraint enforcement overhead |
| Testing | Check common attribute is key | Verify FDs in projected closures |
| Failure consequence | Incorrect query results from joins | Expensive or impossible constraint checks |
| 3NF guarantee | Always achievable together | Always achievable together |
| BCNF guarantee | Always achievable | Sometimes sacrificed for BCNF |
A decomposition can be lossless but not dependency-preserving, dependency-preserving but not lossless, both, or neither. They address orthogonal concerns: data reconstruction vs. constraint enforcement. Well-designed schemas aim for both.
Example Demonstrating Independence:
Consider R(A, B, C) with FDs: A → B, B → C
Decomposition 1: R₁(A, B) and R₂(A, C)
Decomposition 2: R₁(A, B) and R₂(B, C)
Decomposition 2 achieves both properties; Decomposition 1 achieves only lossless join.
Let's translate theory into practice with concrete database design guidance.
12345678910111213141516171819202122232425262728293031323334
-- Example: Verifying FD preservation through schema introspection-- This validates that DeptID → DeptName is enforceable in the Department table -- First, verify the constraint can be expressed as a functional dependencySELECT DeptID, COUNT(DISTINCT DeptName) as distinct_dept_namesFROM DepartmentGROUP BY DeptIDHAVING COUNT(DISTINCT DeptName) > 1;-- If this returns any rows, the FD is violated! -- For preserved dependencies, we can create enforcing constraints:-- Option 1: UNIQUE constraint (when determinant is not PK)ALTER TABLE Department ADD CONSTRAINT enforce_dept_name_fd UNIQUE (DeptID, DeptName); -- Option 2: For complex FDs, use triggersCREATE TRIGGER validate_dept_fdBEFORE INSERT OR UPDATE ON DepartmentFOR EACH ROWBEGIN DECLARE existing_name VARCHAR(255); SELECT DeptName INTO existing_name FROM Department WHERE DeptID = NEW.DeptID LIMIT 1; IF existing_name IS NOT NULL AND existing_name != NEW.DeptName THEN SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'FD violation: DeptID → DeptName'; END IF;END;We've established the foundational understanding of dependency preservation—a critical property for practical database design. Let's consolidate the key insights:
You now understand what dependency preservation means and why it matters for database design. In the next page, we'll explore algorithms for testing whether a decomposition preserves dependencies—giving you the tools to verify your designs rigorously.