Canonical Cover - Learning Module

Loading content...

0/241

The Canonical Cover Algorithm: A Complete Procedure

From Concepts to Algorithm

In the previous pages, you learned what minimal covers are and how to identify extraneous attributes. Now it's time to put these pieces together into a complete, systematic algorithm that transforms any set of functional dependencies into its canonical (minimal) cover.

The canonical cover algorithm is one of the foundational algorithms in database theory. It appears in virtually every database textbook and is a prerequisite for normalization algorithms like 3NF synthesis. Understanding this algorithm deeply—not just memorizing its steps—enables you to reason about FD sets, debug normalization issues, and design optimal schemas.

This page presents the algorithm in complete detail, with rigorous correctness analysis and extensive worked examples.

What You Will Learn

By the end of this page, you will understand every step of the canonical cover algorithm, why each step is necessary, and how to execute it correctly. You will work through multiple complete examples and understand the algorithm's correctness guarantees.

Algorithm Overview

The canonical cover algorithm has three main phases:

Phase 1: Decompose Right-Hand Sides

Convert all FDs to have single-attribute right-hand sides. Replace X → AB with X → A and X → B.

Phase 2: Remove Extraneous Left-Hand Attributes

For each FD, check if any attribute on the left side can be removed. If so, remove it.

Phase 3: Remove Redundant FDs

Check if any FD can be derived from the others. If so, remove it.

Post-Processing: Combine FDs with Same LHS (Optional)

For readability, FDs with identical left-hand sides can be combined: X → A and X → B become X → AB.

Key Invariant:

Throughout the algorithm, the closure of the FD set remains unchanged. Each step preserves equivalence while simplifying the representation.

canonical_cover_algorithm.txt
ALGORITHM: Canonical Cover (Minimal Cover)
 
INPUT: F (set of functional dependencies)
OUTPUT: Fc (canonical cover of F)
 
// ═══════════════════════════════════════════════
// PHASE 1: Decompose to Single-Attribute RHS
// ═══════════════════════════════════════════════
FOR each FD (X → Y) in F where |Y| > 1:
    Remove (X → Y) from F
    FOR each attribute A in Y:
        Add (X → A) to F
 
// ═══════════════════════════════════════════════
// PHASE 2: Remove Extraneous LHS Attributes
// ═══════════════════════════════════════════════
REPEAT:
    changed = false
    FOR each FD (X → A) in F where |X| > 1:
        FOR each attribute B in X:
            X' = X - {B}
            IF A ∈ closure(X', F):
                Replace (X → A) with (X' → A) in F
                changed = true
                BREAK inner loop
        IF changed: BREAK outer loop
UNTIL NOT changed
 
// ═══════════════════════════════════════════════
// PHASE 3: Remove Redundant FDs
// ═══════════════════════════════════════════════
REPEAT:
    changed = false
    FOR each FD (X → A) in F:
        F' = F - {X → A}
        IF A ∈ closure(X, F'):
            F = F'  // Remove the redundant FD
            changed = true
            BREAK loop
UNTIL NOT changed
 
RETURN F  // This is now the canonical cover Fc

Why the REPEAT Loops?

Removing one extraneous attribute or redundant FD may make others extraneous or redundant. The REPEAT loops ensure we continue until the set is fully minimized. This is crucial for correctness.

Phase 1: Right-Hand Side Decomposition

The first phase converts all FDs to have exactly one attribute on the right-hand side. This is justified by the decomposition rule from Armstrong's axioms:

If X → YZ, then X → Y and X → Z

Conversely, the union rule allows us to combine them back:

If X → Y and X → Z, then X → YZ

These rules guarantee that decomposition preserves the closure—no information is lost.

Why Decompose First?

Uniform handling: All FDs have the same structure (X → A where A is single)
Precise redundancy testing: Each FD is either wholly redundant or not
Simpler LHS testing: Extraneous LHS analysis is cleaner with atomic RHS

Example:

Original: {A → BCD, BC → D, D → A}

After Phase 1:
{A → B, A → C, A → D, BC → D, D → A}

Phase 1 Transformation Examples
Original FD	After Decomposition
`A → BC`	`A → B`, `A → C`
`AB → CDE`	`AB → C`, `AB → D`, `AB → E`
`X → Y` (single attr)	`X → Y` (no change)
`PQ → RST`	`PQ → R`, `PQ → S`, `PQ → T`

Order Doesn't Matter Here

Unlike later phases, Phase 1 is deterministic. Every FD with multi-attribute RHS is decomposed the same way, regardless of order. The result of Phase 1 is unique.

Phase 2: Removing Extraneous LHS Attributes

Phase 2 systematically checks each attribute in the left-hand side of each FD. For each attribute, we ask: "Can we still derive the RHS without this attribute?"

The Test:

For FD X → A where B ∈ X:

Compute (X - {B})⁺ under the current F
If A is in this closure, B is extraneous
If extraneous, remove B from the FD's LHS

Important Details:

Use current F for closure: After each modification, use the updated F for subsequent tests
Restart after each change: When an attribute is removed, restart the search (or continue with careful bookkeeping)
Only test FDs with |X| > 1: Single-attribute LHS cannot be further reduced

Example Walkthrough:

F = {A → B, B → C, AC → D}

Test AC → D:

Is A extraneous? Compute C⁺ = {C}. Is D ∈ {C}? No. A is essential.
Is C extraneous? Compute A⁺:
- Start: {A}
- Apply A → B: {A, B}
- Apply B → C: {A, B, C}
- Now we can apply AC → D? Wait—we're computing without C.
Let me redo: We compute A⁺ under F (including AC → D):
- Start: {A}
- Apply A → B: {A, B}
- Apply B → C: {A, B, C}
- Apply AC → D: {A, B, C, D}
- Is D ∈ A⁺? Yes!
- C is extraneous. Replace AC → D with A → D.

F after Phase 2: {A → B, B → C, A → D}

Critical: Use Full F Including Target FD

When testing if B is extraneous in X → A, compute (X-{B})⁺ using ALL FDs in F, including X → A. The FD itself might be triggered if the reduced LHS, combined with other FDs, produces enough attributes.

phase2_detailed.txt
DETAILED PHASE 2 ALGORITHM:
 
changed = true
WHILE changed:
    changed = false
    FOR each (X → A) in F:
        IF |X| == 1: CONTINUE  // Skip single-attribute LHS
        
        FOR each B in X:
            X' = X - {B}
            closure = compute_closure(X', F)  // Use FULL F
            
            IF A ∈ closure:
                // B is extraneous
                Remove (X → A) from F
                Add (X' → A) to F
                changed = true
                BREAK both loops  // Restart outer while
 
// After Phase 2, every FD has minimal LHS

Phase 3: Removing Redundant FDs

Phase 3 checks whether each FD is derivable from the others. If so, the FD is redundant and should be removed.

The Test:

For FD X → A in F:

Let F' = F - {X → A} (temporarily remove the FD)
Compute X⁺ under F'
If A is in this closure, the FD is redundant
If redundant, permanently remove it from F

Why This Works:

If A can be derived from X using only the other FDs, then X → A doesn't add any new derivation power. It's logically implied by the rest of F.

Example:

F = {A → B, B → C, A → C}

Test A → C:

F' = {A → B, B → C}
Compute A⁺ under F':
- Start: {A}
- Apply A → B: {A, B}
- Apply B → C: {A, B, C}
Is C ∈ {A, B, C}? Yes!
A → C is redundant. Remove it.

F after: {A → B, B → C}

Critical: Exclude Target FD

When testing if X → A is redundant, compute X⁺ WITHOUT including X → A. This is opposite from Phase 2! In Phase 2, we include the target FD; in Phase 3, we exclude it.

Phase 2 (LHS Extraneous)

•Testing if attribute B in X is extraneous
•Compute (X-{B})⁺ using full F
•Include the target FD
•Question: Can other FDs help derive A?

Phase 3 (FD Redundant)

•Testing if entire FD is redundant
•Compute X⁺ using F - {X → A}
•Exclude the target FD
•Question: Can other FDs alone derive A?

phase3_detailed.txt
DETAILED PHASE 3 ALGORITHM:
 
changed = true
WHILE changed:
    changed = false
    FOR each (X → A) in F:
        F' = F - {X → A}  // Temporarily exclude
        closure = compute_closure(X, F')  // Use F'
        
        IF A ∈ closure:
            // FD is redundant - make removal permanent
            F = F'
            changed = true
            BREAK loop  // Restart
 
// After Phase 3, no FD is derivable from others

Complete Worked Example

Let's work through a complete example from start to finish.

Given FD Set:

F = {
  A → BC,
  B → C,
  A → B,
  AB → C
}

Scheme: R(A, B, C)

PHASE 1: Decompose RHS

F = {
  A → B,      from A → BC
  A → C,      from A → BC
  B → C,
  A → B,      duplicate! (from original A → B)
  AB → C
}

Remove duplicate:

F = {A → B, A → C, B → C, AB → C}

PHASE 2: Remove Extraneous LHS Attributes

Check AB → C (only FD with |LHS| > 1):

Is A extraneous in AB → C?

Compute B⁺ under F:
- Start: {B}
- Apply B → C: {B, C}
Is C ∈ {B, C}? Yes!
A is extraneous. Replace AB → C with B → C.

But wait—B → C already exists! This creates a duplicate.

F = {A → B, A → C, B → C, B → C}

Remove duplicate:

F = {A → B, A → C, B → C}

No more FDs with |LHS| > 1. Phase 2 complete.

Algorithm Correctness

Why does this algorithm produce a valid minimal cover? Let's examine the correctness guarantees.

Theorem: The algorithm terminates with a valid minimal cover.

Proof Sketch:

Part 1: Termination

Each phase can only decrease the "size" of F:

Phase 1: Fixed number of operations (no loops)
Phase 2: Each iteration removes at least one attribute from some LHS
Phase 3: Each iteration removes at least one FD

Since there are finitely many attributes and FDs, the algorithm must terminate.

Part 2: Equivalence Preservation

Each transformation preserves the closure:

Decomposition: By the decomposition rule, X → AB ≡ {X → A, X → B}
LHS reduction: If A is extraneous in XA → Y, then X → Y is derivable from F, so removing A preserves closure
Redundant FD removal: By definition, a redundant FD is derivable from the rest, so removing it preserves closure

Part 3: Minimality

The result satisfies all minimal cover conditions:

Single-attribute RHS: Ensured by Phase 1
No extraneous LHS: Ensured by Phase 2 (terminates only when no more found)
No redundant FDs: Ensured by Phase 3 (terminates only when no more found)

Therefore, the output is a valid minimal cover of the input F.

Complexity Analysis

The algorithm has polynomial time complexity. For n FDs with at most a attributes per FD over a schema of m attributes: Phase 1 is O(n·a); Phase 2 is O(n²·a·closure_cost); Phase 3 is O(n²·closure_cost). Closure computation is O(n·a), making overall complexity O(n³·a²).

Second Extensive Example

Let's work through a more complex example that exercises all parts of the algorithm.

Given FD Set:

F = {
  AB → C,
  C → A,
  BC → D,
  ACD → B,
  D → EG,
  BE → C,
  CG → BD,
  CE → AG
}

Schema: R(A, B, C, D, E, G)

PHASE 1: Decompose RHS

F = {
  AB → C,
  C → A,
  BC → D,
  ACD → B,
  D → E,
  D → G,
  BE → C,
  CG → B,
  CG → D,
  CE → A,
  CE → G
}

(11 FDs after decomposition)

PHASE 2: Remove Extraneous LHS Attributes

Test each FD with |LHS| > 1:

AB → C:

Is A extraneous? Compute B⁺ = {B}. C ∉ {B}. No.
Is B extraneous? Compute A⁺ = {A}. C ∉ {A}. No.
Both essential. Keep AB → C.

BC → D:

Is B extraneous? Compute C⁺:
- {C} → {C, A} (via C → A)
- Check for more... D not derivable.
Is C extraneous? Compute B⁺ = {B}. D ∉ {B}. No.
Both essential. Keep BC → D.

ACD → B:

Is A extraneous? Compute (CD)⁺:
- {C, D} → {C, D, A, E, G} (via C → A, D → E, D → G)
- Apply CG → B: {C, D, A, E, G, B}!
- B ∈ (CD)⁺? Yes!
A is extraneous. Replace ACD → B with CD → B.

Restart Phase 2 with modified F...

CD → B: (newly modified)

Is C extraneous? Compute D⁺:
- {D} → {D, E, G} (via D → E, D → G)
- B not derivable. No.
Is D extraneous? Compute C⁺:
- {C} → {C, A} (via C → A)
- B not derivable. No.
Both essential. Keep CD → B.

BE → C:

Is B extraneous? Compute E⁺ = {E}. C ∉ {E}. No.
Is E extraneous? Compute B⁺ = {B}. C ∉ {B}. No.
Both essential. Keep BE → C.

CG → B, CG → D:

Similar analysis: both C and G essential for each.

CE → A, CE → G:

Is C extraneous in CE → A? Compute E⁺ = {E}. A ∉ {E}. No.
Is E extraneous in CE → A? Compute C⁺:
- {C} → {C, A}!
- A ∈ C⁺? Yes!
E is extraneous. Replace CE → A with C → A.
But C → A already exists! Remove duplicate.
For CE → G:
- Is C extraneous? E⁺ = {E}. G ∉ {E}. No.
- Is E extraneous? C⁺ = {C, A}. G ∉ {C, A}. No.
- Both essential. Keep CE → G.

After Phase 2:

F = {
  AB → C,
  C → A,
  BC → D,
  CD → B,
  D → E,
  D → G,
  BE → C,
  CG → B,
  CG → D,
  CE → G
}

(10 FDs)

PHASE 3: Remove Redundant FDs

Test each FD for redundancy:

AB → C:

F' = F - {AB → C}
Compute (AB)⁺ under F':
- {A, B}... no other FD has A, B, or AB as LHS except BE → C (need E)
- Can't reach C without this FD.
Not redundant. Keep.

C → A: Not redundant (only source of A from C).

BC → D:

F' = F - {BC → D}
Compute (BC)⁺:
- {B, C} → {B, C, A} (via C → A)
- Apply CG → B? Need G.
- Apply CG → D? Need G.
- Can't reach D without BC → D or CG → D.
Not redundant. Keep.

CD → B:

F' = F - {CD → B}
Compute (CD)⁺:
- {C, D} → {C, D, A, E, G} (via C → A, D → E, D → G)
- Apply CG → B: {C, D, A, E, G, B}!
B ∈ (CD)⁺! Redundant! Remove CD → B.

Restart Phase 3...

CG → B:

F' = F - {CG → B}
Compute (CG)⁺:
- {C, G} → {C, G, A} (via C → A)
- Apply CG → D: {C, G, A, D}
- Apply D → E, D → G: {C, G, A, D, E}
- Apply BC → D? Need B.
- Can't reach B.
Not redundant. Keep.

CG → D:

F' = F - {CG → D}
Compute (CG)⁺:
- {C, G} → {C, G, A} (via C → A)
- Apply CG → B: {C, G, A, B}
- Apply BC → D: {C, G, A, B, D}!
D ∈ (CG)⁺! Redundant! Remove CG → D.

Continue until no more redundant FDs...

Final Canonical Cover:

Fc = {
  AB → C,
  C → A,
  BC → D,
  D → E,
  D → G,
  BE → C,
  CG → B,
  CE → G
}

(8 FDs—reduced from original 8 with multi-attribute RHS)

Common Implementation Pitfalls

Implementing the canonical cover algorithm correctly requires attention to subtle details. Here are common mistakes and how to avoid them.

Pitfall 1: Not Restarting After Changes

After removing an extraneous attribute or redundant FD, the FD set changes. Previous tests may now give different results. Always restart the relevant phase after any modification.

Pitfall 2: Wrong F for Closure Computation

Phase 2: Use full F (including target FD)
Phase 3: Use F' = F - {target FD}

Mixing these up produces incorrect results.

Pitfall 3: Forgetting to Handle Duplicates

Phase 2 can create duplicate FDs:

AB → C becomes B → C but B → C already exists
CD → A becomes C → A but C → A already exists

Always check for and remove duplicates.

Pitfall 4: Empty LHS

If an FD's LHS becomes empty (all attributes extraneous), this indicates a constant—an attribute that has only one possible value. Handle this as a special case.

Pitfall 5: Confusing Equivalence with Equality

Two minimal covers may look different but be equivalent. Don't expect a unique answer—different processing orders yield different valid results.

Common Pitfalls Summary
Pitfall	Incorrect Behavior	Correct Behavior
Not Restarting	Continue testing with stale F	Restart phase after each change
Wrong F for Closure	Use F for Phase 3 closure	Use F' = F - {FD} for Phase 3
Ignoring Duplicates	Multiple identical FDs	Remove duplicates immediately
Empty LHS	Crash or undefined behavior	Flag as constant attribute
Expecting Uniqueness	Reject valid minimal covers	Accept equivalent variations

Testing Your Implementation

Always verify that the output is equivalent to the input by checking both directions: every FD in the input should be derivable from the output, and every FD in the output should be derivable from the input. This catches most implementation bugs.

Summary: The Canonical Cover Algorithm

We've covered the complete canonical cover algorithm in detail. Let's consolidate the key points:

Key Takeaways

•Three phases: Decompose RHS, remove extraneous LHS, remove redundant FDs.
•Phase 2 uses full F: Include the target FD when computing closure for LHS extraneous testing.
•Phase 3 excludes target FD: Compute closure without the FD being tested for redundancy.
•Restart after changes: Each modification may create new opportunities for reduction.
•Handle duplicates: Simplifying LHS can create duplicate FDs that should be removed.
•Result is not unique: Different processing orders yield different (valid) minimal covers.
•Always verify: Confirm output is equivalent to input by testing closures.

What's Next:

The next page explores the concept of removing redundancy more deeply—examining not just how to remove it, but why redundancy arises in practical database design scenarios and strategies for preventing it upfront.

Page Complete

You now understand the complete canonical cover algorithm and can execute it correctly on any FD set. You understand the correctness guarantees and common implementation pitfalls.

The Canonical Cover Algorithm: A Complete Procedure

From Concepts to Algorithm

This page presents the algorithm in complete detail, with rigorous correctness analysis and extensive worked examples.

What You Will Learn

Algorithm Overview

The canonical cover algorithm has three main phases:

Phase 1: Decompose Right-Hand Sides

Convert all FDs to have single-attribute right-hand sides. Replace X → AB with X → A and X → B.

Phase 2: Remove Extraneous Left-Hand Attributes

For each FD, check if any attribute on the left side can be removed. If so, remove it.

Phase 3: Remove Redundant FDs

Check if any FD can be derived from the others. If so, remove it.

Post-Processing: Combine FDs with Same LHS (Optional)

For readability, FDs with identical left-hand sides can be combined: X → A and X → B become X → AB.

Key Invariant:

Throughout the algorithm, the closure of the FD set remains unchanged. Each step preserves equivalence while simplifying the representation.

canonical_cover_algorithm.txt
ALGORITHM: Canonical Cover (Minimal Cover)
 
INPUT: F (set of functional dependencies)
OUTPUT: Fc (canonical cover of F)
 
// ═══════════════════════════════════════════════
// PHASE 1: Decompose to Single-Attribute RHS
// ═══════════════════════════════════════════════
FOR each FD (X → Y) in F where |Y| > 1:
    Remove (X → Y) from F
    FOR each attribute A in Y:
        Add (X → A) to F
 
// ═══════════════════════════════════════════════
// PHASE 2: Remove Extraneous LHS Attributes
// ═══════════════════════════════════════════════
REPEAT:
    changed = false
    FOR each FD (X → A) in F where |X| > 1:
        FOR each attribute B in X:
            X' = X - {B}
            IF A ∈ closure(X', F):
                Replace (X → A) with (X' → A) in F
                changed = true
                BREAK inner loop
        IF changed: BREAK outer loop
UNTIL NOT changed
 
// ═══════════════════════════════════════════════
// PHASE 3: Remove Redundant FDs
// ═══════════════════════════════════════════════
REPEAT:
    changed = false
    FOR each FD (X → A) in F:
        F' = F - {X → A}
        IF A ∈ closure(X, F'):
            F = F'  // Remove the redundant FD
            changed = true
            BREAK loop
UNTIL NOT changed
 
RETURN F  // This is now the canonical cover Fc

Why the REPEAT Loops?

Removing one extraneous attribute or redundant FD may make others extraneous or redundant. The REPEAT loops ensure we continue until the set is fully minimized. This is crucial for correctness.

Phase 1: Right-Hand Side Decomposition

The first phase converts all FDs to have exactly one attribute on the right-hand side. This is justified by the decomposition rule from Armstrong's axioms:

If X → YZ, then X → Y and X → Z

Conversely, the union rule allows us to combine them back:

If X → Y and X → Z, then X → YZ

These rules guarantee that decomposition preserves the closure—no information is lost.

Why Decompose First?

Uniform handling: All FDs have the same structure (X → A where A is single)
Precise redundancy testing: Each FD is either wholly redundant or not
Simpler LHS testing: Extraneous LHS analysis is cleaner with atomic RHS

Example:

Original: {A → BCD, BC → D, D → A}

After Phase 1:
{A → B, A → C, A → D, BC → D, D → A}

Phase 1 Transformation Examples
Original FD	After Decomposition
`A → BC`	`A → B`, `A → C`
`AB → CDE`	`AB → C`, `AB → D`, `AB → E`
`X → Y` (single attr)	`X → Y` (no change)
`PQ → RST`	`PQ → R`, `PQ → S`, `PQ → T`

Order Doesn't Matter Here

Unlike later phases, Phase 1 is deterministic. Every FD with multi-attribute RHS is decomposed the same way, regardless of order. The result of Phase 1 is unique.

Phase 2: Removing Extraneous LHS Attributes

Phase 2 systematically checks each attribute in the left-hand side of each FD. For each attribute, we ask: "Can we still derive the RHS without this attribute?"

The Test:

For FD X → A where B ∈ X:

Compute (X - {B})⁺ under the current F
If A is in this closure, B is extraneous
If extraneous, remove B from the FD's LHS

Important Details:

Use current F for closure: After each modification, use the updated F for subsequent tests
Restart after each change: When an attribute is removed, restart the search (or continue with careful bookkeeping)
Only test FDs with |X| > 1: Single-attribute LHS cannot be further reduced

Example Walkthrough:

F = {A → B, B → C, AC → D}

Test AC → D:

Is A extraneous? Compute C⁺ = {C}. Is D ∈ {C}? No. A is essential.
Is C extraneous? Compute A⁺:
- Start: {A}
- Apply A → B: {A, B}
- Apply B → C: {A, B, C}
- Now we can apply AC → D? Wait—we're computing without C.
Let me redo: We compute A⁺ under F (including AC → D):
- Start: {A}
- Apply A → B: {A, B}
- Apply B → C: {A, B, C}
- Apply AC → D: {A, B, C, D}
- Is D ∈ A⁺? Yes!
- C is extraneous. Replace AC → D with A → D.

F after Phase 2: {A → B, B → C, A → D}

Critical: Use Full F Including Target FD

phase2_detailed.txt
DETAILED PHASE 2 ALGORITHM:
 
changed = true
WHILE changed:
    changed = false
    FOR each (X → A) in F:
        IF |X| == 1: CONTINUE  // Skip single-attribute LHS
        
        FOR each B in X:
            X' = X - {B}
            closure = compute_closure(X', F)  // Use FULL F
            
            IF A ∈ closure:
                // B is extraneous
                Remove (X → A) from F
                Add (X' → A) to F
                changed = true
                BREAK both loops  // Restart outer while
 
// After Phase 2, every FD has minimal LHS

Phase 3: Removing Redundant FDs

Phase 3 checks whether each FD is derivable from the others. If so, the FD is redundant and should be removed.

The Test:

For FD X → A in F:

Let F' = F - {X → A} (temporarily remove the FD)
Compute X⁺ under F'
If A is in this closure, the FD is redundant
If redundant, permanently remove it from F

Why This Works:

If A can be derived from X using only the other FDs, then X → A doesn't add any new derivation power. It's logically implied by the rest of F.

Example:

F = {A → B, B → C, A → C}

Test A → C:

F' = {A → B, B → C}
Compute A⁺ under F':
- Start: {A}
- Apply A → B: {A, B}
- Apply B → C: {A, B, C}
Is C ∈ {A, B, C}? Yes!
A → C is redundant. Remove it.

F after: {A → B, B → C}

Critical: Exclude Target FD

When testing if X → A is redundant, compute X⁺ WITHOUT including X → A. This is opposite from Phase 2! In Phase 2, we include the target FD; in Phase 3, we exclude it.

Phase 2 (LHS Extraneous)

•Testing if attribute B in X is extraneous
•Compute (X-{B})⁺ using full F
•Include the target FD
•Question: Can other FDs help derive A?

Phase 3 (FD Redundant)

•Testing if entire FD is redundant
•Compute X⁺ using F - {X → A}
•Exclude the target FD
•Question: Can other FDs alone derive A?

phase3_detailed.txt
DETAILED PHASE 3 ALGORITHM:
 
changed = true
WHILE changed:
    changed = false
    FOR each (X → A) in F:
        F' = F - {X → A}  // Temporarily exclude
        closure = compute_closure(X, F')  // Use F'
        
        IF A ∈ closure:
            // FD is redundant - make removal permanent
            F = F'
            changed = true
            BREAK loop  // Restart
 
// After Phase 3, no FD is derivable from others

Complete Worked Example

Let's work through a complete example from start to finish.

Given FD Set:

F = {
  A → BC,
  B → C,
  A → B,
  AB → C
}

Scheme: R(A, B, C)

PHASE 1: Decompose RHS

F = {
  A → B,      from A → BC
  A → C,      from A → BC
  B → C,
  A → B,      duplicate! (from original A → B)
  AB → C
}

Remove duplicate:

F = {A → B, A → C, B → C, AB → C}

PHASE 2: Remove Extraneous LHS Attributes

Check AB → C (only FD with |LHS| > 1):

Is A extraneous in AB → C?

Compute B⁺ under F:
- Start: {B}
- Apply B → C: {B, C}
Is C ∈ {B, C}? Yes!
A is extraneous. Replace AB → C with B → C.

But wait—B → C already exists! This creates a duplicate.

F = {A → B, A → C, B → C, B → C}

Remove duplicate:

F = {A → B, A → C, B → C}

No more FDs with |LHS| > 1. Phase 2 complete.

Algorithm Correctness

Why does this algorithm produce a valid minimal cover? Let's examine the correctness guarantees.

Theorem: The algorithm terminates with a valid minimal cover.

Proof Sketch:

Part 1: Termination

Each phase can only decrease the "size" of F:

Phase 1: Fixed number of operations (no loops)
Phase 2: Each iteration removes at least one attribute from some LHS
Phase 3: Each iteration removes at least one FD

Since there are finitely many attributes and FDs, the algorithm must terminate.

Part 2: Equivalence Preservation

Each transformation preserves the closure:

Decomposition: By the decomposition rule, X → AB ≡ {X → A, X → B}
LHS reduction: If A is extraneous in XA → Y, then X → Y is derivable from F, so removing A preserves closure
Redundant FD removal: By definition, a redundant FD is derivable from the rest, so removing it preserves closure

Part 3: Minimality

The result satisfies all minimal cover conditions:

Single-attribute RHS: Ensured by Phase 1
No extraneous LHS: Ensured by Phase 2 (terminates only when no more found)
No redundant FDs: Ensured by Phase 3 (terminates only when no more found)

Therefore, the output is a valid minimal cover of the input F.

Complexity Analysis

Second Extensive Example

Let's work through a more complex example that exercises all parts of the algorithm.

Given FD Set:

F = {
  AB → C,
  C → A,
  BC → D,
  ACD → B,
  D → EG,
  BE → C,
  CG → BD,
  CE → AG
}

Schema: R(A, B, C, D, E, G)

PHASE 1: Decompose RHS

F = {
  AB → C,
  C → A,
  BC → D,
  ACD → B,
  D → E,
  D → G,
  BE → C,
  CG → B,
  CG → D,
  CE → A,
  CE → G
}

(11 FDs after decomposition)

PHASE 2: Remove Extraneous LHS Attributes

Test each FD with |LHS| > 1:

AB → C:

Is A extraneous? Compute B⁺ = {B}. C ∉ {B}. No.
Is B extraneous? Compute A⁺ = {A}. C ∉ {A}. No.
Both essential. Keep AB → C.

BC → D:

Is B extraneous? Compute C⁺:
- {C} → {C, A} (via C → A)
- Check for more... D not derivable.
Is C extraneous? Compute B⁺ = {B}. D ∉ {B}. No.
Both essential. Keep BC → D.

ACD → B:

Is A extraneous? Compute (CD)⁺:
- {C, D} → {C, D, A, E, G} (via C → A, D → E, D → G)
- Apply CG → B: {C, D, A, E, G, B}!
- B ∈ (CD)⁺? Yes!
A is extraneous. Replace ACD → B with CD → B.

Restart Phase 2 with modified F...

CD → B: (newly modified)

Is C extraneous? Compute D⁺:
- {D} → {D, E, G} (via D → E, D → G)
- B not derivable. No.
Is D extraneous? Compute C⁺:
- {C} → {C, A} (via C → A)
- B not derivable. No.
Both essential. Keep CD → B.

BE → C:

Is B extraneous? Compute E⁺ = {E}. C ∉ {E}. No.
Is E extraneous? Compute B⁺ = {B}. C ∉ {B}. No.
Both essential. Keep BE → C.

CG → B, CG → D:

Similar analysis: both C and G essential for each.

CE → A, CE → G:

Is C extraneous in CE → A? Compute E⁺ = {E}. A ∉ {E}. No.
Is E extraneous in CE → A? Compute C⁺:
- {C} → {C, A}!
- A ∈ C⁺? Yes!
E is extraneous. Replace CE → A with C → A.
But C → A already exists! Remove duplicate.
For CE → G:
- Is C extraneous? E⁺ = {E}. G ∉ {E}. No.
- Is E extraneous? C⁺ = {C, A}. G ∉ {C, A}. No.
- Both essential. Keep CE → G.

After Phase 2:

F = {
  AB → C,
  C → A,
  BC → D,
  CD → B,
  D → E,
  D → G,
  BE → C,
  CG → B,
  CG → D,
  CE → G
}

(10 FDs)

PHASE 3: Remove Redundant FDs

Test each FD for redundancy:

AB → C:

F' = F - {AB → C}
Compute (AB)⁺ under F':
- {A, B}... no other FD has A, B, or AB as LHS except BE → C (need E)
- Can't reach C without this FD.
Not redundant. Keep.

C → A: Not redundant (only source of A from C).

BC → D:

F' = F - {BC → D}
Compute (BC)⁺:
- {B, C} → {B, C, A} (via C → A)
- Apply CG → B? Need G.
- Apply CG → D? Need G.
- Can't reach D without BC → D or CG → D.
Not redundant. Keep.

CD → B:

F' = F - {CD → B}
Compute (CD)⁺:
- {C, D} → {C, D, A, E, G} (via C → A, D → E, D → G)
- Apply CG → B: {C, D, A, E, G, B}!
B ∈ (CD)⁺! Redundant! Remove CD → B.

Restart Phase 3...

CG → B:

F' = F - {CG → B}
Compute (CG)⁺:
- {C, G} → {C, G, A} (via C → A)
- Apply CG → D: {C, G, A, D}
- Apply D → E, D → G: {C, G, A, D, E}
- Apply BC → D? Need B.
- Can't reach B.
Not redundant. Keep.

CG → D:

F' = F - {CG → D}
Compute (CG)⁺:
- {C, G} → {C, G, A} (via C → A)
- Apply CG → B: {C, G, A, B}
- Apply BC → D: {C, G, A, B, D}!
D ∈ (CG)⁺! Redundant! Remove CG → D.

Continue until no more redundant FDs...

Final Canonical Cover:

Fc = {
  AB → C,
  C → A,
  BC → D,
  D → E,
  D → G,
  BE → C,
  CG → B,
  CE → G
}

(8 FDs—reduced from original 8 with multi-attribute RHS)

Common Implementation Pitfalls

Implementing the canonical cover algorithm correctly requires attention to subtle details. Here are common mistakes and how to avoid them.

Pitfall 1: Not Restarting After Changes

After removing an extraneous attribute or redundant FD, the FD set changes. Previous tests may now give different results. Always restart the relevant phase after any modification.

Pitfall 2: Wrong F for Closure Computation

Phase 2: Use full F (including target FD)
Phase 3: Use F' = F - {target FD}

Mixing these up produces incorrect results.

Pitfall 3: Forgetting to Handle Duplicates

Phase 2 can create duplicate FDs:

AB → C becomes B → C but B → C already exists
CD → A becomes C → A but C → A already exists

Always check for and remove duplicates.

Pitfall 4: Empty LHS

If an FD's LHS becomes empty (all attributes extraneous), this indicates a constant—an attribute that has only one possible value. Handle this as a special case.

Pitfall 5: Confusing Equivalence with Equality

Two minimal covers may look different but be equivalent. Don't expect a unique answer—different processing orders yield different valid results.

Common Pitfalls Summary
Pitfall	Incorrect Behavior	Correct Behavior
Not Restarting	Continue testing with stale F	Restart phase after each change
Wrong F for Closure	Use F for Phase 3 closure	Use F' = F - {FD} for Phase 3
Ignoring Duplicates	Multiple identical FDs	Remove duplicates immediately
Empty LHS	Crash or undefined behavior	Flag as constant attribute
Expecting Uniqueness	Reject valid minimal covers	Accept equivalent variations

Testing Your Implementation

Summary: The Canonical Cover Algorithm

We've covered the complete canonical cover algorithm in detail. Let's consolidate the key points:

Key Takeaways

•Three phases: Decompose RHS, remove extraneous LHS, remove redundant FDs.
•Phase 2 uses full F: Include the target FD when computing closure for LHS extraneous testing.
•Phase 3 excludes target FD: Compute closure without the FD being tested for redundancy.
•Restart after changes: Each modification may create new opportunities for reduction.
•Handle duplicates: Simplifying LHS can create duplicate FDs that should be removed.
•Result is not unique: Different processing orders yield different (valid) minimal covers.
•Always verify: Confirm output is equivalent to input by testing closures.

What's Next:

Page Complete

You now understand the complete canonical cover algorithm and can execute it correctly on any FD set. You understand the correctness guarantees and common implementation pitfalls.