Attribute Closure - Learning Module

Loading content...

0/241

Finding Keys from FDs

The Key Discovery Problem

Every database table should have a primary key, but how do you determine what that key should be? In simple cases, it's obvious—an EmployeeID uniquely identifies an employee. But in complex schemas with multiple interrelated attributes and dozens of functional dependencies, finding all possible keys becomes a sophisticated analytical challenge.

The key discovery problem asks: Given a relation R and a set of functional dependencies F, what are all the candidate keys of R? This question is fundamental to database design because:

Normalization requires understanding which attributes are part of keys (prime vs. non-prime)
Physical design decisions depend on key selection
Referential integrity constraints must reference valid keys
Database decomposition must preserve keys across resulting relations

What You Will Learn

By the end of this page, you will master systematic techniques for finding candidate keys from functional dependencies. You'll learn the attribute categorization approach, understand why certain attributes must always be in every key, and be able to find all candidate keys of any relation—not just one.

Key Concepts Review

Before diving into key discovery algorithms, let's ensure our definitions are crystal clear. These terms have precise meanings that are essential for correct reasoning.

Superkey: A set of attributes K is a superkey of relation R if K functionally determines all attributes of R. Formally:

K is a superkey ⟺ K⁺ = R (the closure of K is the entire relation)

Candidate Key: A candidate key is a minimal superkey—no proper subset of it is also a superkey. Formally:

K is a candidate key ⟺ K is a superkey AND for all proper subsets K' ⊂ K, K' is not a superkey

Primary Key: The primary key is the candidate key chosen by the database designer to be the main identifier. All other candidate keys become alternate keys.

Prime Attribute: An attribute is prime if it belongs to at least one candidate key.

Non-Prime Attribute: An attribute is non-prime if it does not belong to any candidate key.

Relationship Between Key Concepts
Concept	Definition	Example (R = ABCDE, key = AB)
Superkey	Determines all attributes	{AB}, {ABC}, {ABCD}, {ABE}, {ABCDE}
Candidate Key	Minimal superkey	{AB} (if no subset works)
Prime Attribute	In some candidate key	A, B
Non-Prime Attribute	In no candidate key	C, D, E

The Closure Test

Testing if K is a superkey is straightforward: compute K⁺. If K⁺ contains all attributes of R, K is a superkey. For testing if K is a candidate key, also verify that no proper subset of K has this property.

The Attribute Categorization Strategy

The most powerful technique for finding candidate keys is to categorize attributes based on their role in functional dependencies. This categorization dramatically reduces the search space.

Four Categories of Attributes:

L (Left-only): Appears only on the left side (determinant) of FDs, never on the right (dependent)
R (Right-only): Appears only on the right side (dependent) of FDs, never on the left
B (Both): Appears on both left and right sides in different FDs
N (Neither): Does not appear in any FD (neither side)

Attribute Category Properties
Category	In Determinant?	In Dependent?	Key Membership
L (Left-only)	Yes	No	MUST be in every key
R (Right-only)	No	Yes	NEVER in any key
B (Both)	Yes	Yes	MAY or may not be in key
N (Neither)	No	No	MUST be in every key

Why These Rules?

• L attributes can determine others but nothing determines them, so they must be given (in the key). • R attributes are determined by other attributes, so including them in a key is unnecessary. • N attributes are not involved in any FD, so they can only be known by being explicitly given (in the key). • B attributes are the candidates for optional inclusion—we must test whether adding them helps.

Attribute Categorization ExampleR(A, B, C, D, E, F) with F = {A → B, BC → D, D → E}

Input

Categorize all attributes

Output

L = {A}, R = {E}, B = {B, C, D}, N = {F}

Explanation

Analyzing each attribute:

• A: Left of A→B, never on right side → L (left-only) • B: Right of A→B, left of BC→D → B (both) • C: Left of BC→D, never on right side → L (left-only)... wait, C is also B?

Let me recategorize: • A: determinant in A→B only → L (left-only) • B: dependent in A→B, determinant in BC→D → B (both) • C: determinant in BC→D only → L (left-only) • D: dependent in BC→D, determinant in D→E → B (both) • E: dependent in D→E only → R (right-only) • F: not in any FD → N (neither)

Actual categories: • L = {A, C} • R = {E} • B = {B, D} • N = {F}

Must be in every key: A, C, F Never in any key: E May or may not be in key: B, D

Systematic Key Finding Algorithm

Using attribute categorization, we can formulate a systematic algorithm to find all candidate keys.

Algorithm: FIND-CANDIDATE-KEYS(R, F)

Input:  R — relation schema (set of attributes)
        F — set of functional dependencies
Output: All candidate keys of R

1. Categorize attributes into L, R, B, N sets

2. Base = L ∪ N (attributes that must be in every key)

3. If Base⁺ = R:
      Return {Base} as the only candidate key

4. Else:
      Find all minimal sets X ⊆ B such that (Base ∪ X)⁺ = R
      Return all such (Base ∪ X) as candidate keys

Key Insight: We only need to explore subsets of B (the "both" category). L, N are always included; R is never included. This dramatically reduces the search space from 2^n to 2^|B|.

Search Strategy

When exploring subsets of B, use a level-wise search:

First, try empty set (just Base)
If that doesn't work, try all single-element subsets of B
Then two-element subsets, and so on

This guarantees finding minimal keys first. Once you find a key, any superset is a superkey (not minimal), so you can prune it.

find_candidate_keys
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
from itertools import combinations
 
def find_candidate_keys(R: set, F: list[tuple[set, set]]) -> list[set]:
    """
    Find all candidate keys of relation R under FDs F.
    
    Args:
        R: Set of all attributes in the relation
        F: List of FDs as (determinant, dependent) tuples
    
    Returns:
        List of candidate keys (each key is a set of attributes)
    """
    # Step 1: Categorize attributes
    left_attrs = set()
    right_attrs = set()
    
    for det, dep in F:
        left_attrs |= det
        right_attrs |= dep
    
    L = left_attrs - right_attrs       # Left-only
    R_only = right_attrs - left_attrs  # Right-only  
    B = left_attrs & right_attrs       # Both
    N = R - (left_attrs | right_attrs) # Neither
    
    # Step 2: Compute base (must be in every key)
    base = L | N
    
    # Step 3: Check if base alone is a key
    if compute_closure(base, F) == R:
        return [base]
    
    # Step 4: Search through subsets of B
    candidate_keys = []
    B_list = list(B)
    
    # Level-wise search: smallest subsets first
    for size in range(1, len(B) + 1):
        for subset in combinations(B_list, size):
            candidate = base | set(subset)
            
            # Skip if superset of known candidate key
            if any(ck <= candidate and ck != candidate 
                   for ck in candidate_keys):
                continue
            
            # Check if this is a superkey
            if compute_closure(candidate, F) == R:
                candidate_keys.append(candidate)
    
    return candidate_keys
 
 
# Example usage
R = {'A', 'B', 'C', 'D', 'E', 'F'}
F = [
    ({'A'}, {'B'}),      # A → B
    ({'B', 'C'}, {'D'}), # BC → D
    ({'D'}, {'E'}),      # D → E
]
 
keys = find_candidate_keys(R, F)
print("Candidate keys:", [sorted(k) for k in keys])
# Output: Candidate keys: [['A', 'C', 'F']]

Complete Worked Example

Let's work through a comprehensive example that has multiple candidate keys.

Problem:

Relation: R(A, B, C, D, E)

Functional Dependencies:

F1: AB → C
F2: C → D
F3: D → E
F4: E → A

Step 1: Categorize Attributes

Attribute	Left side of	Right side of	Category
A	AB→C	E→A	B (Both)
B	AB→C	—	L (Left-only)
C	C→D	AB→C	B (Both)
D	D→E	C→D	B (Both)
E	E→A	D→E	B (Both)

Categories:

L = {B} (must be in every key)
R = ∅ (no right-only attributes)
B = {A, C, D, E}
N = ∅ (all attributes appear in some FD)

Base = L ∪ N = {B}

Step 2: Test Base Alone

Compute {B}⁺:

Start: {B}
No FD has just B on left (AB→C needs A too)
Result: {B}⁺ = {B}

Since {B}⁺ ≠ R, we need to add something from B = {A, C, D, E}.

Testing Base + Single Attributes from B
Candidate	Closure Computation	Is Superkey?
{B, A}	{BA}⁺: BA→C⊕, C→D⊕, D→E⊕ → {A,B,C,D,E}	✓ YES
{B, C}	{BC}⁺: C→D⊕, D→E⊕, E→A⊕, AB→C(already) → {A,B,C,D,E}	✓ YES
{B, D}	{BD}⁺: D→E⊕, E→A⊕, AB→C⊕, C→D(already) → {A,B,C,D,E}	✓ YES
{B, E}	{BE}⁺: E→A⊕, AB→C⊕, C→D⊕, D→E(already) → {A,B,C,D,E}	✓ YES

Result: Four Candidate Keys

All four combinations {B, A}, {B, C}, {B, D}, {B, E} are superkeys. Since they're all the same size (2 attributes) and no single attribute is a superkey, they are all candidate keys:

Candidate Keys: {AB}, {BC}, {BD}, {BE}

We don't need to test larger subsets (like {ABC}) because we've already found keys of size 2, making any superset a non-minimal superkey.

Why does this relation have so many candidate keys?

The cycle E → A, AB → C, C → D, D → E creates equivalences. Once you have B plus any attribute in the cycle, you can derive all others through the circular dependencies. This is characteristic of relations with cyclic FD structures.

Example: Relation with Unique Candidate Key

Let's examine a case with a single candidate key.

Problem:

Relation: R(A, B, C, D, E, F)

Functional Dependencies:

F1: A → B
F2: A → C
F3: C → D
F4: D → E

Step 1: Categorize Attributes

Attribute Categorization
Attribute	Appears on Left?	Appears on Right?	Category
A	Yes (A→B, A→C)	No	L (Left-only)
B	No	Yes (A→B)	R (Right-only)
C	Yes (C→D)	Yes (A→C)	B (Both)
D	Yes (D→E)	Yes (C→D)	B (Both)
E	No	Yes (D→E)	R (Right-only)
F	No	No	N (Neither)

Categories:

L = {A}
R = {B, E}
B = {C, D}
N = {F}

Base = L ∪ N = {A, F}

Step 2: Test Base Alone

Compute {A, F}⁺:

Start: {A, F}
A → B: add B → {A, B, F}
A → C: add C → {A, B, C, F}
C → D: add D → {A, B, C, D, F}
D → E: add E → {A, B, C, D, E, F}

Result: {A, F}⁺ = {A, B, C, D, E, F} = R ✓

Conclusion: Base alone is a superkey, so {A, F} is the only candidate key.

Understanding the Result

A must be in every key because nothing determines it. F must be in every key because it's not involved in any FD (neither side). Together, A and F are sufficient because A triggers a chain A→B, A→C, C→D, D→E that covers everything else.

The attributes C and D (category B) are not needed in the key because A already determines them.

Handling Complex Cases

Some relations present more challenging key discovery scenarios. Let's examine strategies for handling them.

Case 1: Large B Category

When many attributes are in category B, the number of subsets to test can become large (2^|B|). Strategies to manage this:

Level-wise search with pruning: Test smallest subsets first. Once you find a key of size k, don't test supersets of found keys.
Compute closure incrementally: If {Base, X}⁺ ⊂ R, then adding some attributes from B might still not help. Use partial closure info to prune.
Focus on essential attributes: An attribute X ∈ B might be essential (must be in every key even though it's in B) if it's the only way to reach certain attributes.

Large B Category ExampleR(A, B, C, D, E) with F = {A → B, B → A, C → D, D → C, E → E}

Input

Find candidate keys

Output

Candidate keys: {A, C, E}, {A, D, E}, {B, C, E}, {B, D, E}

Explanation

Analysis: • L = ∅ (all determinant attrs also appear on right) • R = ∅ (all dependent attrs also appear on left) • B = {A, B, C, D, E} (all appear on both sides) • N = ∅

Base = ∅, so we must test subsets of B = all attributes!

Key observations: • A and B determine each other (A→B, B→A), so either works • C and D determine each other (C→D, D→C), so either works
• E→E is trivial; E must be explicitly given

Every key needs: one of {A,B} + one of {C,D} + E This gives 2×2×1 = 4 candidate keys.

Case 2: Empty Left-Only Category

When L = ∅, every determinant attribute also appears as a dependent somewhere. This often indicates cyclic dependencies. The algorithm still works:

Base = N (neither attributes only)
If N = ∅, base is empty, and we must find minimal subsets of B that close to R
Cycles in FDs create equivalent keys (as seen above)

Exponential Worst Case

In the worst case (all attributes in category B), we might need to check 2^n candidate sets. However, this is rare in practice. Real-world schemas typically have clear L and N attributes that form most of the key, with only a few B attributes to test.

Verification and Common Mistakes

Finding keys correctly requires careful reasoning. Here are common mistakes to avoid and verification strategies to ensure correctness.

Common Mistakes

•Missing N attributes: Forgetting that attributes not in any FD must be in every key
•Stopping at first key: Finding one candidate key but missing others
•Including R attributes: Putting right-only attributes in proposed keys
•Wrong categorization: Miscounting which side an attribute appears on
•Closure errors: Making arithmetic mistakes in closure computation
•Non-minimal keys: Proposing superkeys that aren't minimal

Verification Checklist

•Superkey test: For each proposed key K, verify K⁺ = R
•Minimality test: For each proper subset K' of K, verify K'⁺ ≠ R
•L ∪ N subset: Every candidate key must contain all L ∪ N attributes
•No R attributes: No candidate key should contain R-only attributes
•Cross-check: If you found multiple keys, ensure none is a subset of another
•Count check: Compare your key count against expected (often 1-4 keys)

Spotting the MistakeA student claims {B, C} is a candidate key for R(A, B, C, D) with F = {A → B, B → C, C → D}

Input

Verify or refute this claim

Output

INCORRECT — {B, C} is not even a superkey

Explanation

Let's check:

{B, C}⁺: • Start: {B, C} • A → B: A ⊆ {B,C}? No • B → C: B ⊆ {B,C}? Yes, but C already present • C → D: C ⊆ {B,C}? Yes, add D → {B, C, D}

{B, C}⁺ = {B, C, D} ≠ R (missing A)

The student's error: ignoring that A is in category L (left-only) and MUST be in every key.

Correct key: {A} since {A}⁺ = {A, B, C, D} = R

Summary: Finding Keys Systematically

Finding candidate keys from functional dependencies is a fundamental skill in database design. Let's consolidate the key techniques:

Key Takeaways

•Superkey Test — K is a superkey if K⁺ = R. Use closure computation to verify.
•Candidate Key — A minimal superkey. No proper subset is also a superkey.
•Attribute Categorization — Classify attributes as L (left-only), R (right-only), B (both), N (neither).
•L and N are mandatory — Every candidate key must contain all L ∪ N attributes.
•R is forbidden — No candidate key contains R-only attributes (they're always derivable).
•Search B systematically — Test subsets of B from smallest to largest, with pruning.
•Multiple keys are possible — Cyclic dependencies often create multiple equivalent keys.

The Algorithm in Short:

Categorize: L, R, B, N
Base = L ∪ N
If Base⁺ = R, done (Base is the only key)
Else, find all minimal subsets X ⊆ B where (Base ∪ X)⁺ = R

What's Next:

Now that we can find keys, the next page explores superkey identification in more depth—understanding the structure of all superkeys, how they relate to candidate keys, and why this matters for schema design.

Page Complete

You now have a systematic method for finding all candidate keys of a relation from its functional dependencies. This skill is essential for normalization analysis, where distinguishing prime from non-prime attributes determines the normal form. Next, we examine superkey identification in detail.

Finding Keys from FDs

The Key Discovery Problem

The key discovery problem asks: Given a relation R and a set of functional dependencies F, what are all the candidate keys of R? This question is fundamental to database design because:

Normalization requires understanding which attributes are part of keys (prime vs. non-prime)
Physical design decisions depend on key selection
Referential integrity constraints must reference valid keys
Database decomposition must preserve keys across resulting relations

What You Will Learn

Key Concepts Review

Before diving into key discovery algorithms, let's ensure our definitions are crystal clear. These terms have precise meanings that are essential for correct reasoning.

Superkey: A set of attributes K is a superkey of relation R if K functionally determines all attributes of R. Formally:

K is a superkey ⟺ K⁺ = R (the closure of K is the entire relation)

Candidate Key: A candidate key is a minimal superkey—no proper subset of it is also a superkey. Formally:

K is a candidate key ⟺ K is a superkey AND for all proper subsets K' ⊂ K, K' is not a superkey

Primary Key: The primary key is the candidate key chosen by the database designer to be the main identifier. All other candidate keys become alternate keys.

Prime Attribute: An attribute is prime if it belongs to at least one candidate key.

Non-Prime Attribute: An attribute is non-prime if it does not belong to any candidate key.

Relationship Between Key Concepts
Concept	Definition	Example (R = ABCDE, key = AB)
Superkey	Determines all attributes	{AB}, {ABC}, {ABCD}, {ABE}, {ABCDE}
Candidate Key	Minimal superkey	{AB} (if no subset works)
Prime Attribute	In some candidate key	A, B
Non-Prime Attribute	In no candidate key	C, D, E

The Closure Test

The Attribute Categorization Strategy

The most powerful technique for finding candidate keys is to categorize attributes based on their role in functional dependencies. This categorization dramatically reduces the search space.

Four Categories of Attributes:

L (Left-only): Appears only on the left side (determinant) of FDs, never on the right (dependent)
R (Right-only): Appears only on the right side (dependent) of FDs, never on the left
B (Both): Appears on both left and right sides in different FDs
N (Neither): Does not appear in any FD (neither side)

Attribute Category Properties
Category	In Determinant?	In Dependent?	Key Membership
L (Left-only)	Yes	No	MUST be in every key
R (Right-only)	No	Yes	NEVER in any key
B (Both)	Yes	Yes	MAY or may not be in key
N (Neither)	No	No	MUST be in every key

Why These Rules?

Attribute Categorization ExampleR(A, B, C, D, E, F) with F = {A → B, BC → D, D → E}

Input

Categorize all attributes

Output

L = {A}, R = {E}, B = {B, C, D}, N = {F}

Explanation

Analyzing each attribute:

• A: Left of A→B, never on right side → L (left-only) • B: Right of A→B, left of BC→D → B (both) • C: Left of BC→D, never on right side → L (left-only)... wait, C is also B?

Actual categories: • L = {A, C} • R = {E} • B = {B, D} • N = {F}

Must be in every key: A, C, F Never in any key: E May or may not be in key: B, D

Systematic Key Finding Algorithm

Using attribute categorization, we can formulate a systematic algorithm to find all candidate keys.

Algorithm: FIND-CANDIDATE-KEYS(R, F)

Input:  R — relation schema (set of attributes)
        F — set of functional dependencies
Output: All candidate keys of R

1. Categorize attributes into L, R, B, N sets

2. Base = L ∪ N (attributes that must be in every key)

3. If Base⁺ = R:
      Return {Base} as the only candidate key

4. Else:
      Find all minimal sets X ⊆ B such that (Base ∪ X)⁺ = R
      Return all such (Base ∪ X) as candidate keys

Key Insight: We only need to explore subsets of B (the "both" category). L, N are always included; R is never included. This dramatically reduces the search space from 2^n to 2^|B|.

Search Strategy

When exploring subsets of B, use a level-wise search:

First, try empty set (just Base)
If that doesn't work, try all single-element subsets of B
Then two-element subsets, and so on

This guarantees finding minimal keys first. Once you find a key, any superset is a superkey (not minimal), so you can prune it.

find_candidate_keys
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
from itertools import combinations
 
def find_candidate_keys(R: set, F: list[tuple[set, set]]) -> list[set]:
    """
    Find all candidate keys of relation R under FDs F.
    
    Args:
        R: Set of all attributes in the relation
        F: List of FDs as (determinant, dependent) tuples
    
    Returns:
        List of candidate keys (each key is a set of attributes)
    """
    # Step 1: Categorize attributes
    left_attrs = set()
    right_attrs = set()
    
    for det, dep in F:
        left_attrs |= det
        right_attrs |= dep
    
    L = left_attrs - right_attrs       # Left-only
    R_only = right_attrs - left_attrs  # Right-only  
    B = left_attrs & right_attrs       # Both
    N = R - (left_attrs | right_attrs) # Neither
    
    # Step 2: Compute base (must be in every key)
    base = L | N
    
    # Step 3: Check if base alone is a key
    if compute_closure(base, F) == R:
        return [base]
    
    # Step 4: Search through subsets of B
    candidate_keys = []
    B_list = list(B)
    
    # Level-wise search: smallest subsets first
    for size in range(1, len(B) + 1):
        for subset in combinations(B_list, size):
            candidate = base | set(subset)
            
            # Skip if superset of known candidate key
            if any(ck <= candidate and ck != candidate 
                   for ck in candidate_keys):
                continue
            
            # Check if this is a superkey
            if compute_closure(candidate, F) == R:
                candidate_keys.append(candidate)
    
    return candidate_keys
 
 
# Example usage
R = {'A', 'B', 'C', 'D', 'E', 'F'}
F = [
    ({'A'}, {'B'}),      # A → B
    ({'B', 'C'}, {'D'}), # BC → D
    ({'D'}, {'E'}),      # D → E
]
 
keys = find_candidate_keys(R, F)
print("Candidate keys:", [sorted(k) for k in keys])
# Output: Candidate keys: [['A', 'C', 'F']]

Complete Worked Example

Let's work through a comprehensive example that has multiple candidate keys.

Problem:

Relation: R(A, B, C, D, E)

Functional Dependencies:

F1: AB → C
F2: C → D
F3: D → E
F4: E → A

Step 1: Categorize Attributes

Attribute	Left side of	Right side of	Category
A	AB→C	E→A	B (Both)
B	AB→C	—	L (Left-only)
C	C→D	AB→C	B (Both)
D	D→E	C→D	B (Both)
E	E→A	D→E	B (Both)

Categories:

L = {B} (must be in every key)
R = ∅ (no right-only attributes)
B = {A, C, D, E}
N = ∅ (all attributes appear in some FD)

Base = L ∪ N = {B}

Step 2: Test Base Alone

Compute {B}⁺:

Start: {B}
No FD has just B on left (AB→C needs A too)
Result: {B}⁺ = {B}

Since {B}⁺ ≠ R, we need to add something from B = {A, C, D, E}.

Testing Base + Single Attributes from B
Candidate	Closure Computation	Is Superkey?
{B, A}	{BA}⁺: BA→C⊕, C→D⊕, D→E⊕ → {A,B,C,D,E}	✓ YES
{B, C}	{BC}⁺: C→D⊕, D→E⊕, E→A⊕, AB→C(already) → {A,B,C,D,E}	✓ YES
{B, D}	{BD}⁺: D→E⊕, E→A⊕, AB→C⊕, C→D(already) → {A,B,C,D,E}	✓ YES
{B, E}	{BE}⁺: E→A⊕, AB→C⊕, C→D⊕, D→E(already) → {A,B,C,D,E}	✓ YES

Result: Four Candidate Keys

All four combinations {B, A}, {B, C}, {B, D}, {B, E} are superkeys. Since they're all the same size (2 attributes) and no single attribute is a superkey, they are all candidate keys:

Candidate Keys: {AB}, {BC}, {BD}, {BE}

We don't need to test larger subsets (like {ABC}) because we've already found keys of size 2, making any superset a non-minimal superkey.

Why does this relation have so many candidate keys?

Example: Relation with Unique Candidate Key

Let's examine a case with a single candidate key.

Problem:

Relation: R(A, B, C, D, E, F)

Functional Dependencies:

F1: A → B
F2: A → C
F3: C → D
F4: D → E

Step 1: Categorize Attributes

Attribute Categorization
Attribute	Appears on Left?	Appears on Right?	Category
A	Yes (A→B, A→C)	No	L (Left-only)
B	No	Yes (A→B)	R (Right-only)
C	Yes (C→D)	Yes (A→C)	B (Both)
D	Yes (D→E)	Yes (C→D)	B (Both)
E	No	Yes (D→E)	R (Right-only)
F	No	No	N (Neither)

Categories:

L = {A}
R = {B, E}
B = {C, D}
N = {F}

Base = L ∪ N = {A, F}

Step 2: Test Base Alone

Compute {A, F}⁺:

Start: {A, F}
A → B: add B → {A, B, F}
A → C: add C → {A, B, C, F}
C → D: add D → {A, B, C, D, F}
D → E: add E → {A, B, C, D, E, F}

Result: {A, F}⁺ = {A, B, C, D, E, F} = R ✓

Conclusion: Base alone is a superkey, so {A, F} is the only candidate key.

Understanding the Result

The attributes C and D (category B) are not needed in the key because A already determines them.

Handling Complex Cases

Some relations present more challenging key discovery scenarios. Let's examine strategies for handling them.

Case 1: Large B Category

When many attributes are in category B, the number of subsets to test can become large (2^|B|). Strategies to manage this:

Level-wise search with pruning: Test smallest subsets first. Once you find a key of size k, don't test supersets of found keys.
Compute closure incrementally: If {Base, X}⁺ ⊂ R, then adding some attributes from B might still not help. Use partial closure info to prune.
Focus on essential attributes: An attribute X ∈ B might be essential (must be in every key even though it's in B) if it's the only way to reach certain attributes.

Large B Category ExampleR(A, B, C, D, E) with F = {A → B, B → A, C → D, D → C, E → E}

Input

Find candidate keys

Output

Candidate keys: {A, C, E}, {A, D, E}, {B, C, E}, {B, D, E}

Explanation

Analysis: • L = ∅ (all determinant attrs also appear on right) • R = ∅ (all dependent attrs also appear on left) • B = {A, B, C, D, E} (all appear on both sides) • N = ∅

Base = ∅, so we must test subsets of B = all attributes!

Key observations: • A and B determine each other (A→B, B→A), so either works • C and D determine each other (C→D, D→C), so either works
• E→E is trivial; E must be explicitly given

Every key needs: one of {A,B} + one of {C,D} + E This gives 2×2×1 = 4 candidate keys.

Case 2: Empty Left-Only Category

When L = ∅, every determinant attribute also appears as a dependent somewhere. This often indicates cyclic dependencies. The algorithm still works:

Base = N (neither attributes only)
If N = ∅, base is empty, and we must find minimal subsets of B that close to R
Cycles in FDs create equivalent keys (as seen above)

Exponential Worst Case

Verification and Common Mistakes

Finding keys correctly requires careful reasoning. Here are common mistakes to avoid and verification strategies to ensure correctness.

Common Mistakes

•Missing N attributes: Forgetting that attributes not in any FD must be in every key
•Stopping at first key: Finding one candidate key but missing others
•Including R attributes: Putting right-only attributes in proposed keys
•Wrong categorization: Miscounting which side an attribute appears on
•Closure errors: Making arithmetic mistakes in closure computation
•Non-minimal keys: Proposing superkeys that aren't minimal

Verification Checklist

•Superkey test: For each proposed key K, verify K⁺ = R
•Minimality test: For each proper subset K' of K, verify K'⁺ ≠ R
•L ∪ N subset: Every candidate key must contain all L ∪ N attributes
•No R attributes: No candidate key should contain R-only attributes
•Cross-check: If you found multiple keys, ensure none is a subset of another
•Count check: Compare your key count against expected (often 1-4 keys)

Spotting the MistakeA student claims {B, C} is a candidate key for R(A, B, C, D) with F = {A → B, B → C, C → D}

Input

Verify or refute this claim

Output

INCORRECT — {B, C} is not even a superkey

Explanation

Let's check:

{B, C}⁺: • Start: {B, C} • A → B: A ⊆ {B,C}? No • B → C: B ⊆ {B,C}? Yes, but C already present • C → D: C ⊆ {B,C}? Yes, add D → {B, C, D}

{B, C}⁺ = {B, C, D} ≠ R (missing A)

The student's error: ignoring that A is in category L (left-only) and MUST be in every key.

Correct key: {A} since {A}⁺ = {A, B, C, D} = R

Summary: Finding Keys Systematically

Finding candidate keys from functional dependencies is a fundamental skill in database design. Let's consolidate the key techniques:

Key Takeaways

•Superkey Test — K is a superkey if K⁺ = R. Use closure computation to verify.
•Candidate Key — A minimal superkey. No proper subset is also a superkey.
•Attribute Categorization — Classify attributes as L (left-only), R (right-only), B (both), N (neither).
•L and N are mandatory — Every candidate key must contain all L ∪ N attributes.
•R is forbidden — No candidate key contains R-only attributes (they're always derivable).
•Search B systematically — Test subsets of B from smallest to largest, with pruning.
•Multiple keys are possible — Cyclic dependencies often create multiple equivalent keys.

The Algorithm in Short:

Categorize: L, R, B, N
Base = L ∪ N
If Base⁺ = R, done (Base is the only key)
Else, find all minimal subsets X ⊆ B where (Base ∪ X)⁺ = R

What's Next:

Page Complete