Normalization Problems - Learning Module

Loading content...

0/241

Key Finding

The Foundation of Every Normalization Decision

Throughout our normalization journey, we've repeatedly needed to know what the candidate keys are. Normal form analysis requires distinguishing prime from non-prime attributes—which hinges on knowing all candidate keys. Decomposition algorithms require starting from a candidate key. Dependency analysis requires checking if determinants are superkeys.

Key finding is the critical first step in every normalization problem.

Yet many students and interview candidates struggle with this step, especially when given complex FD sets with multiple candidate keys. They either miss some keys (leading to incorrect prime/non-prime classification) or waste time with inefficient approaches. This page gives you systematic, reliable algorithms that work every time.

What You Will Master

By the end of this page, you will know how to systematically identify ALL candidate keys for any relation given its functional dependencies. You'll understand multiple approaches (closure-based, attribute classification, and systematic search) and know which to apply in different scenarios. You'll also handle tricky cases like overlapping keys and cyclic dependencies.

Key Concepts Review

Let's establish precise definitions before diving into algorithms.

Superkey: A set of attributes S is a superkey of relation R iff S⁺ = R (the closure of S equals all attributes). Every tuple can be uniquely identified by S.

Candidate Key: A superkey K that is minimal—no proper subset of K is also a superkey. Removing any attribute from K makes it no longer a superkey.

Primary Key: One chosen candidate key designated as the "main" identifier. (A design choice, not a mathematical concept.)

Prime Attribute: An attribute that is part of any candidate key.

Non-Prime Attribute: An attribute that is not part of any candidate key.

Key Hierarchy
Concept	Definition	Example (R = ABCDE, Key = AB)
Superkey	Any set whose closure is R	AB, ABC, ABD, ABCD, ABCDE
Candidate Key	Minimal superkey	AB only (if no other key exists)
Primary Key	Chosen candidate key	AB (design decision)
Prime Attribute	In some candidate key	A, B
Non-Prime Attribute	In no candidate key	C, D, E

Critical Interview Insight

A relation can have multiple candidate keys. You must find ALL of them to correctly classify attributes as prime or non-prime. Missing even one candidate key can lead to incorrect normal form conclusions. Never assume there's only one key!

The Attribute Classification Approach

Before computing closures, we can narrow down candidates by classifying attributes based on their role in the FDs.

Attribute Classification:

Look at all FDs and classify each attribute:

L (Left-only): Appears only on left side of FDs, never on right
- These MUST be in every key (nothing else can determine them)
R (Right-only): Appears only on right side of FDs, never on left
- These CANNOT be in any key (they're always determined by others)
LR (Both sides): Appears on both left and right
- May or may not be in keys (need to check)
N (Neither): Appears in neither side of any FD
- These MUST be in every key (nothing else can determine them)

Starting Point for Key Finding:

Minimum Key = L ∪ N (all left-only and neither attributes)

If (L ∪ N)⁺ = R, then L ∪ N is the only candidate key. Otherwise, add LR attributes systematically.

Attribute Classification Algorithm
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
/**
 * Classify attributes for key finding
 */
function classifyAttributes(R, F) {
    let leftSides = new Set();
    let rightSides = new Set();
    
    // Collect all appearances
    for (let fd of F) {
        for (let attr of fd.left) leftSides.add(attr);
        for (let attr of fd.right) rightSides.add(attr);
    }
    
    let classification = {
        L: [],   // Left only - MUST be in key
        R: [],   // Right only - CANNOT be in key
        LR: [],  // Both sides - might be in key
        N: []    // Neither - MUST be in key
    };
    
    for (let attr of R) {
        let inLeft = leftSides.has(attr);
        let inRight = rightSides.has(attr);
        
        if (inLeft && !inRight) classification.L.push(attr);
        else if (!inLeft && inRight) classification.R.push(attr);
        else if (inLeft && inRight) classification.LR.push(attr);
        else classification.N.push(attr);  // Neither
    }
    
    return classification;
}
 
// Minimum key starting point:
// minKey = L ∪ N (must be in every candidate key)
// 
// If minKey⁺ = R, then minKey is THE candidate key.
// Otherwise, try adding subsets of LR to minKey.

Attribute Classification ExampleUsing classification to find keys efficiently

Input

R(A, B, C, D, E, F)
FDs: A → B, BC → D, D → E, E → F

Output

**Step 1: Classify Attributes**

Left sides: A, B, C, D, E
Right sides: B, D, E, F

- A: Left only (L) — **Must be in key**
- B: Both (LR) — Might be in key
- C: Left only (L) — **Must be in key**
- D: Both (LR) — Might be in key
- E: Both (LR) — Might be in key
- F: Right only (R) — **Cannot be in key**

**L = {A, C}** (must be in every key)
**R = {F}** (never in any key)
**LR = {B, D, E}** (check if needed)
**N = {}** (empty)

---

**Step 2: Test Minimum Key**

minKey = L ∪ N = {A, C}

Compute {A, C}⁺:
- {A, C} → {A, C, B} (A → B)
- {A, C, B} → {A, C, B, D} (BC → D)
- {A, C, B, D} → {A, C, B, D, E} (D → E)
- {A, C, B, D, E} → {A, C, B, D, E, F} (E → F)

{A, C}⁺ = {A, B, C, D, E, F} = R ✓

**Candidate Key: {A, C}**

Since minKey alone covers R, {A, C} is the **only** candidate key.
No need to test LR additions.

The Systematic Closure-Based Approach

When attribute classification alone doesn't uniquely determine the key (because (L ∪ N)⁺ ≠ R), we need to systematically search for all candidate keys.

The Algorithm:

Start with minimum base = L ∪ N (must be in every key)
If base⁺ = R, base is the only candidate key
Otherwise, systematically add each subset of LR attributes to base
Check if each combination is a superkey (closure = R)
Keep only minimal superkeys (candidate keys)

Optimization: Since we need minimal keys, we can search smallest to largest. Once we find a superkey of size k, any superset of it is not a candidate key.

Complete Key Finding Algorithm
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
/**
 * Find ALL candidate keys
 */
function findAllCandidateKeys(R, F) {
    let classification = classifyAttributes(R, F);
    let base = classification.L.concat(classification.N);
    let LR = classification.LR;
    
    // Check if base alone is a key
    if (computeClosure(base, F).equals(R)) {
        return [new Set(base)];  // Only one candidate key
    }
    
    // Need to add from LR attributes
    let candidateKeys = [];
    
    // Generate all subsets of LR, from smallest to largest
    for (let size = 1; size <= LR.length; size++) {
        for (let subset of combinations(LR, size)) {
            let potentialKey = new Set([...base, ...subset]);
            
            // Skip if superset of existing candidate key
            if (candidateKeys.some(k => isSubset(k, potentialKey))) {
                continue;
            }
            
            // Check if it's a superkey
            if (computeClosure(potentialKey, F).equals(R)) {
                // Check minimality (no proper subset is a superkey)
                let isMinimal = true;
                for (let attr of potentialKey) {
                    let reduced = new Set(potentialKey);
                    reduced.delete(attr);
                    if (computeClosure(reduced, F).equals(R)) {
                        isMinimal = false;
                        break;
                    }
                }
                
                if (isMinimal) {
                    candidateKeys.push(potentialKey);
                }
            }
        }
    }
    
    return candidateKeys;
}

Finding Multiple Candidate KeysWhen there are multiple minimal keys

Input

R(A, B, C, D)
FDs: A → B, B → A, C → D, D → C

Output

**Step 1: Classify Attributes**

Left: A, B, C, D
Right: B, A, D, C

- A: Both (LR)
- B: Both (LR)
- C: Both (LR)
- D: Both (LR)

**L = {}**, **R = {}**, **LR = {A, B, C, D}**, **N = {}**

base = L ∪ N = {} (empty!)

---

**Step 2: Systematic Search**

Need to find minimal subsets of LR that cover R.

**Size 1:**
- A⁺ = {A, B} ≠ R
- B⁺ = {A, B} ≠ R
- C⁺ = {C, D} ≠ R
- D⁺ = {C, D} ≠ R

No size-1 keys.

**Size 2:**
- {A, C}⁺ = {A, B} ∪ {C, D} = {A, B, C, D} = R ✓ **Candidate Key!**
- {A, D}⁺ = {A, B} ∪ {C, D} = R ✓ **Candidate Key!**
- {B, C}⁺ = {A, B} ∪ {C, D} = R ✓ **Candidate Key!**
- {B, D}⁺ = {A, B} ∪ {C, D} = R ✓ **Candidate Key!**
- {A, B}⁺ = {A, B} ≠ R
- {C, D}⁺ = {C, D} ≠ R

---

**Candidate Keys: {A,C}, {A,D}, {B,C}, {B,D}**

**Prime Attributes:** A, B, C, D (all!)
**Non-Prime Attributes:** None

This relation is in 3NF (and actually BCNF) because every attribute is prime!

Special Cases and Edge Cases

Several edge cases frequently appear in interviews. Recognizing these patterns saves time and prevents errors.

Edge Cases to Watch For

•Cyclic FDs (A → B → C → A): Creates multiple equivalent keys. Any single attribute in the cycle can determine all others in the cycle, but you still need attributes outside the cycle.
•Empty base (L ∪ N = {}): All attributes appear on both sides. Must search all combinations of LR.
•All attributes in base: If L ∪ N = R, you might think R is the key, but check if a smaller set works! (Some L attributes might be determined by other L attributes.)
•No FDs at all: Every attribute must be in the key. The only candidate key is R itself.
•Trivial FDs only: Same as no FDs—the full relation is the only key.
•Overlapping candidate keys: Common when there are multiple paths to determine the same attributes (like A → B and C → B with different keys involving A vs C).

Edge Case: Cyclic DependenciesFinding keys with circular FDs

Input

R(A, B, C, D, E)
FDs: A → B, B → C, C → A, D → E

Output

**Step 1: Classify Attributes**

L = {D} (only determines, never determined by non-cycle)
R = {E} (only determined, never determines)
LR = {A, B, C} (in the cycle)
N = {} 

**Step 2: Analyze the Cycle**

A → B → C → A forms a cycle.
Any one of A, B, or C can determine all three.

**Step 3: Test Keys**

base = {D}
D⁺ = {D, E} ≠ R. Need more.

Try {A, D}:
- A⁺ = {A, B, C} (full cycle)
- {A, D}⁺ = {A, B, C, D, E} = R ✓

Is it minimal?
- {A}⁺ = {A, B, C} ≠ R (missing D, E)
- {D}⁺ = {D, E} ≠ R
- Both needed. **{A, D} is a candidate key.**

**Step 4: Check Other Cycle Members**

{B, D}⁺ = {B, C, A, D, E} = R ✓ **Candidate key!**
{C, D}⁺ = {C, A, B, D, E} = R ✓ **Candidate key!**

---

**Candidate Keys: {A,D}, {B,D}, {C,D}**

Note: D must be in every key (L attribute).
Any cycle member can substitute for others.

Edge Case: All Left-Only but with Internal DependenciesWhen L attributes determine each other

Input

R(A, B, C, D)
FDs: A → B, A → C, A → D

All of B, C, D are right-only. Only A is left-only.
What's the key?

Output

**Classification:**
- L = {A}
- R = {B, C, D}
- LR = {}
- N = {}

**Test:**
base = {A}
A⁺ = {A, B, C, D} = R ✓

**Candidate Key: {A}**

This is straightforward—single-attribute key where one attribute determines everything else.

Key Finding in Interview Context

In interviews, key finding is often a preliminary step before the "real" question (like normalization or decomposition). Here's how to approach it efficiently.

Interview Key Finding Strategy

•Step 1: Quick Classification — Identify L, R, LR, N attributes. This takes 30 seconds and immediately tells you the minimum key base.
•Step 2: Compute Base Closure — If (L ∪ N)⁺ = R, you're done! This is the common case.
•Step 3: If More Needed — Try adding one LR attribute at a time to base. Check closure after each addition.
•Step 4: Check for Multiple Keys — If you found one key, ask: could different LR attributes create alternative keys? Check if any LR attribute can substitute for another.
•Step 5: Verify Minimality — For each potential key, confirm no attribute can be removed.

Time-Saving Tip

In ~80% of interview problems, the base (L ∪ N) alone is the candidate key. Always check this first! Only dive into systematic search if the base doesn't cover all attributes.

Common Interview Mistakes:

Stopping after finding one key — There might be more! Always check for alternatives.
Including R attributes in keys — Attributes that only appear on the right side of FDs are NEVER in keys.
Forgetting N attributes — If an attribute isn't mentioned in any FD, it MUST be in every key.
Not verifying minimality — Finding a superkey isn't finding a candidate key. Always verify no attribute can be removed.
Inefficient brute force — Don't try all possible subsets. Use classification to narrow down quickly.

Keys and Normal Form Connection

Understanding why key finding matters for normalization crystallizes both concepts.

Keys in 2NF Analysis:

2NF violations require partial dependencies—when part of a composite key determines a non-prime attribute.

If all candidate keys are single attributes, 2NF is automatically satisfied
You need composite keys to have 2NF violations
Key finding tells you if 2NF analysis is even necessary

Keys in 3NF Analysis:

3NF requires: For X → A, either X is superkey OR A is prime.

Prime attributes are those in ANY candidate key
You must find ALL keys to correctly classify prime/non-prime
Missing a candidate key = possibly misclassifying a prime attribute = wrong 3NF conclusion

Keys in BCNF Analysis:

BCNF requires: Every determinant is a superkey.

You need to know all superkeys to verify BCNF
A superkey is any superset of a candidate key
Knowing all candidate keys → knowing all superkeys

Key Finding Impact on Normal Form Analysis
Normal Form	Why Keys Matter	Consequence of Missing a Key
2NF	Need to know if keys are composite	Might miss that 2NF is automatically satisfied
3NF	Need all keys for prime classification	Might misclassify prime as non-prime, report false violation
BCNF	Need to verify if determinants are superkeys	Might fail to recognize a superkey, report false violation

Key Finding Changing NF ConclusionHow missing a key leads to wrong answers

Input

R(A, B, C, D)
FDs: A → B, C → D, BD → A

Question: Is this in 3NF?

Output

**Common Mistake (incomplete key finding):**

Quickly find {A, C} as a key (A→B, C→D covers most).
Assume it's the only key.
Prime = {A, C}, Non-prime = {B, D}

Check BD → A:
- Is BD superkey? BD⁺ = ? Let's see... BD → A, A → B, so BDA → B... 
  Actually BD⁺ with A→B... BD contains B and D.
  Need to check properly.

---

**Correct Analysis (complete key finding):**

**Classify:**
- A: LR (left in A→B, right in BD→A)
- B: LR (left in BD→A, right in A→B)
- C: L (left only)
- D: LR (left in BD→A, right in C→D)

L = {C}, base = {C}
C⁺ = {C, D} ≠ R. Need more.

Try {A, C}:
{A, C}⁺ = {A, B, C, D} = R ✓ Candidate key!

Try {B, C} (since B is LR like A):
{B, C}⁺ = {B, C, D} (C→D) → any more? B alone doesn't give A.
Not a superkey. ✗

Try {B, D, C}:
{B, D, C}⁺ = {B, D, C, A} (BD→A) → {A, B, C, D} = R ✓
But is it minimal? {B, D}⁺ = {B, D, A} → {A, B, D} ≠ R (no C). 
{B, C}⁺ = {B, C, D} ≠ R.
{D, C}⁺ = {C, D} ≠ R.
All three needed. **{B, C, D} is a candidate key!**

**Candidate Keys: {A, C} and {B, C, D}**

Prime = {A, B, C, D} (all!)
Non-prime = {} (none!)

**Re-check 3NF:**
- A → B: A not superkey, but B is prime ✓
- C → D: C not superkey, but D is prime ✓
- BD → A: BD not superkey, but A is prime ✓

**IN 3NF!**

If we had missed key {B,C,D}, we might have thought B and D were non-prime and concluded 3NF violation!

Advanced Key Finding Techniques

For complex schemas, these advanced techniques can help.

Technique 1: Dependency Graph Analysis

Draw a graph where nodes are attributes and edges represent FDs. Attributes with no incoming edges (sources) must be in every key. Attributes with only incoming edges (sinks) are never in keys.

Technique 2: Equivalence Classes

When FDs form cycles (A ↔ B), treat the cycle as a single unit. Any member can substitute for others.

Technique 3: Attribute Elimination

Start with all attributes as a potential key. Remove attributes one by one if their removal doesn't break superkey property. What remains is a candidate key.

Attribute Elimination Method
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
/**
 * Find ONE candidate key via elimination
 * (Useful when you just need one key quickly)
 */
function findOneCandidateKey(R, F) {
    let key = new Set(R);  // Start with all attributes
    
    for (let attr of R) {
        // Try removing this attribute
        let reduced = new Set(key);
        reduced.delete(attr);
        
        // Check if still a superkey
        if (computeClosure(reduced, F).equals(R)) {
            // Can remove this attribute
            key = reduced;
        }
        // Otherwise, keep the attribute
    }
    
    return key;  // This is a candidate key
}
 
// Note: This finds ONE candidate key.
// To find ALL keys, you need the systematic approach.
// But for quickly establishing one key, this works.
 
// Example:
// R = {A, B, C, D}, F = {A → B, C → D}
// Start: key = {A, B, C, D}
// Try remove A: {B, C, D}⁺ = {B, C, D} ≠ R. Keep A.
// Try remove B: {A, C, D}⁺ = {A, B, C, D} = R. Remove B!
// key = {A, C, D}
// Try remove C: {A, D}⁺ = {A, B, D} ≠ R. Keep C.
// Try remove D: {A, C}⁺ = {A, B, C, D} = R. Remove D!
// key = {A, C}
// Final: {A, C} is a candidate key.

Order Matters in Elimination

The elimination method finds one candidate key, but different elimination orders might yield different keys if multiple exist. To find all keys, use the systematic search from Section 3, or run elimination multiple times with different orders.

Practice Problems and Summary

Let's practice with interview-style problems.

Practice Problem 1Find all candidate keys

Input

R(A, B, C, D, E, F)
FDs: AB → C, C → D, D → E, CF → B

Output

**Step 1: Classify**

- A: L (only in AB→C left, never right)
- B: LR (left in AB→C, right in CF→B)
- C: LR (left in C→D, CF→B; right in AB→C)
- D: LR (left in D→E, right in C→D)
- E: R (only right)
- F: L (only in CF→B left)

**L = {A, F}**, R = {E}, LR = {B, C, D}

**Step 2: Test Base**

{A, F}⁺ = {A, F}. Need more.

**Step 3: Add LR Attributes**

Try {A, F, B}:
{A, F, B}⁺ = {A,F,B} → {A,F,B,C} (AB→C) → {A,F,B,C,D} (C→D) → {A,F,B,C,D,E} (D→E)
= R ✓ **Superkey!**

Minimal? Try removing:
- {A,F}⁺ = {A,F} ≠ R
- {A,B}⁺ = {A,B,C,D,E} ≠ R (no F)
- {F,B}⁺ = {F,B} ≠ R

**{A, F, B} is a candidate key!**

Try {A, F, C}:
{A,F,C}⁺ = {A,F,C} → {A,F,C,D} (C→D) → {A,F,C,D,E} (D→E) → {A,F,C,D,E,B} (CF→B)
= R ✓ **Superkey!**

Minimal?
- {A,F}⁺ = {A,F} ≠ R
- {A,C}⁺ = {A,C,D,E} ≠ R
- {F,C}⁺ = {F,C,D,E,B} ≠ R (no A)

**{A, F, C} is a candidate key!**

Any others? {A,F,D} would give A,F,D,E but not B or C. Need C for B (CF→B).

**Candidate Keys: {A, B, F} and {A, C, F}**

Prime: {A, B, C, F}
Non-Prime: {D, E}

Practice Problem 2Keys with only binary FDs

Input

R(A, B, C, D)
FDs: A → B, B → C, C → D, D → A

Output

**This is a full cycle!**

A → B → C → D → A

Any single attribute determines all others:
- A⁺ = {A,B,C,D} = R ✓
- B⁺ = {B,C,D,A} = R ✓
- C⁺ = {C,D,A,B} = R ✓
- D⁺ = {D,A,B,C} = R ✓

**Candidate Keys: {A}, {B}, {C}, {D}**

All four single-attribute keys!

**Prime Attributes:** All (A, B, C, D)
**Non-Prime:** None

This relation is automatically in 3NF and BCNF (every determinant is a superkey, every attribute is prime).

Key Takeaways

•Attribute classification is your first step — L (left-only) and N (neither) must be in every key. R (right-only) can never be in a key.
•Test the base first — (L ∪ N)⁺ = R is the common case. Only search systematically if needed.
•Find ALL candidate keys — Missing keys leads to wrong prime/non-prime classification and incorrect NF conclusions.
•Verify minimality — A superkey isn't necessarily a candidate key. Check that no attribute can be removed.
•Cyclic FDs create equivalent alternatives — Any cycle member can substitute for others in the key.
•Keys are foundational — Every normalization analysis starts with key finding. Master this first.

What's Next:

With key finding mastered, Page 5: Trade-off Analysis brings together everything we've learned. You'll learn to make informed decisions about when to normalize further vs. when to stop, when to accept denormalization, and how to communicate these trade-offs in interviews—the practical wisdom that separates textbook knowledge from engineering judgment.

Page Complete

You now have systematic algorithms for finding all candidate keys in any relation. Practice on diverse FD sets to build fluency. Key finding should become automatic—a quick 1-2 minute step before any normalization analysis.

Key Finding

The Foundation of Every Normalization Decision

Key finding is the critical first step in every normalization problem.

What You Will Master

Key Concepts Review

Let's establish precise definitions before diving into algorithms.

Superkey: A set of attributes S is a superkey of relation R iff S⁺ = R (the closure of S equals all attributes). Every tuple can be uniquely identified by S.

Candidate Key: A superkey K that is minimal—no proper subset of K is also a superkey. Removing any attribute from K makes it no longer a superkey.

Primary Key: One chosen candidate key designated as the "main" identifier. (A design choice, not a mathematical concept.)

Prime Attribute: An attribute that is part of any candidate key.

Non-Prime Attribute: An attribute that is not part of any candidate key.

Key Hierarchy
Concept	Definition	Example (R = ABCDE, Key = AB)
Superkey	Any set whose closure is R	AB, ABC, ABD, ABCD, ABCDE
Candidate Key	Minimal superkey	AB only (if no other key exists)
Primary Key	Chosen candidate key	AB (design decision)
Prime Attribute	In some candidate key	A, B
Non-Prime Attribute	In no candidate key	C, D, E

Critical Interview Insight

The Attribute Classification Approach

Before computing closures, we can narrow down candidates by classifying attributes based on their role in the FDs.

Attribute Classification:

Look at all FDs and classify each attribute:

L (Left-only): Appears only on left side of FDs, never on right
- These MUST be in every key (nothing else can determine them)
R (Right-only): Appears only on right side of FDs, never on left
- These CANNOT be in any key (they're always determined by others)
LR (Both sides): Appears on both left and right
- May or may not be in keys (need to check)
N (Neither): Appears in neither side of any FD
- These MUST be in every key (nothing else can determine them)

Starting Point for Key Finding:

Minimum Key = L ∪ N (all left-only and neither attributes)

If (L ∪ N)⁺ = R, then L ∪ N is the only candidate key. Otherwise, add LR attributes systematically.

Attribute Classification Algorithm
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
/**
 * Classify attributes for key finding
 */
function classifyAttributes(R, F) {
    let leftSides = new Set();
    let rightSides = new Set();
    
    // Collect all appearances
    for (let fd of F) {
        for (let attr of fd.left) leftSides.add(attr);
        for (let attr of fd.right) rightSides.add(attr);
    }
    
    let classification = {
        L: [],   // Left only - MUST be in key
        R: [],   // Right only - CANNOT be in key
        LR: [],  // Both sides - might be in key
        N: []    // Neither - MUST be in key
    };
    
    for (let attr of R) {
        let inLeft = leftSides.has(attr);
        let inRight = rightSides.has(attr);
        
        if (inLeft && !inRight) classification.L.push(attr);
        else if (!inLeft && inRight) classification.R.push(attr);
        else if (inLeft && inRight) classification.LR.push(attr);
        else classification.N.push(attr);  // Neither
    }
    
    return classification;
}
 
// Minimum key starting point:
// minKey = L ∪ N (must be in every candidate key)
// 
// If minKey⁺ = R, then minKey is THE candidate key.
// Otherwise, try adding subsets of LR to minKey.

Attribute Classification ExampleUsing classification to find keys efficiently

Input

R(A, B, C, D, E, F)
FDs: A → B, BC → D, D → E, E → F

Output

**Step 1: Classify Attributes**

Left sides: A, B, C, D, E
Right sides: B, D, E, F

- A: Left only (L) — **Must be in key**
- B: Both (LR) — Might be in key
- C: Left only (L) — **Must be in key**
- D: Both (LR) — Might be in key
- E: Both (LR) — Might be in key
- F: Right only (R) — **Cannot be in key**

**L = {A, C}** (must be in every key)
**R = {F}** (never in any key)
**LR = {B, D, E}** (check if needed)
**N = {}** (empty)

---

**Step 2: Test Minimum Key**

minKey = L ∪ N = {A, C}

Compute {A, C}⁺:
- {A, C} → {A, C, B} (A → B)
- {A, C, B} → {A, C, B, D} (BC → D)
- {A, C, B, D} → {A, C, B, D, E} (D → E)
- {A, C, B, D, E} → {A, C, B, D, E, F} (E → F)

{A, C}⁺ = {A, B, C, D, E, F} = R ✓

**Candidate Key: {A, C}**

Since minKey alone covers R, {A, C} is the **only** candidate key.
No need to test LR additions.

The Systematic Closure-Based Approach

When attribute classification alone doesn't uniquely determine the key (because (L ∪ N)⁺ ≠ R), we need to systematically search for all candidate keys.

The Algorithm:

Start with minimum base = L ∪ N (must be in every key)
If base⁺ = R, base is the only candidate key
Otherwise, systematically add each subset of LR attributes to base
Check if each combination is a superkey (closure = R)
Keep only minimal superkeys (candidate keys)

Optimization: Since we need minimal keys, we can search smallest to largest. Once we find a superkey of size k, any superset of it is not a candidate key.

Complete Key Finding Algorithm
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
/**
 * Find ALL candidate keys
 */
function findAllCandidateKeys(R, F) {
    let classification = classifyAttributes(R, F);
    let base = classification.L.concat(classification.N);
    let LR = classification.LR;
    
    // Check if base alone is a key
    if (computeClosure(base, F).equals(R)) {
        return [new Set(base)];  // Only one candidate key
    }
    
    // Need to add from LR attributes
    let candidateKeys = [];
    
    // Generate all subsets of LR, from smallest to largest
    for (let size = 1; size <= LR.length; size++) {
        for (let subset of combinations(LR, size)) {
            let potentialKey = new Set([...base, ...subset]);
            
            // Skip if superset of existing candidate key
            if (candidateKeys.some(k => isSubset(k, potentialKey))) {
                continue;
            }
            
            // Check if it's a superkey
            if (computeClosure(potentialKey, F).equals(R)) {
                // Check minimality (no proper subset is a superkey)
                let isMinimal = true;
                for (let attr of potentialKey) {
                    let reduced = new Set(potentialKey);
                    reduced.delete(attr);
                    if (computeClosure(reduced, F).equals(R)) {
                        isMinimal = false;
                        break;
                    }
                }
                
                if (isMinimal) {
                    candidateKeys.push(potentialKey);
                }
            }
        }
    }
    
    return candidateKeys;
}

Finding Multiple Candidate KeysWhen there are multiple minimal keys

Input

R(A, B, C, D)
FDs: A → B, B → A, C → D, D → C

Output

**Step 1: Classify Attributes**

Left: A, B, C, D
Right: B, A, D, C

- A: Both (LR)
- B: Both (LR)
- C: Both (LR)
- D: Both (LR)

**L = {}**, **R = {}**, **LR = {A, B, C, D}**, **N = {}**

base = L ∪ N = {} (empty!)

---

**Step 2: Systematic Search**

Need to find minimal subsets of LR that cover R.

**Size 1:**
- A⁺ = {A, B} ≠ R
- B⁺ = {A, B} ≠ R
- C⁺ = {C, D} ≠ R
- D⁺ = {C, D} ≠ R

No size-1 keys.

**Size 2:**
- {A, C}⁺ = {A, B} ∪ {C, D} = {A, B, C, D} = R ✓ **Candidate Key!**
- {A, D}⁺ = {A, B} ∪ {C, D} = R ✓ **Candidate Key!**
- {B, C}⁺ = {A, B} ∪ {C, D} = R ✓ **Candidate Key!**
- {B, D}⁺ = {A, B} ∪ {C, D} = R ✓ **Candidate Key!**
- {A, B}⁺ = {A, B} ≠ R
- {C, D}⁺ = {C, D} ≠ R

---

**Candidate Keys: {A,C}, {A,D}, {B,C}, {B,D}**

**Prime Attributes:** A, B, C, D (all!)
**Non-Prime Attributes:** None

This relation is in 3NF (and actually BCNF) because every attribute is prime!

Special Cases and Edge Cases

Several edge cases frequently appear in interviews. Recognizing these patterns saves time and prevents errors.

Edge Cases to Watch For

•Cyclic FDs (A → B → C → A): Creates multiple equivalent keys. Any single attribute in the cycle can determine all others in the cycle, but you still need attributes outside the cycle.
•Empty base (L ∪ N = {}): All attributes appear on both sides. Must search all combinations of LR.
•All attributes in base: If L ∪ N = R, you might think R is the key, but check if a smaller set works! (Some L attributes might be determined by other L attributes.)
•No FDs at all: Every attribute must be in the key. The only candidate key is R itself.
•Trivial FDs only: Same as no FDs—the full relation is the only key.
•Overlapping candidate keys: Common when there are multiple paths to determine the same attributes (like A → B and C → B with different keys involving A vs C).

Edge Case: Cyclic DependenciesFinding keys with circular FDs

Input

R(A, B, C, D, E)
FDs: A → B, B → C, C → A, D → E

Output

**Step 1: Classify Attributes**

L = {D} (only determines, never determined by non-cycle)
R = {E} (only determined, never determines)
LR = {A, B, C} (in the cycle)
N = {} 

**Step 2: Analyze the Cycle**

A → B → C → A forms a cycle.
Any one of A, B, or C can determine all three.

**Step 3: Test Keys**

base = {D}
D⁺ = {D, E} ≠ R. Need more.

Try {A, D}:
- A⁺ = {A, B, C} (full cycle)
- {A, D}⁺ = {A, B, C, D, E} = R ✓

Is it minimal?
- {A}⁺ = {A, B, C} ≠ R (missing D, E)
- {D}⁺ = {D, E} ≠ R
- Both needed. **{A, D} is a candidate key.**

**Step 4: Check Other Cycle Members**

{B, D}⁺ = {B, C, A, D, E} = R ✓ **Candidate key!**
{C, D}⁺ = {C, A, B, D, E} = R ✓ **Candidate key!**

---

**Candidate Keys: {A,D}, {B,D}, {C,D}**

Note: D must be in every key (L attribute).
Any cycle member can substitute for others.

Edge Case: All Left-Only but with Internal DependenciesWhen L attributes determine each other

Input

R(A, B, C, D)
FDs: A → B, A → C, A → D

All of B, C, D are right-only. Only A is left-only.
What's the key?

Output

**Classification:**
- L = {A}
- R = {B, C, D}
- LR = {}
- N = {}

**Test:**
base = {A}
A⁺ = {A, B, C, D} = R ✓

**Candidate Key: {A}**

This is straightforward—single-attribute key where one attribute determines everything else.

Key Finding in Interview Context

In interviews, key finding is often a preliminary step before the "real" question (like normalization or decomposition). Here's how to approach it efficiently.

Interview Key Finding Strategy

•Step 1: Quick Classification — Identify L, R, LR, N attributes. This takes 30 seconds and immediately tells you the minimum key base.
•Step 2: Compute Base Closure — If (L ∪ N)⁺ = R, you're done! This is the common case.
•Step 3: If More Needed — Try adding one LR attribute at a time to base. Check closure after each addition.
•Step 4: Check for Multiple Keys — If you found one key, ask: could different LR attributes create alternative keys? Check if any LR attribute can substitute for another.
•Step 5: Verify Minimality — For each potential key, confirm no attribute can be removed.

Time-Saving Tip

In ~80% of interview problems, the base (L ∪ N) alone is the candidate key. Always check this first! Only dive into systematic search if the base doesn't cover all attributes.

Common Interview Mistakes:

Stopping after finding one key — There might be more! Always check for alternatives.
Including R attributes in keys — Attributes that only appear on the right side of FDs are NEVER in keys.
Forgetting N attributes — If an attribute isn't mentioned in any FD, it MUST be in every key.
Not verifying minimality — Finding a superkey isn't finding a candidate key. Always verify no attribute can be removed.
Inefficient brute force — Don't try all possible subsets. Use classification to narrow down quickly.

Keys and Normal Form Connection

Understanding why key finding matters for normalization crystallizes both concepts.

Keys in 2NF Analysis:

2NF violations require partial dependencies—when part of a composite key determines a non-prime attribute.

If all candidate keys are single attributes, 2NF is automatically satisfied
You need composite keys to have 2NF violations
Key finding tells you if 2NF analysis is even necessary

Keys in 3NF Analysis:

3NF requires: For X → A, either X is superkey OR A is prime.

Prime attributes are those in ANY candidate key
You must find ALL keys to correctly classify prime/non-prime
Missing a candidate key = possibly misclassifying a prime attribute = wrong 3NF conclusion

Keys in BCNF Analysis:

BCNF requires: Every determinant is a superkey.

You need to know all superkeys to verify BCNF
A superkey is any superset of a candidate key
Knowing all candidate keys → knowing all superkeys

Key Finding Impact on Normal Form Analysis
Normal Form	Why Keys Matter	Consequence of Missing a Key
2NF	Need to know if keys are composite	Might miss that 2NF is automatically satisfied
3NF	Need all keys for prime classification	Might misclassify prime as non-prime, report false violation
BCNF	Need to verify if determinants are superkeys	Might fail to recognize a superkey, report false violation

Key Finding Changing NF ConclusionHow missing a key leads to wrong answers

Input

R(A, B, C, D)
FDs: A → B, C → D, BD → A

Question: Is this in 3NF?

Output

**Common Mistake (incomplete key finding):**

Quickly find {A, C} as a key (A→B, C→D covers most).
Assume it's the only key.
Prime = {A, C}, Non-prime = {B, D}

Check BD → A:
- Is BD superkey? BD⁺ = ? Let's see... BD → A, A → B, so BDA → B... 
  Actually BD⁺ with A→B... BD contains B and D.
  Need to check properly.

---

**Correct Analysis (complete key finding):**

**Classify:**
- A: LR (left in A→B, right in BD→A)
- B: LR (left in BD→A, right in A→B)
- C: L (left only)
- D: LR (left in BD→A, right in C→D)

L = {C}, base = {C}
C⁺ = {C, D} ≠ R. Need more.

Try {A, C}:
{A, C}⁺ = {A, B, C, D} = R ✓ Candidate key!

Try {B, C} (since B is LR like A):
{B, C}⁺ = {B, C, D} (C→D) → any more? B alone doesn't give A.
Not a superkey. ✗

Try {B, D, C}:
{B, D, C}⁺ = {B, D, C, A} (BD→A) → {A, B, C, D} = R ✓
But is it minimal? {B, D}⁺ = {B, D, A} → {A, B, D} ≠ R (no C). 
{B, C}⁺ = {B, C, D} ≠ R.
{D, C}⁺ = {C, D} ≠ R.
All three needed. **{B, C, D} is a candidate key!**

**Candidate Keys: {A, C} and {B, C, D}**

Prime = {A, B, C, D} (all!)
Non-prime = {} (none!)

**Re-check 3NF:**
- A → B: A not superkey, but B is prime ✓
- C → D: C not superkey, but D is prime ✓
- BD → A: BD not superkey, but A is prime ✓

**IN 3NF!**

If we had missed key {B,C,D}, we might have thought B and D were non-prime and concluded 3NF violation!

Advanced Key Finding Techniques

For complex schemas, these advanced techniques can help.

Technique 1: Dependency Graph Analysis

Draw a graph where nodes are attributes and edges represent FDs. Attributes with no incoming edges (sources) must be in every key. Attributes with only incoming edges (sinks) are never in keys.

Technique 2: Equivalence Classes

When FDs form cycles (A ↔ B), treat the cycle as a single unit. Any member can substitute for others.

Technique 3: Attribute Elimination

Start with all attributes as a potential key. Remove attributes one by one if their removal doesn't break superkey property. What remains is a candidate key.

Attribute Elimination Method
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
/**
 * Find ONE candidate key via elimination
 * (Useful when you just need one key quickly)
 */
function findOneCandidateKey(R, F) {
    let key = new Set(R);  // Start with all attributes
    
    for (let attr of R) {
        // Try removing this attribute
        let reduced = new Set(key);
        reduced.delete(attr);
        
        // Check if still a superkey
        if (computeClosure(reduced, F).equals(R)) {
            // Can remove this attribute
            key = reduced;
        }
        // Otherwise, keep the attribute
    }
    
    return key;  // This is a candidate key
}
 
// Note: This finds ONE candidate key.
// To find ALL keys, you need the systematic approach.
// But for quickly establishing one key, this works.
 
// Example:
// R = {A, B, C, D}, F = {A → B, C → D}
// Start: key = {A, B, C, D}
// Try remove A: {B, C, D}⁺ = {B, C, D} ≠ R. Keep A.
// Try remove B: {A, C, D}⁺ = {A, B, C, D} = R. Remove B!
// key = {A, C, D}
// Try remove C: {A, D}⁺ = {A, B, D} ≠ R. Keep C.
// Try remove D: {A, C}⁺ = {A, B, C, D} = R. Remove D!
// key = {A, C}
// Final: {A, C} is a candidate key.

Order Matters in Elimination

Practice Problems and Summary

Let's practice with interview-style problems.

Practice Problem 1Find all candidate keys

Input

R(A, B, C, D, E, F)
FDs: AB → C, C → D, D → E, CF → B

Output

**Step 1: Classify**

- A: L (only in AB→C left, never right)
- B: LR (left in AB→C, right in CF→B)
- C: LR (left in C→D, CF→B; right in AB→C)
- D: LR (left in D→E, right in C→D)
- E: R (only right)
- F: L (only in CF→B left)

**L = {A, F}**, R = {E}, LR = {B, C, D}

**Step 2: Test Base**

{A, F}⁺ = {A, F}. Need more.

**Step 3: Add LR Attributes**

Try {A, F, B}:
{A, F, B}⁺ = {A,F,B} → {A,F,B,C} (AB→C) → {A,F,B,C,D} (C→D) → {A,F,B,C,D,E} (D→E)
= R ✓ **Superkey!**

Minimal? Try removing:
- {A,F}⁺ = {A,F} ≠ R
- {A,B}⁺ = {A,B,C,D,E} ≠ R (no F)
- {F,B}⁺ = {F,B} ≠ R

**{A, F, B} is a candidate key!**

Try {A, F, C}:
{A,F,C}⁺ = {A,F,C} → {A,F,C,D} (C→D) → {A,F,C,D,E} (D→E) → {A,F,C,D,E,B} (CF→B)
= R ✓ **Superkey!**

Minimal?
- {A,F}⁺ = {A,F} ≠ R
- {A,C}⁺ = {A,C,D,E} ≠ R
- {F,C}⁺ = {F,C,D,E,B} ≠ R (no A)

**{A, F, C} is a candidate key!**

Any others? {A,F,D} would give A,F,D,E but not B or C. Need C for B (CF→B).

**Candidate Keys: {A, B, F} and {A, C, F}**

Prime: {A, B, C, F}
Non-Prime: {D, E}

Practice Problem 2Keys with only binary FDs

Input

R(A, B, C, D)
FDs: A → B, B → C, C → D, D → A

Output

**This is a full cycle!**

A → B → C → D → A

Any single attribute determines all others:
- A⁺ = {A,B,C,D} = R ✓
- B⁺ = {B,C,D,A} = R ✓
- C⁺ = {C,D,A,B} = R ✓
- D⁺ = {D,A,B,C} = R ✓

**Candidate Keys: {A}, {B}, {C}, {D}**

All four single-attribute keys!

**Prime Attributes:** All (A, B, C, D)
**Non-Prime:** None

This relation is automatically in 3NF and BCNF (every determinant is a superkey, every attribute is prime).

Key Takeaways

•Attribute classification is your first step — L (left-only) and N (neither) must be in every key. R (right-only) can never be in a key.
•Test the base first — (L ∪ N)⁺ = R is the common case. Only search systematically if needed.
•Find ALL candidate keys — Missing keys leads to wrong prime/non-prime classification and incorrect NF conclusions.
•Verify minimality — A superkey isn't necessarily a candidate key. Check that no attribute can be removed.
•Cyclic FDs create equivalent alternatives — Any cycle member can substitute for others in the key.
•Keys are foundational — Every normalization analysis starts with key finding. Master this first.

What's Next:

Page Complete