Database Management SystemsSecond Normal Form (2NF)

Second Normal Form: Eliminating Partial Dependencies

LevelIntermediate

Duration55 mins

TopicSecond Normal Form (2NF)

3 / 5

Identifying Second Normal Form Violations

From Theory to Detection

Knowing the definition of 2NF is one thing; reliably detecting violations in real-world schemas is another. A database may have dozens of tables, each with multiple candidate keys and numerous functional dependencies. How do you systematically identify which relations violate 2NF and pinpoint the exact problematic dependencies?

This page transforms the theoretical definition into a practical detection toolkit. You'll learn systematic algorithms, visual inspection techniques, and common patterns that signal 2NF violations.

What You Will Master

By the end of this page, you will be able to analyze any relation and systematically identify all 2NF violations. You'll master both rigorous algorithmic approaches and practical shortcuts for common scenarios. You'll also recognize the telltale signs of partial dependencies in existing database schemas.

The Complete Detection Algorithm

Let's formalize a complete, step-by-step algorithm for detecting 2NF violations. This algorithm is systematic and guarantees finding ALL violations.

Algorithm: DETECT-2NF-VIOLATIONS(R, F)

Input:

R: A relation schema with attributes {A₁, A₂, ..., Aₙ}
F: A set of functional dependencies over R

Output:

A set of partial dependencies (2NF violations)

2NF Violation Detection Algorithm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
DETECT-2NF-VIOLATIONS(R, F):
    violations = {}
    
    // Step 1: Find all candidate keys
    CK = FIND-CANDIDATE-KEYS(R, F)
    
    // Step 2: Identify prime and non-prime attributes
    prime_attrs = UNION of all attributes in all candidate keys
    non_prime_attrs = R.attributes - prime_attrs
    
    // Step 3: Early exit if no non-prime attributes
    if non_prime_attrs is empty:
        return {}  // Automatically in 2NF
    
    // Step 4: Check each composite candidate key
    for each K in CK where |K| >= 2:
        
        // Step 5: Generate all proper non-empty subsets of K
        subsets = POWER-SET(K) - {K} - {∅}
        
        // Step 6: For each subset, check dependencies to non-prime attrs
        for each S in subsets:
            for each A in non_prime_attrs:
                
                // Step 7: Compute closure of S
                S_closure = ATTRIBUTE-CLOSURE(S, F)
                
                // Step 8: Check if A is in the closure
                if A in S_closure:
                    violations.add((S, A))  // S → A is a partial dependency
    
    return violations

Algorithm Breakdown:

Step 1: Find all candidate keys using the candidate key finding algorithm (compute closures of attribute combinations to find minimal superkeys).

Step 2: Prime attributes are those appearing in any candidate key. Non-prime are the rest.

Step 3: Optimization—if there are no non-prime attributes, 2NF is satisfied.

Step 4: Only composite keys (2+ attributes) can have partial dependencies.

Step 5: Generate subsets. For a key {A, B, C}, the proper subsets are {A}, {B}, {C}, {A,B}, {A,C}, {B,C}.

Step 6-8: Use attribute closure to test if each subset determines each non-prime attribute.

Complexity Analysis:

For a key of size k, there are 2^k - 2 proper non-empty subsets
For each subset, we compute closure (polynomial in |F|)
Overall: Exponential in key size, but keys are typically small (2-4 attributes)

Worked Example: Complete 2NF Analysis

Let's apply the algorithm to a comprehensive example.

Relation: OrderLine(OrderID, ProductID, CustomerName, ProductName, Quantity, UnitPrice, OrderDate)

Given Functional Dependencies (F):

•FD1: {OrderID, ProductID} → {Quantity} — A specific product in a specific order has a quantity
•FD2: {OrderID} → {CustomerName, OrderDate} — Each order belongs to one customer, placed on one date
•FD3: {ProductID} → {ProductName, UnitPrice} — Each product has one name and standard price

Step 1: Find Candidate Keys

We need to find the minimal set(s) of attributes whose closure is all attributes.

Test {OrderID, ProductID}⁺:

Start: {OrderID, ProductID}
Apply FD1: Add Quantity → {OrderID, ProductID, Quantity}
Apply FD2: Add CustomerName, OrderDate → {OrderID, ProductID, Quantity, CustomerName, OrderDate}
Apply FD3: Add ProductName, UnitPrice → {OrderID, ProductID, Quantity, CustomerName, OrderDate, ProductName, UnitPrice}

{OrderID, ProductID}⁺ = ALL attributes ✓

Is it minimal? Test subsets:

{OrderID}⁺ = {OrderID, CustomerName, OrderDate} — Not all attributes
{ProductID}⁺ = {ProductID, ProductName, UnitPrice} — Not all attributes

Neither subset determines all attributes, so {OrderID, ProductID} is a candidate key.

Candidate Key(s): {OrderID, ProductID} (composite key with 2 attributes)

Step 2: Identify Prime and Non-Prime Attributes

Attribute	In Candidate Key?	Classification
OrderID	Yes	Prime
ProductID	Yes	Prime
CustomerName	No	Non-Prime
ProductName	No	Non-Prime
Quantity	No	Non-Prime
UnitPrice	No	Non-Prime
OrderDate	No	Non-Prime

Step 3: Early Exit? No — we have non-prime attributes.

Step 4: Composite Key Check

Key {OrderID, ProductID} has size 2, so partial dependencies are possible.

Step 5: Proper Subsets of Key

Proper non-empty subsets of {OrderID, ProductID}:

{OrderID}
{ProductID}

Step 6-8: Closure Tests for Each Non-Prime Attribute
Subset	Closure	CustomerName	ProductName	Quantity	UnitPrice	OrderDate
{OrderID}	{OrderID, CustomerName, OrderDate}	✗ VIOLATION	—	—	—	✗ VIOLATION
{ProductID}	{ProductID, ProductName, UnitPrice}	—	✗ VIOLATION	—	✗ VIOLATION	—

2NF Violations Found

Four partial dependencies detected: • {OrderID} → CustomerName • {OrderID} → OrderDate • {ProductID} → ProductName • {ProductID} → UnitPrice

Note: Quantity depends on the FULL key {OrderID, ProductID}, so it's NOT a violation.

Conclusion: The relation OrderLine is NOT in 2NF. Four partial dependencies exist that must be eliminated through decomposition.

Visual Pattern Recognition

While the algorithm is rigorous, experienced database designers often recognize 2NF violations through visual patterns. These heuristics provide quick detection for common scenarios.

Pattern 1: Entity Mixing

When a single table contains attributes that describe multiple independent entities, partial dependencies are likely.

Red Flag Signs:

Table name concatenates two entity names (e.g., "OrderProduct", "StudentCourse")
Attributes can be grouped by what they describe
Some groups have natural single-attribute identifiers within them

Entity Mixing Pattern Analysis
Attribute Group	Describes	Natural Key	Partial Dependency Risk
OrderID, CustomerName, OrderDate	Orders	OrderID	High — Order info depends only on OrderID
ProductID, ProductName, UnitPrice	Products	ProductID	High — Product info depends only on ProductID
Quantity	The relationship	{OrderID, ProductID}	Low — Quantity is the relationship attribute

Pattern 2: Repeating Non-Key Values

Look at sample data. If you see the same non-key value appearing across multiple rows, trace WHY.

Example: Spotting Repetition

Sample Data Showing Repetition
OrderID	ProductID	CustomerName	ProductName	Qty
O001	P101	Alice	Widget	5
O001	P102	Alice	Gadget	3
O001	P103	Alice	Gizmo	2
O002	P101	Bob	Widget	10
O003	P101	Carol	Widget	7

Analysis:

"Alice" repeats 3 times — always with OrderID O001 → CustomerName depends on OrderID alone
"Widget" repeats 3 times — always with ProductID P101 → ProductName depends on ProductID alone

Repetition of non-key values is a strong indicator of partial dependencies.

Pattern 3: Attribute Name Prefixes

Attribute naming conventions often reveal the entity they belong to:

StudentCourse(StudentID, CourseID, StudentName, StudentEmail, CourseName, CourseCredits, Grade)
                         └── Student attrs ──┘  └── Course attrs ──┘

When attribute names share prefixes matching part of a composite key, those attributes likely depend on only that part of the key.

Quick Visual Scan

For any table with a composite key, ask: 'Are there attributes that belong to just one entity referenced by part of the key?' If you can mentally group attributes by the entity they describe, partial dependencies exist.

Common 2NF Violation Scenarios

Certain database design patterns frequently lead to 2NF violations. Recognizing these patterns helps you anticipate and prevent violations proactively.

Scenario 1: Many-to-Many Relationship Tables with Entity Attributes

Many-to-Many Anti-Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- VIOLATION: Entity attributes embedded in junction table
CREATE TABLE Enrollment (
    StudentID    INT,
    CourseID     INT,
    StudentName  VARCHAR(100),  -- Partial dependency on StudentID
    StudentEmail VARCHAR(100),  -- Partial dependency on StudentID
    CourseName   VARCHAR(100),  -- Partial dependency on CourseID
    Credits      INT,           -- Partial dependency on CourseID
    Grade        CHAR(2),       -- Full dependency on (StudentID, CourseID)
    PRIMARY KEY (StudentID, CourseID)
);
 
-- 2NF COMPLIANT DESIGN
CREATE TABLE Student (
    StudentID    INT PRIMARY KEY,
    StudentName  VARCHAR(100),
    StudentEmail VARCHAR(100)
);
 
CREATE TABLE Course (
    CourseID   INT PRIMARY KEY,
    CourseName VARCHAR(100),
    Credits    INT
);
 
CREATE TABLE Enrollment (
    StudentID INT,
    CourseID  INT,
    Grade     CHAR(2),
    PRIMARY KEY (StudentID, CourseID),
    FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
    FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);

Scenario 2: Flattened Hierarchies

When hierarchical data is flattened into a single table with a composite key representing the path:

Flattened Hierarchy Anti-Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- VIOLATION: Department info repeated for each employee
CREATE TABLE OrgChart (
    DeptID      INT,
    EmpID       INT,
    DeptName    VARCHAR(50),   -- Partial dependency on DeptID
    DeptBudget  DECIMAL(12,2), -- Partial dependency on DeptID
    EmpName     VARCHAR(100),  -- Partial dependency on EmpID
    EmpSalary   DECIMAL(10,2), -- Partial dependency on EmpID
    EmpRole     VARCHAR(50),   -- Full dependency (role in this dept)
    PRIMARY KEY (DeptID, EmpID)
);
 
-- Note: If employees can only be in one department,
-- then EmpName and EmpSalary also partially depend on EmpID

Scenario 3: Time-Series Data with Entity Attributes

Time-Series Anti-Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
-- VIOLATION: Sensor info repeated for each reading
CREATE TABLE SensorReading (
    SensorID      INT,
    ReadingTime   TIMESTAMP,
    SensorType    VARCHAR(50),   -- Partial dependency on SensorID
    SensorLocation VARCHAR(100), -- Partial dependency on SensorID
    Temperature   DECIMAL(5,2),  -- Full dependency on (SensorID, ReadingTime)
    Humidity      DECIMAL(5,2),  -- Full dependency on (SensorID, ReadingTime)
    PRIMARY KEY (SensorID, ReadingTime)
);
 
-- Every reading for Sensor #1 repeats "Temperature Sensor" 
-- and "Building A, Room 101" unnecessarily

Watch for These Patterns

Key indicators of likely 2NF violations: • Junction/association tables that include entity details beyond foreign keys • Tables with composite keys where some columns describe just one part of the key • Flattened reports or exports converted directly to database tables • Time-series tables that repeat entity metadata with each measurement

Using SQL to Detect Violations

You can use SQL queries to empirically detect potential 2NF violations in existing data. While this doesn't prove functional dependencies exist (which require schema knowledge), it can reveal patterns suggesting violations.

Technique 1: Detect Single-Column Determinants

For a table with composite key (A, B), check if column C has the same value for all rows with the same A value:

SQL: Detect Partial Dependency on Key Part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Potential partial dependency: OrderID → CustomerName
-- If true, each OrderID should have exactly one distinct CustomerName
 
SELECT OrderID, COUNT(DISTINCT CustomerName) AS DistinctCustomers
FROM OrderLine
GROUP BY OrderID
HAVING COUNT(DISTINCT CustomerName) > 1;
 
-- If this returns 0 rows, CustomerName likely depends on OrderID alone
-- Combined with domain knowledge (orders have one customer), 
-- this confirms the partial dependency
 
-- Repeat for other suspected partial dependencies:
SELECT ProductID, COUNT(DISTINCT ProductName) AS DistinctNames
FROM OrderLine
GROUP BY ProductID
HAVING COUNT(DISTINCT ProductName) > 1;

Technique 2: Measure Repetition Factor

Quantify how much redundancy exists due to suspected partial dependencies:

SQL: Measure Redundancy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- How many times is each CustomerName repeated in OrderLine?
SELECT 
    OrderID,
    CustomerName,
    COUNT(*) AS TimesRepeated
FROM OrderLine
GROUP BY OrderID, CustomerName
ORDER BY TimesRepeated DESC;
 
-- Summary: Total storage waste from CustomerName repetition
SELECT 
    'CustomerName' AS Attribute,
    COUNT(*) AS TotalRows,
    COUNT(DISTINCT OrderID) AS UniqueDeterminants,
    COUNT(*) - COUNT(DISTINCT OrderID) AS WastedRepeats,
    ROUND(100.0 * (COUNT(*) - COUNT(DISTINCT OrderID)) / COUNT(*), 2) AS WastePercentage
FROM OrderLine;
 
-- Example output might show:
-- Attribute: CustomerName
-- TotalRows: 10000
-- UniqueDeterminants: 1500
-- WastedRepeats: 8500
-- WastePercentage: 85.00%  (85% of CustomerName storage is redundant!)

Technique 3: Cross-Reference Check

Verify that suspected partial dependencies are consistent (no anomalies already present):

SQL: Check for Existing Anomalies
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Are there any inconsistencies in ProductName for the same ProductID?
-- (Would indicate either not a valid FD or data corruption)
 
SELECT ProductID, ProductName, COUNT(*) AS Occurrences
FROM OrderLine
GROUP BY ProductID, ProductName
ORDER BY ProductID;
 
-- If a ProductID appears with multiple ProductNames, either:
-- 1. ProductID → ProductName is NOT a valid FD, OR
-- 2. Data corruption has already occurred
 
-- Find specific violations:
SELECT ProductID
FROM OrderLine
GROUP BY ProductID
HAVING COUNT(DISTINCT ProductName) > 1;

SQL Detection Limitations

SQL queries on data can SUGGEST functional dependencies but cannot PROVE them. A FD is a schema-level constraint about ALL possible data, not just current data. You might have consistent data now but no enforcement mechanism. Combine SQL analysis with domain knowledge and schema review.

Handling Multiple Candidate Keys

When a relation has multiple candidate keys, you must check for partial dependencies against EACH composite candidate key.

Example: Relation with Multiple Keys

Relation: Employee(SSN, EmpNumber, Name, Department, DeptLocation)

Candidate Keys:
  {SSN}           -- Single attribute (no partial deps possible)
  {EmpNumber}     -- Single attribute (no partial deps possible)
  {Department, Name}  -- Composite (partial deps possible!)

Functional Dependencies:
  SSN → EmpNumber, Name, Department, DeptLocation
  EmpNumber → SSN, Name, Department, DeptLocation
  {Department, Name} → SSN, EmpNumber, DeptLocation
  Department → DeptLocation

Analysis:

The first two candidate keys are single-attribute → no partial deps possible
The third candidate key {Department, Name} is composite
- Proper subsets: {Department}, {Name}
- Check: Does {Department} → any non-prime attribute?
- {Department} → DeptLocation ✓ YES!

But wait: What is the non-prime attribute?

Prime attributes: SSN, EmpNumber, Department, Name (all in some candidate key)
Non-prime attributes: DeptLocation only

Check: {Department} → DeptLocation, and DeptLocation is non-prime

Conclusion: This relation has a partial dependency:

{Department} is a proper subset of candidate key {Department, Name}
{Department} → DeptLocation (where DeptLocation is non-prime)
Therefore, the relation is NOT in 2NF

All Candidate Keys Matter

A common mistake is checking only the primary key. The 2NF definition requires that non-prime attributes be fully dependent on EVERY candidate key. Even if the primary key is single-attribute, other composite candidate keys might have partial dependencies.

Systematic Approach for Multiple Keys:

List ALL candidate keys
Separate single-attribute keys (automatically safe) from composite keys
Identify non-prime attributes (not in ANY candidate key)
For EACH composite candidate key, check all its proper subsets against each non-prime attribute
If ANY check reveals a partial dependency, the relation violates 2NF

Edge Cases and Tricky Situations

Let's examine scenarios that often cause confusion during 2NF analysis:

Edge Case 1: Trivial Dependencies

Every attribute trivially depends on itself: A → A. This is NOT a 2NF violation because:

The attribute A is part of the candidate key (prime attribute)
2NF only concerns non-prime attributes

Edge Case 2: Dependencies Between Non-Key Attributes

Consider: R(A, B, C, D) with key {A, B} and FD C → D

This is NOT a partial dependency (C is not a subset of the key). However, this IS a transitive dependency (addressed by 3NF, not 2NF). For 2NF purposes, check only if key subsets determine non-prime attributes.

Edge Case Decision Table
Dependency	Key	Type	2NF Issue?
{A} → C	{A, B}	Partial (A is subset of key, C is non-prime)	YES — 2NF violation
{A, B} → C	{A, B}	Full (entire key)	NO — correct
C → D	{A, B}	Transitive (C is not part of key)	NO — 3NF concern
{A} → B	{A, B}	Prime to prime	NO — only non-prime matters
{A, B} → A	{A, B}	Trivial	NO — not a real dependency

Edge Case 3: Overlapping Candidate Keys

When candidate keys share attributes:

R(A, B, C, D)
Candidate Keys: {A, B}, {A, C}

Prime attributes: A, B, C Non-prime attributes: D only

Proper subsets to check:

From {A, B}: {A}, {B}
From {A, C}: {A}, {C}

Note that {A} appears twice, but we only need to check it once.

Edge Case 4: All Attributes Are Prime

R(A, B, C)
Candidate Keys: {A, B}, {C}

All three attributes are in at least one candidate key. There are NO non-prime attributes. Therefore, the relation is automatically in 2NF regardless of any other dependencies.

When in Doubt, Follow the Algorithm

Edge cases can be confusing. When uncertain, fall back to the systematic algorithm:

Find all candidate keys
Identify non-prime attributes
Check if any key subset determines any non-prime attribute The algorithm will always give the correct answer.

Summary: Mastering 2NF Violation Detection

You now have a complete toolkit for identifying 2NF violations. Let's consolidate:

Key Takeaways

•Systematic Algorithm: Use the DETECT-2NF-VIOLATIONS algorithm for guaranteed complete detection. Compute closures of all proper key subsets.
•Visual Patterns: Look for entity mixing, repeated non-key values, and attribute name prefixes that match key components.
•Common Scenarios: Watch for junction tables with entity attributes, flattened hierarchies, and time-series data with repeated entity info.
•SQL Analysis: Use COUNT(DISTINCT) queries to empirically detect deterministic relationships suggesting partial dependencies.
•Multiple Keys: Check ALL composite candidate keys, not just the primary key.
•Edge Cases: Focus only on non-prime attributes. Dependencies between primes or non-key determinants are not 2NF concerns.

What's Next:

Now that you can identify all 2NF violations in any schema, we need to fix them. The next page covers decomposition for 2NF—the systematic process of breaking a violating relation into well-designed relations that satisfy 2NF while preserving all data and dependencies.

Detection Skills Acquired

You can now reliably identify 2NF violations using both rigorous algorithms and practical pattern recognition. This skill is essential for database design review, legacy system analysis, and normalization projects. You're ready to learn how to fix the violations you find.

3 / 5

Loading learning content...

Database Management SystemsSecond Normal Form (2NF)

Second Normal Form: Eliminating Partial Dependencies

LevelIntermediate

Duration55 mins

TopicSecond Normal Form (2NF)

3 / 5

Identifying Second Normal Form Violations

From Theory to Detection

This page transforms the theoretical definition into a practical detection toolkit. You'll learn systematic algorithms, visual inspection techniques, and common patterns that signal 2NF violations.

What You Will Master

The Complete Detection Algorithm

Let's formalize a complete, step-by-step algorithm for detecting 2NF violations. This algorithm is systematic and guarantees finding ALL violations.

Algorithm: DETECT-2NF-VIOLATIONS(R, F)

Input:

R: A relation schema with attributes {A₁, A₂, ..., Aₙ}
F: A set of functional dependencies over R

Output:

A set of partial dependencies (2NF violations)

2NF Violation Detection Algorithm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
DETECT-2NF-VIOLATIONS(R, F):
    violations = {}
    
    // Step 1: Find all candidate keys
    CK = FIND-CANDIDATE-KEYS(R, F)
    
    // Step 2: Identify prime and non-prime attributes
    prime_attrs = UNION of all attributes in all candidate keys
    non_prime_attrs = R.attributes - prime_attrs
    
    // Step 3: Early exit if no non-prime attributes
    if non_prime_attrs is empty:
        return {}  // Automatically in 2NF
    
    // Step 4: Check each composite candidate key
    for each K in CK where |K| >= 2:
        
        // Step 5: Generate all proper non-empty subsets of K
        subsets = POWER-SET(K) - {K} - {∅}
        
        // Step 6: For each subset, check dependencies to non-prime attrs
        for each S in subsets:
            for each A in non_prime_attrs:
                
                // Step 7: Compute closure of S
                S_closure = ATTRIBUTE-CLOSURE(S, F)
                
                // Step 8: Check if A is in the closure
                if A in S_closure:
                    violations.add((S, A))  // S → A is a partial dependency
    
    return violations

Algorithm Breakdown:

Step 1: Find all candidate keys using the candidate key finding algorithm (compute closures of attribute combinations to find minimal superkeys).

Step 2: Prime attributes are those appearing in any candidate key. Non-prime are the rest.

Step 3: Optimization—if there are no non-prime attributes, 2NF is satisfied.

Step 4: Only composite keys (2+ attributes) can have partial dependencies.

Step 5: Generate subsets. For a key {A, B, C}, the proper subsets are {A}, {B}, {C}, {A,B}, {A,C}, {B,C}.

Step 6-8: Use attribute closure to test if each subset determines each non-prime attribute.

Complexity Analysis:

For a key of size k, there are 2^k - 2 proper non-empty subsets
For each subset, we compute closure (polynomial in |F|)
Overall: Exponential in key size, but keys are typically small (2-4 attributes)

Worked Example: Complete 2NF Analysis

Let's apply the algorithm to a comprehensive example.

Relation: OrderLine(OrderID, ProductID, CustomerName, ProductName, Quantity, UnitPrice, OrderDate)

Given Functional Dependencies (F):

•FD1: {OrderID, ProductID} → {Quantity} — A specific product in a specific order has a quantity
•FD2: {OrderID} → {CustomerName, OrderDate} — Each order belongs to one customer, placed on one date
•FD3: {ProductID} → {ProductName, UnitPrice} — Each product has one name and standard price

Step 1: Find Candidate Keys

We need to find the minimal set(s) of attributes whose closure is all attributes.

Test {OrderID, ProductID}⁺:

Start: {OrderID, ProductID}
Apply FD1: Add Quantity → {OrderID, ProductID, Quantity}
Apply FD2: Add CustomerName, OrderDate → {OrderID, ProductID, Quantity, CustomerName, OrderDate}
Apply FD3: Add ProductName, UnitPrice → {OrderID, ProductID, Quantity, CustomerName, OrderDate, ProductName, UnitPrice}

{OrderID, ProductID}⁺ = ALL attributes ✓

Is it minimal? Test subsets:

{OrderID}⁺ = {OrderID, CustomerName, OrderDate} — Not all attributes
{ProductID}⁺ = {ProductID, ProductName, UnitPrice} — Not all attributes

Neither subset determines all attributes, so {OrderID, ProductID} is a candidate key.

Candidate Key(s): {OrderID, ProductID} (composite key with 2 attributes)

Step 2: Identify Prime and Non-Prime Attributes

Attribute	In Candidate Key?	Classification
OrderID	Yes	Prime
ProductID	Yes	Prime
CustomerName	No	Non-Prime
ProductName	No	Non-Prime
Quantity	No	Non-Prime
UnitPrice	No	Non-Prime
OrderDate	No	Non-Prime

Step 3: Early Exit? No — we have non-prime attributes.

Step 4: Composite Key Check

Key {OrderID, ProductID} has size 2, so partial dependencies are possible.

Step 5: Proper Subsets of Key

Proper non-empty subsets of {OrderID, ProductID}:

{OrderID}
{ProductID}

Step 6-8: Closure Tests for Each Non-Prime Attribute
Subset	Closure	CustomerName	ProductName	Quantity	UnitPrice	OrderDate
{OrderID}	{OrderID, CustomerName, OrderDate}	✗ VIOLATION	—	—	—	✗ VIOLATION
{ProductID}	{ProductID, ProductName, UnitPrice}	—	✗ VIOLATION	—	✗ VIOLATION	—

2NF Violations Found

Four partial dependencies detected: • {OrderID} → CustomerName • {OrderID} → OrderDate • {ProductID} → ProductName • {ProductID} → UnitPrice

Note: Quantity depends on the FULL key {OrderID, ProductID}, so it's NOT a violation.

Conclusion: The relation OrderLine is NOT in 2NF. Four partial dependencies exist that must be eliminated through decomposition.

Visual Pattern Recognition

While the algorithm is rigorous, experienced database designers often recognize 2NF violations through visual patterns. These heuristics provide quick detection for common scenarios.

Pattern 1: Entity Mixing

When a single table contains attributes that describe multiple independent entities, partial dependencies are likely.

Red Flag Signs:

Table name concatenates two entity names (e.g., "OrderProduct", "StudentCourse")
Attributes can be grouped by what they describe
Some groups have natural single-attribute identifiers within them

Entity Mixing Pattern Analysis
Attribute Group	Describes	Natural Key	Partial Dependency Risk
OrderID, CustomerName, OrderDate	Orders	OrderID	High — Order info depends only on OrderID
ProductID, ProductName, UnitPrice	Products	ProductID	High — Product info depends only on ProductID
Quantity	The relationship	{OrderID, ProductID}	Low — Quantity is the relationship attribute

Pattern 2: Repeating Non-Key Values

Look at sample data. If you see the same non-key value appearing across multiple rows, trace WHY.

Example: Spotting Repetition

Sample Data Showing Repetition
OrderID	ProductID	CustomerName	ProductName	Qty
O001	P101	Alice	Widget	5
O001	P102	Alice	Gadget	3
O001	P103	Alice	Gizmo	2
O002	P101	Bob	Widget	10
O003	P101	Carol	Widget	7

Analysis:

"Alice" repeats 3 times — always with OrderID O001 → CustomerName depends on OrderID alone
"Widget" repeats 3 times — always with ProductID P101 → ProductName depends on ProductID alone

Repetition of non-key values is a strong indicator of partial dependencies.

Pattern 3: Attribute Name Prefixes

Attribute naming conventions often reveal the entity they belong to:

StudentCourse(StudentID, CourseID, StudentName, StudentEmail, CourseName, CourseCredits, Grade)
                         └── Student attrs ──┘  └── Course attrs ──┘

When attribute names share prefixes matching part of a composite key, those attributes likely depend on only that part of the key.

Quick Visual Scan

Common 2NF Violation Scenarios

Certain database design patterns frequently lead to 2NF violations. Recognizing these patterns helps you anticipate and prevent violations proactively.

Scenario 1: Many-to-Many Relationship Tables with Entity Attributes

Many-to-Many Anti-Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- VIOLATION: Entity attributes embedded in junction table
CREATE TABLE Enrollment (
    StudentID    INT,
    CourseID     INT,
    StudentName  VARCHAR(100),  -- Partial dependency on StudentID
    StudentEmail VARCHAR(100),  -- Partial dependency on StudentID
    CourseName   VARCHAR(100),  -- Partial dependency on CourseID
    Credits      INT,           -- Partial dependency on CourseID
    Grade        CHAR(2),       -- Full dependency on (StudentID, CourseID)
    PRIMARY KEY (StudentID, CourseID)
);
 
-- 2NF COMPLIANT DESIGN
CREATE TABLE Student (
    StudentID    INT PRIMARY KEY,
    StudentName  VARCHAR(100),
    StudentEmail VARCHAR(100)
);
 
CREATE TABLE Course (
    CourseID   INT PRIMARY KEY,
    CourseName VARCHAR(100),
    Credits    INT
);
 
CREATE TABLE Enrollment (
    StudentID INT,
    CourseID  INT,
    Grade     CHAR(2),
    PRIMARY KEY (StudentID, CourseID),
    FOREIGN KEY (StudentID) REFERENCES Student(StudentID),
    FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);

Scenario 2: Flattened Hierarchies

When hierarchical data is flattened into a single table with a composite key representing the path:

Flattened Hierarchy Anti-Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- VIOLATION: Department info repeated for each employee
CREATE TABLE OrgChart (
    DeptID      INT,
    EmpID       INT,
    DeptName    VARCHAR(50),   -- Partial dependency on DeptID
    DeptBudget  DECIMAL(12,2), -- Partial dependency on DeptID
    EmpName     VARCHAR(100),  -- Partial dependency on EmpID
    EmpSalary   DECIMAL(10,2), -- Partial dependency on EmpID
    EmpRole     VARCHAR(50),   -- Full dependency (role in this dept)
    PRIMARY KEY (DeptID, EmpID)
);
 
-- Note: If employees can only be in one department,
-- then EmpName and EmpSalary also partially depend on EmpID

Scenario 3: Time-Series Data with Entity Attributes

Time-Series Anti-Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
-- VIOLATION: Sensor info repeated for each reading
CREATE TABLE SensorReading (
    SensorID      INT,
    ReadingTime   TIMESTAMP,
    SensorType    VARCHAR(50),   -- Partial dependency on SensorID
    SensorLocation VARCHAR(100), -- Partial dependency on SensorID
    Temperature   DECIMAL(5,2),  -- Full dependency on (SensorID, ReadingTime)
    Humidity      DECIMAL(5,2),  -- Full dependency on (SensorID, ReadingTime)
    PRIMARY KEY (SensorID, ReadingTime)
);
 
-- Every reading for Sensor #1 repeats "Temperature Sensor" 
-- and "Building A, Room 101" unnecessarily

Watch for These Patterns

Using SQL to Detect Violations

Technique 1: Detect Single-Column Determinants

For a table with composite key (A, B), check if column C has the same value for all rows with the same A value:

SQL: Detect Partial Dependency on Key Part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Potential partial dependency: OrderID → CustomerName
-- If true, each OrderID should have exactly one distinct CustomerName
 
SELECT OrderID, COUNT(DISTINCT CustomerName) AS DistinctCustomers
FROM OrderLine
GROUP BY OrderID
HAVING COUNT(DISTINCT CustomerName) > 1;
 
-- If this returns 0 rows, CustomerName likely depends on OrderID alone
-- Combined with domain knowledge (orders have one customer), 
-- this confirms the partial dependency
 
-- Repeat for other suspected partial dependencies:
SELECT ProductID, COUNT(DISTINCT ProductName) AS DistinctNames
FROM OrderLine
GROUP BY ProductID
HAVING COUNT(DISTINCT ProductName) > 1;

Technique 2: Measure Repetition Factor

Quantify how much redundancy exists due to suspected partial dependencies:

SQL: Measure Redundancy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- How many times is each CustomerName repeated in OrderLine?
SELECT 
    OrderID,
    CustomerName,
    COUNT(*) AS TimesRepeated
FROM OrderLine
GROUP BY OrderID, CustomerName
ORDER BY TimesRepeated DESC;
 
-- Summary: Total storage waste from CustomerName repetition
SELECT 
    'CustomerName' AS Attribute,
    COUNT(*) AS TotalRows,
    COUNT(DISTINCT OrderID) AS UniqueDeterminants,
    COUNT(*) - COUNT(DISTINCT OrderID) AS WastedRepeats,
    ROUND(100.0 * (COUNT(*) - COUNT(DISTINCT OrderID)) / COUNT(*), 2) AS WastePercentage
FROM OrderLine;
 
-- Example output might show:
-- Attribute: CustomerName
-- TotalRows: 10000
-- UniqueDeterminants: 1500
-- WastedRepeats: 8500
-- WastePercentage: 85.00%  (85% of CustomerName storage is redundant!)

Technique 3: Cross-Reference Check

Verify that suspected partial dependencies are consistent (no anomalies already present):

SQL: Check for Existing Anomalies
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Are there any inconsistencies in ProductName for the same ProductID?
-- (Would indicate either not a valid FD or data corruption)
 
SELECT ProductID, ProductName, COUNT(*) AS Occurrences
FROM OrderLine
GROUP BY ProductID, ProductName
ORDER BY ProductID;
 
-- If a ProductID appears with multiple ProductNames, either:
-- 1. ProductID → ProductName is NOT a valid FD, OR
-- 2. Data corruption has already occurred
 
-- Find specific violations:
SELECT ProductID
FROM OrderLine
GROUP BY ProductID
HAVING COUNT(DISTINCT ProductName) > 1;

SQL Detection Limitations

Handling Multiple Candidate Keys

When a relation has multiple candidate keys, you must check for partial dependencies against EACH composite candidate key.

Example: Relation with Multiple Keys

Relation: Employee(SSN, EmpNumber, Name, Department, DeptLocation)

Candidate Keys:
  {SSN}           -- Single attribute (no partial deps possible)
  {EmpNumber}     -- Single attribute (no partial deps possible)
  {Department, Name}  -- Composite (partial deps possible!)

Functional Dependencies:
  SSN → EmpNumber, Name, Department, DeptLocation
  EmpNumber → SSN, Name, Department, DeptLocation
  {Department, Name} → SSN, EmpNumber, DeptLocation
  Department → DeptLocation

Analysis:

The first two candidate keys are single-attribute → no partial deps possible
The third candidate key {Department, Name} is composite
- Proper subsets: {Department}, {Name}
- Check: Does {Department} → any non-prime attribute?
- {Department} → DeptLocation ✓ YES!

But wait: What is the non-prime attribute?

Prime attributes: SSN, EmpNumber, Department, Name (all in some candidate key)
Non-prime attributes: DeptLocation only

Check: {Department} → DeptLocation, and DeptLocation is non-prime

Conclusion: This relation has a partial dependency:

{Department} is a proper subset of candidate key {Department, Name}
{Department} → DeptLocation (where DeptLocation is non-prime)
Therefore, the relation is NOT in 2NF

All Candidate Keys Matter

Systematic Approach for Multiple Keys:

List ALL candidate keys
Separate single-attribute keys (automatically safe) from composite keys
Identify non-prime attributes (not in ANY candidate key)
For EACH composite candidate key, check all its proper subsets against each non-prime attribute
If ANY check reveals a partial dependency, the relation violates 2NF

Edge Cases and Tricky Situations

Let's examine scenarios that often cause confusion during 2NF analysis:

Edge Case 1: Trivial Dependencies

Every attribute trivially depends on itself: A → A. This is NOT a 2NF violation because:

The attribute A is part of the candidate key (prime attribute)
2NF only concerns non-prime attributes

Edge Case 2: Dependencies Between Non-Key Attributes

Consider: R(A, B, C, D) with key {A, B} and FD C → D

Edge Case Decision Table
Dependency	Key	Type	2NF Issue?
{A} → C	{A, B}	Partial (A is subset of key, C is non-prime)	YES — 2NF violation
{A, B} → C	{A, B}	Full (entire key)	NO — correct
C → D	{A, B}	Transitive (C is not part of key)	NO — 3NF concern
{A} → B	{A, B}	Prime to prime	NO — only non-prime matters
{A, B} → A	{A, B}	Trivial	NO — not a real dependency

Edge Case 3: Overlapping Candidate Keys

When candidate keys share attributes:

R(A, B, C, D)
Candidate Keys: {A, B}, {A, C}

Prime attributes: A, B, C Non-prime attributes: D only

Proper subsets to check:

From {A, B}: {A}, {B}
From {A, C}: {A}, {C}

Note that {A} appears twice, but we only need to check it once.

Edge Case 4: All Attributes Are Prime

R(A, B, C)
Candidate Keys: {A, B}, {C}

All three attributes are in at least one candidate key. There are NO non-prime attributes. Therefore, the relation is automatically in 2NF regardless of any other dependencies.

When in Doubt, Follow the Algorithm

Edge cases can be confusing. When uncertain, fall back to the systematic algorithm:

Find all candidate keys
Identify non-prime attributes
Check if any key subset determines any non-prime attribute The algorithm will always give the correct answer.

Summary: Mastering 2NF Violation Detection

You now have a complete toolkit for identifying 2NF violations. Let's consolidate:

Key Takeaways

•Systematic Algorithm: Use the DETECT-2NF-VIOLATIONS algorithm for guaranteed complete detection. Compute closures of all proper key subsets.
•Visual Patterns: Look for entity mixing, repeated non-key values, and attribute name prefixes that match key components.
•Common Scenarios: Watch for junction tables with entity attributes, flattened hierarchies, and time-series data with repeated entity info.
•SQL Analysis: Use COUNT(DISTINCT) queries to empirically detect deterministic relationships suggesting partial dependencies.
•Multiple Keys: Check ALL composite candidate keys, not just the primary key.
•Edge Cases: Focus only on non-prime attributes. Dependencies between primes or non-key determinants are not 2NF concerns.

What's Next:

Detection Skills Acquired

3 / 5