Loading learning content...
Having explored transitive dependencies and their destructive potential, we now arrive at Third Normal Form (3NF)—the normal form specifically designed to eliminate these indirect dependency chains while maintaining practical usability.
Third Normal Form represents a crucial milestone in database normalization. Unlike BCNF (which we'll study later), 3NF strikes a careful balance: it removes transitive dependencies on non-prime attributes while preserving functional dependencies through decomposition. This balance makes 3NF the most commonly used normal form in production database design.
By the end of this page, you will understand multiple equivalent definitions of 3NF, the relationship between 3NF and 2NF, the role of prime and non-prime attributes, and why 3NF remains the practical standard for most database designs. You'll be able to precisely determine whether any relation is in 3NF.
The original definition of Third Normal Form, proposed by E.F. Codd in 1971, is elegant in its simplicity:
Codd's Definition of 3NF:
A relation R is in Third Normal Form if and only if:
- R is in Second Normal Form (2NF), AND
- Every non-prime attribute of R is non-transitively dependent on every candidate key of R
This definition builds directly on what we learned about 2NF and transitive dependencies. Let's unpack each component:
3NF inherits all requirements from 2NF, which inherits all requirements from 1NF. Therefore: Every relation in 3NF is automatically in 2NF, and every relation in 2NF is automatically in 1NF. The normal forms form a strict hierarchy: 1NF ⊃ 2NF ⊃ 3NF.
In Simpler Terms:
A relation is in 3NF if:
Or even simpler: Non-prime attributes must depend on the key, the whole key, and nothing but the key.
This famous mnemonic, often attributed to Bill Kent, captures 1NF, 2NF, and 3NF together:
While Codd's original definition is historically important, database theory has evolved to express 3NF more directly. The modern definition doesn't require checking 2NF as a prerequisite—it captures everything in a single statement:
Modern Definition of 3NF:
A relation R with functional dependencies F is in Third Normal Form if and only if:
For every non-trivial functional dependency X → A in F⁺:
- X is a superkey, OR
- A is a prime attribute (A is part of some candidate key)
This definition is more general and powerful because it doesn't rely on the normal form hierarchy—it directly characterizes 3NF through properties of functional dependencies.
| Term | Definition | Example |
|---|---|---|
| Non-trivial FD | X → A where A ∉ X (A is not contained in X) | StudentID → Name is non-trivial; {A, B} → A is trivial |
| F⁺ (closure) | All FDs that can be derived from F using Armstrong's Axioms | If F = {A → B, B → C}, then F⁺ includes A → C |
| Superkey | An attribute set that functionally determines all other attributes | {StudentID} or {StudentID, Name} in a Student relation |
| Prime attribute | An attribute that appears in at least one candidate key | In R(A, B, C) with keys {A} and {B}, both A and B are prime |
Why This Definition is Equivalent:
Let's verify that the modern definition captures both 2NF and the 3NF transitive dependency condition:
Capturing 2NF (No Partial Dependencies):
Capturing 3NF (No Transitive Dependencies on Non-Prime):
Notice that transitive dependencies to prime attributes are allowed in 3NF! If A is part of a candidate key, then X → A is acceptable even if X is not a superkey. This is a key difference from BCNF, which requires X to be a superkey for ALL non-trivial FDs, regardless of whether A is prime.
The distinction between prime and non-prime attributes is central to understanding 3NF. Let's examine this distinction rigorously.
Definitions:
Prime Attribute: An attribute that is a member of at least one candidate key
Non-Prime Attribute: An attribute that is NOT a member of ANY candidate key
Note that an attribute can be prime by virtue of belonging to just one candidate key, even if it's not in others.
| Relation | Candidate Keys | Prime Attributes | Non-Prime Attributes |
|---|---|---|---|
| Student(SID, SSN, Name, GPA) | {SID}, {SSN} | SID, SSN | Name, GPA |
| Enrollment(SID, CID, Grade) | {SID, CID} | SID, CID | Grade |
| Employee(EmpID, DeptID, Salary, DeptName) | {EmpID} | EmpID | DeptID, Salary, DeptName |
| Flight(FlightNo, Date, Pilot, CoPilot) | {FlightNo, Date} | FlightNo, Date | Pilot, CoPilot |
A primary key is always a candidate key, but not all candidate key attributes are in the primary key. An attribute is prime if it's in ANY candidate key, not just the primary key. When determining 3NF, you must consider ALL candidate keys.
Why the Distinction Matters for 3NF:
The 3NF definition treats prime and non-prime attributes differently:
Non-prime attributes must depend directly on a superkey. Any transitive dependency through a non-superkey intermediate violates 3NF.
Prime attributes can depend on non-superkey determinants. The reasoning is that dependencies involving prime attributes are "acceptable" because they participate in the key structure of the relation.
This asymmetry is what distinguishes 3NF from BCNF:
While 3NF builds on 2NF, they address different classes of problematic dependencies. Understanding their relationship clarifies what each normal form accomplishes.
The Progression:
1NF: Ensures atomic values and proper relational structure 2NF: Removes partial dependencies (non-prime depends on part of key) 3NF: Removes transitive dependencies (non-prime depends on non-key)
Key Insight: Why 2NF Doesn't Imply 3NF
A relation can be in 2NF (no partial dependencies) but still have transitive dependencies. Consider:
Employee(EmpID, DeptID, DeptName, Salary)
2NF Check: ✓ Passes. With a single-attribute key, partial dependencies cannot exist.
3NF Check: ✗ Fails! EmpID → DeptID → DeptName is a transitive dependency. DeptID is not a superkey, and DeptName is non-prime.
This demonstrates that 2NF handles a different class of problem than 3NF. Both must be achieved for a well-designed relation.
When a relation has only single-attribute candidate keys (no composite keys), it's automatically in 2NF because partial dependencies cannot exist—you can't have a 'part' of a single-attribute key. However, such relations can still have transitive dependencies and thus violate 3NF.
To rigorously determine whether a relation is in Third Normal Form, follow this systematic procedure:
Algorithm: Test for 3NF
Input: Relation R(A₁, A₂, ..., Aₙ), Set of FDs F
Output: TRUE if R is in 3NF, FALSE otherwise (with violating FD)
1. Find all candidate keys of R
- Use attribute closure to identify minimal superkeys
2. Classify attributes as Prime or Non-Prime
- Prime: appears in at least one candidate key
- Non-Prime: all other attributes
3. For each non-trivial FD (X → A) in F (or F⁺):
a. Is X a superkey of R?
- If YES: This FD satisfies 3NF (continue to next FD)
b. Is A a prime attribute?
- If YES: This FD satisfies 3NF (continue to next FD)
c. If NEITHER: This FD violates 3NF
- Return FALSE with (X → A) as the violating FD
4. If all FDs pass: Return TRUE
You don't always need to compute the full closure F⁺. Start by testing only the FDs explicitly in F. If all of those pass the 3NF test, the relation is in 3NF (because derived FDs inherit the property). Only compute additional derived FDs if you need to find candidate keys.
Let's work through several examples to solidify your understanding of the 3NF test procedure.
Example 1: Relation that Passes 3NF
Example 2: Relation that Fails 3NF
Example 3: Relation with Multiple Candidate Keys
Example 3 demonstrates how FDs like EmpID → EmpSSN don't violate 3NF because EmpSSN is prime. This same FD WOULD violate BCNF (because EmpID alone is not a superkey). This is why 3NF is sometimes called a 'relaxation' of BCNF.
Third Normal Form has become the most widely used normal form in production database design. This isn't by accident—3NF achieves a crucial balance that makes it uniquely practical.
The Key Guarantee:
For any relation schema R and set of FDs F, there always exists a decomposition of R that is:
- In 3NF
- Lossless-join (no information lost)
- Dependency-preserving (all FDs can be checked within single relations)
This triple guarantee is remarkable. BCNF, which is "stricter" than 3NF, cannot always provide dependency preservation. Many real-world schemas must sacrifice dependency preservation to achieve BCNF, which has practical consequences for constraint enforcement.
| Property | 3NF | BCNF |
|---|---|---|
| Eliminates transitive dependencies on non-prime | ✓ Yes | ✓ Yes |
| Eliminates transitive dependencies on prime | ✗ No (allowed) | ✓ Yes |
| Lossless decomposition always possible | ✓ Yes | ✓ Yes |
| Dependency-preserving decomposition always possible | ✓ Yes | ✗ Not always |
| Easier to achieve | ✓ Yes | ✗ More restrictive |
| More redundancy possible | ✓ Slightly more | ✗ Minimal |
Why Dependency Preservation Matters:
When FDs are preserved in the decomposition, each FD can be enforced by checking a single relation. Consider:
For most applications, the ability to efficiently enforce constraints outweighs the marginal extra redundancy that 3NF allows compared to BCNF.
Industry Practice:
Most production databases target 3NF as their normalization goal:
3NF typically eliminates 80%+ of data redundancy issues while preserving 100% of functional dependencies. Going to BCNF might eliminate slightly more redundancy but at the cost of dependency preservation and design complexity. For most use cases, 3NF is the optimal stopping point.
We've thoroughly explored the formal definitions of Third Normal Form from multiple perspectives. Let's consolidate the key insights:
What's Next:
Now that you understand the formal definition of Third Normal Form, we need to learn how to identify violations in practice. The next page focuses on systematic techniques for recognizing when relations violate 3NF and understanding the specific dependencies that cause violations.
You now have a rigorous understanding of the 3NF definition from both classic and modern perspectives. You can classify attributes as prime or non-prime, test FDs for 3NF compliance, and understand why 3NF is the practical standard for database normalization.