Database Management SystemsFunctional Dependencies

Functional Dependency Concept

LevelIntermediate

Duration60 mins

TopicFunctional Dependencies

1 / 5

FD Definition

The Mathematical Heart of Database Design

Beneath every well-designed relational database lies a hidden mathematical structure—a set of rules that govern how attributes relate to one another. These rules, called Functional Dependencies (FDs), are the invisible scaffolding upon which data integrity, normalization theory, and query optimization are built.

Functional dependencies aren't merely academic abstractions. They're the reason your banking database doesn't allow duplicate account numbers with different balances. They're why a properly designed system prevents an employee from having two different salaries in the same pay period. They're what database engines use to determine whether an index will make your query faster.

Without understanding functional dependencies, you're designing databases by intuition rather than science—and intuition fails at scale.

What You Will Learn

By the end of this page, you will understand the formal definition of functional dependency, recognize why this mathematical concept is essential for database design, and be able to identify functional dependencies in real-world data scenarios. This knowledge forms the foundation for all normalization theory that follows.

Building Intuition: What Does 'Functional' Mean?

Before diving into formal definitions, let's build intuition. The word functional comes from mathematics, where a function maps each input to exactly one output. If you know the input, you can determine the output—there's no ambiguity.

Consider these everyday examples:

Given a Social Security Number, you can determine exactly one Person's Name (assuming perfect data)
Given an ISBN, you can determine exactly one Book Title
Given an Employee ID, you can determine exactly one Employee's Department
Given a Date of Birth, you cannot determine a Person's Name (many people share birthdays)

The first three examples exhibit functional dependencies: knowing one attribute value lets you uniquely determine another. The last example does not—the relationship is many-to-many, not deterministic.

The Function Analogy

Think of functional dependency like a mathematical function f(x) = y. If you plug in the same x, you always get the same y. In database terms: if you have the same SSN, you must have the same PersonName. The SSN 'determines' or 'functionally determines' the PersonName.

Key Insight: Determinism, Not Correlation

Functional dependency is about determinism, not correlation. Just because two attributes often appear together doesn't mean one determines the other. Consider:

Height and Weight are correlated (taller people tend to weigh more)
But Height does not functionally determine Weight—the same height corresponds to many weights

In contrast:

StudentID functionally determines StudentEmail in a university database
Every student ID maps to exactly one email address (by policy)

This distinction is crucial. Database theory deals with certainties, not probabilities.

The Formal Definition of Functional Dependency

Now we're ready for the formal definition that appears in every database textbook. Understanding this precisely is essential—normalization algorithms depend on it.

Definition:

Let R be a relation schema with attributes {A₁, A₂, ..., Aₙ}. Let X and Y be subsets of these attributes (X ⊆ R and Y ⊆ R).

We say that X functionally determines Y, written X → Y, if and only if:

For any two tuples t₁ and t₂ in any valid instance r of R: If t₁[X] = t₂[X], then t₁[Y] = t₂[Y]

In plain English: whenever two rows have the same values for attributes X, they must also have the same values for attributes Y.

Unpacking the Definition

This definition has three critical implications:

Universal Quantification: The condition must hold for ALL pairs of tuples—no exceptions. A single counter-example violates the dependency.
Instance Independence: The dependency is a constraint on ALL valid instances of the relation, not just the current data. Future data must also satisfy it.
Set-Based: X and Y can be single attributes or sets of attributes. {StudentID, CourseID} → {Grade} is valid—both StudentID AND CourseID together determine the Grade.

fd_definition_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Consider this EMPLOYEE relation:
-- EMPLOYEE(EmpID, EmpName, DeptID, DeptName, Salary)
 
-- Functional Dependencies that likely hold:
-- EmpID → EmpName         (Employee ID determines Employee Name)
-- EmpID → DeptID          (Employee ID determines Department ID)  
-- EmpID → Salary          (Employee ID determines Salary)
-- DeptID → DeptName       (Department ID determines Department Name)
 
-- Functional Dependency that does NOT hold:
-- DeptID → Salary         (Many employees in same dept have different salaries)
-- EmpName → EmpID         (Two employees could share the same name)
 
-- Verifying a Functional Dependency (conceptually):
-- For EmpID → EmpName to hold, this query must return 0 rows:
SELECT e1.EmpID, e1.EmpName, e2.EmpName
FROM EMPLOYEE e1
JOIN EMPLOYEE e2 ON e1.EmpID = e2.EmpID
WHERE e1.EmpName <> e2.EmpName;
 
-- If any rows are returned, we'd have the same EmpID 
-- with different EmpNames, violating the FD

Mathematical Precision Matters

Notice the careful phrasing: for any two tuples. This means:

We're comparing every possible pair of rows in the relation
We're checking: do they agree on X? If yes, they must agree on Y
If they disagree on X, the condition is trivially satisfied (the antecedent is false)

This last point is subtle but important. The dependency X → Y says nothing about tuples that differ on X. It only constrains tuples that match on X.

Visualizing Functional Dependencies

Let's make functional dependencies concrete with a tabular example. Consider this instance of a COURSE_ENROLLMENT relation:

COURSE_ENROLLMENT Relation Instance
StudentID	StudentName	CourseID	CourseName	Instructor	Grade
S001	Alice Chen	CS101	Intro to Programming	Dr. Smith	A
S001	Alice Chen	CS201	Data Structures	Dr. Jones	A-
S002	Bob Wilson	CS101	Intro to Programming	Dr. Smith	B+
S002	Bob Wilson	CS301	Algorithms	Dr. Brown	B
S003	Carol Davis	CS201	Data Structures	Dr. Jones	A
S003	Carol Davis	CS101	Intro to Programming	Dr. Smith	A-

Analyzing Potential Functional Dependencies:

Does StudentID → StudentName hold?

Let's check all pairs of tuples with matching StudentID:

Tuples 1 & 2: StudentID = S001, StudentName = 'Alice Chen' in both ✓
Tuples 3 & 4: StudentID = S002, StudentName = 'Bob Wilson' in both ✓
Tuples 5 & 6: StudentID = S003, StudentName = 'Carol Davis' in both ✓

Yes, StudentID → StudentName holds in this instance.

Does CourseID → Instructor hold?

Check tuples with matching CourseID:

Tuples 1, 3, 6: CourseID = CS101, Instructor = 'Dr. Smith' in all ✓
Tuples 2, 5: CourseID = CS201, Instructor = 'Dr. Jones' in both ✓
Tuple 4: CourseID = CS301, only one tuple (trivially satisfied)

Yes, CourseID → Instructor holds in this instance.

Does StudentID → Grade hold?

Check tuples with matching StudentID:

Tuples 1 & 2: StudentID = S001, but Grade = 'A' vs 'A-' ✗

No, StudentID → Grade does NOT hold. A student can have different grades in different courses.

Instance vs. Schema Constraints

Critical distinction: An FD holding in a particular instance doesn't guarantee it's a true constraint. Perhaps we haven't yet observed a counter-example. True FDs come from understanding the MEANING of the data—the real-world constraints the database models. More on this in later pages.

Essential Properties of Functional Dependencies

Functional dependencies have several important properties that directly impact how we reason about database design. Understanding these properties sets the stage for Armstrong's Axioms (covered in the next module).

Fundamental FD Properties

•Reflexivity (Trivial FDs) — If Y ⊆ X, then X → Y always holds. Example: {EmpID, EmpName} → {EmpID}. This is 'trivial' because Y is already part of X.
•Left-Side Supersets — If X → Y holds and Z is any set of attributes, then XZ → Y also holds. Adding attributes to the left side preserves the dependency. Example: If SSN → Name, then {SSN, BirthDate} → Name.
•Non-Symmetry — X → Y does NOT imply Y → X. Example: StudentID → Name doesn't mean Name → StudentID (many students can share names).
•Right-Side Unions — If X → Y and X → Z, then X → YZ. If SSN determines Name AND SSN determines BirthDate, then SSN determines both together.
•Right-Side Decomposition — If X → YZ, then X → Y and X → Z. The union property works in reverse too.

fd_properties_demo.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Demonstrating FD Properties with EMPLOYEE(EmpID, Name, DeptID, DeptName, Salary)
 
-- REFLEXIVITY (Trivial FD):
-- {EmpID, Name} → {EmpID} 
-- Always true because EmpID is part of the determining set
 
-- LEFT-SIDE SUPERSET:
-- Given: EmpID → Name
-- Then: {EmpID, DeptID} → Name (adding DeptID preserves the FD)
-- Proof: If two rows match on both EmpID AND DeptID,
--        they must match on EmpID alone,
--        therefore they must match on Name.
 
-- NON-SYMMETRY Example:
-- EmpID → DeptID  (employee works in one department)
-- DeptID → EmpID  DOES NOT HOLD (department has many employees)
 
-- RIGHT-SIDE UNION:
-- Given: EmpID → Name AND EmpID → DeptID
-- Therefore: EmpID → {Name, DeptID}
-- One EmpID determines both attributes together
 
-- RIGHT-SIDE DECOMPOSITION:
-- Given: EmpID → {Name, DeptID, Salary}
-- Therefore: 
--   EmpID → Name
--   EmpID → DeptID  
--   EmpID → Salary

Why These Properties Matter

These aren't abstract mathematical curiosities. They're tools for reasoning about FDs. When you're analyzing a schema, you'll use right-side decomposition to simplify complex FDs and left-side supersets to understand composite keys. Armstrong's Axioms (next module) formalize these into a complete reasoning system.

Trivial vs. Non-Trivial Functional Dependencies

An important distinction in FD theory separates trivial from non-trivial dependencies. This distinction affects normalization and is tested in database design problems.

Trivial FDs

•Definition: X → Y is trivial if Y ⊆ X
•The right side is a subset of the left side
•Always true by definition—cannot be violated
•Example: {A, B} → {A}
•Example: {A, B} → {B}
•Example: {A, B} → {A, B}
•Provide no real constraint on data
•Generally ignored in normalization

Non-Trivial FDs

•Definition: X → Y is non-trivial if Y ⊄ X (at least one attribute in Y is not in X)
•The right side contains something 'new'
•Represents a genuine data constraint
•Example: {StudentID} → {Name}
•Example: {A, B} → {C}
•Example: {A, B} → {A, C} (partially non-trivial)
•These are the FDs that matter for normalization
•Drive decomposition decisions

Completely Non-Trivial FDs:

An FD X → Y is completely non-trivial if X ∩ Y = ∅ (no overlap between left and right sides).

{StudentID} → {Name, Email} is completely non-trivial
{A, B} → {B, C} is NON-trivial but NOT completely non-trivial (B appears on both sides)

When analyzing FDs for normalization, we often focus on completely non-trivial dependencies because they represent genuine constraints without redundancy.

Practical Relevance

In practice, when we list FDs for a schema, we almost always mean non-trivial FDs. Saying 'SSN → SSN' provides no information. When database theory references 'the set of FDs for relation R,' assume non-trivial FDs unless stated otherwise.

The Relationship Between FDs and Keys

Functional dependencies and keys are deeply connected. In fact, the formal definition of a key relies entirely on functional dependencies.

Definition: Superkey

A set of attributes K is a superkey of relation R if K functionally determines ALL attributes of R:

K → R (equivalently, K → {all attributes in R})

Definition: Candidate Key

A candidate key is a minimal superkey—removing any attribute from it would cause it to stop being a superkey.

Definition: Primary Key

A primary key is a candidate key chosen by the database designer as the main identifier for tuples.

Keys Defined Through Functional Dependencies
Concept	FD Definition	Example
Superkey	K → R (K determines all attributes)	In EMPLOYEE, {EmpID, Name} is a superkey because it determines everything
Candidate Key	Minimal K such that K → R	In EMPLOYEE, {EmpID} is a candidate key—minimal, determines all
Primary Key	Chosen candidate key	EmpID designated as the primary key
Alternate Key	Non-chosen candidate keys	If both SSN and EmpID are candidate keys, the one not chosen as PK

Key Insight: Keys Are Special FDs

When we declare a primary key, we're implicitly declaring a functional dependency:

CREATE TABLE Employee (
    EmpID INT PRIMARY KEY,
    Name VARCHAR(100),
    ...
);

This implicitly states: EmpID → {all other attributes}.

The database enforces this FD by preventing duplicate EmpID values. If two rows could have the same EmpID, the FD would be violated.

Composite Keys and FDs:

For a composite primary key like (StudentID, CourseID):

CREATE TABLE Enrollment (
    StudentID INT,
    CourseID INT,
    Grade CHAR(2),
    PRIMARY KEY (StudentID, CourseID)
);

The implicit FD is: {StudentID, CourseID} → {Grade}. Neither StudentID alone nor CourseID alone determines Grade—only their combination does.

Keys Don't Capture All FDs

Declaring a primary key captures the 'main' FD but not all FDs. If your table has DeptID and DeptName where DeptID → DeptName, this FD exists regardless of what you chose as the primary key. These 'extra' FDs cause normalization problems. Understanding all FDs—not just key-based ones—is essential for proper design.

Why Functional Dependencies Matter

Functional dependencies aren't just theoretical constructs—they're practical tools that solve real database design problems. Understanding their importance motivates the normalization theory we'll study next.

Practical Applications of Functional Dependencies

•Detecting Redundancy — FDs reveal when the same information is stored multiple times. If DeptID → DeptName and you store both in EMPLOYEE, every employee in Accounting repeats 'Accounting' as their DeptName. This is the redundancy that normalization eliminates.
•Preventing Update Anomalies — Without proper FD analysis, updating DeptName for one employee but not others creates inconsistency. FDs identify these risks before they become production bugs.
•Guiding Decomposition — Normalization algorithms use FDs to decide how to split tables. The FD DeptID → DeptName tells us: extract a separate DEPARTMENT table with DeptID as key.
•Defining Integrity Constraints — Every FD is a constraint the DBMS should enforce. FDs inform what indexes to create, what uniqueness constraints to add, and how triggers should validate data.
•Query Optimization — Database optimizers use FDs to simplify queries. If A → B and you GROUP BY A, the optimizer knows B is also determined—no need for aggregation functions on B.
•Schema Documentation — FDs formally document the meaning of data. A set of FDs is a precise specification that any developer can use to understand relationships.

fd_practical_impact.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
-- Example: FD-Based Redundancy Analysis
-- Original table with redundancy:
CREATE TABLE EmployeeProject (
    EmpID INT,
    EmpName VARCHAR(100),
    DeptID INT,
    DeptName VARCHAR(100),
    ProjectID INT,
    ProjectBudget DECIMAL(12,2),
    PRIMARY KEY (EmpID, ProjectID)
);
 
-- FDs in this table:
-- EmpID → EmpName, DeptID        (Employee info)
-- DeptID → DeptName               (Department info - REDUNDANT!)
-- ProjectID → ProjectBudget       (Project info - REDUNDANT!)
-- {EmpID, ProjectID} → Grade      (Assignment-specific)
 
-- Problem: If employee E001 works on 5 projects, we store:
-- - EmpName 5 times (redundancy from EmpID → EmpName)
-- - DeptID 5 times (redundancy from EmpID → DeptID)  
-- - DeptName 5 times (redundancy, depends on DeptID)
 
-- Proper decomposition based on FD analysis:
CREATE TABLE Employee (
    EmpID INT PRIMARY KEY,
    EmpName VARCHAR(100),
    DeptID INT -- Foreign key
);
 
CREATE TABLE Department (
    DeptID INT PRIMARY KEY,
    DeptName VARCHAR(100)
);
 
CREATE TABLE Project (
    ProjectID INT PRIMARY KEY,
    ProjectBudget DECIMAL(12,2)
);
 
CREATE TABLE Assignment (
    EmpID INT,
    ProjectID INT,
    -- Only assignment-specific data here
    PRIMARY KEY (EmpID, ProjectID)
);

Summary: The Foundation of FD Theory

We've established the fundamental concept of functional dependencies. Let's consolidate what we've learned:

Key Takeaways

•Functional Dependency Defined — X → Y means whenever two tuples agree on X, they must agree on Y. It's deterministic, not probabilistic.
•Function Analogy — Like a mathematical function, the determinant (left side) uniquely determines the dependent (right side). Same input always yields same output.
•Trivial vs Non-Trivial — Trivial FDs (Y ⊆ X) always hold and provide no constraint. Non-trivial FDs represent real data relationships.
•FDs and Keys — Keys are defined via FDs. A candidate key functionally determines all other attributes. Declaring a PK implicitly declares an FD.
•Instance vs. Constraint — An FD holding in current data doesn't prove it's a true constraint. True FDs come from understanding what the data means.
•Practical Importance — FDs drive redundancy detection, normalization, constraint enforcement, and query optimization. They're the language of database design theory.

What's Next:

With the definition established, we'll next explore the notation X → Y in depth—learning to read, write, and interpret functional dependencies precisely. This notation is the universal language used across all database literature and essential for working with normalization algorithms.

Page Complete

You now understand the formal definition of functional dependency—the mathematical foundation of database normalization. This concept will appear in every subsequent topic: Armstrong's Axioms, attribute closure, normal forms, and decomposition algorithms all build on this definition. Keep it sharp.

1 / 5

Loading learning content...

Database Management SystemsFunctional Dependencies

Functional Dependency Concept

LevelIntermediate

Duration60 mins

TopicFunctional Dependencies

1 / 5

FD Definition

The Mathematical Heart of Database Design

Without understanding functional dependencies, you're designing databases by intuition rather than science—and intuition fails at scale.

What You Will Learn

Building Intuition: What Does 'Functional' Mean?

Consider these everyday examples:

Given a Social Security Number, you can determine exactly one Person's Name (assuming perfect data)
Given an ISBN, you can determine exactly one Book Title
Given an Employee ID, you can determine exactly one Employee's Department
Given a Date of Birth, you cannot determine a Person's Name (many people share birthdays)

The Function Analogy

Key Insight: Determinism, Not Correlation

Functional dependency is about determinism, not correlation. Just because two attributes often appear together doesn't mean one determines the other. Consider:

Height and Weight are correlated (taller people tend to weigh more)
But Height does not functionally determine Weight—the same height corresponds to many weights

In contrast:

StudentID functionally determines StudentEmail in a university database
Every student ID maps to exactly one email address (by policy)

This distinction is crucial. Database theory deals with certainties, not probabilities.

The Formal Definition of Functional Dependency

Now we're ready for the formal definition that appears in every database textbook. Understanding this precisely is essential—normalization algorithms depend on it.

Definition:

Let R be a relation schema with attributes {A₁, A₂, ..., Aₙ}. Let X and Y be subsets of these attributes (X ⊆ R and Y ⊆ R).

We say that X functionally determines Y, written X → Y, if and only if:

For any two tuples t₁ and t₂ in any valid instance r of R: If t₁[X] = t₂[X], then t₁[Y] = t₂[Y]

In plain English: whenever two rows have the same values for attributes X, they must also have the same values for attributes Y.

Unpacking the Definition

This definition has three critical implications:

Universal Quantification: The condition must hold for ALL pairs of tuples—no exceptions. A single counter-example violates the dependency.
Instance Independence: The dependency is a constraint on ALL valid instances of the relation, not just the current data. Future data must also satisfy it.
Set-Based: X and Y can be single attributes or sets of attributes. {StudentID, CourseID} → {Grade} is valid—both StudentID AND CourseID together determine the Grade.

fd_definition_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Consider this EMPLOYEE relation:
-- EMPLOYEE(EmpID, EmpName, DeptID, DeptName, Salary)
 
-- Functional Dependencies that likely hold:
-- EmpID → EmpName         (Employee ID determines Employee Name)
-- EmpID → DeptID          (Employee ID determines Department ID)  
-- EmpID → Salary          (Employee ID determines Salary)
-- DeptID → DeptName       (Department ID determines Department Name)
 
-- Functional Dependency that does NOT hold:
-- DeptID → Salary         (Many employees in same dept have different salaries)
-- EmpName → EmpID         (Two employees could share the same name)
 
-- Verifying a Functional Dependency (conceptually):
-- For EmpID → EmpName to hold, this query must return 0 rows:
SELECT e1.EmpID, e1.EmpName, e2.EmpName
FROM EMPLOYEE e1
JOIN EMPLOYEE e2 ON e1.EmpID = e2.EmpID
WHERE e1.EmpName <> e2.EmpName;
 
-- If any rows are returned, we'd have the same EmpID 
-- with different EmpNames, violating the FD

Mathematical Precision Matters

Notice the careful phrasing: for any two tuples. This means:

We're comparing every possible pair of rows in the relation
We're checking: do they agree on X? If yes, they must agree on Y
If they disagree on X, the condition is trivially satisfied (the antecedent is false)

This last point is subtle but important. The dependency X → Y says nothing about tuples that differ on X. It only constrains tuples that match on X.

Visualizing Functional Dependencies

Let's make functional dependencies concrete with a tabular example. Consider this instance of a COURSE_ENROLLMENT relation:

COURSE_ENROLLMENT Relation Instance
StudentID	StudentName	CourseID	CourseName	Instructor	Grade
S001	Alice Chen	CS101	Intro to Programming	Dr. Smith	A
S001	Alice Chen	CS201	Data Structures	Dr. Jones	A-
S002	Bob Wilson	CS101	Intro to Programming	Dr. Smith	B+
S002	Bob Wilson	CS301	Algorithms	Dr. Brown	B
S003	Carol Davis	CS201	Data Structures	Dr. Jones	A
S003	Carol Davis	CS101	Intro to Programming	Dr. Smith	A-

Analyzing Potential Functional Dependencies:

Does StudentID → StudentName hold?

Let's check all pairs of tuples with matching StudentID:

Tuples 1 & 2: StudentID = S001, StudentName = 'Alice Chen' in both ✓
Tuples 3 & 4: StudentID = S002, StudentName = 'Bob Wilson' in both ✓
Tuples 5 & 6: StudentID = S003, StudentName = 'Carol Davis' in both ✓

Yes, StudentID → StudentName holds in this instance.

Does CourseID → Instructor hold?

Check tuples with matching CourseID:

Tuples 1, 3, 6: CourseID = CS101, Instructor = 'Dr. Smith' in all ✓
Tuples 2, 5: CourseID = CS201, Instructor = 'Dr. Jones' in both ✓
Tuple 4: CourseID = CS301, only one tuple (trivially satisfied)

Yes, CourseID → Instructor holds in this instance.

Does StudentID → Grade hold?

Check tuples with matching StudentID:

Tuples 1 & 2: StudentID = S001, but Grade = 'A' vs 'A-' ✗

No, StudentID → Grade does NOT hold. A student can have different grades in different courses.

Instance vs. Schema Constraints

Essential Properties of Functional Dependencies

Fundamental FD Properties

•Reflexivity (Trivial FDs) — If Y ⊆ X, then X → Y always holds. Example: {EmpID, EmpName} → {EmpID}. This is 'trivial' because Y is already part of X.
•Left-Side Supersets — If X → Y holds and Z is any set of attributes, then XZ → Y also holds. Adding attributes to the left side preserves the dependency. Example: If SSN → Name, then {SSN, BirthDate} → Name.
•Non-Symmetry — X → Y does NOT imply Y → X. Example: StudentID → Name doesn't mean Name → StudentID (many students can share names).
•Right-Side Unions — If X → Y and X → Z, then X → YZ. If SSN determines Name AND SSN determines BirthDate, then SSN determines both together.
•Right-Side Decomposition — If X → YZ, then X → Y and X → Z. The union property works in reverse too.

fd_properties_demo.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Demonstrating FD Properties with EMPLOYEE(EmpID, Name, DeptID, DeptName, Salary)
 
-- REFLEXIVITY (Trivial FD):
-- {EmpID, Name} → {EmpID} 
-- Always true because EmpID is part of the determining set
 
-- LEFT-SIDE SUPERSET:
-- Given: EmpID → Name
-- Then: {EmpID, DeptID} → Name (adding DeptID preserves the FD)
-- Proof: If two rows match on both EmpID AND DeptID,
--        they must match on EmpID alone,
--        therefore they must match on Name.
 
-- NON-SYMMETRY Example:
-- EmpID → DeptID  (employee works in one department)
-- DeptID → EmpID  DOES NOT HOLD (department has many employees)
 
-- RIGHT-SIDE UNION:
-- Given: EmpID → Name AND EmpID → DeptID
-- Therefore: EmpID → {Name, DeptID}
-- One EmpID determines both attributes together
 
-- RIGHT-SIDE DECOMPOSITION:
-- Given: EmpID → {Name, DeptID, Salary}
-- Therefore: 
--   EmpID → Name
--   EmpID → DeptID  
--   EmpID → Salary

Why These Properties Matter

Trivial vs. Non-Trivial Functional Dependencies

An important distinction in FD theory separates trivial from non-trivial dependencies. This distinction affects normalization and is tested in database design problems.

Trivial FDs

•Definition: X → Y is trivial if Y ⊆ X
•The right side is a subset of the left side
•Always true by definition—cannot be violated
•Example: {A, B} → {A}
•Example: {A, B} → {B}
•Example: {A, B} → {A, B}
•Provide no real constraint on data
•Generally ignored in normalization

Non-Trivial FDs

•Definition: X → Y is non-trivial if Y ⊄ X (at least one attribute in Y is not in X)
•The right side contains something 'new'
•Represents a genuine data constraint
•Example: {StudentID} → {Name}
•Example: {A, B} → {C}
•Example: {A, B} → {A, C} (partially non-trivial)
•These are the FDs that matter for normalization
•Drive decomposition decisions

Completely Non-Trivial FDs:

An FD X → Y is completely non-trivial if X ∩ Y = ∅ (no overlap between left and right sides).

{StudentID} → {Name, Email} is completely non-trivial
{A, B} → {B, C} is NON-trivial but NOT completely non-trivial (B appears on both sides)

When analyzing FDs for normalization, we often focus on completely non-trivial dependencies because they represent genuine constraints without redundancy.

Practical Relevance

The Relationship Between FDs and Keys

Functional dependencies and keys are deeply connected. In fact, the formal definition of a key relies entirely on functional dependencies.

Definition: Superkey

A set of attributes K is a superkey of relation R if K functionally determines ALL attributes of R:

K → R (equivalently, K → {all attributes in R})

Definition: Candidate Key

A candidate key is a minimal superkey—removing any attribute from it would cause it to stop being a superkey.

Definition: Primary Key

A primary key is a candidate key chosen by the database designer as the main identifier for tuples.

Keys Defined Through Functional Dependencies
Concept	FD Definition	Example
Superkey	K → R (K determines all attributes)	In EMPLOYEE, {EmpID, Name} is a superkey because it determines everything
Candidate Key	Minimal K such that K → R	In EMPLOYEE, {EmpID} is a candidate key—minimal, determines all
Primary Key	Chosen candidate key	EmpID designated as the primary key
Alternate Key	Non-chosen candidate keys	If both SSN and EmpID are candidate keys, the one not chosen as PK

Key Insight: Keys Are Special FDs

When we declare a primary key, we're implicitly declaring a functional dependency:

CREATE TABLE Employee (
    EmpID INT PRIMARY KEY,
    Name VARCHAR(100),
    ...
);

This implicitly states: EmpID → {all other attributes}.

The database enforces this FD by preventing duplicate EmpID values. If two rows could have the same EmpID, the FD would be violated.

Composite Keys and FDs:

For a composite primary key like (StudentID, CourseID):

CREATE TABLE Enrollment (
    StudentID INT,
    CourseID INT,
    Grade CHAR(2),
    PRIMARY KEY (StudentID, CourseID)
);

The implicit FD is: {StudentID, CourseID} → {Grade}. Neither StudentID alone nor CourseID alone determines Grade—only their combination does.

Keys Don't Capture All FDs

Why Functional Dependencies Matter

Practical Applications of Functional Dependencies

•Detecting Redundancy — FDs reveal when the same information is stored multiple times. If DeptID → DeptName and you store both in EMPLOYEE, every employee in Accounting repeats 'Accounting' as their DeptName. This is the redundancy that normalization eliminates.
•Preventing Update Anomalies — Without proper FD analysis, updating DeptName for one employee but not others creates inconsistency. FDs identify these risks before they become production bugs.
•Guiding Decomposition — Normalization algorithms use FDs to decide how to split tables. The FD DeptID → DeptName tells us: extract a separate DEPARTMENT table with DeptID as key.
•Defining Integrity Constraints — Every FD is a constraint the DBMS should enforce. FDs inform what indexes to create, what uniqueness constraints to add, and how triggers should validate data.
•Query Optimization — Database optimizers use FDs to simplify queries. If A → B and you GROUP BY A, the optimizer knows B is also determined—no need for aggregation functions on B.
•Schema Documentation — FDs formally document the meaning of data. A set of FDs is a precise specification that any developer can use to understand relationships.

fd_practical_impact.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
-- Example: FD-Based Redundancy Analysis
-- Original table with redundancy:
CREATE TABLE EmployeeProject (
    EmpID INT,
    EmpName VARCHAR(100),
    DeptID INT,
    DeptName VARCHAR(100),
    ProjectID INT,
    ProjectBudget DECIMAL(12,2),
    PRIMARY KEY (EmpID, ProjectID)
);
 
-- FDs in this table:
-- EmpID → EmpName, DeptID        (Employee info)
-- DeptID → DeptName               (Department info - REDUNDANT!)
-- ProjectID → ProjectBudget       (Project info - REDUNDANT!)
-- {EmpID, ProjectID} → Grade      (Assignment-specific)
 
-- Problem: If employee E001 works on 5 projects, we store:
-- - EmpName 5 times (redundancy from EmpID → EmpName)
-- - DeptID 5 times (redundancy from EmpID → DeptID)  
-- - DeptName 5 times (redundancy, depends on DeptID)
 
-- Proper decomposition based on FD analysis:
CREATE TABLE Employee (
    EmpID INT PRIMARY KEY,
    EmpName VARCHAR(100),
    DeptID INT -- Foreign key
);
 
CREATE TABLE Department (
    DeptID INT PRIMARY KEY,
    DeptName VARCHAR(100)
);
 
CREATE TABLE Project (
    ProjectID INT PRIMARY KEY,
    ProjectBudget DECIMAL(12,2)
);
 
CREATE TABLE Assignment (
    EmpID INT,
    ProjectID INT,
    -- Only assignment-specific data here
    PRIMARY KEY (EmpID, ProjectID)
);

Summary: The Foundation of FD Theory

We've established the fundamental concept of functional dependencies. Let's consolidate what we've learned:

Key Takeaways

•Functional Dependency Defined — X → Y means whenever two tuples agree on X, they must agree on Y. It's deterministic, not probabilistic.
•Function Analogy — Like a mathematical function, the determinant (left side) uniquely determines the dependent (right side). Same input always yields same output.
•Trivial vs Non-Trivial — Trivial FDs (Y ⊆ X) always hold and provide no constraint. Non-trivial FDs represent real data relationships.
•FDs and Keys — Keys are defined via FDs. A candidate key functionally determines all other attributes. Declaring a PK implicitly declares an FD.
•Instance vs. Constraint — An FD holding in current data doesn't prove it's a true constraint. True FDs come from understanding what the data means.
•Practical Importance — FDs drive redundancy detection, normalization, constraint enforcement, and query optimization. They're the language of database design theory.

What's Next:

Page Complete

1 / 5