Database Management SystemsFunctional Dependencies

Functional Dependency Concept

LevelIntermediate

Duration60 mins

TopicFunctional Dependencies

2 / 5

Notation (X → Y)

The Language of Functional Dependencies

Every discipline has its notation—mathematics has equations, music has sheet music, chemistry has molecular formulas. In database theory, the notation X → Y is the fundamental expression for functional dependencies. This seemingly simple arrow carries profound meaning.

Mastering FD notation isn't about memorizing symbols. It's about fluency. When you see {StudentID, CourseID} → {Grade, Semester}, you should instantly understand what constraint this represents, what violations would look like, and why it matters for schema design.

This page transforms you from someone who recognizes FD notation to someone who thinks in it—reading FDs as naturally as reading English sentences.

What You Will Learn

By the end of this page, you will master all aspects of FD notation: reading single and composite attribute dependencies, understanding set-based notation conventions, interpreting shorthand forms, and avoiding common notational mistakes that lead to design errors.

Anatomy of the Arrow: Breaking Down X → Y

The functional dependency notation consists of three elements: the left-hand side (LHS), the arrow, and the right-hand side (RHS). Each carries specific meaning.

Components of FD Notation: X → Y
Component	Name	Also Called	Meaning
X	Left-Hand Side (LHS)	Determinant	The set of attributes whose values determine other attributes
→	Arrow	Functionally Determines	Indicates a deterministic relationship; reads as 'determines'
Y	Right-Hand Side (RHS)	Dependent / Consequent	The set of attributes whose values are determined by X

Reading the Arrow:

The arrow is read as "functionally determines" or simply "determines." Various equivalent readings:

Notation	Readings
`A → B`	"A determines B"
`A → B`	"A functionally determines B"
`A → B`	"B is functionally dependent on A"
`A → B`	"B depends on A"
`A → B`	"The value of A determines the value of B"

All these readings express the same constraint: knowing A's value is sufficient to know B's value.

Directionality Matters

The arrow is directional. A → B does NOT mean B → A. Think of it like a one-way street. StudentID → StudentName means ID determines Name, but the reverse (Name determines ID) is typically false—multiple students can share names.

fd_notation_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- FD Notation Examples with SQL Context
 
-- Simple FDs:
-- SSN → Name
-- Translation: Social Security Number determines Person's Name
-- In a table: if two rows have the same SSN, they MUST have the same Name
 
-- EmpID → DeptID
-- Translation: Employee ID determines Department ID
-- Meaning: Each employee belongs to exactly one department
 
-- ISBN → BookTitle  
-- Translation: ISBN determines Book Title
-- Meaning: Each ISBN maps to exactly one book title
 
-- Reading FDs aloud:
-- "SSN determines Name"
-- "EmpID functionally determines DeptID"
-- "BookTitle is functionally dependent on ISBN"

Single Attributes vs. Composite (Set) Attributes

Functional dependencies can involve single attributes or sets of attributes on either side of the arrow. Understanding this distinction is crucial for correctly expressing real-world constraints.

Single Attribute FDs

•Format: A → B
•One attribute determines another
•Example: EmpID → EmpName
•Example: SSN → BirthDate
•Example: ISBN → Title
•Simplest form of FD
•Common in highly normalized schemas

Composite Attribute FDs

•Format: {A, B} → C or AB → C
•Multiple attributes together determine another
•Example: {StudentID, CourseID} → Grade
•Example: {FlightNo, Date} → Gate
•Example: {City, Street} → ZipCode
•Common in many-to-many relationships
•Neither attribute alone is sufficient

Set Notation Conventions:

When dealing with sets of attributes, different textbooks use different notation:

Notation Style	Example	Meaning
Set braces	`{A, B} → {C, D}`	Explicit set notation, mathematically precise
Concatenation	`AB → CD`	Compact form, common in textbooks
Mixed	`{A, B} → C`	Set on left, single attribute on right

All three mean the same thing: attributes A and B together determine attributes C and D.

Critical Understanding: The LHS is a CONJUNCTION

When we write {StudentID, CourseID} → Grade:

We mean StudentID AND CourseID together determine Grade
Neither StudentID alone nor CourseID alone determines Grade
A student has many grades (different courses)
A course has many grades (different students)
But a specific student in a specific course has ONE grade

Common Mistake: Reading Composite LHS as 'OR'

Never read {A, B} → C as 'A OR B determines C.' It means 'A AND B together determine C.' The combination is required. If either A or B individually determined C, we'd write them as separate FDs: A → C and B → C.

composite_fd_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Composite FD Examples
 
-- Example 1: Course Enrollment
-- {StudentID, CourseID} → Grade
-- A student's grade in a course is determined by the student-course pair
-- StudentID alone? No - same student has different grades in different courses
-- CourseID alone? No - same course has different grades for different students
-- BOTH together? Yes - exactly one grade per student per course
 
-- Example 2: Flight Information  
-- {FlightNumber, DepartureDate} → {Gate, DepartureTime, Status}
-- Gate assignment depends on which flight AND which date
-- Same flight number on different dates may use different gates
 
-- Example 3: Address Components
-- {City, StreetAddress} → ZipCode
-- Two houses on the same street in the same city share a zip code
-- Note: City alone doesn't determine ZipCode (many zips per city)
-- Note: Street alone doesn't determine ZipCode (same street names in different cities)
 
-- Example 4: Multiple Dependent Attributes
-- EmpID → {Name, DeptID, Salary, HireDate}
-- One determinant, multiple dependents
-- This is equivalent to four separate FDs:
--   EmpID → Name
--   EmpID → DeptID
--   EmpID → Salary
--   EmpID → HireDate

Notation Shorthands and Conventions

Database literature uses several notational conventions and shorthands. Recognizing these prevents confusion when reading different textbooks or academic papers.

Common FD Notation Conventions
Convention	Example	Full Form	Notes
Concatenation for sets	AB → C	{A, B} → {C}	Letters next to each other form a set
Single letter attributes	A, B, C	Arbitrary attribute names	Used in theoretical discussions
Uppercase letters	X, Y, Z	Arbitrary attribute sets	Often represent sets of multiple attributes
RHS decomposition	A → BC	A → B and A → C	Multiple dependents listed together
Multiple FDs combined	A → B, C → D	Two separate FDs	Comma separates distinct FDs
Attribute from relation	R.Attr	Attribute Attr from relation R	Used when context needs clarification

Lowercase vs Uppercase Convention:

In many theoretical texts:

Lowercase letters (a, b, c) represent single attributes
Uppercase letters (X, Y, Z) represent sets of attributes (possibly single, possibly multiple)

Thus:

a → b definitely means a single attribute determines another single attribute
X → Y might mean any of: single → single, single → set, set → single, or set → set

RHS Decomposition Equivalence:

A fundamental property: the RHS can always be decomposed.

A → BC  is equivalent to  A → B  AND  A → C

This is formally proven (it's one of Armstrong's derived rules). It means:

Writing EmpID → {Name, DeptID} is shorthand for two FDs
You can always break apart the RHS into individual attributes
This is NOT true for the LHS! AB → C is NOT equivalent to A → C AND B → C

LHS Cannot Be Decomposed

Remember: You CAN decompose the RHS (A → BC means A → B and A → C). You CANNOT decompose the LHS (AB → C does NOT mean A → C or B → C). The distinction is critical for normalization.

Sets of Functional Dependencies (F)

Real database schemas have multiple functional dependencies, collected into a set typically denoted F. Understanding how to work with FD sets is essential for normalization algorithms.

Notation for FD Sets:

An FD set is written with curly braces:

F = {
    A → B,
    A → C,
    B → D,
    {C, D} → E
}

Or more compactly:

F = { A → BC, B → D, CD → E }

Both represent the same constraints using RHS decomposition equivalence.

fd_set_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Example: FD Set for an Order Management Schema
-- Relation: ORDER_ITEM(OrderID, ProductID, CustomerID, CustomerName, 
--                      ProductName, Price, Quantity, OrderDate)
 
-- FD Set F:
-- F = {
--     OrderID → CustomerID,           -- Order placed by one customer
--     OrderID → CustomerName,         -- (derived via transitivity)
--     OrderID → OrderDate,            -- Order has one date
--     CustomerID → CustomerName,      -- Customer has one name
--     ProductID → ProductName,        -- Product has one name
--     ProductID → Price,              -- Product has one price
--     {OrderID, ProductID} → Quantity -- Each product ordered has quantity
-- }
 
-- Compact form using RHS grouping:
-- F = {
--     OrderID → {CustomerID, CustomerName, OrderDate},
--     CustomerID → CustomerName,
--     ProductID → {ProductName, Price},
--     {OrderID, ProductID} → Quantity
-- }
 
-- Note: This schema has redundancy issues that normalization fixes!
-- CustomerName is determined by both OrderID (transitively) and CustomerID

Notation for Discussing FD Sets:

Symbol	Meaning
`F`	A set of functional dependencies
`F ⊨ X → Y`	F logically implies X → Y (X → Y can be derived from F)
`F+` (F-plus)	The closure of F—all FDs that can be derived from F
`X+` (X-plus)	Attribute closure—all attributes determined by X under F
`F ≡ G`	F and G are equivalent (they imply the same FDs)

These symbols appear throughout normalization theory. We'll cover closures and equivalence in depth in later modules.

F+ Can Be Huge

The closure F+ includes all derivable FDs, including trivial ones. For a schema with n attributes, F+ can contain up to 2^n × 2^n FDs theoretically. We never compute F+ explicitly—we use algorithms to test FD membership without enumeration.

Reading Complex FD Expressions

Let's practice reading increasingly complex FD expressions. This fluency is essential for working with normalization algorithms.

Complex FD Expression Practice
Expression	English Reading	Meaning
A → B	A determines B	Knowing A's value tells you B's value
AB → C	A and B together determine C	Both A and B are needed; neither alone suffices
A → BCD	A determines B, C, and D	Equivalent to three separate FDs
AB → CD	A and B together determine C and D	Composite determinant, multiple dependents
A → B, B → C	A determines B; B determines C	Two FDs; note A transitively determines C
ABC → DE, E → F	ABC determines DE; E determines F	Complex FD set with transitivity

Practice Exercise: Real-World Interpretation

Consider a UNIVERSITY_COURSE schema with FD set:

F = {
    CourseID → {CourseName, Credits, DeptID},
    DeptID → DeptName,
    {CourseID, Semester} → {Instructor, Room, Time},
    Instructor → Office
}

Let's interpret each FD:

CourseID → {CourseName, Credits, DeptID}
- Each course has exactly one name, credit count, and home department
- The course 'CS101' will always be 'Intro to Programming' with 3 credits in the CS department
DeptID → DeptName
- Each department ID maps to exactly one department name
- Department 'CS' is always 'Computer Science'
{CourseID, Semester} → {Instructor, Room, Time}
- A course in a specific semester has one assigned instructor, room, and time
- CS101 in Fall 2024 might have Dr. Smith in Room 101 at 10am
- CS101 in Spring 2025 could have different values
Instructor → Office
- Each instructor has exactly one office
- Dr. Smith is always in Office 305

Transitivity in FD Sets

Notice the chain: CourseID → DeptID and DeptID → DeptName. By transitivity, CourseID → DeptName even though it's not explicitly stated. FD sets often contain implicit dependencies. Armstrong's Axioms (next module) formalize this derivation.

Common Notation Mistakes to Avoid

Several notation mistakes repeatedly appear in student work and even professional discussions. Being aware of them prevents errors in database design.

Critical Notation Mistakes

•Mistake 1: Assuming symmetry — Writing A → B and assuming B → A. The arrow is one-directional. StudentID → Name does NOT mean Name → StudentID.
•Mistake 2: Decomposing the LHS — Believing AB → C means A → C or B → C. Wrong! The LHS cannot be decomposed. {StudentID, CourseID} → Grade does NOT mean StudentID → Grade.
•Mistake 3: Confusing FD with correlation — Two attributes appearing together doesn't make an FD. Height and Weight often correlate but neither determines the other.
•Mistake 4: Ignoring set semantics — Reading {A, B} as 'A or B' instead of 'A and B'. The braces denote conjunction, not disjunction.
•Mistake 5: Writing partial dependencies — Writing A → B when actually AB → B (trivial). Make sure the dependent isn't already in the determinant.
•Mistake 6: Conflating with foreign keys — A foreign key like EmpDeptID → DeptID isn't an FD in the traditional sense—it's a referential constraint across tables, not within a single relation.

notation_mistakes.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- WRONG interpretations with corrections
 
-- MISTAKE 1: Assuming symmetry
-- Given: StudentID → StudentEmail
-- WRONG conclusion: StudentEmail → StudentID
-- WHY WRONG: Multiple students could share same email (shared account, typos)
-- A real system might enforce email uniqueness, but that's a separate constraint
 
-- MISTAKE 2: LHS decomposition
-- Given: {StudentID, CourseID} → Grade
-- WRONG conclusion: StudentID → Grade
-- WHY WRONG: A student has many grades (one per course)
-- Same student ID, different courses = different grades
-- The FD only holds when BOTH attributes are known
 
-- MISTAKE 3: Correlation vs determination
-- Observation: Salary and JobLevel appear related in data
-- WRONG: Salary → JobLevel or JobLevel → Salary
-- REALITY: 
--   JobLevel → SalaryRange (a job level has a salary RANGE)
--   But same exact salary could exist at different levels (overlap)
-- Always verify: could two rows have same LHS but different RHS?
 
-- MISTAKE 4: Set semantics
-- Given: {City, ZipCode} written in FD
-- WRONG reading: "City OR ZipCode"
-- CORRECT reading: "City AND ZipCode together"
-- {City, ZipCode} → State means you need BOTH to determine State
 
-- MISTAKE 5: Including RHS in LHS (trivial FD stated as meaningful)
-- WRONG: {StudentID, Name} → Name
-- WHY WRONG: This is trivially true, always holds, provides no constraint
-- CORRECT: StudentID → Name (the actual meaningful dependency)

The LHS Decomposition Trap

The single most common error is treating AB → C as meaning A → C or B → C. Always remember: if you need multiple attributes to determine something, removing any one breaks the determination. This directly affects normalization—2NF violations arise from partial dependencies where PART of a key determines non-key attributes.

Notation Variations Across Resources

Different textbooks, academic papers, and database vendors use slightly different notation conventions. Being aware of these variations helps when reading diverse sources.

Notation Variations in Database Literature
Convention	Example	Used In	Our Convention
Arrow with bar	A ⟶ B	Some academic papers	A → B
Colon notation	A : B	Rare, older texts	A → B
Dependency word	A FD B	Verbose descriptions	A → B
Greek letters	α → β	Theoretical CS papers	X → Y or A → B
Subscript notation	A₁A₂ → B	Papers with many attrs	AB → C or {A₁, A₂} → B
Formal set	{A} → {B, C}	LaTeX documents	{A} → {B, C}

Vendor-Specific Documentation:

Database vendors rarely use formal FD notation in their documentation. Instead, they describe dependencies in prose:

Oracle: "The EMPLOYEE_ID column uniquely identifies each employee"
- Translates to: EMPLOYEE_ID → {all other attributes}
PostgreSQL: "Declare PRIMARY KEY(col1, col2) to ensure uniqueness of the combination"
- Translates to: {col1, col2} → {all other attributes}
MySQL: "UNIQUE constraints ensure no two rows have the same value"
- Translates to: Unique column → other attributes (when appropriate)

As a database professional, you mentally translate between formal notation and vendor prose.

Standard Reference

The notation used in this course follows the standard convention from major database textbooks (Silberschatz, Ramakrishnan, Elmasri & Navathe). When in doubt, use explicit set notation {A, B} → {C, D} to eliminate ambiguity.

Summary: Mastering FD Notation

We've developed fluency in functional dependency notation. Let's consolidate the key concepts:

Key Takeaways

•Arrow Semantics — X → Y reads as 'X determines Y' or 'Y is functionally dependent on X.' The arrow is directional—it does not imply the reverse.
•LHS (Determinant) — The left side can be a single attribute or a set. A composite LHS means all attributes together are needed to determine the RHS.
•RHS (Dependent) — The right side can be decomposed: A → BC equals A → B and A → C. This is NOT true for the LHS.
•Set Notation — {A, B} means A AND B together, representing conjunction. The comma inside braces does NOT mean 'or.'
•FD Sets — Multiple FDs are collected in set F. Notation like F+ and X+ denotes closures we'll explore later.
•Avoid Common Mistakes — Never assume symmetry, never decompose LHS, and always verify genuine determination (not just correlation).

What's Next:

Having mastered the notation, we'll now explore the terminology more deeply. The next page examines Determinant and Dependent—the two sides of the functional dependency relationship—with precise definitions and practical implications for database design.

Page Complete

You now have fluency in FD notation—the language of database theory. You can read, write, and interpret functional dependencies correctly, avoiding common pitfalls. This notation literacy is prerequisite for all advanced topics: Armstrong's Axioms, normalization algorithms, and decomposition theory.

2 / 5

Loading learning content...

Database Management SystemsFunctional Dependencies

Functional Dependency Concept

LevelIntermediate

Duration60 mins

TopicFunctional Dependencies

2 / 5

Notation (X → Y)

The Language of Functional Dependencies

This page transforms you from someone who recognizes FD notation to someone who thinks in it—reading FDs as naturally as reading English sentences.

What You Will Learn

Anatomy of the Arrow: Breaking Down X → Y

The functional dependency notation consists of three elements: the left-hand side (LHS), the arrow, and the right-hand side (RHS). Each carries specific meaning.

Components of FD Notation: X → Y
Component	Name	Also Called	Meaning
X	Left-Hand Side (LHS)	Determinant	The set of attributes whose values determine other attributes
→	Arrow	Functionally Determines	Indicates a deterministic relationship; reads as 'determines'
Y	Right-Hand Side (RHS)	Dependent / Consequent	The set of attributes whose values are determined by X

Reading the Arrow:

The arrow is read as "functionally determines" or simply "determines." Various equivalent readings:

Notation	Readings
`A → B`	"A determines B"
`A → B`	"A functionally determines B"
`A → B`	"B is functionally dependent on A"
`A → B`	"B depends on A"
`A → B`	"The value of A determines the value of B"

All these readings express the same constraint: knowing A's value is sufficient to know B's value.

Directionality Matters

fd_notation_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- FD Notation Examples with SQL Context
 
-- Simple FDs:
-- SSN → Name
-- Translation: Social Security Number determines Person's Name
-- In a table: if two rows have the same SSN, they MUST have the same Name
 
-- EmpID → DeptID
-- Translation: Employee ID determines Department ID
-- Meaning: Each employee belongs to exactly one department
 
-- ISBN → BookTitle  
-- Translation: ISBN determines Book Title
-- Meaning: Each ISBN maps to exactly one book title
 
-- Reading FDs aloud:
-- "SSN determines Name"
-- "EmpID functionally determines DeptID"
-- "BookTitle is functionally dependent on ISBN"

Single Attributes vs. Composite (Set) Attributes

Functional dependencies can involve single attributes or sets of attributes on either side of the arrow. Understanding this distinction is crucial for correctly expressing real-world constraints.

Single Attribute FDs

•Format: A → B
•One attribute determines another
•Example: EmpID → EmpName
•Example: SSN → BirthDate
•Example: ISBN → Title
•Simplest form of FD
•Common in highly normalized schemas

Composite Attribute FDs

•Format: {A, B} → C or AB → C
•Multiple attributes together determine another
•Example: {StudentID, CourseID} → Grade
•Example: {FlightNo, Date} → Gate
•Example: {City, Street} → ZipCode
•Common in many-to-many relationships
•Neither attribute alone is sufficient

Set Notation Conventions:

When dealing with sets of attributes, different textbooks use different notation:

Notation Style	Example	Meaning
Set braces	`{A, B} → {C, D}`	Explicit set notation, mathematically precise
Concatenation	`AB → CD`	Compact form, common in textbooks
Mixed	`{A, B} → C`	Set on left, single attribute on right

All three mean the same thing: attributes A and B together determine attributes C and D.

Critical Understanding: The LHS is a CONJUNCTION

When we write {StudentID, CourseID} → Grade:

We mean StudentID AND CourseID together determine Grade
Neither StudentID alone nor CourseID alone determines Grade
A student has many grades (different courses)
A course has many grades (different students)
But a specific student in a specific course has ONE grade

Common Mistake: Reading Composite LHS as 'OR'

composite_fd_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Composite FD Examples
 
-- Example 1: Course Enrollment
-- {StudentID, CourseID} → Grade
-- A student's grade in a course is determined by the student-course pair
-- StudentID alone? No - same student has different grades in different courses
-- CourseID alone? No - same course has different grades for different students
-- BOTH together? Yes - exactly one grade per student per course
 
-- Example 2: Flight Information  
-- {FlightNumber, DepartureDate} → {Gate, DepartureTime, Status}
-- Gate assignment depends on which flight AND which date
-- Same flight number on different dates may use different gates
 
-- Example 3: Address Components
-- {City, StreetAddress} → ZipCode
-- Two houses on the same street in the same city share a zip code
-- Note: City alone doesn't determine ZipCode (many zips per city)
-- Note: Street alone doesn't determine ZipCode (same street names in different cities)
 
-- Example 4: Multiple Dependent Attributes
-- EmpID → {Name, DeptID, Salary, HireDate}
-- One determinant, multiple dependents
-- This is equivalent to four separate FDs:
--   EmpID → Name
--   EmpID → DeptID
--   EmpID → Salary
--   EmpID → HireDate

Notation Shorthands and Conventions

Database literature uses several notational conventions and shorthands. Recognizing these prevents confusion when reading different textbooks or academic papers.

Common FD Notation Conventions
Convention	Example	Full Form	Notes
Concatenation for sets	AB → C	{A, B} → {C}	Letters next to each other form a set
Single letter attributes	A, B, C	Arbitrary attribute names	Used in theoretical discussions
Uppercase letters	X, Y, Z	Arbitrary attribute sets	Often represent sets of multiple attributes
RHS decomposition	A → BC	A → B and A → C	Multiple dependents listed together
Multiple FDs combined	A → B, C → D	Two separate FDs	Comma separates distinct FDs
Attribute from relation	R.Attr	Attribute Attr from relation R	Used when context needs clarification

Lowercase vs Uppercase Convention:

In many theoretical texts:

Lowercase letters (a, b, c) represent single attributes
Uppercase letters (X, Y, Z) represent sets of attributes (possibly single, possibly multiple)

Thus:

a → b definitely means a single attribute determines another single attribute
X → Y might mean any of: single → single, single → set, set → single, or set → set

RHS Decomposition Equivalence:

A fundamental property: the RHS can always be decomposed.

A → BC  is equivalent to  A → B  AND  A → C

This is formally proven (it's one of Armstrong's derived rules). It means:

Writing EmpID → {Name, DeptID} is shorthand for two FDs
You can always break apart the RHS into individual attributes
This is NOT true for the LHS! AB → C is NOT equivalent to A → C AND B → C

LHS Cannot Be Decomposed

Remember: You CAN decompose the RHS (A → BC means A → B and A → C). You CANNOT decompose the LHS (AB → C does NOT mean A → C or B → C). The distinction is critical for normalization.

Sets of Functional Dependencies (F)

Real database schemas have multiple functional dependencies, collected into a set typically denoted F. Understanding how to work with FD sets is essential for normalization algorithms.

Notation for FD Sets:

An FD set is written with curly braces:

F = {
    A → B,
    A → C,
    B → D,
    {C, D} → E
}

Or more compactly:

F = { A → BC, B → D, CD → E }

Both represent the same constraints using RHS decomposition equivalence.

fd_set_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Example: FD Set for an Order Management Schema
-- Relation: ORDER_ITEM(OrderID, ProductID, CustomerID, CustomerName, 
--                      ProductName, Price, Quantity, OrderDate)
 
-- FD Set F:
-- F = {
--     OrderID → CustomerID,           -- Order placed by one customer
--     OrderID → CustomerName,         -- (derived via transitivity)
--     OrderID → OrderDate,            -- Order has one date
--     CustomerID → CustomerName,      -- Customer has one name
--     ProductID → ProductName,        -- Product has one name
--     ProductID → Price,              -- Product has one price
--     {OrderID, ProductID} → Quantity -- Each product ordered has quantity
-- }
 
-- Compact form using RHS grouping:
-- F = {
--     OrderID → {CustomerID, CustomerName, OrderDate},
--     CustomerID → CustomerName,
--     ProductID → {ProductName, Price},
--     {OrderID, ProductID} → Quantity
-- }
 
-- Note: This schema has redundancy issues that normalization fixes!
-- CustomerName is determined by both OrderID (transitively) and CustomerID

Notation for Discussing FD Sets:

Symbol	Meaning
`F`	A set of functional dependencies
`F ⊨ X → Y`	F logically implies X → Y (X → Y can be derived from F)
`F+` (F-plus)	The closure of F—all FDs that can be derived from F
`X+` (X-plus)	Attribute closure—all attributes determined by X under F
`F ≡ G`	F and G are equivalent (they imply the same FDs)

These symbols appear throughout normalization theory. We'll cover closures and equivalence in depth in later modules.

F+ Can Be Huge

Reading Complex FD Expressions

Let's practice reading increasingly complex FD expressions. This fluency is essential for working with normalization algorithms.

Complex FD Expression Practice
Expression	English Reading	Meaning
A → B	A determines B	Knowing A's value tells you B's value
AB → C	A and B together determine C	Both A and B are needed; neither alone suffices
A → BCD	A determines B, C, and D	Equivalent to three separate FDs
AB → CD	A and B together determine C and D	Composite determinant, multiple dependents
A → B, B → C	A determines B; B determines C	Two FDs; note A transitively determines C
ABC → DE, E → F	ABC determines DE; E determines F	Complex FD set with transitivity

Practice Exercise: Real-World Interpretation

Consider a UNIVERSITY_COURSE schema with FD set:

F = {
    CourseID → {CourseName, Credits, DeptID},
    DeptID → DeptName,
    {CourseID, Semester} → {Instructor, Room, Time},
    Instructor → Office
}

Let's interpret each FD:

CourseID → {CourseName, Credits, DeptID}
- Each course has exactly one name, credit count, and home department
- The course 'CS101' will always be 'Intro to Programming' with 3 credits in the CS department
DeptID → DeptName
- Each department ID maps to exactly one department name
- Department 'CS' is always 'Computer Science'
{CourseID, Semester} → {Instructor, Room, Time}
- A course in a specific semester has one assigned instructor, room, and time
- CS101 in Fall 2024 might have Dr. Smith in Room 101 at 10am
- CS101 in Spring 2025 could have different values
Instructor → Office
- Each instructor has exactly one office
- Dr. Smith is always in Office 305

Transitivity in FD Sets

Common Notation Mistakes to Avoid

Several notation mistakes repeatedly appear in student work and even professional discussions. Being aware of them prevents errors in database design.

Critical Notation Mistakes

•Mistake 1: Assuming symmetry — Writing A → B and assuming B → A. The arrow is one-directional. StudentID → Name does NOT mean Name → StudentID.
•Mistake 2: Decomposing the LHS — Believing AB → C means A → C or B → C. Wrong! The LHS cannot be decomposed. {StudentID, CourseID} → Grade does NOT mean StudentID → Grade.
•Mistake 3: Confusing FD with correlation — Two attributes appearing together doesn't make an FD. Height and Weight often correlate but neither determines the other.
•Mistake 4: Ignoring set semantics — Reading {A, B} as 'A or B' instead of 'A and B'. The braces denote conjunction, not disjunction.
•Mistake 5: Writing partial dependencies — Writing A → B when actually AB → B (trivial). Make sure the dependent isn't already in the determinant.
•Mistake 6: Conflating with foreign keys — A foreign key like EmpDeptID → DeptID isn't an FD in the traditional sense—it's a referential constraint across tables, not within a single relation.

notation_mistakes.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- WRONG interpretations with corrections
 
-- MISTAKE 1: Assuming symmetry
-- Given: StudentID → StudentEmail
-- WRONG conclusion: StudentEmail → StudentID
-- WHY WRONG: Multiple students could share same email (shared account, typos)
-- A real system might enforce email uniqueness, but that's a separate constraint
 
-- MISTAKE 2: LHS decomposition
-- Given: {StudentID, CourseID} → Grade
-- WRONG conclusion: StudentID → Grade
-- WHY WRONG: A student has many grades (one per course)
-- Same student ID, different courses = different grades
-- The FD only holds when BOTH attributes are known
 
-- MISTAKE 3: Correlation vs determination
-- Observation: Salary and JobLevel appear related in data
-- WRONG: Salary → JobLevel or JobLevel → Salary
-- REALITY: 
--   JobLevel → SalaryRange (a job level has a salary RANGE)
--   But same exact salary could exist at different levels (overlap)
-- Always verify: could two rows have same LHS but different RHS?
 
-- MISTAKE 4: Set semantics
-- Given: {City, ZipCode} written in FD
-- WRONG reading: "City OR ZipCode"
-- CORRECT reading: "City AND ZipCode together"
-- {City, ZipCode} → State means you need BOTH to determine State
 
-- MISTAKE 5: Including RHS in LHS (trivial FD stated as meaningful)
-- WRONG: {StudentID, Name} → Name
-- WHY WRONG: This is trivially true, always holds, provides no constraint
-- CORRECT: StudentID → Name (the actual meaningful dependency)

The LHS Decomposition Trap

Notation Variations Across Resources

Different textbooks, academic papers, and database vendors use slightly different notation conventions. Being aware of these variations helps when reading diverse sources.

Notation Variations in Database Literature
Convention	Example	Used In	Our Convention
Arrow with bar	A ⟶ B	Some academic papers	A → B
Colon notation	A : B	Rare, older texts	A → B
Dependency word	A FD B	Verbose descriptions	A → B
Greek letters	α → β	Theoretical CS papers	X → Y or A → B
Subscript notation	A₁A₂ → B	Papers with many attrs	AB → C or {A₁, A₂} → B
Formal set	{A} → {B, C}	LaTeX documents	{A} → {B, C}

Vendor-Specific Documentation:

Database vendors rarely use formal FD notation in their documentation. Instead, they describe dependencies in prose:

Oracle: "The EMPLOYEE_ID column uniquely identifies each employee"
- Translates to: EMPLOYEE_ID → {all other attributes}
PostgreSQL: "Declare PRIMARY KEY(col1, col2) to ensure uniqueness of the combination"
- Translates to: {col1, col2} → {all other attributes}
MySQL: "UNIQUE constraints ensure no two rows have the same value"
- Translates to: Unique column → other attributes (when appropriate)

As a database professional, you mentally translate between formal notation and vendor prose.

Standard Reference

Summary: Mastering FD Notation

We've developed fluency in functional dependency notation. Let's consolidate the key concepts:

Key Takeaways

•Arrow Semantics — X → Y reads as 'X determines Y' or 'Y is functionally dependent on X.' The arrow is directional—it does not imply the reverse.
•LHS (Determinant) — The left side can be a single attribute or a set. A composite LHS means all attributes together are needed to determine the RHS.
•RHS (Dependent) — The right side can be decomposed: A → BC equals A → B and A → C. This is NOT true for the LHS.
•Set Notation — {A, B} means A AND B together, representing conjunction. The comma inside braces does NOT mean 'or.'
•FD Sets — Multiple FDs are collected in set F. Notation like F+ and X+ denotes closures we'll explore later.
•Avoid Common Mistakes — Never assume symmetry, never decompose LHS, and always verify genuine determination (not just correlation).

What's Next:

Page Complete

2 / 5