Database Management SystemsExpressive Power

Expressive Power of Relational Languages

LevelAdvanced

Duration90 mins

TopicExpressive Power

3 / 5

Equivalence Proof

From Theorem to Algorithm

In the previous page, we stated Codd's Theorem: relational algebra and safe relational calculus possess identical expressive power. Such a claim demands proof—and not just any proof, but a constructive proof that provides actual algorithms for translating queries between these paradigms.

This page presents the key elements of that proof. We'll walk through the translations in both directions, demonstrating with concrete examples how calculus queries become algebraic expressions and vice versa. This isn't merely academic exercise—these translation algorithms mirror what real database systems do when compiling SQL into execution plans.

What You Will Learn

By the end of this page, you will understand the structural induction technique used to prove equivalence, the specific translation rules for each algebraic operator and calculus construct, worked examples showing complete translations, and the algorithmic nature of these proofs that enables implementation in database compilers.

Structure of the Equivalence Proof

The equivalence proof proceeds in two directions, each using structural induction on the syntax of query expressions:

Direction 1: Algebra → Calculus

Base case: Every relation R has the calculus equivalent { t | R(t) }
Inductive step: For each algebraic operator, show that if operands have calculus equivalents, so does the result

Direction 2: Calculus → Algebra (Safe Queries Only)

Convert the calculus formula to a normal form (usually domain-independent form)
Systematically translate each logical construct to algebraic operations
Handle quantifiers through projection (∃) and difference (∀)

Both directions are constructive: they don't just prove equivalence exists, they provide explicit translation procedures.

Structural Induction Recap

Structural induction proves properties about recursively defined structures. For algebraic expressions: (1) prove the property holds for base expressions (relations), (2) prove that if it holds for subexpressions, it holds for any expression built from them. This covers all possible expressions in the language.

Notation conventions for this page:

Symbol	Meaning
R, S	Base relations
E, E₁, E₂	Algebraic expressions
φ, ψ	Calculus formulas
t, u, v	Tuple variables (TRC)
x, y, z	Domain variables (DRC)
σ, π, ×, ∪, −	Algebraic operators
∧, ∨, ¬, ∃, ∀	Logical connectives
[E]	The calculus equivalent of algebraic expression E
〈φ〉	The algebraic equivalent of calculus formula φ

Direction 1: Algebra → Calculus

We prove that every relational algebra expression has an equivalent tuple relational calculus query. The proof proceeds by induction on the structure of algebraic expressions.

Base Case

For any base relation R with schema R(A₁, A₂, ..., Aₙ), the equivalent TRC query is: { t | R(t) }. This simply retrieves all tuples in R.

Inductive Step: Translation Rules

Assume E₁ and E₂ are algebraic expressions with TRC equivalents [E₁] = { t | φ₁(t) } and [E₂] = { u | φ₂(u) }. We show each algebraic operator has a calculus equivalent:

1. Union (E₁ ∪ E₂)

[E₁ ∪ E₂] = { t | φ₁(t) ∨ φ₂(t) }

A tuple is in the union iff it satisfies the first formula OR the second.

2. Set Difference (E₁ − E₂)

[E₁ − E₂] = { t | φ₁(t) ∧ ¬φ₂(t) }

A tuple is in the difference iff it satisfies the first formula AND NOT the second.

3. Cartesian Product (E₁ × E₂)

[E₁ × E₂] = { (t, u) | φ₁(t) ∧ φ₂(u) }

A combined tuple is in the product iff the first part satisfies φ₁ and the second satisfies φ₂.

4. Selection σ_C(E₁) where C is a condition

[σ_C(E₁)] = { t | φ₁(t) ∧ C(t) }

A tuple is selected iff it satisfies the original formula AND the selection condition.

5. Projection π_{A₁,...,Aₖ}(E₁)

[π_{A₁,...,Aₖ}(E₁)] = { (t.A₁, ..., t.Aₖ) | ∃t(φ₁(t)) }

The projected attributes appear iff there EXISTS a tuple in E₁ with those values.

Complete Algebra → Calculus Translation Table
Algebraic Expression	TRC Equivalent	Key Insight
R (base relation)	{ t \| R(t) }	Direct membership test
E₁ ∪ E₂	{ t \| φ₁(t) ∨ φ₂(t) }	Union = disjunction
E₁ − E₂	{ t \| φ₁(t) ∧ ¬φ₂(t) }	Difference = negated conjunction
E₁ × E₂	{ (t,u) \| φ₁(t) ∧ φ₂(u) }	Product = independent conjunction
σ_C(E₁)	{ t \| φ₁(t) ∧ C(t) }	Selection = added condition
π_{A}(E₁)	{ t.A \| ∃t(φ₁(t)) }	Projection = existential hiding
E₁ ⋈_C E₂	{ (t,u) \| φ₁(t) ∧ φ₂(u) ∧ C(t,u) }	Join = filtered product
E₁ ∩ E₂	{ t \| φ₁(t) ∧ φ₂(t) }	Intersection = conjunction

Worked Example: Algebra to Calculus

Let's translate a complete algebraic expression to TRC step by step.

Given algebraic query:

π_{name}(σ_{salary > 50000 ∧ dept = 'Engineering'}(Employee))

Goal: Find names of Engineering employees earning more than 50000.

Step-by-step translation:

Step 1: Translate the base relation

[Employee] = { e | Employee(e) }

Step 2: Translate the selection

[σ_{salary > 50000 ∧ dept = 'Engineering'}(Employee)]
= { e | Employee(e) ∧ e.salary > 50000 ∧ e.dept = 'Engineering' }

Step 3: Translate the projection

[π_{name}(σ_{...}(Employee))]
= { (e.name) | ∃e(Employee(e) ∧ e.salary > 50000 ∧ e.dept = 'Engineering') }

Final TRC query:

{ t | ∃e(Employee(e) ∧ e.salary > 50000 ∧ e.dept = 'Engineering' ∧ t.name = e.name) }

Or in a cleaner form using attribute binding:

{ ⟨n⟩ | ∃e(Employee(e) ∧ e.salary > 50000 ∧ e.dept = 'Engineering' ∧ e.name = n) }

Verification

Both the original algebraic expression and the TRC query describe the same set: names of employees in Engineering with salary above 50000. The translation preserves semantics—this is what the proof guarantees.

A more complex example with join:

Algebraic query:

π_{E.name, D.dname}(σ_{E.dno = D.dno}(Employee × Department))

Translation:

Step 1: [Employee] = { e | Employee(e) }
Step 2: [Department] = { d | Department(d) }
Step 3: [Employee × Department] = { (e, d) | Employee(e) ∧ Department(d) }
Step 4: [σ_{E.dno = D.dno}(...)] = { (e, d) | Employee(e) ∧ Department(d) ∧ e.dno = d.dno }
Step 5: [π_{name, dname}(...)] = { (n, dn) | ∃e∃d(Employee(e) ∧ Department(d) ∧ e.dno = d.dno 
                                              ∧ e.name = n ∧ d.dname = dn) }

Final TRC query:

{ ⟨n, dn⟩ | ∃e∃d(Employee(e) ∧ Department(d) ∧ e.dno = d.dno ∧ e.name = n ∧ d.dname = dn) }

This reads naturally: 'Find name-department pairs where an employee exists in that department.'

Direction 2: Calculus → Algebra

The reverse direction—translating calculus to algebra—is more challenging because calculus formulas have arbitrary logical structure. The key technique is to systematically process each type of formula construct.

Critical Prerequisite: Safety

This translation only works for SAFE calculus queries. For unsafe queries (infinite or domain-dependent results), no algebraic equivalent exists. The translation algorithm will produce nonsensical or infinite operations for unsafe inputs.

The translation algorithm (overview):

Normalize the formula — Convert to a canonical form where quantifiers have specific structure
Compute the active domain — Determine all possible values that variables can take
Translate atomic formulas — Map base predicates to relations
Translate connectives — Map ∧, ∨, ¬ to algebraic operations
Translate quantifiers — Map ∃ to projection, ∀ to complemented existential

Translation rules for atomic formulas:

〈R(t)〉 = R                    // Base relation membership
〈t.A = c〉 = σ_{A=c}(DOM)      // Constant comparison
〈t.A = u.B〉 = σ_{A=B}(DOM × DOM)  // Attribute comparison
〈t.A < c〉 = σ_{A<c}(DOM)      // Inequality

Where DOM represents the active domain relation.

Translation rules for connectives:

〈φ₁ ∧ φ₂〉 = 〈φ₁〉 ∩ 〈φ₂〉 = 〈φ₁〉 − (〈φ₁〉 − 〈φ₂〉)  // Conjunction
〈φ₁ ∨ φ₂〉 = 〈φ₁〉 ∪ 〈φ₂〉                           // Disjunction  
〈¬φ₁〉 = DOM − 〈φ₁〉                                  // Negation (requires active domain)

Translation rules for quantifiers:

〈∃x.φ(x,y)〉 = π_{y}(〈φ(x,y)〉)     // Existential = projection
〈∀x.φ(x,y)〉 = DOM − π_{y}(DOM × DOM_{x} − 〈φ(x,y)〉)  // Universal (complex)

The universal quantifier translation uses: ∀x.φ ≡ ¬∃x.¬φ

Key Translation Insights

•Existential quantifiers become projections—hiding the quantified variable eliminates it from the result while preserving tuples that had some binding.
•Universal quantifiers are rewritten as negated existentials: ∀x.φ(x) ≡ ¬∃x.¬φ(x). They become difference operations.
•Negation requires the active domain — to compute 'not in R', we need to know the universe from which to subtract R. This is the active domain.
•Conjunction becomes intersection or filtered join — tuples must satisfy both conditions.
•Disjunction becomes union — tuples satisfying either condition qualify.

Worked Example: Calculus to Algebra

Let's translate a TRC query to relational algebra.

Given TRC query:

{ e.name | Employee(e) ∧ e.salary > 50000 ∧ ∃d(Department(d) ∧ d.dno = e.dno ∧ d.dname = 'Research') }

Goal: Names of employees earning over 50000 who work in Research.

Step-by-step translation:

Step 1: Identify the outermost structure

The query projects e.name from a formula with:

Base predicate Employee(e)
Condition e.salary > 50000
Existentially quantified subformula about Department

Step 2: Translate the existential subformula

∃d(Department(d) ∧ d.dno = e.dno ∧ d.dname = 'Research')

This checks if there exists a Research department matching the employee's dno.

〈∃d(Department(d) ∧ d.dno = e.dno ∧ d.dname = 'Research')〉
= π_{dno}(σ_{dname='Research'}(Department))

Projecting dno gives us the department numbers of Research departments.

Step 3: Combine with Employee conditions

Employee(e) ∧ e.salary > 50000 ∧ (e.dno ∈ Research departments)

Algebraically:

σ_{salary > 50000}(Employee) ⋈_{dno = dno} π_{dno}(σ_{dname='Research'}(Department))

Or using semi-join:

σ_{salary > 50000}(Employee) ⋉_{dno} σ_{dname='Research'}(Department)

Step 4: Apply the final projection

π_{name}(σ_{salary > 50000}(Employee) ⋈_{dno = dno} π_{dno}(σ_{dname='Research'}(Department)))

Final algebraic expression:

π_{name}(
  σ_{salary > 50000}(Employee) 
  ⋈_{Employee.dno = Department.dno} 
  σ_{dname = 'Research'}(Department)
)

Verification

Both expressions compute the same result: names of high-earning Research employees. The TRC describes WHAT, the algebra describes HOW. The translation is semantics-preserving.

Handling Universal Quantifiers

Universal quantifiers (∀) require special attention. They appear in queries like 'find students who passed ALL courses' or 'find suppliers who supply ALL parts.' Translating them to algebra involves the division operation or its equivalent using difference.

Example: Universal quantifier query

{ s.sname | Student(s) ∧ ∀c(Course(c) → Enrolled(s.sid, c.cid)) }

Translation approach using ∀x.φ ≡ ¬∃x.¬φ:

∀c(Course(c) → Enrolled(s.sid, c.cid))
≡ ∀c(¬Course(c) ∨ Enrolled(s.sid, c.cid))
≡ ¬∃c(Course(c) ∧ ¬Enrolled(s.sid, c.cid))
≡ ¬∃c(c is a course AND s is not enrolled in c)

Algebraic translation:

-- All courses
π_{cid}(Course)

-- Courses each student is enrolled in
π_{sid, cid}(Enrolled)

-- Students missing at least one course
-- (Students paired with all courses) - (Actual enrollments)
Missing = π_{sid}((π_{sid}(Student) × π_{cid}(Course)) − Enrolled)

-- Students enrolled in ALL courses = All students - Those missing any
π_{sid}(Student) − Missing

-- Get names
π_{sname}(Student ⋈ (π_{sid}(Student) − Missing))

Recognizing this as division:

The operation 'find students enrolled in all courses' is exactly what the division operator (÷) computes:

π_{sid, cid}(Enrolled) ÷ π_{cid}(Course)

This returns sids of students whose enrolled courses include ALL courses.

Division Pattern

When you see a universal quantifier over a membership test (∀x ∈ S, predicate(y,x)), consider the division operation. Division explicitly computes 'which y values are associated with ALL x values in S.'

The Active Domain Problem

A subtle but crucial aspect of the calculus-to-algebra translation involves the active domain—the set of values that actually appear in the database. This matters when translating negation.

The problem:

Consider the TRC query:

{ t | R(t) ∧ ¬S(t) }

This asks for tuples in R but not in S—simply the difference R − S.

But consider:

{ t | ¬R(t) }

This asks for tuples NOT in R. But from what universe? All possible tuples? That's infinite.

The active domain solution:

For safe queries, all free variables are constrained to range over values appearing in the database. The active domain ADOM is:

ADOM = π_{A₁}(R₁) ∪ π_{A₂}(R₁) ∪ ... ∪ π_{B₁}(R₂) ∪ ...

Union of all attribute values from all relations.

Safe negation translation:

〈¬R(t)〉 = ADOM^n − R   // where n is the arity of R

This computes all tuples of active domain values that are not in R—a finite, computable set.

Example:

{ t | Employee(t) ∧ ¬Manager(t) }

Translation:

Employee − Manager

No need for ADOM here because Employee already constrains the domain.

Safety Ensures Finite ADOM Suffices

Safe queries guarantee that every free variable is 'limited'—constrained by some positive occurrence in a relation. This ensures the active domain contains all relevant values. Unsafe queries fail precisely because they reference values outside ADOM, leading to infinite or domain-dependent results.

Completeness and Correctness of Translation

For the equivalence proof to be complete, we must establish that the translations preserve semantics in all cases.

Properties of the Translation

•Soundness (Algebra → Calculus): For every database D, the algebraic expression E and its calculus translation [E] produce identical results: E(D) = E.
•Soundness (Calculus → Algebra): For every database D, the calculus formula φ and its algebraic translation 〈φ〉 produce identical results: φ(D) = 〈φ〉(D).
•Completeness: Every well-formed algebraic expression has a translation, and every safe calculus formula has a translation. No valid input is rejected.
•Termination: Both translation algorithms terminate for any well-formed input. No infinite loops or unbounded recursion.

The proof technique:

Each translation rule is verified individually:

Base case verification:
- [R] = { t | R(t) } trivially represents the same set of tuples
Inductive case verification:
- Assume [E₁] correctly represents E₁ and [E₂] correctly represents E₂
- Show [E₁ ∪ E₂] = { t | φ₁(t) ∨ φ₂(t) } correctly represents E₁ ∪ E₂
- A tuple t is in E₁ ∪ E₂ iff t ∈ E₁ or t ∈ E₂
- By inductive hypothesis, t ∈ E₁ iff φ₁(t) and t ∈ E₂ iff φ₂(t)
- Therefore t ∈ E₁ ∪ E₂ iff φ₁(t) ∨ φ₂(t) ✓
Similar verification for each operator and construct

The full proof appears in database theory textbooks (Ullman, Abiteboul-Hull-Vianu) with all edge cases addressed.

Key theorems established:

Theorem (Algebra-Calculus Equivalence): For every relational schema S:

For every RA expression E over S, there exists a safe TRC formula φ such that E ≡ { t | φ(t) }

For every safe TRC formula φ over S, there exists an RA expression E such that { t | φ(t) } ≡ E

The translations are effective (algorithmic) and semantics-preserving.

Summary: Algorithms Behind the Theorem

We've now seen the constructive proof of Codd's Theorem—the algorithms that translate between algebra and calculus.

Key Takeaways

•Algebra → Calculus proceeds by structural induction: relations become predicates, operators become logical connectives (∪→∨, −→∧¬, ×→∧, σ→∧condition, π→∃).
•Calculus → Algebra normalizes formulas and translates constructs: predicates become relations, connectives become set operations, quantifiers become projections and differences.
•Universal quantifiers require special handling via ∀x.φ ≡ ¬∃x.¬φ, which translates to difference operations (or division).
•The active domain provides the finite universe for negation in calculus, enabling translation to the algebraic difference operation.
•Both translations are constructive — they provide actual algorithms, not just existence proofs. Database compilers implement these algorithms.
•Correctness is verified by induction — each translation rule preserves semantics, proven by case analysis on the structure of expressions.

What's next:

With the equivalence established, we turn to an important question: What can't relational algebra (and equivalently, safe calculus) express? Page 4 examines the inherent limitations of the relational model—queries that are impossible to express regardless of how cleverly we write them. These limitations motivate the extensions we'll explore in Page 5.

Page Complete

You now understand the formal proof of Codd's Theorem at a level suitable for database professionals. The translation algorithms we examined are not merely theoretical—they mirror the actual compilation process in database systems, where SQL (calculus-like) becomes operator trees (algebra) for optimization and execution.

3 / 5

Loading learning content...

Database Management SystemsExpressive Power

Expressive Power of Relational Languages

LevelAdvanced

Duration90 mins

TopicExpressive Power

3 / 5

Equivalence Proof

From Theorem to Algorithm

What You Will Learn

Structure of the Equivalence Proof

The equivalence proof proceeds in two directions, each using structural induction on the syntax of query expressions:

Direction 1: Algebra → Calculus

Base case: Every relation R has the calculus equivalent { t | R(t) }
Inductive step: For each algebraic operator, show that if operands have calculus equivalents, so does the result

Direction 2: Calculus → Algebra (Safe Queries Only)

Convert the calculus formula to a normal form (usually domain-independent form)
Systematically translate each logical construct to algebraic operations
Handle quantifiers through projection (∃) and difference (∀)

Both directions are constructive: they don't just prove equivalence exists, they provide explicit translation procedures.

Structural Induction Recap

Notation conventions for this page:

Symbol	Meaning
R, S	Base relations
E, E₁, E₂	Algebraic expressions
φ, ψ	Calculus formulas
t, u, v	Tuple variables (TRC)
x, y, z	Domain variables (DRC)
σ, π, ×, ∪, −	Algebraic operators
∧, ∨, ¬, ∃, ∀	Logical connectives
[E]	The calculus equivalent of algebraic expression E
〈φ〉	The algebraic equivalent of calculus formula φ

Direction 1: Algebra → Calculus

We prove that every relational algebra expression has an equivalent tuple relational calculus query. The proof proceeds by induction on the structure of algebraic expressions.

Base Case

For any base relation R with schema R(A₁, A₂, ..., Aₙ), the equivalent TRC query is: { t | R(t) }. This simply retrieves all tuples in R.

Inductive Step: Translation Rules

Assume E₁ and E₂ are algebraic expressions with TRC equivalents [E₁] = { t | φ₁(t) } and [E₂] = { u | φ₂(u) }. We show each algebraic operator has a calculus equivalent:

1. Union (E₁ ∪ E₂)

[E₁ ∪ E₂] = { t | φ₁(t) ∨ φ₂(t) }

A tuple is in the union iff it satisfies the first formula OR the second.

2. Set Difference (E₁ − E₂)

[E₁ − E₂] = { t | φ₁(t) ∧ ¬φ₂(t) }

A tuple is in the difference iff it satisfies the first formula AND NOT the second.

3. Cartesian Product (E₁ × E₂)

[E₁ × E₂] = { (t, u) | φ₁(t) ∧ φ₂(u) }

A combined tuple is in the product iff the first part satisfies φ₁ and the second satisfies φ₂.

4. Selection σ_C(E₁) where C is a condition

[σ_C(E₁)] = { t | φ₁(t) ∧ C(t) }

A tuple is selected iff it satisfies the original formula AND the selection condition.

5. Projection π_{A₁,...,Aₖ}(E₁)

[π_{A₁,...,Aₖ}(E₁)] = { (t.A₁, ..., t.Aₖ) | ∃t(φ₁(t)) }

The projected attributes appear iff there EXISTS a tuple in E₁ with those values.

Complete Algebra → Calculus Translation Table
Algebraic Expression	TRC Equivalent	Key Insight
R (base relation)	{ t \| R(t) }	Direct membership test
E₁ ∪ E₂	{ t \| φ₁(t) ∨ φ₂(t) }	Union = disjunction
E₁ − E₂	{ t \| φ₁(t) ∧ ¬φ₂(t) }	Difference = negated conjunction
E₁ × E₂	{ (t,u) \| φ₁(t) ∧ φ₂(u) }	Product = independent conjunction
σ_C(E₁)	{ t \| φ₁(t) ∧ C(t) }	Selection = added condition
π_{A}(E₁)	{ t.A \| ∃t(φ₁(t)) }	Projection = existential hiding
E₁ ⋈_C E₂	{ (t,u) \| φ₁(t) ∧ φ₂(u) ∧ C(t,u) }	Join = filtered product
E₁ ∩ E₂	{ t \| φ₁(t) ∧ φ₂(t) }	Intersection = conjunction

Worked Example: Algebra to Calculus

Let's translate a complete algebraic expression to TRC step by step.

Given algebraic query:

π_{name}(σ_{salary > 50000 ∧ dept = 'Engineering'}(Employee))

Goal: Find names of Engineering employees earning more than 50000.

Step-by-step translation:

Step 1: Translate the base relation

[Employee] = { e | Employee(e) }

Step 2: Translate the selection

[σ_{salary > 50000 ∧ dept = 'Engineering'}(Employee)]
= { e | Employee(e) ∧ e.salary > 50000 ∧ e.dept = 'Engineering' }

Step 3: Translate the projection

[π_{name}(σ_{...}(Employee))]
= { (e.name) | ∃e(Employee(e) ∧ e.salary > 50000 ∧ e.dept = 'Engineering') }

Final TRC query:

{ t | ∃e(Employee(e) ∧ e.salary > 50000 ∧ e.dept = 'Engineering' ∧ t.name = e.name) }

Or in a cleaner form using attribute binding:

{ ⟨n⟩ | ∃e(Employee(e) ∧ e.salary > 50000 ∧ e.dept = 'Engineering' ∧ e.name = n) }

Verification

A more complex example with join:

Algebraic query:

π_{E.name, D.dname}(σ_{E.dno = D.dno}(Employee × Department))

Translation:

Step 1: [Employee] = { e | Employee(e) }
Step 2: [Department] = { d | Department(d) }
Step 3: [Employee × Department] = { (e, d) | Employee(e) ∧ Department(d) }
Step 4: [σ_{E.dno = D.dno}(...)] = { (e, d) | Employee(e) ∧ Department(d) ∧ e.dno = d.dno }
Step 5: [π_{name, dname}(...)] = { (n, dn) | ∃e∃d(Employee(e) ∧ Department(d) ∧ e.dno = d.dno 
                                              ∧ e.name = n ∧ d.dname = dn) }

Final TRC query:

{ ⟨n, dn⟩ | ∃e∃d(Employee(e) ∧ Department(d) ∧ e.dno = d.dno ∧ e.name = n ∧ d.dname = dn) }

This reads naturally: 'Find name-department pairs where an employee exists in that department.'

Direction 2: Calculus → Algebra

Critical Prerequisite: Safety

The translation algorithm (overview):

Normalize the formula — Convert to a canonical form where quantifiers have specific structure
Compute the active domain — Determine all possible values that variables can take
Translate atomic formulas — Map base predicates to relations
Translate connectives — Map ∧, ∨, ¬ to algebraic operations
Translate quantifiers — Map ∃ to projection, ∀ to complemented existential

Translation rules for atomic formulas:

〈R(t)〉 = R                    // Base relation membership
〈t.A = c〉 = σ_{A=c}(DOM)      // Constant comparison
〈t.A = u.B〉 = σ_{A=B}(DOM × DOM)  // Attribute comparison
〈t.A < c〉 = σ_{A<c}(DOM)      // Inequality

Where DOM represents the active domain relation.

Translation rules for connectives:

〈φ₁ ∧ φ₂〉 = 〈φ₁〉 ∩ 〈φ₂〉 = 〈φ₁〉 − (〈φ₁〉 − 〈φ₂〉)  // Conjunction
〈φ₁ ∨ φ₂〉 = 〈φ₁〉 ∪ 〈φ₂〉                           // Disjunction  
〈¬φ₁〉 = DOM − 〈φ₁〉                                  // Negation (requires active domain)

Translation rules for quantifiers:

〈∃x.φ(x,y)〉 = π_{y}(〈φ(x,y)〉)     // Existential = projection
〈∀x.φ(x,y)〉 = DOM − π_{y}(DOM × DOM_{x} − 〈φ(x,y)〉)  // Universal (complex)

The universal quantifier translation uses: ∀x.φ ≡ ¬∃x.¬φ

Key Translation Insights

•Existential quantifiers become projections—hiding the quantified variable eliminates it from the result while preserving tuples that had some binding.
•Universal quantifiers are rewritten as negated existentials: ∀x.φ(x) ≡ ¬∃x.¬φ(x). They become difference operations.
•Negation requires the active domain — to compute 'not in R', we need to know the universe from which to subtract R. This is the active domain.
•Conjunction becomes intersection or filtered join — tuples must satisfy both conditions.
•Disjunction becomes union — tuples satisfying either condition qualify.

Worked Example: Calculus to Algebra

Let's translate a TRC query to relational algebra.

Given TRC query:

{ e.name | Employee(e) ∧ e.salary > 50000 ∧ ∃d(Department(d) ∧ d.dno = e.dno ∧ d.dname = 'Research') }

Goal: Names of employees earning over 50000 who work in Research.

Step-by-step translation:

Step 1: Identify the outermost structure

The query projects e.name from a formula with:

Base predicate Employee(e)
Condition e.salary > 50000
Existentially quantified subformula about Department

Step 2: Translate the existential subformula

∃d(Department(d) ∧ d.dno = e.dno ∧ d.dname = 'Research')

This checks if there exists a Research department matching the employee's dno.

〈∃d(Department(d) ∧ d.dno = e.dno ∧ d.dname = 'Research')〉
= π_{dno}(σ_{dname='Research'}(Department))

Projecting dno gives us the department numbers of Research departments.

Step 3: Combine with Employee conditions

Employee(e) ∧ e.salary > 50000 ∧ (e.dno ∈ Research departments)

Algebraically:

σ_{salary > 50000}(Employee) ⋈_{dno = dno} π_{dno}(σ_{dname='Research'}(Department))

Or using semi-join:

σ_{salary > 50000}(Employee) ⋉_{dno} σ_{dname='Research'}(Department)

Step 4: Apply the final projection

π_{name}(σ_{salary > 50000}(Employee) ⋈_{dno = dno} π_{dno}(σ_{dname='Research'}(Department)))

Final algebraic expression:

π_{name}(
  σ_{salary > 50000}(Employee) 
  ⋈_{Employee.dno = Department.dno} 
  σ_{dname = 'Research'}(Department)
)

Verification

Both expressions compute the same result: names of high-earning Research employees. The TRC describes WHAT, the algebra describes HOW. The translation is semantics-preserving.

Handling Universal Quantifiers

Example: Universal quantifier query

{ s.sname | Student(s) ∧ ∀c(Course(c) → Enrolled(s.sid, c.cid)) }

Translation approach using ∀x.φ ≡ ¬∃x.¬φ:

∀c(Course(c) → Enrolled(s.sid, c.cid))
≡ ∀c(¬Course(c) ∨ Enrolled(s.sid, c.cid))
≡ ¬∃c(Course(c) ∧ ¬Enrolled(s.sid, c.cid))
≡ ¬∃c(c is a course AND s is not enrolled in c)

Algebraic translation:

-- All courses
π_{cid}(Course)

-- Courses each student is enrolled in
π_{sid, cid}(Enrolled)

-- Students missing at least one course
-- (Students paired with all courses) - (Actual enrollments)
Missing = π_{sid}((π_{sid}(Student) × π_{cid}(Course)) − Enrolled)

-- Students enrolled in ALL courses = All students - Those missing any
π_{sid}(Student) − Missing

-- Get names
π_{sname}(Student ⋈ (π_{sid}(Student) − Missing))

Recognizing this as division:

The operation 'find students enrolled in all courses' is exactly what the division operator (÷) computes:

π_{sid, cid}(Enrolled) ÷ π_{cid}(Course)

This returns sids of students whose enrolled courses include ALL courses.

Division Pattern

The Active Domain Problem

A subtle but crucial aspect of the calculus-to-algebra translation involves the active domain—the set of values that actually appear in the database. This matters when translating negation.

The problem:

Consider the TRC query:

{ t | R(t) ∧ ¬S(t) }

This asks for tuples in R but not in S—simply the difference R − S.

But consider:

{ t | ¬R(t) }

This asks for tuples NOT in R. But from what universe? All possible tuples? That's infinite.

The active domain solution:

For safe queries, all free variables are constrained to range over values appearing in the database. The active domain ADOM is:

ADOM = π_{A₁}(R₁) ∪ π_{A₂}(R₁) ∪ ... ∪ π_{B₁}(R₂) ∪ ...

Union of all attribute values from all relations.

Safe negation translation:

〈¬R(t)〉 = ADOM^n − R   // where n is the arity of R

This computes all tuples of active domain values that are not in R—a finite, computable set.

Example:

{ t | Employee(t) ∧ ¬Manager(t) }

Translation:

Employee − Manager

No need for ADOM here because Employee already constrains the domain.

Safety Ensures Finite ADOM Suffices

Completeness and Correctness of Translation

For the equivalence proof to be complete, we must establish that the translations preserve semantics in all cases.

Properties of the Translation

•Soundness (Algebra → Calculus): For every database D, the algebraic expression E and its calculus translation [E] produce identical results: E(D) = E.
•Soundness (Calculus → Algebra): For every database D, the calculus formula φ and its algebraic translation 〈φ〉 produce identical results: φ(D) = 〈φ〉(D).
•Completeness: Every well-formed algebraic expression has a translation, and every safe calculus formula has a translation. No valid input is rejected.
•Termination: Both translation algorithms terminate for any well-formed input. No infinite loops or unbounded recursion.

The proof technique:

Each translation rule is verified individually:

Base case verification:
- [R] = { t | R(t) } trivially represents the same set of tuples
Inductive case verification:
- Assume [E₁] correctly represents E₁ and [E₂] correctly represents E₂
- Show [E₁ ∪ E₂] = { t | φ₁(t) ∨ φ₂(t) } correctly represents E₁ ∪ E₂
- A tuple t is in E₁ ∪ E₂ iff t ∈ E₁ or t ∈ E₂
- By inductive hypothesis, t ∈ E₁ iff φ₁(t) and t ∈ E₂ iff φ₂(t)
- Therefore t ∈ E₁ ∪ E₂ iff φ₁(t) ∨ φ₂(t) ✓
Similar verification for each operator and construct

The full proof appears in database theory textbooks (Ullman, Abiteboul-Hull-Vianu) with all edge cases addressed.

Key theorems established:

Theorem (Algebra-Calculus Equivalence): For every relational schema S:

For every RA expression E over S, there exists a safe TRC formula φ such that E ≡ { t | φ(t) }

For every safe TRC formula φ over S, there exists an RA expression E such that { t | φ(t) } ≡ E

The translations are effective (algorithmic) and semantics-preserving.

Summary: Algorithms Behind the Theorem

We've now seen the constructive proof of Codd's Theorem—the algorithms that translate between algebra and calculus.

Key Takeaways

•Algebra → Calculus proceeds by structural induction: relations become predicates, operators become logical connectives (∪→∨, −→∧¬, ×→∧, σ→∧condition, π→∃).
•Calculus → Algebra normalizes formulas and translates constructs: predicates become relations, connectives become set operations, quantifiers become projections and differences.
•Universal quantifiers require special handling via ∀x.φ ≡ ¬∃x.¬φ, which translates to difference operations (or division).
•The active domain provides the finite universe for negation in calculus, enabling translation to the algebraic difference operation.
•Both translations are constructive — they provide actual algorithms, not just existence proofs. Database compilers implement these algorithms.
•Correctness is verified by induction — each translation rule preserves semantics, proven by case analysis on the structure of expressions.

What's next:

Page Complete

3 / 5