Database Management SystemsTuple Relational Calculus

Tuple Relational Calculus

LevelIntermediate

Duration75 mins

TopicTuple Relational Calculus

4 / 5

Atomic Formulas

The Primitive Building Blocks of TRC

In the grand architecture of Tuple Relational Calculus, atomic formulas (or simply atoms) are the irreducible primitives—the fundamental assertions that cannot be broken down further. They are the base cases of the recursive formula grammar, the foundation upon which all complex expressions are built.

Understanding atoms with precision is essential because:

Every TRC query ultimately reduces to atoms: Complex formulas with nested quantifiers and multiple connectives eventually bottom out at atomic comparisons and range predicates
Atoms determine computational cost: Query execution translates atoms into table scans, index lookups, and comparisons—understanding atoms helps predict performance
Atoms bridge theory and practice: Each SQL WHERE clause condition corresponds to an atomic formula; mastering atoms means mastering the WHERE clause
Type systems operate at the atom level: Schema validation, domain checking, and NULL handling are all atomic-level concerns

This page provides comprehensive coverage of both types of atoms in TRC: range predicates and comparison atoms.

What You Will Learn

By the end of this page, you will understand the precise semantics of range predicates and comparison atoms, master three-valued logic for NULL handling, recognize atom patterns in SQL, and appreciate how atoms form the computational foundation of query evaluation.

Range Predicates: The Membership Atom

The range predicate is the first type of atomic formula. It asserts that a tuple variable is bound to (i.e., is a member of) a specific relation.

Formal Definition:

R(t) ≡ "tuple variable t is a member of relation R"

Semantics:

Given a database instance D, the range predicate R(t) evaluates as:

TRUE if the current binding of t is a tuple that exists in R within D
FALSE if the current binding of t is not in R

Range predicates are membership tests—they answer "Is this tuple in this relation?"

Role in Queries:

Range predicates serve dual purposes:

Domain Declaration: They specify which tuples a variable can bind to
Filtering: They constrain results to tuples actually in the database

Range Predicate EvaluationUnderstanding how range predicates evaluate

Input

Database state:
  Employee = {(E1, 'Alice', 50000), (E2, 'Bob', 60000)}
  Contractor = {(C1, 'Charlie', 70000)}

Atom to evaluate: Employee(t) where t = (E1, 'Alice', 50000)

Output

TRUE (tuple exists in relation)

Explanation

Since (E1, 'Alice', 50000) is indeed in the Employee relation, the predicate evaluates to TRUE. If t were bound to (C1, 'Charlie', 70000), it would be FALSE—that tuple is in Contractor, not Employee.

Schema Constraints:

A range predicate R(t) implicitly constrains t to have the same schema as R. If R has schema (A₁: D₁, A₂: D₂, ..., Aₙ: Dₙ), then t must also have these attributes with compatible types.

This enables attribute references in subsequent conditions:

Employee(e) ∧ e.salary > 50000
                ↑
        Valid because Employee schema includes 'salary'

Multiple Range Predicates:

Queries often include multiple range predicates:

{ e, d | Employee(e) ∧ Department(d) ∧ e.dept_id = d.dept_id }

Here, Employee(e) and Department(d) are separate atoms. Both must be TRUE (conjunction) for the formula to be satisfied.

Negative Range Tests:

Negated range predicates test non-membership:

¬Employee(t)  ≡  "t is NOT in Employee"

This is conceptually problematic for free variables (infinite complement), but valid for bound variables in specific contexts.

Range Predicates Are Not Optional

Every tuple variable used in a query must have at least one positive (non-negated) range predicate. Without it, the variable's domain is undefined, and attribute references become meaningless. This is a fundamental well-formedness requirement.

Comparison Atoms: The Relational Operators

The comparison atom is the second type of atomic formula. It asserts a relationship between two values using a comparison operator.

Formal Definition:

term₁ θ term₂

Where:

term₁ and term₂ are terms (attribute references or constants)
θ is a comparison operator from {=, ≠, <, >, ≤, ≥}

Term Types:

Attribute Term: t.A — the value of attribute A in tuple variable t
Constant Term: 42, 'Alice', DATE '2024-01-15', NULL

Comparison Patterns:

Pattern	Example	Description
Attribute-Attribute	`e.dept_id = d.dept_id`	Compare attributes from different tuples (join)
Attribute-Constant	`e.salary > 50000`	Compare attribute to literal (selection)
Constant-Attribute	`100 <= e.age`	Same as above (order reversed)
Attribute-Self	`p.start < p.end`	Compare attributes within same tuple

Comparison Operators and Their Semantics
Operator	Name	Semantics	Example
`=`	Equality	TRUE iff values are identical	`e.name = 'Alice'`
`≠` / `<>`	Inequality	TRUE iff values differ	`e.dept_id ≠ 5`
`<`	Less Than	TRUE iff left < right (strict)	`e.age < 30`
`>`	Greater Than	TRUE iff left > right (strict)	`e.salary > 50000`
`≤` / `<=`	Less or Equal	TRUE iff left ≤ right	`e.tenure <= 5`
`≥` / `>=`	Greater or Equal	TRUE iff left ≥ right	`e.age >= 21`

Type Requirements:

Comparison atoms require type compatibility between operands:

Numeric types: INTEGER, DECIMAL, FLOAT can be compared freely
String types: VARCHAR, CHAR use lexicographic ordering
Date/Time types: Temporal comparison (earlier/later)
Boolean: Only equality/inequality typically meaningful

Mismatched types are generally errors:

✗ e.name > 50000      -- String vs. Number: Type error
✓ e.salary > 50000    -- Number vs. Number: Valid
✓ e.name > 'M'        -- String vs. String: Valid (lexicographic)

Join Atoms:

Attribute-to-attribute comparisons often express joins:

e.dept_id = d.dept_id    -- Equijoin condition
e.salary > m.salary      -- Theta-join (inequality)
e₁.manager_id = e₂.emp_id -- Self-join

These atoms connect tuples across (or within) relations, enabling relational combination.

comparison-atoms.trc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- EQUALITY ATOMS (Selection and Join)
{ e | Employee(e) ∧ e.dept_id = 5 }                     -- Attribute = Constant
{ e, d | Employee(e) ∧ Department(d) ∧ 
         e.dept_id = d.dept_id }                         -- Attribute = Attribute (Join)
 
-- INEQUALITY ATOMS
{ e | Employee(e) ∧ e.status ≠ 'Terminated' }           -- Exclude specific value
{ e | Employee(e) ∧ e.salary > e.base_salary }          -- Within-tuple comparison
 
-- RANGE ATOMS  
{ e | Employee(e) ∧ e.age ≥ 21 ∧ e.age ≤ 65 }          -- Age range
{ p | Project(p) ∧ p.start_date < p.end_date }          -- Temporal ordering
 
-- MIXED TYPE COMPARISONS
{ e | Employee(e) ∧ e.name >= 'M' ∧ e.name < 'N' }      -- Last names starting with M
{ t | Transaction(t) ∧ t.timestamp >= DATE '2024-01-01' }  -- Date range

Three-Valued Logic and NULL Handling

Real databases contain NULL values—markers for missing, unknown, or inapplicable data. NULLs introduce three-valued logic (3VL) into atomic formula evaluation.

The Problem with NULL:

Consider: Is NULL = 5 true or false?

We don't know what NULL represents
It could be 5, so we can't say FALSE
It could be something else, so we can't say TRUE
The only honest answer is: UNKNOWN

Three-Valued Logic:

TRC (and SQL) use three truth values:

Value	Meaning
TRUE	Condition definitely holds
FALSE	Condition definitely doesn't hold
UNKNOWN	Cannot determine (typically involves NULL)

NULL Comparison Results
Expression	Result	Explanation
`NULL = NULL`	UNKNOWN	We don't know if two unknowns are equal
`NULL = 5`	UNKNOWN	NULL might or might not be 5
`NULL <> 5`	UNKNOWN	NULL might or might not differ from 5
`NULL > 0`	UNKNOWN	Can't order unknown value
`5 = 5`	TRUE	Known values can be compared
`5 <> 5`	FALSE	Known values can be compared

NULL ≠ NULL is UNKNOWN, not TRUE!

A common mistake: assuming NULL = NULL is FALSE or NULL ≠ NULL is TRUE. Both evaluate to UNKNOWN. This is because two unknown values might be the same unknown thing. This behavior affects queries in subtle ways.

3VL Truth Tables:

Logical connectives extend to three values:

Negation (¬):

P	¬P
T	F
U	U
F	T

Conjunction (∧):

P ∧ Q	T	U	F
T	T	U	F
U	U	U	F
F	F	F	F

Disjunction (∨):

P ∨ Q	T	U	F
T	T	T	T
U	T	U	U
F	T	U	F

Key Insight: FALSE dominates AND (F ∧ anything = F), TRUE dominates OR (T ∨ anything = T). UNKNOWN propagates otherwise.

Effect on Query Results:

TRC (and SQL) include a tuple in the result only when the formula evaluates to TRUE. UNKNOWN is treated like FALSE for inclusion purposes.

Employee(e) ∧ e.commission > 0

If e.commission is NULL for some employee, this evaluates to UNKNOWN, and that employee is excluded from results.

null-handling.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Employee table with potential NULLs
-- | emp_id | name    | commission |
-- |--------|---------|------------|
-- | 1      | Alice   | 500        |
-- | 2      | Bob     | NULL       |
-- | 3      | Charlie | 0          |
 
-- TRC: { e | Employee(e) ∧ e.commission > 0 }
-- Returns only Alice (commission 500 > 0 is TRUE)
-- Bob excluded: NULL > 0 is UNKNOWN
-- Charlie excluded: 0 > 0 is FALSE
 
SELECT * FROM Employee WHERE commission > 0;
-- Result: Only Alice
 
-- To include NULLs, explicit IS NULL test needed:
-- TRC: { e | Employee(e) ∧ (e.commission > 0 ∨ e.commission IS NULL) }
SELECT * FROM Employee WHERE commission > 0 OR commission IS NULL;
-- Result: Alice and Bob

Extended Atomic Predicates

Pure TRC limits atoms to range predicates and simple comparisons. Extended TRC (and SQL) add richer atomic predicates to handle real-world needs.

IS NULL / IS NOT NULL:

Explicit NULL testing:

t.A IS NULL       -- TRUE iff t.A is NULL
t.A IS NOT NULL   -- TRUE iff t.A has a non-NULL value

These are the only way to definitively test for NULL. The atom t.A = NULL is always UNKNOWN!

BETWEEN:

Range inclusion (syntactic sugar):

t.A BETWEEN x AND y  ≡  t.A >= x ∧ t.A <= y

IN (Set Membership):

Value in a set:

t.A IN (v₁, v₂, ..., vₙ)  ≡  t.A = v₁ ∨ t.A = v₂ ∨ ... ∨ t.A = vₙ

LIKE (Pattern Matching):

String pattern matching:

t.name LIKE 'A%'     -- Names starting with 'A'
t.email LIKE '%@%'   -- Contains '@'
t.code LIKE 'AB_C'   -- 'AB' + any char + 'C'

Patterns: % = any sequence, _ = any single character

Extended Atomic Predicates
Predicate	Syntax	Meaning	Example
NULL Test	`t.A IS NULL`	Attribute is NULL	`e.commission IS NULL`
Not NULL	`t.A IS NOT NULL`	Attribute has value	`e.email IS NOT NULL`
Between	`t.A BETWEEN x AND y`	Value in range [x,y]	`e.age BETWEEN 21 AND 65`
Set In	`t.A IN (v₁,...)`	Value in set	`e.dept IN (1,2,3)`
Not In	`t.A NOT IN (v₁,...)`	Value not in set	`e.status NOT IN ('X','Y')`
Like	`t.A LIKE pattern`	String matches pattern	`e.name LIKE 'A%'`

Computed Terms (Extended):

Some TRC extensions allow computations within atoms:

e.salary * 1.1 > 60000          -- Computed value comparison
e.first_name || ' ' || e.last_name = 'John Doe'  -- String concatenation
YEAR(e.hire_date) = 2024        -- Function application

These extend the term grammar from simple attribute references to expressions.

Type Casts:

CAST(t.quantity AS DECIMAL) > 10.5
t.date_string::DATE = DATE '2024-01-15'

Aggregate-Based Atoms (Advanced):

While aggregates are typically handled separately, some extended formalisms allow:

e.salary > AVG(SELECT salary FROM Employee)  -- Scalar subquery comparison

This conceptually treats the aggregate as a computed constant for comparison.

Extended vs. Pure TRC

Pure TRC is theoretically minimal—just range predicates and basic comparisons. Extended forms add IS NULL, LIKE, etc. for practicality. The extensions don't increase expressive power (you could simulate many with complex formulas) but vastly improve usability. SQL incorporates these extensions directly.

Atom Evaluation Semantics

Understanding precisely how atoms are evaluated illuminates query execution and optimization.

Range Predicate Evaluation:

To evaluate R(t) given binding of t to tuple τ:

Look up relation R in the current database instance D
Check if τ ∈ R (membership test)
Return TRUE if yes, FALSE if no

Implementation: This might involve:

Full table scan (worst case)
Index lookup if τ's key is indexed
Hash probe if hash index exists

Comparison Atom Evaluation:

To evaluate term₁ θ term₂:

Resolve term₁:
- If attribute (t.A): Look up attribute A in tuple bound to t
- If constant: Use the constant value directly
Resolve term₂: Same process
Check NULL:
- If either value is NULL, result is typically UNKNOWN
- Exception: IS NULL tests
Apply comparison:
- Execute the comparison operator on the two values
- Return TRUE, FALSE, or UNKNOWN

atom-evaluation.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// Pseudocode for atom evaluation
 
function evaluateRangePredicate(relation R, tupleVar t, binding B):
    τ = B[t]  // Get the tuple bound to variable t
    return τ ∈ R  // Membership test
 
function evaluateComparisonAtom(term1, op, term2, binding B):
    // Resolve terms
    v1 = resolveTerm(term1, B)
    v2 = resolveTerm(term2, B)
    
    // NULL handling
    if v1 IS NULL or v2 IS NULL:
        return UNKNOWN  // Most comparisons
    
    // Apply operator
    switch op:
        case '=':  return v1 == v2
        case '≠':  return v1 != v2  
        case '<':  return v1 < v2
        case '>':  return v1 > v2
        case '≤':  return v1 <= v2
        case '≥':  return v1 >= v2
 
function resolveTerm(term, binding B):
    if term is AttributeRef(var, attr):
        τ = B[var]  // Get tuple for variable
        return τ[attr]  // Extract attribute value
    else if term is Constant(value):
        return value

Optimization at the Atom Level:

Query optimizers focus heavily on atoms because they determine physical operations:

Atom Type	Potential Optimization
`R(t) ∧ t.pk = value`	Primary key lookup (O(1))
`R(t) ∧ t.indexed_col = value`	Index seek (O(log n))
`R(t) ∧ t.unindexed > value`	Full scan with filter
`t₁.fk = t₂.pk`	Hash join or merge join
`t.col IN (values)`	Multiple index probes or IN-list optimization

Selectivity Estimation:

Optimizers estimate selectivity—the fraction of tuples satisfying an atom:

e.dept_id = 5: If 10 departments with uniform distribution, ~10% selectivity
e.salary > 100000: Depends on salary distribution, maybe 5%
e.name LIKE 'A%': ~1/26 ≈ 4% if uniform first-letter distribution

Higher selectivity (fewer matching rows) atoms are often evaluated first.

Atom Order Matters for Performance

While TRC is declarative (no specified order), query execution is physical. Evaluating high-selectivity atoms first reduces the work for subsequent atoms. Optimizers reorder conditions, but understanding this helps in query tuning and index design.

Atoms in SQL WHERE Clauses

Every SQL WHERE clause condition is an atomic formula (or combination thereof). Understanding this connection makes SQL natural.

Direct Mappings:

TRC Atoms to SQL Mapping
TRC Atom	SQL Equivalent	Example
`R(t)`	`FROM R AS t`	`FROM Employee AS e`
`t.A = value`	`t.A = value`	`WHERE e.dept_id = 5`
`t.A = s.B`	`t.A = s.B`	`WHERE e.dept_id = d.dept_id`
`t.A > value`	`t.A > value`	`WHERE e.salary > 50000`
`t.A IS NULL`	`t.A IS NULL`	`WHERE e.commission IS NULL`
`t.A LIKE 'pat'`	`t.A LIKE 'pat'`	`WHERE e.name LIKE 'A%'`

atoms-in-sql.sql

-- TRC Query with multiple atoms:
-- { e | Employee(e) ∧ 
--       e.salary > 50000 ∧ 
--       e.dept_id = 5 ∧ 
--       e.status ≠ 'Inactive' ∧
--       e.commission IS NOT NULL }
 
-- SQL Translation:
SELECT *
FROM Employee AS e  -- Range predicate
WHERE e.salary > 50000  -- Comparison atom
  AND e.dept_id = 5     -- Comparison atom  
  AND e.status <> 'Inactive'  -- Comparison atom
  AND e.commission IS NOT NULL;  -- NULL test atom
 
-- Join atoms in SQL:
-- TRC: { e, d | Employee(e) ∧ Department(d) ∧ e.dept_id = d.dept_id }
SELECT e.*, d.*
FROM Employee AS e, Department AS d  -- Range predicates
WHERE e.dept_id = d.dept_id;  -- Join atom (comparison atom)
 
-- Or using explicit JOIN syntax:
SELECT e.*, d.*
FROM Employee AS e
INNER JOIN Department AS d ON e.dept_id = d.dept_id;  -- Join atom in ON clause

SQL Condition Composition:

SQL's WHERE clause is a conjunction of atoms (or compound conditions):

WHERE atom1 AND atom2 AND atom3 ...

This corresponds to TRC's:

R(t) ∧ atom1 ∧ atom2 ∧ atom3 ...

Disjunctions and negations work similarly:

WHERE (e.dept_id = 5 OR e.dept_id = 7) AND NOT e.status = 'Inactive'

Corresponds to TRC:

Employee(e) ∧ (e.dept_id = 5 ∨ e.dept_id = 7) ∧ ¬(e.status = 'Inactive')

SQL Inherits TRC Semantics

SQL's NULL handling, three-valued logic, and comparison operator behavior all derive from TRC's formal semantics. When a SQL condition returns UNKNOWN (due to NULL), the row is excluded—exactly as TRC specifies. Understanding TRC atoms means understanding SQL conditions at the deepest level.

Common Atom Patterns and Idioms

Certain atom patterns recur frequently across queries. Recognizing these idioms accelerates query development.

Pattern 1: Primary Key Lookup

R(t) ∧ t.pk = specific_value

Selects at most one tuple. Highly efficient with B-tree index.

Pattern 2: Foreign Key Join

R(t₁) ∧ S(t₂) ∧ t₁.fk = t₂.pk

Classic equijoin linking related tables.

Pattern 3: Range Filter

R(t) ∧ t.date >= start_date ∧ t.date < end_date

Bounded range, often on indexed date columns.

Pattern 4: Set Membership

R(t) ∧ (t.category = 'A' ∨ t.category = 'B' ∨ t.category = 'C')

Multiple alternatives (IN-list equivalent).

Pattern 5: Optional Value (NULL handling)

R(t) ∧ (t.optional_field = value ∨ t.optional_field IS NULL)

Handles NULLs explicitly when needed.

Pattern 6: Self-Join Comparison

R(t₁) ∧ R(t₂) ∧ t₁.pk ≠ t₂.pk ∧ t₁.attr = t₂.attr

Finds different tuples with matching attributes.

Efficient Atom Patterns

•Equality on indexed columns
•Primary/foreign key joins
•Range on leading index column
•Small IN-lists on indexed column
•Prefix LIKE patterns (e.g., 'ABC%')

Inefficient Atom Patterns

•Inequality (≠) can't use index effectively
•LIKE with leading wildcard ('%ABC')
•Functions on indexed columns (YEAR(date))
•OR across different columns
•Negated IN (NOT IN large_list)

Pattern Recognition is Key

Expert query writers recognize these patterns instantly and choose indexing strategies accordingly. When you see a query taking too long, examine which atoms are being evaluated and whether they match efficient patterns.

Atoms and Query Equivalence

Understanding which atom formulations are equivalent (produce identical results) enables query optimization and simplification.

Basic Equivalences:

t.A > 5 ∧ t.A > 10  ≡  t.A > 10      -- Subsumption
t.A = 5 ∧ t.A = 10  ≡  FALSE          -- Contradiction
t.A = 5 ∨ t.A = 10  ≡  t.A IN (5,10)  -- OR to IN
t.A >= 5 ∧ t.A <= 10 ≡  t.A BETWEEN 5 AND 10

De Morgan with Atoms:

¬(t.A = 5 ∧ t.B = 10)  ≡  t.A ≠ 5 ∨ t.B ≠ 10
¬(t.A = 5 ∨ t.B = 10)  ≡  t.A ≠ 5 ∧ t.B ≠ 10

NULL-aware Equivalences:

t.A = t.A  -- NOT always TRUE! UNKNOWN if t.A is NULL
¬(t.A IS NULL)  ≡  t.A IS NOT NULL

Range Predicate Optimizations:

t.A >= 5 ∧ t.A > 3  ≡  t.A >= 5      -- The >= 5 is stronger
t.A < 10 ∧ t.A <= 10  ≡  t.A < 10    -- The < 10 is stronger

Atom SimplificationSimplifying redundant atoms

Input

Original: R(t) ∧ t.salary > 40000 ∧ t.salary > 50000 ∧ t.salary >= 50000

Step 1: t.salary > 50000 implies t.salary > 40000 (remove weaker)
Step 2: t.salary > 50000 implies t.salary >= 50000 (remove weaker)

Output

R(t) ∧ t.salary > 50000

Explanation

The strongest constraint subsumes the weaker ones. Only t.salary > 50000 is needed.

Query Optimizer Transformations:

Optimizers apply these equivalences automatically:

Constant Folding: t.A > 3+2 → t.A > 5
Contradiction Detection: t.A = 5 AND t.A = 10 → return empty immediately
Subsumption Elimination: Remove redundant weaker conditions
IN-list Conversion: Multiple ORs on same column → IN
Range Consolidation: Multiple range conditions → single range

These transformations preserve semantics while improving execution efficiency.

Summary: Mastering Atomic Formulas

Key Takeaways

•Two Types of Atoms: Range predicates (R(t)) assert relation membership; comparison atoms (term θ term) assert value relationships
•Comparison Operators: =, ≠, <, >, ≤, ≥ compare attribute values and constants with precise semantics
•Three-Valued Logic: NULL introduces UNKNOWN; any NULL comparison typically yields UNKNOWN, excluding rows from results
•Extended Atoms: IS NULL, BETWEEN, IN, LIKE extend pure TRC for practical querying without increasing theoretical power
•Evaluation Semantics: Atoms translate to physical operations—lookups, scans, comparisons—that determine query cost
•SQL Correspondence: WHERE clause conditions are atoms; understanding atoms means understanding SQL filtering
•Pattern Recognition: Efficient vs. inefficient atom patterns guide index design and query tuning
•Equivalence Rules: Atom transformations enable optimization while preserving query semantics

What's Next:

With atomic formulas thoroughly covered, we'll now explore Complex Formulas—how atoms combine through connectives and quantifiers to express sophisticated queries. We'll see the full power of TRC emerge from these primitive building blocks.

Page Complete

You now understand atomic formulas at the deepest level—from formal semantics to practical SQL, from NULL handling to optimization patterns. This foundational knowledge empowers you to analyze, write, and optimize queries with precision.

4 / 5

Loading learning content...

Database Management SystemsTuple Relational Calculus

Tuple Relational Calculus

LevelIntermediate

Duration75 mins

TopicTuple Relational Calculus

4 / 5

Atomic Formulas

The Primitive Building Blocks of TRC

Understanding atoms with precision is essential because:

Every TRC query ultimately reduces to atoms: Complex formulas with nested quantifiers and multiple connectives eventually bottom out at atomic comparisons and range predicates
Atoms determine computational cost: Query execution translates atoms into table scans, index lookups, and comparisons—understanding atoms helps predict performance
Atoms bridge theory and practice: Each SQL WHERE clause condition corresponds to an atomic formula; mastering atoms means mastering the WHERE clause
Type systems operate at the atom level: Schema validation, domain checking, and NULL handling are all atomic-level concerns

This page provides comprehensive coverage of both types of atoms in TRC: range predicates and comparison atoms.

What You Will Learn

Range Predicates: The Membership Atom

The range predicate is the first type of atomic formula. It asserts that a tuple variable is bound to (i.e., is a member of) a specific relation.

Formal Definition:

R(t) ≡ "tuple variable t is a member of relation R"

Semantics:

Given a database instance D, the range predicate R(t) evaluates as:

TRUE if the current binding of t is a tuple that exists in R within D
FALSE if the current binding of t is not in R

Range predicates are membership tests—they answer "Is this tuple in this relation?"

Role in Queries:

Range predicates serve dual purposes:

Domain Declaration: They specify which tuples a variable can bind to
Filtering: They constrain results to tuples actually in the database

Range Predicate EvaluationUnderstanding how range predicates evaluate

Input

Database state:
  Employee = {(E1, 'Alice', 50000), (E2, 'Bob', 60000)}
  Contractor = {(C1, 'Charlie', 70000)}

Atom to evaluate: Employee(t) where t = (E1, 'Alice', 50000)

Output

TRUE (tuple exists in relation)

Explanation

Schema Constraints:

This enables attribute references in subsequent conditions:

Employee(e) ∧ e.salary > 50000
                ↑
        Valid because Employee schema includes 'salary'

Multiple Range Predicates:

Queries often include multiple range predicates:

{ e, d | Employee(e) ∧ Department(d) ∧ e.dept_id = d.dept_id }

Here, Employee(e) and Department(d) are separate atoms. Both must be TRUE (conjunction) for the formula to be satisfied.

Negative Range Tests:

Negated range predicates test non-membership:

¬Employee(t)  ≡  "t is NOT in Employee"

This is conceptually problematic for free variables (infinite complement), but valid for bound variables in specific contexts.

Range Predicates Are Not Optional

Comparison Atoms: The Relational Operators

The comparison atom is the second type of atomic formula. It asserts a relationship between two values using a comparison operator.

Formal Definition:

term₁ θ term₂

Where:

term₁ and term₂ are terms (attribute references or constants)
θ is a comparison operator from {=, ≠, <, >, ≤, ≥}

Term Types:

Attribute Term: t.A — the value of attribute A in tuple variable t
Constant Term: 42, 'Alice', DATE '2024-01-15', NULL

Comparison Patterns:

Pattern	Example	Description
Attribute-Attribute	`e.dept_id = d.dept_id`	Compare attributes from different tuples (join)
Attribute-Constant	`e.salary > 50000`	Compare attribute to literal (selection)
Constant-Attribute	`100 <= e.age`	Same as above (order reversed)
Attribute-Self	`p.start < p.end`	Compare attributes within same tuple

Comparison Operators and Their Semantics
Operator	Name	Semantics	Example
`=`	Equality	TRUE iff values are identical	`e.name = 'Alice'`
`≠` / `<>`	Inequality	TRUE iff values differ	`e.dept_id ≠ 5`
`<`	Less Than	TRUE iff left < right (strict)	`e.age < 30`
`>`	Greater Than	TRUE iff left > right (strict)	`e.salary > 50000`
`≤` / `<=`	Less or Equal	TRUE iff left ≤ right	`e.tenure <= 5`
`≥` / `>=`	Greater or Equal	TRUE iff left ≥ right	`e.age >= 21`

Type Requirements:

Comparison atoms require type compatibility between operands:

Numeric types: INTEGER, DECIMAL, FLOAT can be compared freely
String types: VARCHAR, CHAR use lexicographic ordering
Date/Time types: Temporal comparison (earlier/later)
Boolean: Only equality/inequality typically meaningful

Mismatched types are generally errors:

✗ e.name > 50000      -- String vs. Number: Type error
✓ e.salary > 50000    -- Number vs. Number: Valid
✓ e.name > 'M'        -- String vs. String: Valid (lexicographic)

Join Atoms:

Attribute-to-attribute comparisons often express joins:

e.dept_id = d.dept_id    -- Equijoin condition
e.salary > m.salary      -- Theta-join (inequality)
e₁.manager_id = e₂.emp_id -- Self-join

These atoms connect tuples across (or within) relations, enabling relational combination.

comparison-atoms.trc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- EQUALITY ATOMS (Selection and Join)
{ e | Employee(e) ∧ e.dept_id = 5 }                     -- Attribute = Constant
{ e, d | Employee(e) ∧ Department(d) ∧ 
         e.dept_id = d.dept_id }                         -- Attribute = Attribute (Join)
 
-- INEQUALITY ATOMS
{ e | Employee(e) ∧ e.status ≠ 'Terminated' }           -- Exclude specific value
{ e | Employee(e) ∧ e.salary > e.base_salary }          -- Within-tuple comparison
 
-- RANGE ATOMS  
{ e | Employee(e) ∧ e.age ≥ 21 ∧ e.age ≤ 65 }          -- Age range
{ p | Project(p) ∧ p.start_date < p.end_date }          -- Temporal ordering
 
-- MIXED TYPE COMPARISONS
{ e | Employee(e) ∧ e.name >= 'M' ∧ e.name < 'N' }      -- Last names starting with M
{ t | Transaction(t) ∧ t.timestamp >= DATE '2024-01-01' }  -- Date range

Three-Valued Logic and NULL Handling

Real databases contain NULL values—markers for missing, unknown, or inapplicable data. NULLs introduce three-valued logic (3VL) into atomic formula evaluation.

The Problem with NULL:

Consider: Is NULL = 5 true or false?

We don't know what NULL represents
It could be 5, so we can't say FALSE
It could be something else, so we can't say TRUE
The only honest answer is: UNKNOWN

Three-Valued Logic:

TRC (and SQL) use three truth values:

Value	Meaning
TRUE	Condition definitely holds
FALSE	Condition definitely doesn't hold
UNKNOWN	Cannot determine (typically involves NULL)

NULL Comparison Results
Expression	Result	Explanation
`NULL = NULL`	UNKNOWN	We don't know if two unknowns are equal
`NULL = 5`	UNKNOWN	NULL might or might not be 5
`NULL <> 5`	UNKNOWN	NULL might or might not differ from 5
`NULL > 0`	UNKNOWN	Can't order unknown value
`5 = 5`	TRUE	Known values can be compared
`5 <> 5`	FALSE	Known values can be compared

NULL ≠ NULL is UNKNOWN, not TRUE!

3VL Truth Tables:

Logical connectives extend to three values:

Negation (¬):

P	¬P
T	F
U	U
F	T

Conjunction (∧):

P ∧ Q	T	U	F
T	T	U	F
U	U	U	F
F	F	F	F

Disjunction (∨):

P ∨ Q	T	U	F
T	T	T	T
U	T	U	U
F	T	U	F

Key Insight: FALSE dominates AND (F ∧ anything = F), TRUE dominates OR (T ∨ anything = T). UNKNOWN propagates otherwise.

Effect on Query Results:

TRC (and SQL) include a tuple in the result only when the formula evaluates to TRUE. UNKNOWN is treated like FALSE for inclusion purposes.

Employee(e) ∧ e.commission > 0

If e.commission is NULL for some employee, this evaluates to UNKNOWN, and that employee is excluded from results.

null-handling.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Employee table with potential NULLs
-- | emp_id | name    | commission |
-- |--------|---------|------------|
-- | 1      | Alice   | 500        |
-- | 2      | Bob     | NULL       |
-- | 3      | Charlie | 0          |
 
-- TRC: { e | Employee(e) ∧ e.commission > 0 }
-- Returns only Alice (commission 500 > 0 is TRUE)
-- Bob excluded: NULL > 0 is UNKNOWN
-- Charlie excluded: 0 > 0 is FALSE
 
SELECT * FROM Employee WHERE commission > 0;
-- Result: Only Alice
 
-- To include NULLs, explicit IS NULL test needed:
-- TRC: { e | Employee(e) ∧ (e.commission > 0 ∨ e.commission IS NULL) }
SELECT * FROM Employee WHERE commission > 0 OR commission IS NULL;
-- Result: Alice and Bob

Extended Atomic Predicates

Pure TRC limits atoms to range predicates and simple comparisons. Extended TRC (and SQL) add richer atomic predicates to handle real-world needs.

IS NULL / IS NOT NULL:

Explicit NULL testing:

t.A IS NULL       -- TRUE iff t.A is NULL
t.A IS NOT NULL   -- TRUE iff t.A has a non-NULL value

These are the only way to definitively test for NULL. The atom t.A = NULL is always UNKNOWN!

BETWEEN:

Range inclusion (syntactic sugar):

t.A BETWEEN x AND y  ≡  t.A >= x ∧ t.A <= y

IN (Set Membership):

Value in a set:

t.A IN (v₁, v₂, ..., vₙ)  ≡  t.A = v₁ ∨ t.A = v₂ ∨ ... ∨ t.A = vₙ

LIKE (Pattern Matching):

String pattern matching:

t.name LIKE 'A%'     -- Names starting with 'A'
t.email LIKE '%@%'   -- Contains '@'
t.code LIKE 'AB_C'   -- 'AB' + any char + 'C'

Patterns: % = any sequence, _ = any single character

Extended Atomic Predicates
Predicate	Syntax	Meaning	Example
NULL Test	`t.A IS NULL`	Attribute is NULL	`e.commission IS NULL`
Not NULL	`t.A IS NOT NULL`	Attribute has value	`e.email IS NOT NULL`
Between	`t.A BETWEEN x AND y`	Value in range [x,y]	`e.age BETWEEN 21 AND 65`
Set In	`t.A IN (v₁,...)`	Value in set	`e.dept IN (1,2,3)`
Not In	`t.A NOT IN (v₁,...)`	Value not in set	`e.status NOT IN ('X','Y')`
Like	`t.A LIKE pattern`	String matches pattern	`e.name LIKE 'A%'`

Computed Terms (Extended):

Some TRC extensions allow computations within atoms:

e.salary * 1.1 > 60000          -- Computed value comparison
e.first_name || ' ' || e.last_name = 'John Doe'  -- String concatenation
YEAR(e.hire_date) = 2024        -- Function application

These extend the term grammar from simple attribute references to expressions.

Type Casts:

CAST(t.quantity AS DECIMAL) > 10.5
t.date_string::DATE = DATE '2024-01-15'

Aggregate-Based Atoms (Advanced):

While aggregates are typically handled separately, some extended formalisms allow:

e.salary > AVG(SELECT salary FROM Employee)  -- Scalar subquery comparison

This conceptually treats the aggregate as a computed constant for comparison.

Extended vs. Pure TRC

Atom Evaluation Semantics

Understanding precisely how atoms are evaluated illuminates query execution and optimization.

Range Predicate Evaluation:

To evaluate R(t) given binding of t to tuple τ:

Look up relation R in the current database instance D
Check if τ ∈ R (membership test)
Return TRUE if yes, FALSE if no

Implementation: This might involve:

Full table scan (worst case)
Index lookup if τ's key is indexed
Hash probe if hash index exists

Comparison Atom Evaluation:

To evaluate term₁ θ term₂:

Resolve term₁:
- If attribute (t.A): Look up attribute A in tuple bound to t
- If constant: Use the constant value directly
Resolve term₂: Same process
Check NULL:
- If either value is NULL, result is typically UNKNOWN
- Exception: IS NULL tests
Apply comparison:
- Execute the comparison operator on the two values
- Return TRUE, FALSE, or UNKNOWN

atom-evaluation.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// Pseudocode for atom evaluation
 
function evaluateRangePredicate(relation R, tupleVar t, binding B):
    τ = B[t]  // Get the tuple bound to variable t
    return τ ∈ R  // Membership test
 
function evaluateComparisonAtom(term1, op, term2, binding B):
    // Resolve terms
    v1 = resolveTerm(term1, B)
    v2 = resolveTerm(term2, B)
    
    // NULL handling
    if v1 IS NULL or v2 IS NULL:
        return UNKNOWN  // Most comparisons
    
    // Apply operator
    switch op:
        case '=':  return v1 == v2
        case '≠':  return v1 != v2  
        case '<':  return v1 < v2
        case '>':  return v1 > v2
        case '≤':  return v1 <= v2
        case '≥':  return v1 >= v2
 
function resolveTerm(term, binding B):
    if term is AttributeRef(var, attr):
        τ = B[var]  // Get tuple for variable
        return τ[attr]  // Extract attribute value
    else if term is Constant(value):
        return value

Optimization at the Atom Level:

Query optimizers focus heavily on atoms because they determine physical operations:

Atom Type	Potential Optimization
`R(t) ∧ t.pk = value`	Primary key lookup (O(1))
`R(t) ∧ t.indexed_col = value`	Index seek (O(log n))
`R(t) ∧ t.unindexed > value`	Full scan with filter
`t₁.fk = t₂.pk`	Hash join or merge join
`t.col IN (values)`	Multiple index probes or IN-list optimization

Selectivity Estimation:

Optimizers estimate selectivity—the fraction of tuples satisfying an atom:

e.dept_id = 5: If 10 departments with uniform distribution, ~10% selectivity
e.salary > 100000: Depends on salary distribution, maybe 5%
e.name LIKE 'A%': ~1/26 ≈ 4% if uniform first-letter distribution

Higher selectivity (fewer matching rows) atoms are often evaluated first.

Atom Order Matters for Performance

Atoms in SQL WHERE Clauses

Every SQL WHERE clause condition is an atomic formula (or combination thereof). Understanding this connection makes SQL natural.

Direct Mappings:

TRC Atoms to SQL Mapping
TRC Atom	SQL Equivalent	Example
`R(t)`	`FROM R AS t`	`FROM Employee AS e`
`t.A = value`	`t.A = value`	`WHERE e.dept_id = 5`
`t.A = s.B`	`t.A = s.B`	`WHERE e.dept_id = d.dept_id`
`t.A > value`	`t.A > value`	`WHERE e.salary > 50000`
`t.A IS NULL`	`t.A IS NULL`	`WHERE e.commission IS NULL`
`t.A LIKE 'pat'`	`t.A LIKE 'pat'`	`WHERE e.name LIKE 'A%'`

atoms-in-sql.sql

-- TRC Query with multiple atoms:
-- { e | Employee(e) ∧ 
--       e.salary > 50000 ∧ 
--       e.dept_id = 5 ∧ 
--       e.status ≠ 'Inactive' ∧
--       e.commission IS NOT NULL }
 
-- SQL Translation:
SELECT *
FROM Employee AS e  -- Range predicate
WHERE e.salary > 50000  -- Comparison atom
  AND e.dept_id = 5     -- Comparison atom  
  AND e.status <> 'Inactive'  -- Comparison atom
  AND e.commission IS NOT NULL;  -- NULL test atom
 
-- Join atoms in SQL:
-- TRC: { e, d | Employee(e) ∧ Department(d) ∧ e.dept_id = d.dept_id }
SELECT e.*, d.*
FROM Employee AS e, Department AS d  -- Range predicates
WHERE e.dept_id = d.dept_id;  -- Join atom (comparison atom)
 
-- Or using explicit JOIN syntax:
SELECT e.*, d.*
FROM Employee AS e
INNER JOIN Department AS d ON e.dept_id = d.dept_id;  -- Join atom in ON clause

SQL Condition Composition:

SQL's WHERE clause is a conjunction of atoms (or compound conditions):

WHERE atom1 AND atom2 AND atom3 ...

This corresponds to TRC's:

R(t) ∧ atom1 ∧ atom2 ∧ atom3 ...

Disjunctions and negations work similarly:

WHERE (e.dept_id = 5 OR e.dept_id = 7) AND NOT e.status = 'Inactive'

Corresponds to TRC:

Employee(e) ∧ (e.dept_id = 5 ∨ e.dept_id = 7) ∧ ¬(e.status = 'Inactive')

SQL Inherits TRC Semantics

Common Atom Patterns and Idioms

Certain atom patterns recur frequently across queries. Recognizing these idioms accelerates query development.

Pattern 1: Primary Key Lookup

R(t) ∧ t.pk = specific_value

Selects at most one tuple. Highly efficient with B-tree index.

Pattern 2: Foreign Key Join

R(t₁) ∧ S(t₂) ∧ t₁.fk = t₂.pk

Classic equijoin linking related tables.

Pattern 3: Range Filter

R(t) ∧ t.date >= start_date ∧ t.date < end_date

Bounded range, often on indexed date columns.

Pattern 4: Set Membership

R(t) ∧ (t.category = 'A' ∨ t.category = 'B' ∨ t.category = 'C')

Multiple alternatives (IN-list equivalent).

Pattern 5: Optional Value (NULL handling)

R(t) ∧ (t.optional_field = value ∨ t.optional_field IS NULL)

Handles NULLs explicitly when needed.

Pattern 6: Self-Join Comparison

R(t₁) ∧ R(t₂) ∧ t₁.pk ≠ t₂.pk ∧ t₁.attr = t₂.attr

Finds different tuples with matching attributes.

Efficient Atom Patterns

•Equality on indexed columns
•Primary/foreign key joins
•Range on leading index column
•Small IN-lists on indexed column
•Prefix LIKE patterns (e.g., 'ABC%')

Inefficient Atom Patterns

•Inequality (≠) can't use index effectively
•LIKE with leading wildcard ('%ABC')
•Functions on indexed columns (YEAR(date))
•OR across different columns
•Negated IN (NOT IN large_list)

Pattern Recognition is Key

Atoms and Query Equivalence

Understanding which atom formulations are equivalent (produce identical results) enables query optimization and simplification.

Basic Equivalences:

t.A > 5 ∧ t.A > 10  ≡  t.A > 10      -- Subsumption
t.A = 5 ∧ t.A = 10  ≡  FALSE          -- Contradiction
t.A = 5 ∨ t.A = 10  ≡  t.A IN (5,10)  -- OR to IN
t.A >= 5 ∧ t.A <= 10 ≡  t.A BETWEEN 5 AND 10

De Morgan with Atoms:

¬(t.A = 5 ∧ t.B = 10)  ≡  t.A ≠ 5 ∨ t.B ≠ 10
¬(t.A = 5 ∨ t.B = 10)  ≡  t.A ≠ 5 ∧ t.B ≠ 10

NULL-aware Equivalences:

t.A = t.A  -- NOT always TRUE! UNKNOWN if t.A is NULL
¬(t.A IS NULL)  ≡  t.A IS NOT NULL

Range Predicate Optimizations:

t.A >= 5 ∧ t.A > 3  ≡  t.A >= 5      -- The >= 5 is stronger
t.A < 10 ∧ t.A <= 10  ≡  t.A < 10    -- The < 10 is stronger

Atom SimplificationSimplifying redundant atoms

Input

Original: R(t) ∧ t.salary > 40000 ∧ t.salary > 50000 ∧ t.salary >= 50000

Step 1: t.salary > 50000 implies t.salary > 40000 (remove weaker)
Step 2: t.salary > 50000 implies t.salary >= 50000 (remove weaker)

Output

R(t) ∧ t.salary > 50000

Explanation

The strongest constraint subsumes the weaker ones. Only t.salary > 50000 is needed.

Query Optimizer Transformations:

Optimizers apply these equivalences automatically:

Constant Folding: t.A > 3+2 → t.A > 5
Contradiction Detection: t.A = 5 AND t.A = 10 → return empty immediately
Subsumption Elimination: Remove redundant weaker conditions
IN-list Conversion: Multiple ORs on same column → IN
Range Consolidation: Multiple range conditions → single range

These transformations preserve semantics while improving execution efficiency.

Summary: Mastering Atomic Formulas

Key Takeaways

•Two Types of Atoms: Range predicates (R(t)) assert relation membership; comparison atoms (term θ term) assert value relationships
•Comparison Operators: =, ≠, <, >, ≤, ≥ compare attribute values and constants with precise semantics
•Three-Valued Logic: NULL introduces UNKNOWN; any NULL comparison typically yields UNKNOWN, excluding rows from results
•Extended Atoms: IS NULL, BETWEEN, IN, LIKE extend pure TRC for practical querying without increasing theoretical power
•Evaluation Semantics: Atoms translate to physical operations—lookups, scans, comparisons—that determine query cost
•SQL Correspondence: WHERE clause conditions are atoms; understanding atoms means understanding SQL filtering
•Pattern Recognition: Efficient vs. inefficient atom patterns guide index design and query tuning
•Equivalence Rules: Atom transformations enable optimization while preserving query semantics

What's Next:

Page Complete

4 / 5