Loading learning content...
In the grand architecture of Tuple Relational Calculus, atomic formulas (or simply atoms) are the irreducible primitives—the fundamental assertions that cannot be broken down further. They are the base cases of the recursive formula grammar, the foundation upon which all complex expressions are built.
Understanding atoms with precision is essential because:
Every TRC query ultimately reduces to atoms: Complex formulas with nested quantifiers and multiple connectives eventually bottom out at atomic comparisons and range predicates
Atoms determine computational cost: Query execution translates atoms into table scans, index lookups, and comparisons—understanding atoms helps predict performance
Atoms bridge theory and practice: Each SQL WHERE clause condition corresponds to an atomic formula; mastering atoms means mastering the WHERE clause
Type systems operate at the atom level: Schema validation, domain checking, and NULL handling are all atomic-level concerns
This page provides comprehensive coverage of both types of atoms in TRC: range predicates and comparison atoms.
By the end of this page, you will understand the precise semantics of range predicates and comparison atoms, master three-valued logic for NULL handling, recognize atom patterns in SQL, and appreciate how atoms form the computational foundation of query evaluation.
The range predicate is the first type of atomic formula. It asserts that a tuple variable is bound to (i.e., is a member of) a specific relation.
Formal Definition:
R(t) ≡ "tuple variable t is a member of relation R"
Semantics:
Given a database instance D, the range predicate R(t) evaluates as:
t is a tuple that exists in R within Dt is not in RRange predicates are membership tests—they answer "Is this tuple in this relation?"
Role in Queries:
Range predicates serve dual purposes:
Database state:
Employee = {(E1, 'Alice', 50000), (E2, 'Bob', 60000)}
Contractor = {(C1, 'Charlie', 70000)}
Atom to evaluate: Employee(t) where t = (E1, 'Alice', 50000)TRUE (tuple exists in relation)Since (E1, 'Alice', 50000) is indeed in the Employee relation, the predicate evaluates to TRUE. If t were bound to (C1, 'Charlie', 70000), it would be FALSE—that tuple is in Contractor, not Employee.
Schema Constraints:
A range predicate R(t) implicitly constrains t to have the same schema as R. If R has schema (A₁: D₁, A₂: D₂, ..., Aₙ: Dₙ), then t must also have these attributes with compatible types.
This enables attribute references in subsequent conditions:
Employee(e) ∧ e.salary > 50000
↑
Valid because Employee schema includes 'salary'
Multiple Range Predicates:
Queries often include multiple range predicates:
{ e, d | Employee(e) ∧ Department(d) ∧ e.dept_id = d.dept_id }
Here, Employee(e) and Department(d) are separate atoms. Both must be TRUE (conjunction) for the formula to be satisfied.
Negative Range Tests:
Negated range predicates test non-membership:
¬Employee(t) ≡ "t is NOT in Employee"
This is conceptually problematic for free variables (infinite complement), but valid for bound variables in specific contexts.
Every tuple variable used in a query must have at least one positive (non-negated) range predicate. Without it, the variable's domain is undefined, and attribute references become meaningless. This is a fundamental well-formedness requirement.
The comparison atom is the second type of atomic formula. It asserts a relationship between two values using a comparison operator.
Formal Definition:
term₁ θ term₂
Where:
term₁ and term₂ are terms (attribute references or constants)θ is a comparison operator from {=, ≠, <, >, ≤, ≥}Term Types:
t.A — the value of attribute A in tuple variable t42, 'Alice', DATE '2024-01-15', NULLComparison Patterns:
| Pattern | Example | Description |
|---|---|---|
| Attribute-Attribute | e.dept_id = d.dept_id | Compare attributes from different tuples (join) |
| Attribute-Constant | e.salary > 50000 | Compare attribute to literal (selection) |
| Constant-Attribute | 100 <= e.age | Same as above (order reversed) |
| Attribute-Self | p.start < p.end | Compare attributes within same tuple |
| Operator | Name | Semantics | Example |
|---|---|---|---|
= | Equality | TRUE iff values are identical | e.name = 'Alice' |
≠ / <> | Inequality | TRUE iff values differ | e.dept_id ≠ 5 |
< | Less Than | TRUE iff left < right (strict) | e.age < 30 |
> | Greater Than | TRUE iff left > right (strict) | e.salary > 50000 |
≤ / <= | Less or Equal | TRUE iff left ≤ right | e.tenure <= 5 |
≥ / >= | Greater or Equal | TRUE iff left ≥ right | e.age >= 21 |
Type Requirements:
Comparison atoms require type compatibility between operands:
Mismatched types are generally errors:
✗ e.name > 50000 -- String vs. Number: Type error
✓ e.salary > 50000 -- Number vs. Number: Valid
✓ e.name > 'M' -- String vs. String: Valid (lexicographic)
Join Atoms:
Attribute-to-attribute comparisons often express joins:
e.dept_id = d.dept_id -- Equijoin condition
e.salary > m.salary -- Theta-join (inequality)
e₁.manager_id = e₂.emp_id -- Self-join
These atoms connect tuples across (or within) relations, enabling relational combination.
12345678910111213141516
-- EQUALITY ATOMS (Selection and Join){ e | Employee(e) ∧ e.dept_id = 5 } -- Attribute = Constant{ e, d | Employee(e) ∧ Department(d) ∧ e.dept_id = d.dept_id } -- Attribute = Attribute (Join) -- INEQUALITY ATOMS{ e | Employee(e) ∧ e.status ≠ 'Terminated' } -- Exclude specific value{ e | Employee(e) ∧ e.salary > e.base_salary } -- Within-tuple comparison -- RANGE ATOMS { e | Employee(e) ∧ e.age ≥ 21 ∧ e.age ≤ 65 } -- Age range{ p | Project(p) ∧ p.start_date < p.end_date } -- Temporal ordering -- MIXED TYPE COMPARISONS{ e | Employee(e) ∧ e.name >= 'M' ∧ e.name < 'N' } -- Last names starting with M{ t | Transaction(t) ∧ t.timestamp >= DATE '2024-01-01' } -- Date rangeReal databases contain NULL values—markers for missing, unknown, or inapplicable data. NULLs introduce three-valued logic (3VL) into atomic formula evaluation.
The Problem with NULL:
Consider: Is NULL = 5 true or false?
Three-Valued Logic:
TRC (and SQL) use three truth values:
| Value | Meaning |
|---|---|
| TRUE | Condition definitely holds |
| FALSE | Condition definitely doesn't hold |
| UNKNOWN | Cannot determine (typically involves NULL) |
| Expression | Result | Explanation |
|---|---|---|
NULL = NULL | UNKNOWN | We don't know if two unknowns are equal |
NULL = 5 | UNKNOWN | NULL might or might not be 5 |
NULL <> 5 | UNKNOWN | NULL might or might not differ from 5 |
NULL > 0 | UNKNOWN | Can't order unknown value |
5 = 5 | TRUE | Known values can be compared |
5 <> 5 | FALSE | Known values can be compared |
A common mistake: assuming NULL = NULL is FALSE or NULL ≠ NULL is TRUE. Both evaluate to UNKNOWN. This is because two unknown values might be the same unknown thing. This behavior affects queries in subtle ways.
3VL Truth Tables:
Logical connectives extend to three values:
Negation (¬):
| P | ¬P |
|---|---|
| T | F |
| U | U |
| F | T |
Conjunction (∧):
| P ∧ Q | T | U | F |
|---|---|---|---|
| T | T | U | F |
| U | U | U | F |
| F | F | F | F |
Disjunction (∨):
| P ∨ Q | T | U | F |
|---|---|---|---|
| T | T | T | T |
| U | T | U | U |
| F | T | U | F |
Key Insight: FALSE dominates AND (F ∧ anything = F), TRUE dominates OR (T ∨ anything = T). UNKNOWN propagates otherwise.
Effect on Query Results:
TRC (and SQL) include a tuple in the result only when the formula evaluates to TRUE. UNKNOWN is treated like FALSE for inclusion purposes.
Employee(e) ∧ e.commission > 0
If e.commission is NULL for some employee, this evaluates to UNKNOWN, and that employee is excluded from results.
12345678910111213141516171819
-- Employee table with potential NULLs-- | emp_id | name | commission |-- |--------|---------|------------|-- | 1 | Alice | 500 |-- | 2 | Bob | NULL |-- | 3 | Charlie | 0 | -- TRC: { e | Employee(e) ∧ e.commission > 0 }-- Returns only Alice (commission 500 > 0 is TRUE)-- Bob excluded: NULL > 0 is UNKNOWN-- Charlie excluded: 0 > 0 is FALSE SELECT * FROM Employee WHERE commission > 0;-- Result: Only Alice -- To include NULLs, explicit IS NULL test needed:-- TRC: { e | Employee(e) ∧ (e.commission > 0 ∨ e.commission IS NULL) }SELECT * FROM Employee WHERE commission > 0 OR commission IS NULL;-- Result: Alice and BobPure TRC limits atoms to range predicates and simple comparisons. Extended TRC (and SQL) add richer atomic predicates to handle real-world needs.
IS NULL / IS NOT NULL:
Explicit NULL testing:
t.A IS NULL -- TRUE iff t.A is NULL
t.A IS NOT NULL -- TRUE iff t.A has a non-NULL value
These are the only way to definitively test for NULL. The atom t.A = NULL is always UNKNOWN!
BETWEEN:
Range inclusion (syntactic sugar):
t.A BETWEEN x AND y ≡ t.A >= x ∧ t.A <= y
IN (Set Membership):
Value in a set:
t.A IN (v₁, v₂, ..., vₙ) ≡ t.A = v₁ ∨ t.A = v₂ ∨ ... ∨ t.A = vₙ
LIKE (Pattern Matching):
String pattern matching:
t.name LIKE 'A%' -- Names starting with 'A'
t.email LIKE '%@%' -- Contains '@'
t.code LIKE 'AB_C' -- 'AB' + any char + 'C'
Patterns: % = any sequence, _ = any single character
| Predicate | Syntax | Meaning | Example |
|---|---|---|---|
| NULL Test | t.A IS NULL | Attribute is NULL | e.commission IS NULL |
| Not NULL | t.A IS NOT NULL | Attribute has value | e.email IS NOT NULL |
| Between | t.A BETWEEN x AND y | Value in range [x,y] | e.age BETWEEN 21 AND 65 |
| Set In | t.A IN (v₁,...) | Value in set | e.dept IN (1,2,3) |
| Not In | t.A NOT IN (v₁,...) | Value not in set | e.status NOT IN ('X','Y') |
| Like | t.A LIKE pattern | String matches pattern | e.name LIKE 'A%' |
Computed Terms (Extended):
Some TRC extensions allow computations within atoms:
e.salary * 1.1 > 60000 -- Computed value comparison
e.first_name || ' ' || e.last_name = 'John Doe' -- String concatenation
YEAR(e.hire_date) = 2024 -- Function application
These extend the term grammar from simple attribute references to expressions.
Type Casts:
CAST(t.quantity AS DECIMAL) > 10.5
t.date_string::DATE = DATE '2024-01-15'
Aggregate-Based Atoms (Advanced):
While aggregates are typically handled separately, some extended formalisms allow:
e.salary > AVG(SELECT salary FROM Employee) -- Scalar subquery comparison
This conceptually treats the aggregate as a computed constant for comparison.
Pure TRC is theoretically minimal—just range predicates and basic comparisons. Extended forms add IS NULL, LIKE, etc. for practicality. The extensions don't increase expressive power (you could simulate many with complex formulas) but vastly improve usability. SQL incorporates these extensions directly.
Understanding precisely how atoms are evaluated illuminates query execution and optimization.
Range Predicate Evaluation:
To evaluate R(t) given binding of t to tuple τ:
Implementation: This might involve:
Comparison Atom Evaluation:
To evaluate term₁ θ term₂:
Resolve term₁:
t.A): Look up attribute A in tuple bound to tResolve term₂: Same process
Check NULL:
Apply comparison:
123456789101112131415161718192021222324252627282930
// Pseudocode for atom evaluation function evaluateRangePredicate(relation R, tupleVar t, binding B): τ = B[t] // Get the tuple bound to variable t return τ ∈ R // Membership test function evaluateComparisonAtom(term1, op, term2, binding B): // Resolve terms v1 = resolveTerm(term1, B) v2 = resolveTerm(term2, B) // NULL handling if v1 IS NULL or v2 IS NULL: return UNKNOWN // Most comparisons // Apply operator switch op: case '=': return v1 == v2 case '≠': return v1 != v2 case '<': return v1 < v2 case '>': return v1 > v2 case '≤': return v1 <= v2 case '≥': return v1 >= v2 function resolveTerm(term, binding B): if term is AttributeRef(var, attr): τ = B[var] // Get tuple for variable return τ[attr] // Extract attribute value else if term is Constant(value): return valueOptimization at the Atom Level:
Query optimizers focus heavily on atoms because they determine physical operations:
| Atom Type | Potential Optimization |
|---|---|
R(t) ∧ t.pk = value | Primary key lookup (O(1)) |
R(t) ∧ t.indexed_col = value | Index seek (O(log n)) |
R(t) ∧ t.unindexed > value | Full scan with filter |
t₁.fk = t₂.pk | Hash join or merge join |
t.col IN (values) | Multiple index probes or IN-list optimization |
Selectivity Estimation:
Optimizers estimate selectivity—the fraction of tuples satisfying an atom:
e.dept_id = 5: If 10 departments with uniform distribution, ~10% selectivitye.salary > 100000: Depends on salary distribution, maybe 5%e.name LIKE 'A%': ~1/26 ≈ 4% if uniform first-letter distributionHigher selectivity (fewer matching rows) atoms are often evaluated first.
While TRC is declarative (no specified order), query execution is physical. Evaluating high-selectivity atoms first reduces the work for subsequent atoms. Optimizers reorder conditions, but understanding this helps in query tuning and index design.
Every SQL WHERE clause condition is an atomic formula (or combination thereof). Understanding this connection makes SQL natural.
Direct Mappings:
| TRC Atom | SQL Equivalent | Example |
|---|---|---|
R(t) | FROM R AS t | FROM Employee AS e |
t.A = value | t.A = value | WHERE e.dept_id = 5 |
t.A = s.B | t.A = s.B | WHERE e.dept_id = d.dept_id |
t.A > value | t.A > value | WHERE e.salary > 50000 |
t.A IS NULL | t.A IS NULL | WHERE e.commission IS NULL |
t.A LIKE 'pat' | t.A LIKE 'pat' | WHERE e.name LIKE 'A%' |
12345678910111213141516171819202122232425
-- TRC Query with multiple atoms:-- { e | Employee(e) ∧ -- e.salary > 50000 ∧ -- e.dept_id = 5 ∧ -- e.status ≠ 'Inactive' ∧-- e.commission IS NOT NULL } -- SQL Translation:SELECT *FROM Employee AS e -- Range predicateWHERE e.salary > 50000 -- Comparison atom AND e.dept_id = 5 -- Comparison atom AND e.status <> 'Inactive' -- Comparison atom AND e.commission IS NOT NULL; -- NULL test atom -- Join atoms in SQL:-- TRC: { e, d | Employee(e) ∧ Department(d) ∧ e.dept_id = d.dept_id }SELECT e.*, d.*FROM Employee AS e, Department AS d -- Range predicatesWHERE e.dept_id = d.dept_id; -- Join atom (comparison atom) -- Or using explicit JOIN syntax:SELECT e.*, d.*FROM Employee AS eINNER JOIN Department AS d ON e.dept_id = d.dept_id; -- Join atom in ON clauseSQL Condition Composition:
SQL's WHERE clause is a conjunction of atoms (or compound conditions):
WHERE atom1 AND atom2 AND atom3 ...
This corresponds to TRC's:
R(t) ∧ atom1 ∧ atom2 ∧ atom3 ...
Disjunctions and negations work similarly:
WHERE (e.dept_id = 5 OR e.dept_id = 7) AND NOT e.status = 'Inactive'
Corresponds to TRC:
Employee(e) ∧ (e.dept_id = 5 ∨ e.dept_id = 7) ∧ ¬(e.status = 'Inactive')
SQL's NULL handling, three-valued logic, and comparison operator behavior all derive from TRC's formal semantics. When a SQL condition returns UNKNOWN (due to NULL), the row is excluded—exactly as TRC specifies. Understanding TRC atoms means understanding SQL conditions at the deepest level.
Certain atom patterns recur frequently across queries. Recognizing these idioms accelerates query development.
Pattern 1: Primary Key Lookup
R(t) ∧ t.pk = specific_value
Selects at most one tuple. Highly efficient with B-tree index.
Pattern 2: Foreign Key Join
R(t₁) ∧ S(t₂) ∧ t₁.fk = t₂.pk
Classic equijoin linking related tables.
Pattern 3: Range Filter
R(t) ∧ t.date >= start_date ∧ t.date < end_date
Bounded range, often on indexed date columns.
Pattern 4: Set Membership
R(t) ∧ (t.category = 'A' ∨ t.category = 'B' ∨ t.category = 'C')
Multiple alternatives (IN-list equivalent).
Pattern 5: Optional Value (NULL handling)
R(t) ∧ (t.optional_field = value ∨ t.optional_field IS NULL)
Handles NULLs explicitly when needed.
Pattern 6: Self-Join Comparison
R(t₁) ∧ R(t₂) ∧ t₁.pk ≠ t₂.pk ∧ t₁.attr = t₂.attr
Finds different tuples with matching attributes.
Expert query writers recognize these patterns instantly and choose indexing strategies accordingly. When you see a query taking too long, examine which atoms are being evaluated and whether they match efficient patterns.
Understanding which atom formulations are equivalent (produce identical results) enables query optimization and simplification.
Basic Equivalences:
t.A > 5 ∧ t.A > 10 ≡ t.A > 10 -- Subsumption
t.A = 5 ∧ t.A = 10 ≡ FALSE -- Contradiction
t.A = 5 ∨ t.A = 10 ≡ t.A IN (5,10) -- OR to IN
t.A >= 5 ∧ t.A <= 10 ≡ t.A BETWEEN 5 AND 10
De Morgan with Atoms:
¬(t.A = 5 ∧ t.B = 10) ≡ t.A ≠ 5 ∨ t.B ≠ 10
¬(t.A = 5 ∨ t.B = 10) ≡ t.A ≠ 5 ∧ t.B ≠ 10
NULL-aware Equivalences:
t.A = t.A -- NOT always TRUE! UNKNOWN if t.A is NULL
¬(t.A IS NULL) ≡ t.A IS NOT NULL
Range Predicate Optimizations:
t.A >= 5 ∧ t.A > 3 ≡ t.A >= 5 -- The >= 5 is stronger
t.A < 10 ∧ t.A <= 10 ≡ t.A < 10 -- The < 10 is stronger
Original: R(t) ∧ t.salary > 40000 ∧ t.salary > 50000 ∧ t.salary >= 50000
Step 1: t.salary > 50000 implies t.salary > 40000 (remove weaker)
Step 2: t.salary > 50000 implies t.salary >= 50000 (remove weaker)R(t) ∧ t.salary > 50000The strongest constraint subsumes the weaker ones. Only t.salary > 50000 is needed.
Query Optimizer Transformations:
Optimizers apply these equivalences automatically:
t.A > 3+2 → t.A > 5t.A = 5 AND t.A = 10 → return empty immediatelyThese transformations preserve semantics while improving execution efficiency.
What's Next:
With atomic formulas thoroughly covered, we'll now explore Complex Formulas—how atoms combine through connectives and quantifiers to express sophisticated queries. We'll see the full power of TRC emerge from these primitive building blocks.
You now understand atomic formulas at the deepest level—from formal semantics to practical SQL, from NULL handling to optimization patterns. This foundational knowledge empowers you to analyze, write, and optimize queries with precision.