Database Management SystemsRelations and Tables

Relations and Tables: The Foundation of Relational Databases

LevelBeginner

Duration60 mins

TopicRelations and Tables

5 / 5

Relation Instance: Data in Motion

The Living Data of Databases

If the relation schema is the architectural blueprint, the relation instance is the building itself—the actual structure that exists at a particular moment. Instances are what databases are ultimately about: storing, retrieving, and manipulating real data that represents real-world facts.

While schemas define what's possible, instances capture what's true. The same schema can have infinitely many valid instances, ranging from the empty relation (no tuples) to massive collections of millions of records. Understanding instances—their properties, their operations, and their constraints—is essential for effective database work.

This page completes our exploration of relations and tables by examining the dynamic, ever-changing content that gives databases their value.

Learning Objectives

By the end of this page, you will: (1) Formally define and understand relation instances, (2) Recognize the properties instances must satisfy, (3) Understand instance operations (insert, update, delete) and their effects, (4) Grasp how constraints govern valid instances, (5) Appreciate the temporal nature of instance states.

Formal Definition of Relation Instance

A relation instance (also called a relation state or relation extension) is the set of tuples that a relation contains at a particular moment in time.

Formal Definition:

Given a relation schema R(A₁:D₁, A₂:D₂, ..., Aₙ:Dₙ), a relation instance r of R is:

r ⊆ D₁ × D₂ × ... × Dₙ

such that:

r is a finite subset (instances are always finite)
Every tuple t ∈ r satisfies all constraints in R's constraint set
Each tuple t is an n-tuple (t₁, t₂, ..., tₙ) where tᵢ ∈ Dᵢ ∪ {NULL} for each i

Notation:

We write r(R) to denote an instance r of schema R. At different times, we might have r₁(R), r₂(R), etc., representing different states of the same relation.

Instance ExampleConsider the schema Employee(emp_id, emp_name, dept, salary) at two different times:

Input

Employee schema with instances at time t₁ and t₂

Output

Instance at t₁ (Monday):
r₁(Employee) = {
    (1001, 'Alice', 'Eng', 90000),
    (1002, 'Bob', 'Sales', 75000)
}
Cardinality: 2

Instance at t₂ (Tuesday, after changes):
r₂(Employee) = {
    (1001, 'Alice', 'Eng', 95000),  -- salary updated
    (1002, 'Bob', 'Sales', 75000),
    (1003, 'Carol', 'Eng', 85000)   -- new employee added
}
Cardinality: 3

Explanation

Same schema, different instances. The schema's structure didn't change (still 4 attributes), but the content evolved. Both instances are valid because each tuple conforms to the schema and satisfies all constraints.

Key Instance Properties:

Property	Meaning	Implication
Finite	Always a finite set of tuples	No infinite relations in practice
Subset	r ⊆ Cartesian product of domains	Only valid combinations included
Dynamic	Changes over time	Today's truth may differ from tomorrow's
Constrained	Must satisfy schema constraints	Not all possible tuples are allowed
Homogeneous	All tuples have same degree n	Uniform structure within a relation

Instance as Database State

The complete database instance (all relation instances together) represents the database STATE at a moment—a snapshot of all facts we believe to be true. Database operations are state transitions: they transform one valid instance into another valid instance.

Cardinality and Degree in Practice

Two fundamental metrics characterize every relation instance: cardinality (number of tuples) and degree (number of attributes). Understanding these metrics is crucial for query optimization, capacity planning, and understanding query results.

Cardinality (|r|):

The cardinality of instance r is the number of tuples it contains. Cardinality is:

Dynamic: Changes with every INSERT, DELETE, or some UPDATEs (if they change key values)
Non-negative: Minimum is 0 (empty relation)
Bounded in practice: Limited by storage, though no theoretical upper limit

Degree (deg(R)):

The degree of a relation (also called arity) is the number of attributes. Degree is:

Static: Fixed by the schema (changes only via ALTER TABLE)
Structural: Defines how 'wide' each tuple is

Real-World Scale Examples
Relation	Typical Degree	Typical Cardinality	Storage Implications
User accounts	15-30 columns	Millions to billions	Core table, heavily indexed
Product catalog	20-50 columns	Thousands to millions	Read-heavy, often cached
Order items	5-10 columns	Billions over time	Massive, often partitioned
Config settings	3-5 columns	Hundreds	Small, frequently read
Audit log	10-15 columns	Billions, append-only	Time-partitioned, archived

Impact on Query Performance:

Both cardinality and degree affect query execution:

High Cardinality (Many Rows):

Full table scans become expensive
Index selection becomes critical
Aggregations require more computation
Join order matters significantly

High Degree (Many Columns):

Rows are wider, fewer fit per disk page
SELECT * pulls more data than needed
Vertical partitioning may help
Covering indexes become larger

Cardinality Estimation:

Query optimizers estimate result cardinalities to choose execution plans. If the optimizer estimates a join will produce 100 rows but it actually produces 1,000,000, the chosen plan may be catastrophically wrong. Understanding cardinality helps you:

Write better queries (filter early)
Create appropriate indexes
Analyze explain plans
Diagnose performance issues

Selectivity

Selectivity is the fraction of tuples that satisfy a predicate: selectivity = |σ(r)| / |r|. High selectivity (close to 1) means most rows match; low selectivity (close to 0) means few match. Index efficiency depends heavily on selectivity—indexes are most valuable for low-selectivity predicates.

Instance Operations: The DML Trio

Relations instances change through three fundamental operations: INSERT, UPDATE, and DELETE. These Data Manipulation Language (DML) operations are state transitions—they transform one valid instance into another.

INSERT: Adding Tuples

INSERT adds one or more new tuples to a relation instance.

-- Single tuple insertion
INSERT INTO employees (emp_id, emp_name, dept, salary)
VALUES (1004, 'Diana', 'Marketing', 72000);

-- Multi-tuple insertion
INSERT INTO employees (emp_id, emp_name, dept, salary)
VALUES 
    (1005, 'Edward', 'Eng', 88000),
    (1006, 'Fiona', 'Sales', 76000);

-- Insertion from query
INSERT INTO archived_employees
SELECT * FROM employees WHERE termination_date < '2020-01-01';

INSERT Constraints:

New tuple must not violate PRIMARY KEY (no duplicate)
Foreign key values must exist in referenced table
CHECK constraints must be satisfied
NOT NULL columns must have values

UPDATE: Modifying Tuples

UPDATE changes attribute values in existing tuples.

-- Update specific tuple
UPDATE employees 
SET salary = 98000 
WHERE emp_id = 1001;

-- Update multiple tuples
UPDATE employees 
SET salary = salary * 1.05 
WHERE dept = 'Engineering';

-- Update with subquery
UPDATE employees 
SET dept = 'Research' 
WHERE dept_id IN (SELECT dept_id FROM departments WHERE is_research = true);

UPDATE Constraints:

New values must satisfy domain constraints
Primary key updates must not create duplicates
Foreign key updates must maintain referential integrity
All CHECK constraints must be satisfied post-update

DELETE: Removing Tuples

DELETE removes tuples from a relation instance.

-- Delete specific tuple
DELETE FROM employees 
WHERE emp_id = 1002;

-- Delete matching tuples
DELETE FROM employees 
WHERE termination_date IS NOT NULL;

-- Delete all tuples (careful!)
DELETE FROM employees;  -- Removes all rows, table still exists

DELETE Constraints:

Cannot delete if other tuples reference this tuple via foreign key (unless CASCADE)
Some systems support soft delete (marking deleted rather than removing)
Triggers may execute on delete

Atomic Operations

DML operations are atomic at the statement level by default. An INSERT of 1000 rows either succeeds completely or fails completely—you won't get partial inserts. Transactions extend this atomicity across multiple statements, ensuring consistent state transitions.

Instance State Transitions

Every DML operation is a state transition: it takes the database from one valid state to another. Understanding state transitions helps you reason about data consistency and transaction correctness.

Formal Model:

Let S be the set of all valid database states (states satisfying all constraints). A DML operation O is a function:

O: S → S ∪ {FAILURE}

If the operation succeeds, we get a new valid state. If it would create an invalid state (constraint violation), the operation fails and the original state is preserved.

State Transition Visualized:

State S₀                    State S₁                    State S₂
┌─────────────┐   INSERT    ┌─────────────┐   UPDATE    ┌─────────────┐
│ employees   │────────────▶│ employees   │────────────▶│ employees   │
│  (1001,...)│             │  (1001,...) │             │  (1001,...')│
│             │             │  (1002,...) │             │  (1002,...) │
└─────────────┘             └─────────────┘             └─────────────┘
   Cardinality: 1              Cardinality: 2              Cardinality: 2

Transition Classification:

Operation	Effect on Cardinality	Effect on Data
INSERT	Increases by n (rows inserted)	New tuples added
DELETE	Decreases by n (rows deleted)	Tuples removed
UPDATE	Unchanged (usually)	Existing tuples modified

Note: UPDATE can change cardinality if it modifies a primary key to duplicate an existing key (which typically fails) or in special MERGE/UPSERT scenarios.

Constraint Checking During Transitions:

When a DML operation executes, the DBMS checks constraints:

Constraint Type	When Checked	What Happens on Violation
NOT NULL	During INSERT/UPDATE	Statement fails
UNIQUE/PRIMARY KEY	During INSERT/UPDATE	Statement fails
CHECK	During INSERT/UPDATE	Statement fails
FOREIGN KEY (referenced)	During DELETE/UPDATE of parent	Fail or CASCADE
FOREIGN KEY (referencing)	During INSERT/UPDATE of child	Statement fails if parent missing

Deferred vs. Immediate Checking:

Some constraints can be deferred until transaction commit:

-- PostgreSQL example: defer foreign key checking
SET CONSTRAINTS my_fk_constraint DEFERRED;

-- Now can insert child before parent in same transaction
INSERT INTO orders (order_id, customer_id) VALUES (1, 999);
INSERT INTO customers (customer_id, name) VALUES (999, 'New Customer');

COMMIT;  -- Constraint checked here

Deferred checking allows flexible insertion order but delays error detection.

Think in State Transitions

When debugging data issues, think: 'What sequence of operations transformed the database from expected state to current state?' This forensic thinking, combined with audit logs or temporal tables, helps track down when and how data became incorrect.

Valid vs. Invalid Instances

Constraints divide the space of all possible instances into valid instances (those satisfying all constraints) and invalid instances (those violating at least one constraint).

The Constraint Satisfaction Perspective:

An instance r is valid for schema R if and only if:

Every tuple has the correct degree (n attributes)
Each attribute value comes from its domain (or is NULL if allowed)
No two tuples are duplicates (set property)
All schema constraints are satisfied

What Makes Instances Valid

•All tuples have same degree
•Values within domain bounds
•No NULL where NOT NULL specified
•Primary key values unique
•Foreign key values exist in referenced table
•All CHECK predicates evaluate TRUE
•No duplicate tuples (if enforced)

What Makes Instances Invalid

•Tuples with wrong number of values
•Values outside domain (wrong type)
•NULL in NOT NULL column
•Duplicate primary key values
•Orphan foreign key references
•CHECK constraint evaluates FALSE
•Duplicate tuples (if forbidden)

Detecting Invalid States:

The DBMS prevents invalid states through constraint enforcement. But sometimes invalid data enters through:

Constraints added after data exists
ETL/import processes that bypass constraints
Application bugs with direct database access
Disabled constraints during bulk loads

Finding Constraint Violations:

-- Find orphan foreign keys (child without parent)
SELECT o.*
FROM orders o
LEFT JOIN customers c ON o.customer_id = c.customer_id
WHERE c.customer_id IS NULL;

-- Find duplicates (would violate uniqueness)
SELECT email, COUNT(*)
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

-- Find check constraint violations
SELECT * FROM products WHERE price < 0;

-- Find NULL violations (assuming column should be NOT NULL)
SELECT * FROM employees WHERE hire_date IS NULL;

Data Quality Is an Ongoing Concern

Constraints prevent new violations but don't fix existing ones. If you add a constraint to a table with dirty data, the constraint may fail to create. Always audit data before adding constraints, fix issues, then add the constraint. Consider data quality checks as part of regular maintenance.

Empty and Non-Empty Instances

A special case worth understanding is the empty relation instance—a relation with zero tuples.

The Empty Instance:

An empty instance r = {} (or r = ∅) satisfies:

|r| = 0 (cardinality is zero)
The schema is fully defined (columns, types, constraints exist)
The relation legally exists; it just contains no facts

When Empty Instances Occur:

Common Scenarios for Empty Relations

•After CREATE TABLE: Every new table starts empty
•After DELETE all: DELETE FROM table; removes all rows
•Query results: Queries can return empty result sets
•Filter no matches: WHERE clause matches nothing
•Empty join: Join conditions match no pairs
•Future data: Tables for events that haven't happened yet

Algebraic Properties of Empty Relations:

Empty relations have interesting algebraic behaviors:

Operation	With Empty Relation	Result
R ∪ ∅	Union with empty	R (unchanged)
R ∩ ∅	Intersection with empty	∅ (empty)
R − ∅	Difference with empty	R (unchanged)
∅ − R	Empty minus anything	∅ (empty)
R × ∅	Cartesian product with empty	∅ (empty)
R ⋈ ∅	Join with empty	∅ (empty)
σ(∅)	Selection on empty	∅ (empty)
π(∅)	Projection on empty	∅ (empty)

These properties are important for query optimization—the optimizer can short-circuit operations involving empty relations.

Handling Empty Results in Applications:

Empty results are legitimate and common. Applications must handle them gracefully:

// Bad: assumes results exist
const user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
console.log(user.name);  // ERROR if user not found!

// Good: check for empty result
const users = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
if (users.length === 0) {
    throw new NotFoundError('User not found');
}
const user = users[0];

Distinguishing Causes:

An empty result can mean:

No data exists (legitimate empty)
Filter is wrong (bug in query)
Data exists but doesn't match (understanding issue)

Always verify: if you expect data and get none, investigate before assuming correctness.

Temporal Aspects of Instances

Relation instances change over time. Understanding temporal aspects is crucial for auditing, debugging, and applications that need historical data.

Current vs. Historical State:

Current Instance: What the relation contains right now
Historical Instance: What the relation contained at some past time
Temporal Instance: Relations that explicitly model time in their schema

Why Historical Data Matters:

Use Cases for Historical Data

•Auditing: Who changed what, when? Required for compliance (SOX, GDPR).
•Debugging: What was the state when the bug occurred?
•Analytics: How have metrics trended over time?
•Recovery: Restore to a known good state after corruption.
•Business Logic: Prices are effective between dates; historical lookups needed.

Approaches to Historical Data:

1. Audit Logging: Separate tables record changes:

CREATE TABLE employee_audit (
    audit_id SERIAL PRIMARY KEY,
    emp_id INTEGER NOT NULL,
    action VARCHAR(10) NOT NULL,  -- INSERT, UPDATE, DELETE
    old_values JSONB,
    new_values JSONB,
    changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    changed_by VARCHAR(100)
);

2. Temporal Tables (SQL:2011 Standard): System-versioned tables automatically track history:

-- PostgreSQL syntax
CREATE TABLE employees (
    emp_id INTEGER PRIMARY KEY,
    emp_name VARCHAR(100),
    salary DECIMAL(12,2),
    valid_from TIMESTAMP GENERATED ALWAYS AS ROW START,
    valid_to TIMESTAMP GENERATED ALWAYS AS ROW END,
    PERIOD FOR SYSTEM_TIME (valid_from, valid_to)
) WITH SYSTEM VERSIONING;

-- Query historical state
SELECT * FROM employees 
FOR SYSTEM_TIME AS OF '2023-06-15 10:00:00';

3. Soft Deletes: Mark records as deleted rather than removing:

ALTER TABLE employees ADD COLUMN deleted_at TIMESTAMP NULL;

-- "Delete" by marking
UPDATE employees SET deleted_at = CURRENT_TIMESTAMP WHERE emp_id = 1001;

-- Query active records only
SELECT * FROM employees WHERE deleted_at IS NULL;

Choosing a Temporal Strategy

The right approach depends on requirements: Audit logging adds complexity but captures who/why. Temporal tables are elegant but not universally supported. Soft deletes are simple but don't track changes, only deletions. Consider compliance requirements, query patterns, and storage costs.

Summary: Instances Complete the Picture

We've now completed our exploration of relations and tables, from abstract mathematical foundations to concrete instances. Let's synthesize what we've learned:

Key Concepts Mastered

•Instance Definition: A relation instance is a finite subset of the Cartesian product of domains, representing the facts currently asserted as true.
•Cardinality and Degree: Cardinality (row count) is dynamic; degree (column count) is structural. Both impact query performance significantly.
•DML Operations: INSERT, UPDATE, and DELETE transform instances. Each operation is a state transition that must preserve constraint satisfaction.
•State Transitions: Every operation moves the database from one valid state to another. Failed operations leave the state unchanged.
•Valid Instances: Instances must satisfy all schema constraints—domain compatibility, key uniqueness, referential integrity, and CHECK conditions.
•Empty Relations: The empty instance (∅) is valid and has algebraic properties important for query optimization.
•Temporal Dimension: Instances change over time. Auditing, temporal tables, and soft deletes capture historical states for compliance and analysis.

The Complete Picture:

Concept	Role	Key Points
Relation (mathematical)	Formal foundation	Set of tuples, subset of Cartesian product
Relation Schema	Structure definition	Attributes, domains, constraints
Relation Instance	Current data	Dynamic, constrained, finite
Table	Implementation	SQL approximation of relation
DML Operations	State transformation	INSERT/UPDATE/DELETE create new valid states

Module Completion:

You now understand relations and tables from first principles through practical implementation. This foundation supports everything that follows in relational database study: keys, normalization, algebra, SQL queries, and optimization all build on these concepts.

Module Complete: Relations and Tables

Congratulations! You've mastered the foundational concepts of the relational model—relations as mathematical sets, schemas as structural blueprints, tables as implementations, and instances as living data. This knowledge forms the bedrock for all database work: design, querying, optimization, and maintenance.

5 / 5

Loading learning content...

Database Management SystemsRelations and Tables

Relations and Tables: The Foundation of Relational Databases

LevelBeginner

Duration60 mins

TopicRelations and Tables

5 / 5

Relation Instance: Data in Motion

The Living Data of Databases

This page completes our exploration of relations and tables by examining the dynamic, ever-changing content that gives databases their value.

Learning Objectives

Formal Definition of Relation Instance

A relation instance (also called a relation state or relation extension) is the set of tuples that a relation contains at a particular moment in time.

Formal Definition:

Given a relation schema R(A₁:D₁, A₂:D₂, ..., Aₙ:Dₙ), a relation instance r of R is:

r ⊆ D₁ × D₂ × ... × Dₙ

such that:

r is a finite subset (instances are always finite)
Every tuple t ∈ r satisfies all constraints in R's constraint set
Each tuple t is an n-tuple (t₁, t₂, ..., tₙ) where tᵢ ∈ Dᵢ ∪ {NULL} for each i

Notation:

We write r(R) to denote an instance r of schema R. At different times, we might have r₁(R), r₂(R), etc., representing different states of the same relation.

Instance ExampleConsider the schema Employee(emp_id, emp_name, dept, salary) at two different times:

Input

Employee schema with instances at time t₁ and t₂

Output

Instance at t₁ (Monday):
r₁(Employee) = {
    (1001, 'Alice', 'Eng', 90000),
    (1002, 'Bob', 'Sales', 75000)
}
Cardinality: 2

Instance at t₂ (Tuesday, after changes):
r₂(Employee) = {
    (1001, 'Alice', 'Eng', 95000),  -- salary updated
    (1002, 'Bob', 'Sales', 75000),
    (1003, 'Carol', 'Eng', 85000)   -- new employee added
}
Cardinality: 3

Explanation

Key Instance Properties:

Property	Meaning	Implication
Finite	Always a finite set of tuples	No infinite relations in practice
Subset	r ⊆ Cartesian product of domains	Only valid combinations included
Dynamic	Changes over time	Today's truth may differ from tomorrow's
Constrained	Must satisfy schema constraints	Not all possible tuples are allowed
Homogeneous	All tuples have same degree n	Uniform structure within a relation

Instance as Database State

Cardinality and Degree in Practice

Cardinality (|r|):

The cardinality of instance r is the number of tuples it contains. Cardinality is:

Dynamic: Changes with every INSERT, DELETE, or some UPDATEs (if they change key values)
Non-negative: Minimum is 0 (empty relation)
Bounded in practice: Limited by storage, though no theoretical upper limit

Degree (deg(R)):

The degree of a relation (also called arity) is the number of attributes. Degree is:

Static: Fixed by the schema (changes only via ALTER TABLE)
Structural: Defines how 'wide' each tuple is

Real-World Scale Examples
Relation	Typical Degree	Typical Cardinality	Storage Implications
User accounts	15-30 columns	Millions to billions	Core table, heavily indexed
Product catalog	20-50 columns	Thousands to millions	Read-heavy, often cached
Order items	5-10 columns	Billions over time	Massive, often partitioned
Config settings	3-5 columns	Hundreds	Small, frequently read
Audit log	10-15 columns	Billions, append-only	Time-partitioned, archived

Impact on Query Performance:

Both cardinality and degree affect query execution:

High Cardinality (Many Rows):

Full table scans become expensive
Index selection becomes critical
Aggregations require more computation
Join order matters significantly

High Degree (Many Columns):

Rows are wider, fewer fit per disk page
SELECT * pulls more data than needed
Vertical partitioning may help
Covering indexes become larger

Cardinality Estimation:

Write better queries (filter early)
Create appropriate indexes
Analyze explain plans
Diagnose performance issues

Selectivity

Instance Operations: The DML Trio

INSERT: Adding Tuples

INSERT adds one or more new tuples to a relation instance.

-- Single tuple insertion
INSERT INTO employees (emp_id, emp_name, dept, salary)
VALUES (1004, 'Diana', 'Marketing', 72000);

-- Multi-tuple insertion
INSERT INTO employees (emp_id, emp_name, dept, salary)
VALUES 
    (1005, 'Edward', 'Eng', 88000),
    (1006, 'Fiona', 'Sales', 76000);

-- Insertion from query
INSERT INTO archived_employees
SELECT * FROM employees WHERE termination_date < '2020-01-01';

INSERT Constraints:

New tuple must not violate PRIMARY KEY (no duplicate)
Foreign key values must exist in referenced table
CHECK constraints must be satisfied
NOT NULL columns must have values

UPDATE: Modifying Tuples

UPDATE changes attribute values in existing tuples.

-- Update specific tuple
UPDATE employees 
SET salary = 98000 
WHERE emp_id = 1001;

-- Update multiple tuples
UPDATE employees 
SET salary = salary * 1.05 
WHERE dept = 'Engineering';

-- Update with subquery
UPDATE employees 
SET dept = 'Research' 
WHERE dept_id IN (SELECT dept_id FROM departments WHERE is_research = true);

UPDATE Constraints:

New values must satisfy domain constraints
Primary key updates must not create duplicates
Foreign key updates must maintain referential integrity
All CHECK constraints must be satisfied post-update

DELETE: Removing Tuples

DELETE removes tuples from a relation instance.

-- Delete specific tuple
DELETE FROM employees 
WHERE emp_id = 1002;

-- Delete matching tuples
DELETE FROM employees 
WHERE termination_date IS NOT NULL;

-- Delete all tuples (careful!)
DELETE FROM employees;  -- Removes all rows, table still exists

DELETE Constraints:

Cannot delete if other tuples reference this tuple via foreign key (unless CASCADE)
Some systems support soft delete (marking deleted rather than removing)
Triggers may execute on delete

Atomic Operations

Instance State Transitions

Formal Model:

Let S be the set of all valid database states (states satisfying all constraints). A DML operation O is a function:

O: S → S ∪ {FAILURE}

If the operation succeeds, we get a new valid state. If it would create an invalid state (constraint violation), the operation fails and the original state is preserved.

State Transition Visualized:

State S₀                    State S₁                    State S₂
┌─────────────┐   INSERT    ┌─────────────┐   UPDATE    ┌─────────────┐
│ employees   │────────────▶│ employees   │────────────▶│ employees   │
│  (1001,...)│             │  (1001,...) │             │  (1001,...')│
│             │             │  (1002,...) │             │  (1002,...) │
└─────────────┘             └─────────────┘             └─────────────┘
   Cardinality: 1              Cardinality: 2              Cardinality: 2

Transition Classification:

Operation	Effect on Cardinality	Effect on Data
INSERT	Increases by n (rows inserted)	New tuples added
DELETE	Decreases by n (rows deleted)	Tuples removed
UPDATE	Unchanged (usually)	Existing tuples modified

Note: UPDATE can change cardinality if it modifies a primary key to duplicate an existing key (which typically fails) or in special MERGE/UPSERT scenarios.

Constraint Checking During Transitions:

When a DML operation executes, the DBMS checks constraints:

Constraint Type	When Checked	What Happens on Violation
NOT NULL	During INSERT/UPDATE	Statement fails
UNIQUE/PRIMARY KEY	During INSERT/UPDATE	Statement fails
CHECK	During INSERT/UPDATE	Statement fails
FOREIGN KEY (referenced)	During DELETE/UPDATE of parent	Fail or CASCADE
FOREIGN KEY (referencing)	During INSERT/UPDATE of child	Statement fails if parent missing

Deferred vs. Immediate Checking:

Some constraints can be deferred until transaction commit:

-- PostgreSQL example: defer foreign key checking
SET CONSTRAINTS my_fk_constraint DEFERRED;

-- Now can insert child before parent in same transaction
INSERT INTO orders (order_id, customer_id) VALUES (1, 999);
INSERT INTO customers (customer_id, name) VALUES (999, 'New Customer');

COMMIT;  -- Constraint checked here

Deferred checking allows flexible insertion order but delays error detection.

Think in State Transitions

Valid vs. Invalid Instances

Constraints divide the space of all possible instances into valid instances (those satisfying all constraints) and invalid instances (those violating at least one constraint).

The Constraint Satisfaction Perspective:

An instance r is valid for schema R if and only if:

Every tuple has the correct degree (n attributes)
Each attribute value comes from its domain (or is NULL if allowed)
No two tuples are duplicates (set property)
All schema constraints are satisfied

What Makes Instances Valid

•All tuples have same degree
•Values within domain bounds
•No NULL where NOT NULL specified
•Primary key values unique
•Foreign key values exist in referenced table
•All CHECK predicates evaluate TRUE
•No duplicate tuples (if enforced)

What Makes Instances Invalid

•Tuples with wrong number of values
•Values outside domain (wrong type)
•NULL in NOT NULL column
•Duplicate primary key values
•Orphan foreign key references
•CHECK constraint evaluates FALSE
•Duplicate tuples (if forbidden)

Detecting Invalid States:

The DBMS prevents invalid states through constraint enforcement. But sometimes invalid data enters through:

Constraints added after data exists
ETL/import processes that bypass constraints
Application bugs with direct database access
Disabled constraints during bulk loads

Finding Constraint Violations:

-- Find orphan foreign keys (child without parent)
SELECT o.*
FROM orders o
LEFT JOIN customers c ON o.customer_id = c.customer_id
WHERE c.customer_id IS NULL;

-- Find duplicates (would violate uniqueness)
SELECT email, COUNT(*)
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

-- Find check constraint violations
SELECT * FROM products WHERE price < 0;

-- Find NULL violations (assuming column should be NOT NULL)
SELECT * FROM employees WHERE hire_date IS NULL;

Data Quality Is an Ongoing Concern

Empty and Non-Empty Instances

A special case worth understanding is the empty relation instance—a relation with zero tuples.

The Empty Instance:

An empty instance r = {} (or r = ∅) satisfies:

|r| = 0 (cardinality is zero)
The schema is fully defined (columns, types, constraints exist)
The relation legally exists; it just contains no facts

When Empty Instances Occur:

Common Scenarios for Empty Relations

•After CREATE TABLE: Every new table starts empty
•After DELETE all: DELETE FROM table; removes all rows
•Query results: Queries can return empty result sets
•Filter no matches: WHERE clause matches nothing
•Empty join: Join conditions match no pairs
•Future data: Tables for events that haven't happened yet

Algebraic Properties of Empty Relations:

Empty relations have interesting algebraic behaviors:

Operation	With Empty Relation	Result
R ∪ ∅	Union with empty	R (unchanged)
R ∩ ∅	Intersection with empty	∅ (empty)
R − ∅	Difference with empty	R (unchanged)
∅ − R	Empty minus anything	∅ (empty)
R × ∅	Cartesian product with empty	∅ (empty)
R ⋈ ∅	Join with empty	∅ (empty)
σ(∅)	Selection on empty	∅ (empty)
π(∅)	Projection on empty	∅ (empty)

These properties are important for query optimization—the optimizer can short-circuit operations involving empty relations.

Handling Empty Results in Applications:

Empty results are legitimate and common. Applications must handle them gracefully:

// Bad: assumes results exist
const user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
console.log(user.name);  // ERROR if user not found!

// Good: check for empty result
const users = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
if (users.length === 0) {
    throw new NotFoundError('User not found');
}
const user = users[0];

Distinguishing Causes:

An empty result can mean:

No data exists (legitimate empty)
Filter is wrong (bug in query)
Data exists but doesn't match (understanding issue)

Always verify: if you expect data and get none, investigate before assuming correctness.

Temporal Aspects of Instances

Relation instances change over time. Understanding temporal aspects is crucial for auditing, debugging, and applications that need historical data.

Current vs. Historical State:

Current Instance: What the relation contains right now
Historical Instance: What the relation contained at some past time
Temporal Instance: Relations that explicitly model time in their schema

Why Historical Data Matters:

Use Cases for Historical Data

•Auditing: Who changed what, when? Required for compliance (SOX, GDPR).
•Debugging: What was the state when the bug occurred?
•Analytics: How have metrics trended over time?
•Recovery: Restore to a known good state after corruption.
•Business Logic: Prices are effective between dates; historical lookups needed.

Approaches to Historical Data:

1. Audit Logging: Separate tables record changes:

CREATE TABLE employee_audit (
    audit_id SERIAL PRIMARY KEY,
    emp_id INTEGER NOT NULL,
    action VARCHAR(10) NOT NULL,  -- INSERT, UPDATE, DELETE
    old_values JSONB,
    new_values JSONB,
    changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    changed_by VARCHAR(100)
);

2. Temporal Tables (SQL:2011 Standard): System-versioned tables automatically track history:

-- PostgreSQL syntax
CREATE TABLE employees (
    emp_id INTEGER PRIMARY KEY,
    emp_name VARCHAR(100),
    salary DECIMAL(12,2),
    valid_from TIMESTAMP GENERATED ALWAYS AS ROW START,
    valid_to TIMESTAMP GENERATED ALWAYS AS ROW END,
    PERIOD FOR SYSTEM_TIME (valid_from, valid_to)
) WITH SYSTEM VERSIONING;

-- Query historical state
SELECT * FROM employees 
FOR SYSTEM_TIME AS OF '2023-06-15 10:00:00';

3. Soft Deletes: Mark records as deleted rather than removing:

ALTER TABLE employees ADD COLUMN deleted_at TIMESTAMP NULL;

-- "Delete" by marking
UPDATE employees SET deleted_at = CURRENT_TIMESTAMP WHERE emp_id = 1001;

-- Query active records only
SELECT * FROM employees WHERE deleted_at IS NULL;

Choosing a Temporal Strategy

Summary: Instances Complete the Picture

We've now completed our exploration of relations and tables, from abstract mathematical foundations to concrete instances. Let's synthesize what we've learned:

Key Concepts Mastered

•Instance Definition: A relation instance is a finite subset of the Cartesian product of domains, representing the facts currently asserted as true.
•Cardinality and Degree: Cardinality (row count) is dynamic; degree (column count) is structural. Both impact query performance significantly.
•DML Operations: INSERT, UPDATE, and DELETE transform instances. Each operation is a state transition that must preserve constraint satisfaction.
•State Transitions: Every operation moves the database from one valid state to another. Failed operations leave the state unchanged.
•Valid Instances: Instances must satisfy all schema constraints—domain compatibility, key uniqueness, referential integrity, and CHECK conditions.
•Empty Relations: The empty instance (∅) is valid and has algebraic properties important for query optimization.
•Temporal Dimension: Instances change over time. Auditing, temporal tables, and soft deletes capture historical states for compliance and analysis.

The Complete Picture:

Concept	Role	Key Points
Relation (mathematical)	Formal foundation	Set of tuples, subset of Cartesian product
Relation Schema	Structure definition	Attributes, domains, constraints
Relation Instance	Current data	Dynamic, constrained, finite
Table	Implementation	SQL approximation of relation
DML Operations	State transformation	INSERT/UPDATE/DELETE create new valid states

Module Completion:

Module Complete: Relations and Tables

5 / 5