Loading learning content...
If the relation schema is the architectural blueprint, the relation instance is the building itself—the actual structure that exists at a particular moment. Instances are what databases are ultimately about: storing, retrieving, and manipulating real data that represents real-world facts.
While schemas define what's possible, instances capture what's true. The same schema can have infinitely many valid instances, ranging from the empty relation (no tuples) to massive collections of millions of records. Understanding instances—their properties, their operations, and their constraints—is essential for effective database work.
This page completes our exploration of relations and tables by examining the dynamic, ever-changing content that gives databases their value.
By the end of this page, you will: (1) Formally define and understand relation instances, (2) Recognize the properties instances must satisfy, (3) Understand instance operations (insert, update, delete) and their effects, (4) Grasp how constraints govern valid instances, (5) Appreciate the temporal nature of instance states.
A relation instance (also called a relation state or relation extension) is the set of tuples that a relation contains at a particular moment in time.
Formal Definition:
Given a relation schema R(A₁:D₁, A₂:D₂, ..., Aₙ:Dₙ), a relation instance r of R is:
r ⊆ D₁ × D₂ × ... × Dₙ
such that:
Notation:
We write r(R) to denote an instance r of schema R. At different times, we might have r₁(R), r₂(R), etc., representing different states of the same relation.
Employee schema with instances at time t₁ and t₂Instance at t₁ (Monday):
r₁(Employee) = {
(1001, 'Alice', 'Eng', 90000),
(1002, 'Bob', 'Sales', 75000)
}
Cardinality: 2
Instance at t₂ (Tuesday, after changes):
r₂(Employee) = {
(1001, 'Alice', 'Eng', 95000), -- salary updated
(1002, 'Bob', 'Sales', 75000),
(1003, 'Carol', 'Eng', 85000) -- new employee added
}
Cardinality: 3Same schema, different instances. The schema's structure didn't change (still 4 attributes), but the content evolved. Both instances are valid because each tuple conforms to the schema and satisfies all constraints.
Key Instance Properties:
| Property | Meaning | Implication |
|---|---|---|
| Finite | Always a finite set of tuples | No infinite relations in practice |
| Subset | r ⊆ Cartesian product of domains | Only valid combinations included |
| Dynamic | Changes over time | Today's truth may differ from tomorrow's |
| Constrained | Must satisfy schema constraints | Not all possible tuples are allowed |
| Homogeneous | All tuples have same degree n | Uniform structure within a relation |
The complete database instance (all relation instances together) represents the database STATE at a moment—a snapshot of all facts we believe to be true. Database operations are state transitions: they transform one valid instance into another valid instance.
Two fundamental metrics characterize every relation instance: cardinality (number of tuples) and degree (number of attributes). Understanding these metrics is crucial for query optimization, capacity planning, and understanding query results.
Cardinality (|r|):
The cardinality of instance r is the number of tuples it contains. Cardinality is:
Degree (deg(R)):
The degree of a relation (also called arity) is the number of attributes. Degree is:
| Relation | Typical Degree | Typical Cardinality | Storage Implications |
|---|---|---|---|
| User accounts | 15-30 columns | Millions to billions | Core table, heavily indexed |
| Product catalog | 20-50 columns | Thousands to millions | Read-heavy, often cached |
| Order items | 5-10 columns | Billions over time | Massive, often partitioned |
| Config settings | 3-5 columns | Hundreds | Small, frequently read |
| Audit log | 10-15 columns | Billions, append-only | Time-partitioned, archived |
Impact on Query Performance:
Both cardinality and degree affect query execution:
High Cardinality (Many Rows):
High Degree (Many Columns):
Cardinality Estimation:
Query optimizers estimate result cardinalities to choose execution plans. If the optimizer estimates a join will produce 100 rows but it actually produces 1,000,000, the chosen plan may be catastrophically wrong. Understanding cardinality helps you:
Selectivity is the fraction of tuples that satisfy a predicate: selectivity = |σ(r)| / |r|. High selectivity (close to 1) means most rows match; low selectivity (close to 0) means few match. Index efficiency depends heavily on selectivity—indexes are most valuable for low-selectivity predicates.
Relations instances change through three fundamental operations: INSERT, UPDATE, and DELETE. These Data Manipulation Language (DML) operations are state transitions—they transform one valid instance into another.
INSERT: Adding Tuples
INSERT adds one or more new tuples to a relation instance.
-- Single tuple insertion
INSERT INTO employees (emp_id, emp_name, dept, salary)
VALUES (1004, 'Diana', 'Marketing', 72000);
-- Multi-tuple insertion
INSERT INTO employees (emp_id, emp_name, dept, salary)
VALUES
(1005, 'Edward', 'Eng', 88000),
(1006, 'Fiona', 'Sales', 76000);
-- Insertion from query
INSERT INTO archived_employees
SELECT * FROM employees WHERE termination_date < '2020-01-01';
INSERT Constraints:
UPDATE: Modifying Tuples
UPDATE changes attribute values in existing tuples.
-- Update specific tuple
UPDATE employees
SET salary = 98000
WHERE emp_id = 1001;
-- Update multiple tuples
UPDATE employees
SET salary = salary * 1.05
WHERE dept = 'Engineering';
-- Update with subquery
UPDATE employees
SET dept = 'Research'
WHERE dept_id IN (SELECT dept_id FROM departments WHERE is_research = true);
UPDATE Constraints:
DELETE: Removing Tuples
DELETE removes tuples from a relation instance.
-- Delete specific tuple
DELETE FROM employees
WHERE emp_id = 1002;
-- Delete matching tuples
DELETE FROM employees
WHERE termination_date IS NOT NULL;
-- Delete all tuples (careful!)
DELETE FROM employees; -- Removes all rows, table still exists
DELETE Constraints:
DML operations are atomic at the statement level by default. An INSERT of 1000 rows either succeeds completely or fails completely—you won't get partial inserts. Transactions extend this atomicity across multiple statements, ensuring consistent state transitions.
Every DML operation is a state transition: it takes the database from one valid state to another. Understanding state transitions helps you reason about data consistency and transaction correctness.
Formal Model:
Let S be the set of all valid database states (states satisfying all constraints). A DML operation O is a function:
O: S → S ∪ {FAILURE}
If the operation succeeds, we get a new valid state. If it would create an invalid state (constraint violation), the operation fails and the original state is preserved.
State Transition Visualized:
State S₀ State S₁ State S₂
┌─────────────┐ INSERT ┌─────────────┐ UPDATE ┌─────────────┐
│ employees │────────────▶│ employees │────────────▶│ employees │
│ (1001,...)│ │ (1001,...) │ │ (1001,...')│
│ │ │ (1002,...) │ │ (1002,...) │
└─────────────┘ └─────────────┘ └─────────────┘
Cardinality: 1 Cardinality: 2 Cardinality: 2
Transition Classification:
| Operation | Effect on Cardinality | Effect on Data |
|---|---|---|
| INSERT | Increases by n (rows inserted) | New tuples added |
| DELETE | Decreases by n (rows deleted) | Tuples removed |
| UPDATE | Unchanged (usually) | Existing tuples modified |
Note: UPDATE can change cardinality if it modifies a primary key to duplicate an existing key (which typically fails) or in special MERGE/UPSERT scenarios.
Constraint Checking During Transitions:
When a DML operation executes, the DBMS checks constraints:
| Constraint Type | When Checked | What Happens on Violation |
|---|---|---|
| NOT NULL | During INSERT/UPDATE | Statement fails |
| UNIQUE/PRIMARY KEY | During INSERT/UPDATE | Statement fails |
| CHECK | During INSERT/UPDATE | Statement fails |
| FOREIGN KEY (referenced) | During DELETE/UPDATE of parent | Fail or CASCADE |
| FOREIGN KEY (referencing) | During INSERT/UPDATE of child | Statement fails if parent missing |
Deferred vs. Immediate Checking:
Some constraints can be deferred until transaction commit:
-- PostgreSQL example: defer foreign key checking
SET CONSTRAINTS my_fk_constraint DEFERRED;
-- Now can insert child before parent in same transaction
INSERT INTO orders (order_id, customer_id) VALUES (1, 999);
INSERT INTO customers (customer_id, name) VALUES (999, 'New Customer');
COMMIT; -- Constraint checked here
Deferred checking allows flexible insertion order but delays error detection.
When debugging data issues, think: 'What sequence of operations transformed the database from expected state to current state?' This forensic thinking, combined with audit logs or temporal tables, helps track down when and how data became incorrect.
Constraints divide the space of all possible instances into valid instances (those satisfying all constraints) and invalid instances (those violating at least one constraint).
The Constraint Satisfaction Perspective:
An instance r is valid for schema R if and only if:
Detecting Invalid States:
The DBMS prevents invalid states through constraint enforcement. But sometimes invalid data enters through:
Finding Constraint Violations:
-- Find orphan foreign keys (child without parent)
SELECT o.*
FROM orders o
LEFT JOIN customers c ON o.customer_id = c.customer_id
WHERE c.customer_id IS NULL;
-- Find duplicates (would violate uniqueness)
SELECT email, COUNT(*)
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
-- Find check constraint violations
SELECT * FROM products WHERE price < 0;
-- Find NULL violations (assuming column should be NOT NULL)
SELECT * FROM employees WHERE hire_date IS NULL;
Constraints prevent new violations but don't fix existing ones. If you add a constraint to a table with dirty data, the constraint may fail to create. Always audit data before adding constraints, fix issues, then add the constraint. Consider data quality checks as part of regular maintenance.
A special case worth understanding is the empty relation instance—a relation with zero tuples.
The Empty Instance:
An empty instance r = {} (or r = ∅) satisfies:
When Empty Instances Occur:
Algebraic Properties of Empty Relations:
Empty relations have interesting algebraic behaviors:
| Operation | With Empty Relation | Result |
|---|---|---|
| R ∪ ∅ | Union with empty | R (unchanged) |
| R ∩ ∅ | Intersection with empty | ∅ (empty) |
| R − ∅ | Difference with empty | R (unchanged) |
| ∅ − R | Empty minus anything | ∅ (empty) |
| R × ∅ | Cartesian product with empty | ∅ (empty) |
| R ⋈ ∅ | Join with empty | ∅ (empty) |
| σ(∅) | Selection on empty | ∅ (empty) |
| π(∅) | Projection on empty | ∅ (empty) |
These properties are important for query optimization—the optimizer can short-circuit operations involving empty relations.
Handling Empty Results in Applications:
Empty results are legitimate and common. Applications must handle them gracefully:
// Bad: assumes results exist
const user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
console.log(user.name); // ERROR if user not found!
// Good: check for empty result
const users = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
if (users.length === 0) {
throw new NotFoundError('User not found');
}
const user = users[0];
Distinguishing Causes:
An empty result can mean:
Always verify: if you expect data and get none, investigate before assuming correctness.
Relation instances change over time. Understanding temporal aspects is crucial for auditing, debugging, and applications that need historical data.
Current vs. Historical State:
Why Historical Data Matters:
Approaches to Historical Data:
1. Audit Logging: Separate tables record changes:
CREATE TABLE employee_audit (
audit_id SERIAL PRIMARY KEY,
emp_id INTEGER NOT NULL,
action VARCHAR(10) NOT NULL, -- INSERT, UPDATE, DELETE
old_values JSONB,
new_values JSONB,
changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
changed_by VARCHAR(100)
);
2. Temporal Tables (SQL:2011 Standard): System-versioned tables automatically track history:
-- PostgreSQL syntax
CREATE TABLE employees (
emp_id INTEGER PRIMARY KEY,
emp_name VARCHAR(100),
salary DECIMAL(12,2),
valid_from TIMESTAMP GENERATED ALWAYS AS ROW START,
valid_to TIMESTAMP GENERATED ALWAYS AS ROW END,
PERIOD FOR SYSTEM_TIME (valid_from, valid_to)
) WITH SYSTEM VERSIONING;
-- Query historical state
SELECT * FROM employees
FOR SYSTEM_TIME AS OF '2023-06-15 10:00:00';
3. Soft Deletes: Mark records as deleted rather than removing:
ALTER TABLE employees ADD COLUMN deleted_at TIMESTAMP NULL;
-- "Delete" by marking
UPDATE employees SET deleted_at = CURRENT_TIMESTAMP WHERE emp_id = 1001;
-- Query active records only
SELECT * FROM employees WHERE deleted_at IS NULL;
The right approach depends on requirements: Audit logging adds complexity but captures who/why. Temporal tables are elegant but not universally supported. Soft deletes are simple but don't track changes, only deletions. Consider compliance requirements, query patterns, and storage costs.
We've now completed our exploration of relations and tables, from abstract mathematical foundations to concrete instances. Let's synthesize what we've learned:
The Complete Picture:
| Concept | Role | Key Points |
|---|---|---|
| Relation (mathematical) | Formal foundation | Set of tuples, subset of Cartesian product |
| Relation Schema | Structure definition | Attributes, domains, constraints |
| Relation Instance | Current data | Dynamic, constrained, finite |
| Table | Implementation | SQL approximation of relation |
| DML Operations | State transformation | INSERT/UPDATE/DELETE create new valid states |
Module Completion:
You now understand relations and tables from first principles through practical implementation. This foundation supports everything that follows in relational database study: keys, normalization, algebra, SQL queries, and optimization all build on these concepts.
Congratulations! You've mastered the foundational concepts of the relational model—relations as mathematical sets, schemas as structural blueprints, tables as implementations, and instances as living data. This knowledge forms the bedrock for all database work: design, querying, optimization, and maintenance.