Loading learning content...
In any relational database with foreign keys, a fundamental problem lurks: orphan records. What happens when an employee references department 42, but department 42 has been deleted? The employee record now points to nothing—a dangling reference that corrupts data integrity and breaks application logic.
Referential integrity is the mechanism that prevents this corruption. It ensures that every foreign key value either references an existing parent row or is NULL (if permitted). Without referential integrity, databases would silently accumulate inconsistencies, reports would show impossible data, and applications would crash on missing references.
In this page, we explore referential integrity in depth: what it guarantees, how databases enforce it, what happens during insert/update/delete operations, and how to design for integrity while maintaining performance.
By the end of this page, you will understand the formal definition of referential integrity, how databases enforce it on INSERT, UPDATE, and DELETE operations, the five referential actions (RESTRICT, CASCADE, SET NULL, SET DEFAULT, NO ACTION), and best practices for choosing the right action for different scenarios.
Referential integrity is a constraint that ensures consistency between related tables. It is defined as follows:
Formal Definition:
Let R₁ (the referencing or child relation) have a foreign key FK that references the primary key PK of R₂ (the referenced or parent relation). The database satisfies referential integrity if and only if:
∀ t ∈ R₁: t[FK] is NULL OR ∃ s ∈ R₂ such that s[PK] = t[FK]
In plain language: For every row in the child table, the foreign key value is either NULL (if allowed) or matches a primary key value in the parent table.
| Term | Definition | Also Called |
|---|---|---|
| Referencing Table | Table containing the foreign key | Child table, dependent table |
| Referenced Table | Table containing the primary key being referenced | Parent table, master table |
| Foreign Key | Column(s) in child that reference parent's PK | FK, referencing column |
| Referenced Key | Primary (or unique) key in parent table | PK, target key |
| Dangling Reference | FK value with no matching parent row | Orphan reference |
The Two-Way Guarantee:
Referential integrity provides protection in both directions:
Against Invalid Inserts/Updates in Child:
Against Destructive Deletes/Updates in Parent:
Violating either direction would create a dangling reference.
While foreign keys typically reference primary keys, they can reference any column with a UNIQUE constraint. This enables scenarios like referencing a 'code' column instead of an auto-increment ID. The referential integrity guarantee is the same.
The database enforces referential integrity automatically during INSERT, UPDATE, and DELETE operations. Let's examine what happens for each operation and when violations occur.
INSERT:
-- Scenario: Employee with FK to Department
-- ✓ SUCCESS: dept_id 10 exists in Department
INSERT INTO Employee (name, dept_id) VALUES ('Alice', 10);
-- ✗ FAILURE: dept_id 999 doesn't exist
INSERT INTO Employee (name, dept_id) VALUES ('Bob', 999);
-- ERROR: insert or update on table "employee" violates foreign key constraint
-- Key (dept_id)=(999) is not present in table "department"
UPDATE:
-- ✓ SUCCESS: transfer employee to existing department
UPDATE Employee SET dept_id = 20 WHERE emp_id = 1;
-- ✗ FAILURE: 999 doesn't exist
UPDATE Employee SET dept_id = 999 WHERE emp_id = 1;
DELETE:
INSERT:
UPDATE (on PK):
-- Scenario: Department(dept_id) referenced by Employee(dept_id)
-- On Update handling:
-- WITH NO ACTION/RESTRICT: fails if any employee references dept_id=10
UPDATE Department SET dept_id = 100 WHERE dept_id = 10;
-- ERROR: update or delete on table "department" violates foreign key constraint
DELETE:
-- WITH NO ACTION/RESTRICT: fails if employees reference this department
DELETE FROM Department WHERE dept_id = 10;
-- ERROR: update or delete on table "department" violates foreign key constraint
When a parent row is deleted or its PK is updated, the database must decide how to handle referencing children. SQL provides five referential actions to specify this behavior.
| Action | On DELETE | On UPDATE | FK Nullability |
|---|---|---|---|
| RESTRICT | Block deletion if children exist | Block update if children exist | N/A |
| CASCADE | Delete all children | Update FK in all children | N/A |
| SET NULL | Set FK to NULL in children | Set FK to NULL in children | Must allow NULL |
| SET DEFAULT | Set FK to default in children | Set FK to default in children | Default must be valid |
| NO ACTION | Similar to RESTRICT* | Similar to RESTRICT* | N/A |
These are nearly identical in most cases. The difference is timing: RESTRICT checks immediately; NO ACTION can be deferred to end of transaction (with deferrable constraints). In PostgreSQL, NO ACTION is the default and can work with deferred checks. In MySQL, they behave identically.
Detailed Behavior of Each Action:
**Behavior:** Completely blocks the operation if any children exist. **Example:** Deleting Department 10 with 50 employees fails entirely. **Use when:** - Parent is too important to delete casually - Children are independent entities that shouldn't auto-delete - You want explicit handling of dependencies **Example scenario:** RESTRICT deletion of a Customer who has Orders. Force explicit resolution of orders first.
**Behavior:** Automatically propagates the operation to all children. **ON DELETE CASCADE:** Deletes all referencing children. **ON UPDATE CASCADE:** Updates FK in all children to match new PK. **Use when:** - Children are existence-dependent on parent (weak entities) - Child data is meaningless without parent - You want atomic cleanup **Example scenario:** CASCADE deletion of Order to all OrderLines. Lines have no meaning without their order.
**Behavior:** Sets the FK column to NULL in all affected children. **Requirements:** FK column must allow NULL values. **Use when:** - Relationship is optional (partial participation on child) - Children can exist independently in 'orphaned' state - You want to preserve child data while severing the link **Example scenario:** SET NULL when a Manager is deleted, leaving employees without a manager until reassignment.
**Behavior:** Sets the FK column to its DEFAULT value in all affected children. **Requirements:** FK column must have a DEFAULT defined, and that default must be a valid reference (exist in parent). **Use when:** - You have a 'fallback' or 'unassigned' parent record - Children should be reassigned rather than orphaned - Business logic requires a default association **Example scenario:** SET DEFAULT to 'Uncategorized' when a Category is deleted, moving all products to the fallback category.
**Behavior:** Defers the check to end of statement (or transaction if deferred). **Difference from RESTRICT:** With deferrable constraints, you can: 1. Delete parent 2. Delete children (or re-assign) in same transaction 3. Constraint checked at COMMIT **Use when:** - You need flexibility in operation order - Complex multi-table operations require deferred checking - Default behavior is acceptable (same as RESTRICT for immediate)
Selecting the appropriate referential action is a critical design decision. The wrong choice can lead to data loss, integrity violations, or application errors. Use this decision framework:
| Relationship | Entity Type | Recommended ON DELETE | Reasoning |
|---|---|---|---|
| Order → OrderLine | Weak entity | CASCADE | Lines are meaningless without order |
| Blog → Comment | Weak entity | CASCADE | Comments belong to blog post |
| User → Post | Strong entity | RESTRICT or SET NULL | Posts might be preserved with 'deleted user' |
| Department → Employee | Strong entity | RESTRICT | Employees need explicit reassignment |
| Category → Product | Optional | SET DEFAULT | Products move to 'Uncategorized' |
| Manager → Subordinate | Self-reference | SET NULL | Employees become manager-less temporarily |
| Customer → Order | Strong entity | RESTRICT | Orders are important business records |
| Folder → File | Weak/Dependent | CASCADE | Files in folder are logically part of it |
CASCADE can delete vast amounts of data with a single DELETE statement. A single 'DELETE FROM Customer WHERE ...' could cascade to delete thousands of orders, each cascading to delete thousands of line items. Always consider the blast radius before using CASCADE on important data.
Let's examine complete implementations showing different referential actions in context.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101
-- =========================================-- Pattern 1: CASCADE for Weak Entities-- Order → OrderLine (existence-dependent)-- ========================================= CREATE TABLE "Order" ( order_id SERIAL PRIMARY KEY, customer_id INT NOT NULL, order_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, status VARCHAR(20) NOT NULL DEFAULT 'pending'); CREATE TABLE OrderLine ( line_id SERIAL PRIMARY KEY, order_id INT NOT NULL, product_id INT NOT NULL, quantity INT NOT NULL CHECK (quantity > 0), unit_price DECIMAL(10,2) NOT NULL, CONSTRAINT fk_orderline_order FOREIGN KEY (order_id) REFERENCES "Order"(order_id) ON DELETE CASCADE -- Delete order → delete all lines ON UPDATE CASCADE -- Update order_id → update in lines); -- Delete an order: all lines automatically deletedDELETE FROM "Order" WHERE order_id = 100;-- Lines with order_id = 100 are gone -- =========================================-- Pattern 2: RESTRICT for Important Data-- Customer → Order (independent entities)-- ========================================= CREATE TABLE Customer ( customer_id SERIAL PRIMARY KEY, name VARCHAR(200) NOT NULL, email VARCHAR(255) UNIQUE NOT NULL); CREATE TABLE "Order" ( order_id SERIAL PRIMARY KEY, customer_id INT NOT NULL, order_date TIMESTAMP NOT NULL, CONSTRAINT fk_order_customer FOREIGN KEY (customer_id) REFERENCES Customer(customer_id) ON DELETE RESTRICT -- Cannot delete customer with orders ON UPDATE CASCADE -- Customer ID changes propagate); -- Attempt to delete customer with orders: BLOCKEDDELETE FROM Customer WHERE customer_id = 42;-- ERROR: update or delete on table "customer" violates foreign key constraint-- Detail: Key (customer_id)=(42) is still referenced from table "Order" -- =========================================-- Pattern 3: SET NULL for Optional Relationships-- Employee → Manager (self-referential, optional)-- ========================================= CREATE TABLE Employee ( emp_id SERIAL PRIMARY KEY, name VARCHAR(100) NOT NULL, manager_id INT, -- Nullable: can be manager-less CONSTRAINT fk_employee_manager FOREIGN KEY (manager_id) REFERENCES Employee(emp_id) ON DELETE SET NULL -- Manager deleted → subordinates become manager-less ON UPDATE CASCADE); -- Delete a manager: subordinates' manager_id becomes NULLDELETE FROM Employee WHERE emp_id = 10; -- Was a manager-- Employees with manager_id = 10 now have manager_id = NULL -- =========================================-- Pattern 4: SET DEFAULT with Fallback Record-- Product → Category (default to 'Uncategorized')-- ========================================= -- First, create the fallback categoryINSERT INTO Category (category_id, name) VALUES (0, 'Uncategorized'); CREATE TABLE Product ( product_id SERIAL PRIMARY KEY, name VARCHAR(200) NOT NULL, category_id INT NOT NULL DEFAULT 0, -- Default to fallback CONSTRAINT fk_product_category FOREIGN KEY (category_id) REFERENCES Category(category_id) ON DELETE SET DEFAULT -- Category deleted → products become 'Uncategorized' ON UPDATE CASCADE); -- Delete a category: products move to 'Uncategorized' (id=0)DELETE FROM Category WHERE category_id = 5;-- Products with category_id = 5 now have category_id = 0By default, referential integrity is checked immediately after each statement. However, some databases support deferrable constraints that delay checking until the end of the transaction. This enables complex operations that would otherwise fail.
Deferrable constraints are supported in PostgreSQL, Oracle, and SQL Server (with some syntax differences). MySQL does not support deferrable constraints—checks are always immediate.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
-- =========================================-- Creating Deferrable Foreign Keys (PostgreSQL)-- ========================================= CREATE TABLE Parent ( parent_id INT PRIMARY KEY, child_ref INT -- Will reference Child later (circular ref)); CREATE TABLE Child ( child_id INT PRIMARY KEY, parent_id INT NOT NULL, CONSTRAINT fk_child_parent FOREIGN KEY (parent_id) REFERENCES Parent(parent_id) DEFERRABLE INITIALLY DEFERRED -- Check at transaction end, not immediately); -- Now Parent can reference Child, creating circular dependencyALTER TABLE ParentADD CONSTRAINT fk_parent_child FOREIGN KEY (child_ref) REFERENCES Child(child_id) DEFERRABLE INITIALLY DEFERRED; -- =========================================-- Using Deferred Checks-- ========================================= BEGIN; -- Insert child referencing non-existent parent (OK: deferred) INSERT INTO Child (child_id, parent_id) VALUES (1, 100); -- Insert parent referencing non-existent child (OK: deferred) INSERT INTO Parent (parent_id, child_ref) VALUES (100, 1); COMMIT; -- NOW constraints are checked - both references are valid! -- =========================================-- Controlling Deferral Within Transaction-- ========================================= BEGIN; -- Make all constraints immediate for this transaction SET CONSTRAINTS ALL IMMEDIATE; -- Or defer specific constraint SET CONSTRAINTS fk_child_parent DEFERRED; -- Operations... COMMIT; -- =========================================-- INITIALLY IMMEDIATE with DEFERRABLE option-- ========================================= CREATE TABLE Item ( item_id INT PRIMARY KEY, category_id INT NOT NULL, CONSTRAINT fk_item_category FOREIGN KEY (category_id) REFERENCES Category(category_id) DEFERRABLE INITIALLY IMMEDIATE -- Checks immediately by default, but CAN be deferred); -- In a transaction, you can defer it:BEGIN; SET CONSTRAINTS fk_item_category DEFERRED; -- Now checks at COMMIT INSERT INTO Item (item_id, category_id) VALUES (1, 999); -- 999 doesn't exist yet INSERT INTO Category (category_id, name) VALUES (999, 'New Category');COMMIT; -- Valid!Referential integrity enforcement isn't free. The database must perform additional work to validate constraints. Understanding the performance implications helps you design efficient schemas.
What the Database Does:
On INSERT/UPDATE (child table):
Cost: One index lookup per FK column. With an index on parent's PK (automatic), this is typically O(log n) and fast.
On DELETE/UPDATE (parent table with RESTRICT):
Cost: This is where indexes on FK columns matter. Without an index on the child's FK column, this is a full table scan—O(n) and potentially very slow.
On DELETE (parent table with CASCADE):
Cost: Potentially large I/O. Cascading through deep hierarchies can be expensive.
The single most important performance optimization for referential integrity is indexing FK columns. Without indexes, every parent DELETE or UPDATE triggers a full table scan on the child. With indexes, it's a quick lookup. PostgreSQL and Oracle do NOT auto-index FKs.
| Action | Operation | Performance Impact | Index Requirement |
|---|---|---|---|
| RESTRICT | Parent DELETE | Check for children | Index on child FK critical |
| CASCADE | Parent DELETE | Delete children (potentially many) | Index helps find children |
| SET NULL | Parent DELETE | Update children | Index helps; updates can be heavy |
| SET DEFAULT | Parent DELETE | Update children | Index helps; updates can be heavy |
| Any | Child INSERT | Lookup parent PK | Parent PK always indexed (auto) |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
-- =========================================-- Always Index Foreign Key Columns-- ========================================= CREATE TABLE Employee ( emp_id SERIAL PRIMARY KEY, name VARCHAR(100) NOT NULL, dept_id INT NOT NULL REFERENCES Department(dept_id)); -- THIS INDEX IS CRITICAL for parent table operationsCREATE INDEX idx_employee_dept_id ON Employee(dept_id); -- Without this index:-- DELETE FROM Department WHERE dept_id = 10;-- ↑ Full table scan of Employee to check for references! -- With this index:-- DELETE FROM Department WHERE dept_id = 10;-- ↑ Quick index lookup = fast check -- =========================================-- Bulk Loading: Temporarily Disable Checks-- ========================================= -- PostgreSQL: Session-level FK check toggleSET session_replication_role = 'replica'; -- Disables FK checks -- Load dataCOPY Employee FROM '/data/employees.csv'; SET session_replication_role = 'origin'; -- Re-enables FK checks -- WARNING: Data must be valid or you'll have violations! -- =========================================-- Analyzing Constraint Check Performance-- ========================================= -- PostgreSQL: Check if FK has supporting indexSELECT conname AS constraint_name, conrelid::regclass AS child_table, a.attname AS fk_column, (SELECT COUNT(*) FROM pg_index i JOIN pg_attribute ia ON ia.attrelid = i.indrelid AND ia.attnum = ANY(i.indkey) WHERE i.indrelid = c.conrelid AND ia.attname = a.attname) > 0 AS has_indexFROM pg_constraint cJOIN pg_attribute a ON a.attrelid = c.conrelid AND a.attnum = ANY(c.conkey)WHERE c.contype = 'f'; -- Foreign keys onlyWe've comprehensively explored referential integrity—the mechanism that ensures foreign key validity and prevents orphan records. Let's consolidate the key principles:
Looking Ahead:
With the theoretical and practical foundations of 1:N mapping complete, the final page presents comprehensive real-world examples—complete schemas, mapping walkthroughs, and common patterns that bring together all the concepts covered in this module.
You now have deep understanding of referential integrity in 1:N relationships—how it works, the five referential actions, when to use each, deferrable constraints, and performance optimization through indexing. This knowledge is essential for robust database design.