Loading learning content...
Imagine transferring $1,000 from your savings to checking account. The bank's database must subtract from savings and add to checking. If the system crashes after the subtraction but before the addition, where did the money go? If two simultaneous transfers read the same balance, will one overwrite the other?
These are not hypothetical concerns—they're the daily reality of database systems processing millions of concurrent operations. The component that prevents such catastrophes is the Transaction Manager, arguably the most critical guardian in the entire DBMS architecture.
The Transaction Manager ensures that databases remain consistent even under concurrent access and system failures. It implements the legendary ACID properties that distinguish real databases from mere file storage. This page takes you deep into transaction management—the mechanisms that make reliable data systems possible.
By the end of this page, you will understand ACID properties in depth, concurrency control through locking and MVCC, isolation levels and their trade-offs, deadlock detection and resolution, and the fundamentals of transaction logging for recovery.
ACID is the acronym that defines what it means for a transaction to be processed reliably. These four properties—Atomicity, Consistency, Isolation, and Durability—are non-negotiable guarantees that the Transaction Manager must provide.
Understanding ACID deeply reveals why databases are infinitely more reliable than ad-hoc file manipulation, and why implementing these guarantees requires sophisticated machinery.
Atomicity means a transaction is an indivisible unit—either all its operations complete, or none of them do. There is no partial execution.
The classic example:
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 1000 WHERE id = 'savings';
UPDATE accounts SET balance = balance + 1000 WHERE id = 'checking';
COMMIT;
If the system crashes after the first UPDATE but before the second, atomicity requires that the first UPDATE is undone—the savings account gets its $1,000 back.
How it's implemented:
Write-Ahead Logging (WAL): Before modifying data pages, the Transaction Manager writes log records describing the change. If a crash occurs, the log enables undoing incomplete transactions.
Undo Information: Each modification records the old value. During recovery, incomplete transactions are rolled back using this undo information.
Shadow Paging (alternative): Some systems maintain two versions of pages—the current version and a shadow. Commit makes the shadow official; crash reverts to the old version.
Maintaining atomicity requires logging every modification before it happens. This 'write twice' pattern (once to log, once to data) is a fundamental overhead of transactional systems. SSDs have made this overhead more tolerable by reducing log write latency.
| Property | Guarantee | Implementation Mechanism | Cost |
|---|---|---|---|
| Atomicity | All or nothing execution | WAL with undo logging, rollback | Double writes to log + data |
| Consistency | Valid state to valid state | Constraint checking, triggers | Validation overhead |
| Isolation | Transactions don't interfere | Locking, MVCC, validation | Reduced concurrency |
| Durability | Committed = permanent | WAL with forced writes | Sync I/O latency |
Concurrency control is how the Transaction Manager enforces isolation. The classic approach is locking: transactions must acquire locks on data items before accessing them, preventing conflicting access by other transactions.
Lock modes:
Lock compatibility matrix:
| Requesting \ Holding | None | Shared (S) | Exclusive (X) |
|---|---|---|---|
| Shared (S) | ✅ Grant | ✅ Grant | ❌ Block |
| Exclusive (X) | ✅ Grant | ❌ Block | ❌ Block |
Two-Phase Locking (2PL):
Simply locking before access isn't enough for serializability. The Two-Phase Locking protocol requires:
In practice, most systems use Strict 2PL: hold all exclusive locks until transaction ends (commit/abort). This prevents cascading rollbacks and simplifies recovery.
Lock Granularity:
Locks can be acquired at different levels of the data hierarchy. Each level involves trade-offs:
| Granularity | Concurrency | Overhead | Use Case |
|---|---|---|---|
| Database lock | Minimal | Trivial | DDL operations, backup |
| Table lock | Low | Very low | Bulk operations, TRUNCATE |
| Page lock | Medium | Low | Some simpler DBMSs |
| Row lock | High | Medium | OLTP workloads (default) |
| Field lock | Maximum | High | Rarely implemented |
12345678910111213141516171819202122232425
-- Explicit table locking (PostgreSQL)BEGIN;LOCK TABLE accounts IN EXCLUSIVE MODE;-- No other transaction can read or write this table-- until we commit/rollback -- Advisory locks (application-coordinated locking)SELECT pg_advisory_lock(12345); -- Acquire lock on key 12345-- ... do work ...SELECT pg_advisory_unlock(12345); -- MySQL: SELECT ... FOR UPDATE (row-level X locks)BEGIN;SELECT * FROM accounts WHERE id = 'A123' FOR UPDATE;-- Row is now exclusively locked until transaction endsUPDATE accounts SET balance = balance - 100 WHERE id = 'A123';COMMIT; -- SELECT ... FOR SHARE (row-level S locks)SELECT * FROM accounts WHERE id = 'A123' FOR SHARE;-- Others can read but not modify this row -- View current locks (PostgreSQL)SELECT pid, mode, granted, relation::regclassFROM pg_locks WHERE relation IS NOT NULL;When a transaction holds many row locks, the DBMS may 'escalate' to a table lock to reduce overhead. This can cause unexpected blocking. SQL Server is notable for aggressive lock escalation; PostgreSQL does not escalate. Understanding your DBMS's escalation behavior is important for high-concurrency workloads.
Multi-Version Concurrency Control (MVCC) is an alternative to locking that dramatically improves read-write concurrency. Instead of blocking readers when a writer modifies data, MVCC maintains multiple versions of each row, allowing readers to see a consistent snapshot without blocking.
The key insight: Readers never block writers; writers never block readers.
How MVCC works:
1234567891011121314151617181920212223242526
-- PostgreSQL: Each row has hidden system columns-- xmin: Transaction ID that created this row version-- xmax: Transaction ID that deleted/updated this row (0 if live) SELECT xmin, xmax, * FROM accounts WHERE id = 'A123';-- xmin | xmax | id | balance-- 1234 | 0 | A123 | 1000 -- After an UPDATE:BEGIN;UPDATE accounts SET balance = 900 WHERE id = 'A123';-- Old version: xmin=1234, xmax=1240 (marked deleted by txn 1240)-- New version: xmin=1240, xmax=0 (created by txn 1240)COMMIT; -- Visibility rule (simplified):-- A row version is visible to transaction T if:-- xmin is committed and xmin < T's snapshot AND-- xmax is either 0 OR uncommitted OR xmax >= T's snapshot -- VACUUM removes versions no longer visible to any transactionVACUUM ANALYZE accounts; -- Check for bloat (too many dead versions)SELECT relname, n_live_tup, n_dead_tup FROM pg_stat_user_tables;PostgreSQL stores versions in the main table (heap), requiring VACUUM. MySQL/InnoDB uses a separate undo log for old versions, with automatic purge. Oracle stores old versions in rollback segments. The trade-offs affect performance characteristics: PostgreSQL updates are slower (new version in heap) but reads avoid indirection; InnoDB updates are faster but long-running transactions can exhaust undo space.
The SQL standard defines four isolation levels, each offering different trade-offs between anomaly prevention and concurrency. Understanding these levels is critical for designing correct applications.
The isolation level spectrum: From weakest (most concurrent) to strongest (most consistent):
| Isolation Level | Dirty Read | Non-Repeatable Read | Phantom Read | Concurrency |
|---|---|---|---|---|
| READ UNCOMMITTED | Possible ⚠️ | Possible ⚠️ | Possible ⚠️ | Maximum |
| READ COMMITTED | Prevented ✅ | Possible ⚠️ | Possible ⚠️ | High |
| REPEATABLE READ | Prevented ✅ | Prevented ✅ | Possible ⚠️ | Medium |
| SERIALIZABLE | Prevented ✅ | Prevented ✅ | Prevented ✅ | Low |
READ COMMITTED is the default in PostgreSQL, Oracle, and SQL Server. It guarantees that you only see committed data—no dirty reads.
Behavior:
Practical implications:
1234567891011121314151617181920
-- Non-repeatable read at READ COMMITTED: -- Transaction A: -- Transaction B:BEGIN;SELECT balance FROM accounts WHERE id = 'A123'; -- Returns: 1000 BEGIN; UPDATE accounts SET balance = 500 WHERE id = 'A123'; COMMIT; SELECT balance FROM accounts WHERE id = 'A123'; -- Returns: 500 (different!)COMMIT; -- Transaction A saw different values for the same query-- This is allowed at READ COMMITTED123456789101112131415
-- PostgreSQL: Set for transactionBEGIN;SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;-- ... operations ...COMMIT; -- PostgreSQL: Set session defaultSET default_transaction_isolation = 'repeatable read'; -- MySQL: Session levelSET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED; -- Check current levelSHOW transaction_isolation; -- MySQLSHOW default_transaction_isolation; -- PostgreSQLA deadlock occurs when two or more transactions are each waiting for locks held by the others, creating a cycle of dependencies. None can proceed; the system is stuck.
Classic example:
T1: Lock(A), then wants Lock(B)
T2: Lock(B), then wants Lock(A)
T1 holds A, waits for B. T2 holds B, waits for A. Neither can proceed.
Deadlock handling strategies:
1234567891011121314151617181920212223242526272829
-- PostgreSQL deadlock detection is automatic-- Default deadlock_timeout is 1 second SHOW deadlock_timeout;-- After waiting 1s, PostgreSQL checks for deadlock -- When deadlock detected:-- ERROR: deadlock detected-- DETAIL: Process 12345 waits for ShareLock on transaction 6789;-- blocked by process 12346.-- Process 12346 waits for ShareLock on transaction 6788;-- blocked by process 12345.-- HINT: See server log for query details. -- Application must catch and retry:-- try:-- execute_transaction()-- except DeadlockError:-- rollback()-- retry_after_delay() -- MySQL: innodb_deadlock_detect = ON (default)-- MySQL: innodb_lock_wait_timeout = 50 (seconds) -- Best practice: Acquire locks in consistent order-- If both transactions lock accounts in id order:-- T1: Lock(A), Lock(B) -- A < B-- T2: Lock(A), Lock(B) -- same order-- No deadlock possible!The most effective deadlock prevention is consistent lock ordering: always access rows in the same order (e.g., by primary key). Keep transactions short—long transactions hold locks longer, increasing deadlock probability. Access the 'hottest' resource last when possible.
Write-Ahead Logging (WAL) is the backbone of ACID compliance. The principle is simple: before modifying any data page, write a log record describing the change. This log enables both atomicity (undo incomplete transactions) and durability (redo committed transactions after crash).
The WAL protocol:
Recovery Process (ARIES Algorithm):
When the system crashes and restarts, recovery proceeds in three phases:
1. Analysis Phase: Scan log from last checkpoint. Identify which transactions were active at crash time. Determine which pages may have uncommitted changes.
2. Redo Phase: Replay the log forward, applying all changes (committed and uncommitted). This brings the database to the exact state at crash time. 'Redo history repeats.'
3. Undo Phase: Roll back uncommitted transactions by applying undo information from log records (in reverse order). Write CLRs to ensure undo is recorded.
After recovery, the database contains exactly the committed state—no more, no less.
1234567891011121314151617
-- PostgreSQL WAL settingsSHOW wal_level; -- replica, logical (for replication)SHOW wal_buffers; -- Size of WAL buffer in memorySHOW checkpoint_timeout; -- Max time between checkpointsSHOW max_wal_size; -- Trigger checkpoint if WAL grows this large -- View current WAL positionSELECT pg_current_wal_lsn(); -- View WAL statisticsSELECT * FROM pg_stat_wal; -- MySQL InnoDB redo logSHOW VARIABLES LIKE 'innodb_log%';-- innodb_log_buffer_size: Memory buffer-- innodb_log_file_size: Size of each log file-- innodb_log_files_in_group: Number of log filesWithout checkpoints, recovery would need to replay the entire log from database creation. Checkpoints record which pages are dirty and their log positions. Recovery starts from the last checkpoint, dramatically reducing recovery time. Frequent checkpoints speed recovery but consume I/O; tuning is workload-specific.
The Transaction Manager is the guardian of data integrity, implementing ACID properties through sophisticated concurrency control and logging mechanisms.
Key takeaways:
What's next:
The Transaction Manager coordinates closely with the Buffer Manager to ensure durability—dirty pages must be written in a specific order relative to log records. The next page explores the Buffer Manager, the component that bridges the massive speed gap between memory and disk through intelligent caching.
You now understand how the Transaction Manager implements ACID properties through locking, MVCC, isolation levels, and write-ahead logging. You can appreciate the trade-offs between consistency and concurrency, and why transaction management is considered the heart of database reliability.