Database Management SystemsDBMS Components

DBMS Components: The Internal Architecture

LevelIntermediate

Duration75 mins

TopicDBMS Components

3 / 5

Transaction Manager: Guaranteeing ACID Properties

The Guardian of Consistency

Imagine transferring $1,000 from your savings to checking account. The bank's database must subtract from savings and add to checking. If the system crashes after the subtraction but before the addition, where did the money go? If two simultaneous transfers read the same balance, will one overwrite the other?

These are not hypothetical concerns—they're the daily reality of database systems processing millions of concurrent operations. The component that prevents such catastrophes is the Transaction Manager, arguably the most critical guardian in the entire DBMS architecture.

The Transaction Manager ensures that databases remain consistent even under concurrent access and system failures. It implements the legendary ACID properties that distinguish real databases from mere file storage. This page takes you deep into transaction management—the mechanisms that make reliable data systems possible.

What You Will Learn

By the end of this page, you will understand ACID properties in depth, concurrency control through locking and MVCC, isolation levels and their trade-offs, deadlock detection and resolution, and the fundamentals of transaction logging for recovery.

ACID Properties: The Foundation of Reliability

ACID is the acronym that defines what it means for a transaction to be processed reliably. These four properties—Atomicity, Consistency, Isolation, and Durability—are non-negotiable guarantees that the Transaction Manager must provide.

Understanding ACID deeply reveals why databases are infinitely more reliable than ad-hoc file manipulation, and why implementing these guarantees requires sophisticated machinery.

Atomicity means a transaction is an indivisible unit—either all its operations complete, or none of them do. There is no partial execution.

The classic example:

BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 1000 WHERE id = 'savings';
UPDATE accounts SET balance = balance + 1000 WHERE id = 'checking';
COMMIT;

If the system crashes after the first UPDATE but before the second, atomicity requires that the first UPDATE is undone—the savings account gets its $1,000 back.

How it's implemented:

Write-Ahead Logging (WAL): Before modifying data pages, the Transaction Manager writes log records describing the change. If a crash occurs, the log enables undoing incomplete transactions.
Undo Information: Each modification records the old value. During recovery, incomplete transactions are rolled back using this undo information.
Shadow Paging (alternative): Some systems maintain two versions of pages—the current version and a shadow. Commit makes the shadow official; crash reverts to the old version.

Atomicity is Expensive

Maintaining atomicity requires logging every modification before it happens. This 'write twice' pattern (once to log, once to data) is a fundamental overhead of transactional systems. SSDs have made this overhead more tolerable by reducing log write latency.

ACID Properties Summary
Property	Guarantee	Implementation Mechanism	Cost
Atomicity	All or nothing execution	WAL with undo logging, rollback	Double writes to log + data
Consistency	Valid state to valid state	Constraint checking, triggers	Validation overhead
Isolation	Transactions don't interfere	Locking, MVCC, validation	Reduced concurrency
Durability	Committed = permanent	WAL with forced writes	Sync I/O latency

Concurrency Control: Locking Mechanisms

Concurrency control is how the Transaction Manager enforces isolation. The classic approach is locking: transactions must acquire locks on data items before accessing them, preventing conflicting access by other transactions.

Lock modes:

Shared (S) Lock: Allows reading; multiple transactions can hold shared locks simultaneously
Exclusive (X) Lock: Allows writing; only one transaction can hold an exclusive lock, and no shared locks can coexist

Lock compatibility matrix:

Lock Compatibility
Requesting \ Holding	None	Shared (S)	Exclusive (X)
Shared (S)	✅ Grant	✅ Grant	❌ Block
Exclusive (X)	✅ Grant	❌ Block	❌ Block

Two-Phase Locking (2PL):

Simply locking before access isn't enough for serializability. The Two-Phase Locking protocol requires:

Growing Phase: Transaction acquires all needed locks, releasing none
Shrinking Phase: Transaction may release locks but acquire no new ones

In practice, most systems use Strict 2PL: hold all exclusive locks until transaction ends (commit/abort). This prevents cascading rollbacks and simplifies recovery.

Converting Mermaid diagram...

Lock Granularity:

Locks can be acquired at different levels of the data hierarchy. Each level involves trade-offs:

Lock Granularity Trade-offs
Granularity	Concurrency	Overhead	Use Case
Database lock	Minimal	Trivial	DDL operations, backup
Table lock	Low	Very low	Bulk operations, TRUNCATE
Page lock	Medium	Low	Some simpler DBMSs
Row lock	High	Medium	OLTP workloads (default)
Field lock	Maximum	High	Rarely implemented

Lock Examples
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Explicit table locking (PostgreSQL)
BEGIN;
LOCK TABLE accounts IN EXCLUSIVE MODE;
-- No other transaction can read or write this table
-- until we commit/rollback
 
-- Advisory locks (application-coordinated locking)
SELECT pg_advisory_lock(12345);  -- Acquire lock on key 12345
-- ... do work ...
SELECT pg_advisory_unlock(12345);
 
-- MySQL: SELECT ... FOR UPDATE (row-level X locks)
BEGIN;
SELECT * FROM accounts WHERE id = 'A123' FOR UPDATE;
-- Row is now exclusively locked until transaction ends
UPDATE accounts SET balance = balance - 100 WHERE id = 'A123';
COMMIT;
 
-- SELECT ... FOR SHARE (row-level S locks)
SELECT * FROM accounts WHERE id = 'A123' FOR SHARE;
-- Others can read but not modify this row
 
-- View current locks (PostgreSQL)
SELECT pid, mode, granted, relation::regclass
FROM pg_locks WHERE relation IS NOT NULL;

Lock Escalation

When a transaction holds many row locks, the DBMS may 'escalate' to a table lock to reduce overhead. This can cause unexpected blocking. SQL Server is notable for aggressive lock escalation; PostgreSQL does not escalate. Understanding your DBMS's escalation behavior is important for high-concurrency workloads.

Multi-Version Concurrency Control (MVCC)

Multi-Version Concurrency Control (MVCC) is an alternative to locking that dramatically improves read-write concurrency. Instead of blocking readers when a writer modifies data, MVCC maintains multiple versions of each row, allowing readers to see a consistent snapshot without blocking.

The key insight: Readers never block writers; writers never block readers.

How MVCC works:

Each row version has visibility information (creation timestamp, deletion timestamp, or transaction IDs)
A reading transaction sees only versions 'visible' to its snapshot
Writers create new versions instead of overwriting; old versions remain for concurrent readers
Background processes (vacuum/purge) eventually clean up old versions no longer needed

Converting Mermaid diagram...

MVCC Advantages

•Readers never block writers
•Writers never block readers
•Excellent read scalability
•Natural snapshot isolation
•No deadlocks for read-only transactions
•Time-travel queries possible

MVCC Disadvantages

•Storage overhead (multiple versions)
•Vacuum/purge process needed
•Write amplification for updates
•More complex implementation
•Writers still block writers
•Long transactions block cleanup

MVCC Implementation Visibility
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- PostgreSQL: Each row has hidden system columns
-- xmin: Transaction ID that created this row version
-- xmax: Transaction ID that deleted/updated this row (0 if live)
 
SELECT xmin, xmax, * FROM accounts WHERE id = 'A123';
-- xmin | xmax | id   | balance
-- 1234 | 0    | A123 | 1000
 
-- After an UPDATE:
BEGIN;
UPDATE accounts SET balance = 900 WHERE id = 'A123';
-- Old version: xmin=1234, xmax=1240 (marked deleted by txn 1240)
-- New version: xmin=1240, xmax=0 (created by txn 1240)
COMMIT;
 
-- Visibility rule (simplified):
-- A row version is visible to transaction T if:
--   xmin is committed and xmin < T's snapshot AND
--   xmax is either 0 OR uncommitted OR xmax >= T's snapshot
 
-- VACUUM removes versions no longer visible to any transaction
VACUUM ANALYZE accounts;
 
-- Check for bloat (too many dead versions)
SELECT relname, n_live_tup, n_dead_tup 
FROM pg_stat_user_tables;

MVCC Implementations Vary

PostgreSQL stores versions in the main table (heap), requiring VACUUM. MySQL/InnoDB uses a separate undo log for old versions, with automatic purge. Oracle stores old versions in rollback segments. The trade-offs affect performance characteristics: PostgreSQL updates are slower (new version in heap) but reads avoid indirection; InnoDB updates are faster but long-running transactions can exhaust undo space.

Isolation Levels: Trading Safety for Performance

The SQL standard defines four isolation levels, each offering different trade-offs between anomaly prevention and concurrency. Understanding these levels is critical for designing correct applications.

The isolation level spectrum: From weakest (most concurrent) to strongest (most consistent):

SQL Isolation Levels and Permitted Anomalies
Isolation Level	Dirty Read	Non-Repeatable Read	Phantom Read	Concurrency
READ UNCOMMITTED	Possible ⚠️	Possible ⚠️	Possible ⚠️	Maximum
READ COMMITTED	Prevented ✅	Possible ⚠️	Possible ⚠️	High
REPEATABLE READ	Prevented ✅	Prevented ✅	Possible ⚠️	Medium
SERIALIZABLE	Prevented ✅	Prevented ✅	Prevented ✅	Low

READ COMMITTED is the default in PostgreSQL, Oracle, and SQL Server. It guarantees that you only see committed data—no dirty reads.

Behavior:

Each statement sees a new snapshot of committed data
Different statements in the same transaction may see different data
Non-repeatable reads are possible

Practical implications:

Safe for most applications
Write-write conflicts detected at commit time
Good concurrency, minimal blocking
Applications must handle 'read-update-write' race conditions

Read Committed Example
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Non-repeatable read at READ COMMITTED:
 
-- Transaction A:              -- Transaction B:
BEGIN;
SELECT balance FROM accounts   
WHERE id = 'A123';             
-- Returns: 1000
                                BEGIN;
                                UPDATE accounts 
                                SET balance = 500
                                WHERE id = 'A123';
                                COMMIT;
 
SELECT balance FROM accounts   
WHERE id = 'A123';             
-- Returns: 500  (different!)
COMMIT;
 
-- Transaction A saw different values for the same query
-- This is allowed at READ COMMITTED

Setting Isolation Levels
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-- PostgreSQL: Set for transaction
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- ... operations ...
COMMIT;
 
-- PostgreSQL: Set session default
SET default_transaction_isolation = 'repeatable read';
 
-- MySQL: Session level
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
 
-- Check current level
SHOW transaction_isolation;  -- MySQL
SHOW default_transaction_isolation;  -- PostgreSQL

Deadlock Detection and Resolution

A deadlock occurs when two or more transactions are each waiting for locks held by the others, creating a cycle of dependencies. None can proceed; the system is stuck.

Classic example:

T1: Lock(A), then wants Lock(B)
T2: Lock(B), then wants Lock(A)

T1 holds A, waits for B. T2 holds B, waits for A. Neither can proceed.

Deadlock handling strategies:

Deadlock Resolution Approaches

•Deadlock Detection: Periodically check for cycles in the wait-for graph. When detected, choose a victim transaction and abort it. This is the most common approach (PostgreSQL, MySQL, Oracle).
•Deadlock Prevention: Design protocols that make deadlocks impossible. Examples: acquire all locks upfront, impose lock ordering, use timeouts. Reduces concurrency but simplifies handling.
•Lock Timeouts: If a lock wait exceeds a threshold, abort the waiting transaction. Simple but may abort transactions that would have succeeded.
•Wait-Die / Wound-Wait: Age-based schemes where older or younger transactions yield. Prevents starvation of old transactions.

Converting Mermaid diagram...

Deadlock Handling
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- PostgreSQL deadlock detection is automatic
-- Default deadlock_timeout is 1 second
 
SHOW deadlock_timeout;
-- After waiting 1s, PostgreSQL checks for deadlock
 
-- When deadlock detected:
-- ERROR:  deadlock detected
-- DETAIL:  Process 12345 waits for ShareLock on transaction 6789;
--          blocked by process 12346.
--          Process 12346 waits for ShareLock on transaction 6788;
--          blocked by process 12345.
-- HINT:  See server log for query details.
 
-- Application must catch and retry:
-- try:
--     execute_transaction()
-- except DeadlockError:
--     rollback()
--     retry_after_delay()
 
-- MySQL: innodb_deadlock_detect = ON (default)
-- MySQL: innodb_lock_wait_timeout = 50 (seconds)
 
-- Best practice: Acquire locks in consistent order
-- If both transactions lock accounts in id order:
--   T1: Lock(A), Lock(B) -- A < B
--   T2: Lock(A), Lock(B) -- same order
-- No deadlock possible!

Preventing Deadlocks in Application Design

The most effective deadlock prevention is consistent lock ordering: always access rows in the same order (e.g., by primary key). Keep transactions short—long transactions hold locks longer, increasing deadlock probability. Access the 'hottest' resource last when possible.

Write-Ahead Logging and Recovery

Write-Ahead Logging (WAL) is the backbone of ACID compliance. The principle is simple: before modifying any data page, write a log record describing the change. This log enables both atomicity (undo incomplete transactions) and durability (redo committed transactions after crash).

The WAL protocol:

Before modifying data, write log record to log buffer
Before commit returns, flush log records to disk
Data pages may remain in memory; Buffer Manager writes them lazily
On crash recovery, replay log to restore committed state

Log Record Types

•UPDATE record: Describes a data modification—page ID, offset, old value (for undo), new value (for redo)
•INSERT record: Describes a new row—page ID, slot, full row data
•DELETE record: Describes a removed row—page ID, slot, row data (for undo)
•COMMIT record: Marks transaction completion—transaction ID, timestamp
•ABORT record: Marks transaction rollback—transaction ID
•CHECKPOINT record: Records consistent state for recovery acceleration
•CLR (Compensation Log Record): Records undo operations during rollback—ensures undo is idempotent

Converting Mermaid diagram...

Recovery Process (ARIES Algorithm):

When the system crashes and restarts, recovery proceeds in three phases:

1. Analysis Phase: Scan log from last checkpoint. Identify which transactions were active at crash time. Determine which pages may have uncommitted changes.

2. Redo Phase: Replay the log forward, applying all changes (committed and uncommitted). This brings the database to the exact state at crash time. 'Redo history repeats.'

3. Undo Phase: Roll back uncommitted transactions by applying undo information from log records (in reverse order). Write CLRs to ensure undo is recorded.

After recovery, the database contains exactly the committed state—no more, no less.

WAL Configuration
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- PostgreSQL WAL settings
SHOW wal_level;  -- replica, logical (for replication)
SHOW wal_buffers;  -- Size of WAL buffer in memory
SHOW checkpoint_timeout;  -- Max time between checkpoints
SHOW max_wal_size;  -- Trigger checkpoint if WAL grows this large
 
-- View current WAL position
SELECT pg_current_wal_lsn();
 
-- View WAL statistics
SELECT * FROM pg_stat_wal;
 
-- MySQL InnoDB redo log
SHOW VARIABLES LIKE 'innodb_log%';
-- innodb_log_buffer_size: Memory buffer
-- innodb_log_file_size: Size of each log file
-- innodb_log_files_in_group: Number of log files

Checkpoints: Accelerating Recovery

Without checkpoints, recovery would need to replay the entire log from database creation. Checkpoints record which pages are dirty and their log positions. Recovery starts from the last checkpoint, dramatically reducing recovery time. Frequent checkpoints speed recovery but consume I/O; tuning is workload-specific.

Summary: The Transaction Manager

The Transaction Manager is the guardian of data integrity, implementing ACID properties through sophisticated concurrency control and logging mechanisms.

Key takeaways:

Transaction Manager Essentials

•ACID properties form the foundation of database reliability: Atomicity (all-or-nothing), Consistency (valid states), Isolation (no interference), Durability (permanent commits).
•Locking provides transaction isolation through shared and exclusive locks, with Two-Phase Locking (2PL) ensuring serializability at the cost of potential blocking and deadlocks.
•MVCC eliminates read-write blocking by maintaining multiple row versions, dramatically improving concurrency for read-heavy workloads while adding storage overhead.
•Isolation levels offer trade-offs: READ COMMITTED for most applications, REPEATABLE READ for consistent snapshots, SERIALIZABLE when correctness is paramount.
•Deadlocks are detected and resolved by aborting a victim transaction; prevention through consistent lock ordering is often more effective than detection.
•Write-Ahead Logging enables crash recovery by recording changes before they happen; the ARIES algorithm provides a systematic approach to bring databases back to consistent state.

What's next:

The Transaction Manager coordinates closely with the Buffer Manager to ensure durability—dirty pages must be written in a specific order relative to log records. The next page explores the Buffer Manager, the component that bridges the massive speed gap between memory and disk through intelligent caching.

Page Complete

You now understand how the Transaction Manager implements ACID properties through locking, MVCC, isolation levels, and write-ahead logging. You can appreciate the trade-offs between consistency and concurrency, and why transaction management is considered the heart of database reliability.

3 / 5

Loading learning content...

Database Management SystemsDBMS Components

DBMS Components: The Internal Architecture

LevelIntermediate

Duration75 mins

TopicDBMS Components

3 / 5

Transaction Manager: Guaranteeing ACID Properties

The Guardian of Consistency

What You Will Learn

ACID Properties: The Foundation of Reliability

Understanding ACID deeply reveals why databases are infinitely more reliable than ad-hoc file manipulation, and why implementing these guarantees requires sophisticated machinery.

Atomicity means a transaction is an indivisible unit—either all its operations complete, or none of them do. There is no partial execution.

The classic example:

BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 1000 WHERE id = 'savings';
UPDATE accounts SET balance = balance + 1000 WHERE id = 'checking';
COMMIT;

If the system crashes after the first UPDATE but before the second, atomicity requires that the first UPDATE is undone—the savings account gets its $1,000 back.

How it's implemented:

Write-Ahead Logging (WAL): Before modifying data pages, the Transaction Manager writes log records describing the change. If a crash occurs, the log enables undoing incomplete transactions.
Undo Information: Each modification records the old value. During recovery, incomplete transactions are rolled back using this undo information.
Shadow Paging (alternative): Some systems maintain two versions of pages—the current version and a shadow. Commit makes the shadow official; crash reverts to the old version.

Atomicity is Expensive

ACID Properties Summary
Property	Guarantee	Implementation Mechanism	Cost
Atomicity	All or nothing execution	WAL with undo logging, rollback	Double writes to log + data
Consistency	Valid state to valid state	Constraint checking, triggers	Validation overhead
Isolation	Transactions don't interfere	Locking, MVCC, validation	Reduced concurrency
Durability	Committed = permanent	WAL with forced writes	Sync I/O latency

Concurrency Control: Locking Mechanisms

Lock modes:

Shared (S) Lock: Allows reading; multiple transactions can hold shared locks simultaneously
Exclusive (X) Lock: Allows writing; only one transaction can hold an exclusive lock, and no shared locks can coexist

Lock compatibility matrix:

Lock Compatibility
Requesting \ Holding	None	Shared (S)	Exclusive (X)
Shared (S)	✅ Grant	✅ Grant	❌ Block
Exclusive (X)	✅ Grant	❌ Block	❌ Block

Two-Phase Locking (2PL):

Simply locking before access isn't enough for serializability. The Two-Phase Locking protocol requires:

Growing Phase: Transaction acquires all needed locks, releasing none
Shrinking Phase: Transaction may release locks but acquire no new ones

In practice, most systems use Strict 2PL: hold all exclusive locks until transaction ends (commit/abort). This prevents cascading rollbacks and simplifies recovery.

Converting Mermaid diagram...

Lock Granularity:

Locks can be acquired at different levels of the data hierarchy. Each level involves trade-offs:

Lock Granularity Trade-offs
Granularity	Concurrency	Overhead	Use Case
Database lock	Minimal	Trivial	DDL operations, backup
Table lock	Low	Very low	Bulk operations, TRUNCATE
Page lock	Medium	Low	Some simpler DBMSs
Row lock	High	Medium	OLTP workloads (default)
Field lock	Maximum	High	Rarely implemented

Lock Examples
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Explicit table locking (PostgreSQL)
BEGIN;
LOCK TABLE accounts IN EXCLUSIVE MODE;
-- No other transaction can read or write this table
-- until we commit/rollback
 
-- Advisory locks (application-coordinated locking)
SELECT pg_advisory_lock(12345);  -- Acquire lock on key 12345
-- ... do work ...
SELECT pg_advisory_unlock(12345);
 
-- MySQL: SELECT ... FOR UPDATE (row-level X locks)
BEGIN;
SELECT * FROM accounts WHERE id = 'A123' FOR UPDATE;
-- Row is now exclusively locked until transaction ends
UPDATE accounts SET balance = balance - 100 WHERE id = 'A123';
COMMIT;
 
-- SELECT ... FOR SHARE (row-level S locks)
SELECT * FROM accounts WHERE id = 'A123' FOR SHARE;
-- Others can read but not modify this row
 
-- View current locks (PostgreSQL)
SELECT pid, mode, granted, relation::regclass
FROM pg_locks WHERE relation IS NOT NULL;

Lock Escalation

Multi-Version Concurrency Control (MVCC)

The key insight: Readers never block writers; writers never block readers.

How MVCC works:

Each row version has visibility information (creation timestamp, deletion timestamp, or transaction IDs)
A reading transaction sees only versions 'visible' to its snapshot
Writers create new versions instead of overwriting; old versions remain for concurrent readers
Background processes (vacuum/purge) eventually clean up old versions no longer needed

Converting Mermaid diagram...

MVCC Advantages

•Readers never block writers
•Writers never block readers
•Excellent read scalability
•Natural snapshot isolation
•No deadlocks for read-only transactions
•Time-travel queries possible

MVCC Disadvantages

•Storage overhead (multiple versions)
•Vacuum/purge process needed
•Write amplification for updates
•More complex implementation
•Writers still block writers
•Long transactions block cleanup

MVCC Implementation Visibility
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- PostgreSQL: Each row has hidden system columns
-- xmin: Transaction ID that created this row version
-- xmax: Transaction ID that deleted/updated this row (0 if live)
 
SELECT xmin, xmax, * FROM accounts WHERE id = 'A123';
-- xmin | xmax | id   | balance
-- 1234 | 0    | A123 | 1000
 
-- After an UPDATE:
BEGIN;
UPDATE accounts SET balance = 900 WHERE id = 'A123';
-- Old version: xmin=1234, xmax=1240 (marked deleted by txn 1240)
-- New version: xmin=1240, xmax=0 (created by txn 1240)
COMMIT;
 
-- Visibility rule (simplified):
-- A row version is visible to transaction T if:
--   xmin is committed and xmin < T's snapshot AND
--   xmax is either 0 OR uncommitted OR xmax >= T's snapshot
 
-- VACUUM removes versions no longer visible to any transaction
VACUUM ANALYZE accounts;
 
-- Check for bloat (too many dead versions)
SELECT relname, n_live_tup, n_dead_tup 
FROM pg_stat_user_tables;

MVCC Implementations Vary

Isolation Levels: Trading Safety for Performance

The isolation level spectrum: From weakest (most concurrent) to strongest (most consistent):

SQL Isolation Levels and Permitted Anomalies
Isolation Level	Dirty Read	Non-Repeatable Read	Phantom Read	Concurrency
READ UNCOMMITTED	Possible ⚠️	Possible ⚠️	Possible ⚠️	Maximum
READ COMMITTED	Prevented ✅	Possible ⚠️	Possible ⚠️	High
REPEATABLE READ	Prevented ✅	Prevented ✅	Possible ⚠️	Medium
SERIALIZABLE	Prevented ✅	Prevented ✅	Prevented ✅	Low

READ COMMITTED is the default in PostgreSQL, Oracle, and SQL Server. It guarantees that you only see committed data—no dirty reads.

Behavior:

Each statement sees a new snapshot of committed data
Different statements in the same transaction may see different data
Non-repeatable reads are possible

Practical implications:

Safe for most applications
Write-write conflicts detected at commit time
Good concurrency, minimal blocking
Applications must handle 'read-update-write' race conditions

Read Committed Example
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Non-repeatable read at READ COMMITTED:
 
-- Transaction A:              -- Transaction B:
BEGIN;
SELECT balance FROM accounts   
WHERE id = 'A123';             
-- Returns: 1000
                                BEGIN;
                                UPDATE accounts 
                                SET balance = 500
                                WHERE id = 'A123';
                                COMMIT;
 
SELECT balance FROM accounts   
WHERE id = 'A123';             
-- Returns: 500  (different!)
COMMIT;
 
-- Transaction A saw different values for the same query
-- This is allowed at READ COMMITTED

Setting Isolation Levels
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-- PostgreSQL: Set for transaction
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- ... operations ...
COMMIT;
 
-- PostgreSQL: Set session default
SET default_transaction_isolation = 'repeatable read';
 
-- MySQL: Session level
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
 
-- Check current level
SHOW transaction_isolation;  -- MySQL
SHOW default_transaction_isolation;  -- PostgreSQL

Deadlock Detection and Resolution

A deadlock occurs when two or more transactions are each waiting for locks held by the others, creating a cycle of dependencies. None can proceed; the system is stuck.

Classic example:

T1: Lock(A), then wants Lock(B)
T2: Lock(B), then wants Lock(A)

T1 holds A, waits for B. T2 holds B, waits for A. Neither can proceed.

Deadlock handling strategies:

Deadlock Resolution Approaches

•Deadlock Detection: Periodically check for cycles in the wait-for graph. When detected, choose a victim transaction and abort it. This is the most common approach (PostgreSQL, MySQL, Oracle).
•Deadlock Prevention: Design protocols that make deadlocks impossible. Examples: acquire all locks upfront, impose lock ordering, use timeouts. Reduces concurrency but simplifies handling.
•Lock Timeouts: If a lock wait exceeds a threshold, abort the waiting transaction. Simple but may abort transactions that would have succeeded.
•Wait-Die / Wound-Wait: Age-based schemes where older or younger transactions yield. Prevents starvation of old transactions.

Converting Mermaid diagram...

Deadlock Handling
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- PostgreSQL deadlock detection is automatic
-- Default deadlock_timeout is 1 second
 
SHOW deadlock_timeout;
-- After waiting 1s, PostgreSQL checks for deadlock
 
-- When deadlock detected:
-- ERROR:  deadlock detected
-- DETAIL:  Process 12345 waits for ShareLock on transaction 6789;
--          blocked by process 12346.
--          Process 12346 waits for ShareLock on transaction 6788;
--          blocked by process 12345.
-- HINT:  See server log for query details.
 
-- Application must catch and retry:
-- try:
--     execute_transaction()
-- except DeadlockError:
--     rollback()
--     retry_after_delay()
 
-- MySQL: innodb_deadlock_detect = ON (default)
-- MySQL: innodb_lock_wait_timeout = 50 (seconds)
 
-- Best practice: Acquire locks in consistent order
-- If both transactions lock accounts in id order:
--   T1: Lock(A), Lock(B) -- A < B
--   T2: Lock(A), Lock(B) -- same order
-- No deadlock possible!

Preventing Deadlocks in Application Design

Write-Ahead Logging and Recovery

The WAL protocol:

Before modifying data, write log record to log buffer
Before commit returns, flush log records to disk
Data pages may remain in memory; Buffer Manager writes them lazily
On crash recovery, replay log to restore committed state

Log Record Types

•UPDATE record: Describes a data modification—page ID, offset, old value (for undo), new value (for redo)
•INSERT record: Describes a new row—page ID, slot, full row data
•DELETE record: Describes a removed row—page ID, slot, row data (for undo)
•COMMIT record: Marks transaction completion—transaction ID, timestamp
•ABORT record: Marks transaction rollback—transaction ID
•CHECKPOINT record: Records consistent state for recovery acceleration
•CLR (Compensation Log Record): Records undo operations during rollback—ensures undo is idempotent

Converting Mermaid diagram...

Recovery Process (ARIES Algorithm):

When the system crashes and restarts, recovery proceeds in three phases:

1. Analysis Phase: Scan log from last checkpoint. Identify which transactions were active at crash time. Determine which pages may have uncommitted changes.

2. Redo Phase: Replay the log forward, applying all changes (committed and uncommitted). This brings the database to the exact state at crash time. 'Redo history repeats.'

3. Undo Phase: Roll back uncommitted transactions by applying undo information from log records (in reverse order). Write CLRs to ensure undo is recorded.

After recovery, the database contains exactly the committed state—no more, no less.

WAL Configuration
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- PostgreSQL WAL settings
SHOW wal_level;  -- replica, logical (for replication)
SHOW wal_buffers;  -- Size of WAL buffer in memory
SHOW checkpoint_timeout;  -- Max time between checkpoints
SHOW max_wal_size;  -- Trigger checkpoint if WAL grows this large
 
-- View current WAL position
SELECT pg_current_wal_lsn();
 
-- View WAL statistics
SELECT * FROM pg_stat_wal;
 
-- MySQL InnoDB redo log
SHOW VARIABLES LIKE 'innodb_log%';
-- innodb_log_buffer_size: Memory buffer
-- innodb_log_file_size: Size of each log file
-- innodb_log_files_in_group: Number of log files

Checkpoints: Accelerating Recovery

Summary: The Transaction Manager

The Transaction Manager is the guardian of data integrity, implementing ACID properties through sophisticated concurrency control and logging mechanisms.

Key takeaways:

Transaction Manager Essentials

•ACID properties form the foundation of database reliability: Atomicity (all-or-nothing), Consistency (valid states), Isolation (no interference), Durability (permanent commits).
•Locking provides transaction isolation through shared and exclusive locks, with Two-Phase Locking (2PL) ensuring serializability at the cost of potential blocking and deadlocks.
•MVCC eliminates read-write blocking by maintaining multiple row versions, dramatically improving concurrency for read-heavy workloads while adding storage overhead.
•Isolation levels offer trade-offs: READ COMMITTED for most applications, REPEATABLE READ for consistent snapshots, SERIALIZABLE when correctness is paramount.
•Deadlocks are detected and resolved by aborting a victim transaction; prevention through consistent lock ordering is often more effective than detection.
•Write-Ahead Logging enables crash recovery by recording changes before they happen; the ARIES algorithm provides a systematic approach to bring databases back to consistent state.

What's next:

Page Complete

3 / 5