Loading content...
Imagine you're the only one using the database. Every query returns fresh, consistent data. Every update takes effect immediately. Life is simple.
Now add a thousand concurrent users. Ten thousand. A million. Suddenly, transactions overlap, race conditions emerge, and without careful management, chaos ensues. One user reads data that another user is actively modifying. Two users update the same record simultaneously. The database that seemed so simple becomes a minefield of inconsistency.
Isolation is the ACID property that maintains the illusion of solitude. It ensures that concurrent transactions execute as if they were running sequentially—one after another—even though they're actually interleaved for performance. Each transaction sees a consistent snapshot of the database, unaffected by the unfinished work of other transactions.
This page covers isolation comprehensively: the theoretical foundation, the concurrency anomalies that occur without proper isolation (dirty reads, non-repeatable reads, phantom reads, write skew), the four standard isolation levels defined by SQL, how databases implement isolation through locking and MVCC, and practical guidance for choosing appropriate isolation levels. By the end, you'll be able to reason precisely about concurrent database access.
Databases serve multiple clients simultaneously. Without isolation, the interleaving of operations from different transactions would cause serious problems.
The Serial Execution Baseline:
If we ran one transaction at a time—completely finishing T1 before starting T2—we'd have perfect isolation. Every transaction would see a stable database and could make changes without interference. But this would be catastrophically slow. A database serving one request at a time cannot scale.
The Concurrent Reality:
To achieve performance, databases execute transactions concurrently. Operations from T1 and T2 interleave. This creates potential for transactions to interfere with each other in subtle and dangerous ways.
Isolation's goal: achieve the performance of concurrent execution while maintaining the correctness of serial execution.
123456789101112131415161718192021222324252627
SERIAL EXECUTION (Perfectly Safe, Terribly Slow):──────────────────────────────────────────────────Time │ T1 │ T2 │─────┼─────────────┼─────────────┤ 1 │ BEGIN │ │ 2 │ READ A │ │ 3 │ WRITE A │ │ 4 │ READ B │ │ 5 │ WRITE B │ │ 6 │ COMMIT │ │ 7 │ │ BEGIN │ ◄── T2 waits for T1 8 │ │ READ A │ 9 │ │ WRITE A │ 10 │ │ COMMIT │ CONCURRENT EXECUTION (Fast, But Dangerous Without Isolation):──────────────────────────────────────────────────Time │ T1 │ T2 │─────┼─────────────┼─────────────┤ 1 │ BEGIN │ BEGIN │ 2 │ READ A │ │ 3 │ │ READ A │ ◄── Sees T1's uncommitted change? 4 │ WRITE A │ │ 5 │ │ WRITE A │ ◄── Overwrites T1's change? 6 │ READ B │ COMMIT │ 7 │ WRITE B │ │ 8 │ COMMIT │ │A concurrent execution is 'serializable' if its outcome is equivalent to some serial execution of the same transactions. This means we get the speed of concurrency while ensuring correctness. Achieving true serializability efficiently is one of the hardest problems in database engineering.
Without proper isolation, several categories of problems—called anomalies or phenomena—can occur. Understanding these anomalies is essential for choosing appropriate isolation levels.
1. Dirty Read
A transaction reads data written by another transaction that hasn't yet committed. If that other transaction rolls back, the first transaction has read data that never officially existed.
123456789101112131415161718
Time │ Transaction T1 │ Transaction T2─────┼─────────────────────────────┼───────────────────────────── 1 │ BEGIN │ 2 │ UPDATE accounts │ │ SET balance = 500 │ │ WHERE id = 'A' │ │ (was 1000) │ 3 │ │ BEGIN 4 │ │ SELECT balance FROM accounts │ │ WHERE id = 'A' │ │ → Returns 500 (DIRTY READ! ⚠️) 5 │ ROLLBACK │ │ (balance restored to 1000) │ 6 │ │ -- T2 thinks balance is 500 │ │ -- but it's actually 1000! │ │ COMMIT T2 made a decision based on data that never existed.2. Non-Repeatable Read (Fuzzy Read)
A transaction reads the same row twice and gets different values because another transaction modified and committed between the reads.
123456789101112131415161718
Time │ Transaction T1 │ Transaction T2─────┼─────────────────────────────┼───────────────────────────── 1 │ BEGIN │ 2 │ SELECT balance FROM accounts│ │ WHERE id = 'A' │ │ → Returns 1000 │ 3 │ │ BEGIN 4 │ │ UPDATE accounts │ │ SET balance = 500 │ │ WHERE id = 'A' 5 │ │ COMMIT 6 │ SELECT balance FROM accounts│ │ WHERE id = 'A' │ │ → Returns 500 │ (DIFFERENT! ⚠️) 7 │ -- T1 is confused │ │ -- Same query, different │ │ -- result within one tx │ │ COMMIT │3. Phantom Read
A transaction re-executes a query and finds new rows that weren't there before (or missing rows that were), because another transaction inserted/deleted and committed between queries.
12345678910111213141516
Time │ Transaction T1 │ Transaction T2─────┼─────────────────────────────┼───────────────────────────── 1 │ BEGIN │ 2 │ SELECT COUNT(*) FROM orders │ │ WHERE status = 'pending' │ │ → Returns 5 │ 3 │ │ BEGIN 4 │ │ INSERT INTO orders (status) │ │ VALUES ('pending') 5 │ │ COMMIT 6 │ SELECT COUNT(*) FROM orders │ │ WHERE status = 'pending' │ │ → Returns 6 │ (PHANTOM ROW! ⚠️) 7 │ -- T1 counted 5, then 6 │ │ -- A "phantom" appeared │ │ COMMIT │4. Lost Update
Two transactions both read a value, both compute an update based on it, and both write back. One transaction's update is silently lost.
1234567891011121314151617181920
Time │ Transaction T1 │ Transaction T2─────┼─────────────────────────────┼───────────────────────────── 1 │ BEGIN │ BEGIN 2 │ SELECT balance FROM accounts│ │ WHERE id = 'A' → 1000 │ 3 │ │ SELECT balance FROM accounts │ │ WHERE id = 'A' → 1000 4 │ -- T1 calculates: 1000 + 100│ │ UPDATE accounts │ │ SET balance = 1100 │ │ WHERE id = 'A' │ 5 │ │ -- T2 calculates: 1000 + 200 │ │ UPDATE accounts │ │ SET balance = 1200 │ │ WHERE id = 'A' 6 │ COMMIT │ COMMIT Final balance: 1200Expected: 1000 + 100 + 200 = 1300T1's update was LOST! 🔥Lost updates are particularly dangerous because they don't cause errors—they just silently lose data. No exception is thrown. No constraint is violated. The database appears to work perfectly while quietly corrupting data. This is why understanding isolation levels matters so much.
Write skew is a more subtle anomaly that's often overlooked. It occurs when two transactions read overlapping data, make decisions based on that data, then write to different records—but the combination of their writes violates an invariant.
Example: Doctor On-Call Scheduling
Invariant: At least one doctor must always be on call.
1234567891011121314151617181920212223
Initial state: Alice and Bob are both on_call = true Time │ Transaction T1 (Alice) │ Transaction T2 (Bob)─────┼─────────────────────────────┼───────────────────────────── 1 │ BEGIN │ BEGIN 2 │ SELECT COUNT(*) FROM doctors│ │ WHERE on_call = true → 2 │ 3 │ │ SELECT COUNT(*) FROM doctors │ │ WHERE on_call = true → 2 4 │ -- "2 on call, safe to │ │ -- take myself off" │ │ UPDATE doctors │ │ SET on_call = false │ │ WHERE name = 'Alice' │ 5 │ │ -- "2 on call, safe to take │ │ -- myself off" │ │ UPDATE doctors │ │ SET on_call = false │ │ WHERE name = 'Bob' 6 │ COMMIT │ COMMIT Final state: Nobody is on call! ⚠️Invariant violated, but neither transaction did anything wrong individually.Why Write Skew Is Tricky:
The Fix: Serializable Isolation
Write skew can only be prevented by true serializable isolation or explicit locking. Lower isolation levels (even Repeatable Read) allow write skew because they don't prevent reads from returning different on-call counts.
123456789101112131415161718
-- Use SELECT FOR UPDATE to lock the rows we're basing decisions onBEGIN; -- Lock all on-call doctorsSELECT * FROM doctors WHERE on_call = true FOR UPDATE; -- Acquires exclusive locks -- Now count is stable; other transactions block on our lockSELECT COUNT(*) FROM doctors WHERE on_call = true;-- Returns 2 -- If we proceed, no other transaction can change on_call-- until we commitUPDATE doctors SET on_call = false WHERE name = 'Alice'; COMMIT;-- Now Bob's transaction can proceed, sees only 1 on-callThe SQL standard defines four isolation levels, each providing progressively stronger guarantees at the cost of reduced concurrency and performance. Every database supports these levels (with some variations in implementation).
| Isolation Level | Dirty Read | Non-Repeatable Read | Phantom Read | Write Skew |
|---|---|---|---|---|
| READ UNCOMMITTED | Possible ⚠️ | Possible ⚠️ | Possible ⚠️ | Possible ⚠️ |
| READ COMMITTED | Prevented ✓ | Possible ⚠️ | Possible ⚠️ | Possible ⚠️ |
| REPEATABLE READ | Prevented ✓ | Prevented ✓ | Possible ⚠️ | Possible ⚠️ |
| SERIALIZABLE | Prevented ✓ | Prevented ✓ | Prevented ✓ | Prevented ✓ |
1. READ UNCOMMITTED
The weakest level. Transactions can see uncommitted changes from other transactions (dirty reads). Almost never used in practice except for rough analytics where accuracy doesn't matter.
2. READ COMMITTED
Transactions only see committed data. Each query sees a fresh snapshot as of query start time. Most databases default to this level (PostgreSQL, Oracle, SQL Server). Good balance of safety and performance for many workloads.
3. REPEATABLE READ
Within a transaction, reading the same row always returns the same value. Snapshot is taken at transaction start. MySQL/InnoDB defaults to this level. Prevents most anomalies but allows phantom reads.
4. SERIALIZABLE
The strongest level. Transactions behave as if executed serially. Prevents all standard anomalies including write skew. Highest correctness guarantee, but lowest concurrency.
1234567891011121314151617181920212223
-- PostgreSQL: Set for current transactionBEGIN;SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;-- Your queries hereCOMMIT; -- PostgreSQL: Set session defaultSET default_transaction_isolation = 'repeatable read'; -- MySQL: Set for current transaction SET TRANSACTION ISOLATION LEVEL READ COMMITTED;START TRANSACTION;-- Your queries hereCOMMIT; -- MySQL: Set global defaultSET GLOBAL transaction_isolation = 'REPEATABLE-READ'; -- SQL Server: Set for current transactionSET TRANSACTION ISOLATION LEVEL SNAPSHOT;BEGIN TRANSACTION;-- Your queries hereCOMMIT TRANSACTION;PostgreSQL, Oracle, and SQL Server default to READ COMMITTED. MySQL/InnoDB defaults to REPEATABLE READ. These differences matter! Code that works correctly on one database might have concurrency bugs on another due to different default isolation levels.
The traditional approach to isolation is pessimistic locking—assuming conflicts are likely and preventing them by acquiring locks before accessing data.
Lock Types:
Shared Lock (S-lock / Read Lock):
Exclusive Lock (X-lock / Write Lock):
| Existing Lock → | None | Shared (S) | Exclusive (X) |
|---|---|---|---|
| Request Shared (S) | ✓ Grant | ✓ Grant | ✗ Wait |
| Request Exclusive (X) | ✓ Grant | ✗ Wait | ✗ Wait |
Two-Phase Locking (2PL):
2PL is the classic protocol for achieving serializability through locking. It has two phases:
Growing Phase: Transaction acquires locks as needed. May acquire new locks but never releases any.
Shrinking Phase: Transaction releases locks. After releasing the first lock, it cannot acquire any new locks.
This prevents certain interleaving that would violate serializability.
123456789101112131415161718192021
┌─────────────────────────────────────────────┐ │ TWO-PHASE LOCKING │ └─────────────────────────────────────────────┘ Number of │Locks │ ┌────────────┐Held │ / \ │ / \ │ / \ │ / \ │ / \ ├──/────────────────────────\──────────────▶ Time │ │ │◄─ Growing ─►│◄─ Shrinking ─►│ │ │ Phase │ Phase │ │ │ │ │ │ BEGIN Lock Point COMMIT │ (all locks held) Strict 2PL: Release all locks at commit (prevents cascading aborts)Rigorous 2PL: Also holds read locks until commitLocking introduces the possibility of deadlocks—where Transaction A holds a lock that B needs, while B holds a lock that A needs. Neither can proceed. Databases detect deadlocks and abort one transaction as the 'victim.' Design your access patterns to minimize deadlock risk (e.g., always lock resources in a consistent order).
Modern databases predominantly use Multi-Version Concurrency Control (MVCC) instead of pure locking. MVCC keeps multiple versions of data, allowing readers to see a consistent snapshot without blocking writers.
How MVCC Works:
When a transaction modifies a row, it doesn't overwrite the existing data. Instead, it creates a new version of the row with a timestamp or transaction ID. Old versions are retained for transactions that need to see them.
Readers don't block writers, writers don't block readers. Only writer-writer conflicts require coordination.
1234567891011121314151617181920212223242526
MVCC Example: Row with ID = 'user-1' Version 1 (Created by Tx 100, visible to Tx < 200):┌─────────────────────────────────────────────┐│ id: user-1 │ name: "Alice" │ age: 30 ││ created_by: Tx 100 │ deleted_by: Tx 200 │└─────────────────────────────────────────────┘ │ │ (superseded by) ▼Version 2 (Created by Tx 200, visible to Tx 200-300):┌─────────────────────────────────────────────┐│ id: user-1 │ name: "Alice" │ age: 31 ││ created_by: Tx 200 │ deleted_by: Tx 300 │└─────────────────────────────────────────────┘ │ │ (superseded by) ▼Version 3 (Created by Tx 300, current version):┌─────────────────────────────────────────────┐│ id: user-1 │ name: "Alicia" │ age: 32 ││ created_by: Tx 300 │ deleted_by: NULL │└─────────────────────────────────────────────┘ When Tx 250 reads user-1 → sees Version 2 (name: "Alice", age: 31)When Tx 350 reads user-1 → sees Version 3 (name: "Alicia", age: 32)PostgreSQL's MVCC Implementation:
PostgreSQL stores version metadata in hidden columns:
xmin: Transaction ID that created this versionxmax: Transaction ID that deleted this version (or 0 if current)Each transaction has a snapshot listing which transactions are 'visible' (committed before snapshot). Visibility rules determine which version a query sees.
Garbage Collection (Vacuum):
Old versions that no transaction can see anymore must be cleaned up. PostgreSQL's VACUUM process reclaims this dead space. Failing to vacuum leads to table bloat—tables grow even without net data growth.
Snapshot Isolation (SI) is an isolation level that falls between Repeatable Read and Serializable in strength. It's the natural outcome of MVCC and offers excellent performance with strong guarantees.
Snapshot Isolation Guarantees:
Why It's Not Serializable:
Snapshot Isolation prevents dirty reads, non-repeatable reads, and phantom reads (for predicate reads within the snapshot). But it allows write skew, which means it doesn't guarantee true serializability.
1234567891011121314151617181920212223
Transaction T1 starts at time 10 Transaction T2 starts at time 15 │ │ ▼ ▼ ┌───────────────────────────────────────────────────────────────┐ │ DATABASE HISTORY │ ├───────────────────────────────────────────────────────────────┤ │ Time 5: Row A = 100 │ │ Time 12: Row A = 200 (committed by T3) │ │ Time 18: Row A = 300 (committed by T4) │ └───────────────────────────────────────────────────────────────┘ What each transaction sees:┌────────────────────────────────────────────────────────────────────┐│ T1 (snapshot at time 10): ││ - Sees Row A = 100 ││ - Does NOT see time 12 or 18 changes (after its snapshot) ││ - Consistent view throughout entire transaction │├────────────────────────────────────────────────────────────────────┤│ T2 (snapshot at time 15): ││ - Sees Row A = 200 ││ - Does NOT see time 18 change (after its snapshot) ││ - Different view than T1, but also consistent │└────────────────────────────────────────────────────────────────────┘PostgreSQL's REPEATABLE READ is actually Snapshot Isolation—it provides the SI guarantees described here. This is stronger than the SQL standard requires for Repeatable Read but weaker than Serializable. MySQL's Repeatable Read is different and uses gap locking.
Serializable Snapshot Isolation (SSI) is a modern algorithm that achieves true serializability while maintaining MVCC's performance benefits. PostgreSQL 9.1+ implements this for its SERIALIZABLE level.
How SSI Works:
SSI starts with regular Snapshot Isolation but adds detection for potential serialization anomalies. It tracks:
If SSI detects a 'dangerous structure'—a cycle of rw-conflicts that could lead to a non-serializable outcome—it aborts one of the transactions.
1234567891011121314151617181920212223242526272829303132
Write Skew Detection with SSI: Time │ T1 (Alice) │ T2 (Bob)─────┼─────────────────────────────┼───────────────────────────── 1 │ BEGIN (SERIALIZABLE) │ BEGIN (SERIALIZABLE) 2 │ SELECT * FROM doctors │ │ WHERE on_call = true │ │ [Reads Alice=true,Bob=true] │ 3 │ │ SELECT * FROM doctors │ │ WHERE on_call = true │ │ [Reads Alice=true,Bob=true] │ │ │ SSI records: T1 read doctor rows │ SSI records: T2 read doctor rows 4 │ UPDATE doctors │ │ SET on_call = false │ │ WHERE name = 'Alice' │ │ │ │ SSI records: T1 wrote row that T2 read │ (rw-conflict: T2 → T1) 5 │ │ UPDATE doctors │ │ SET on_call = false │ │ WHERE name = 'Bob' │ │ │ SSI records: T2 wrote row that T1 read │ (rw-conflict: T1 → T2) │ │ ⚠️ DANGEROUS STRUCTURE DETECTED! │ Cycle: T1 → T2 → T1 6 │ COMMIT │ COMMIT fails! │ (first to commit wins) │ ERROR: could not serialize access │ │ due to read/write dependenciesUse SERIALIZABLE when correctness is critical and your application can handle retries. Financial transactions, inventory systems, and any case where write skew would cause serious problems. For read-heavy workloads with few writes, the performance impact is often minimal.
Choosing an isolation level is a trade-off between correctness and performance. Here's a practical decision framework:
| Use Case | Recommended Level | Rationale |
|---|---|---|
| Analytics queries (approximations OK) | READ COMMITTED | Dirty reads blocked; occasional inconsistency acceptable |
| Standard OLTP operations | READ COMMITTED | Good balance; well-tested default for most apps |
| Financial calculations within transaction | REPEATABLE READ | Consistent read of amounts during calculation |
| Report generation (point-in-time snapshot) | REPEATABLE READ / SNAPSHOT | Consistent snapshot for entire report |
| Money transfers, inventory updates | SERIALIZABLE | Cannot tolerate lost updates or write skew |
| Multi-row checks before inserts | SERIALIZABLE | Prevent phantom rows from invalidating check |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// Different isolation levels for different operationsimport { Prisma } from '@prisma/client'; // Standard operations: use default (READ COMMITTED usually)async function getUser(id: string) { return prisma.user.findUnique({ where: { id } });} // Financial operation: use SERIALIZABLEasync function transferFunds(fromId: string, toId: string, amount: number) { return prisma.$transaction( async (tx) => { const from = await tx.account.findUnique({ where: { id: fromId } }); const to = await tx.account.findUnique({ where: { id: toId } }); if (from.balance < amount) { throw new Error('Insufficient funds'); } await tx.account.update({ where: { id: fromId }, data: { balance: { decrement: amount } } }); await tx.account.update({ where: { id: toId }, data: { balance: { increment: amount } } }); }, { isolationLevel: Prisma.TransactionIsolationLevel.Serializable, maxWait: 5000, timeout: 10000, } );} // Report generation: use REPEATABLE READ for consistent snapshotasync function generateMonthlyReport(userId: string) { return prisma.$transaction( async (tx) => { const orders = await tx.order.findMany({ /* ... */ }); const payments = await tx.payment.findMany({ /* ... */ }); // All reads see the same snapshot return { orders, payments }; }, { isolationLevel: Prisma.TransactionIsolationLevel.RepeatableRead, } );}Isolation is what makes concurrent database access safe. It provides the illusion that your transaction is the only one running, even when thousands of transactions execute simultaneously.
You now understand Isolation comprehensively—from the anomalies that occur without it, through the SQL standard levels, to the implementation mechanisms of locking and MVCC. Next, we'll explore Durability: how databases ensure that committed data survives any kind of failure.