Snapshot Isolation - Learning Module

Loading content...

0/241

Snapshot Isolation vs Serializable

The Isolation Spectrum

We've explored snapshot isolation in depth—its elegant snapshot-based reads, its high concurrency, and its Achilles heel: write skew. We've also seen that SERIALIZABLE isolation can prevent write skew through mechanisms like SSI. But how do these isolation levels truly compare?\n\nThis page provides a comprehensive comparison between Snapshot Isolation (SI) and Serializable isolation, examining their theoretical foundations, anomaly prevention capabilities, performance characteristics, and practical trade-offs. Understanding this comparison is essential for making informed decisions about which isolation level to use in your applications.

What You Will Learn

By the end of this page, you will understand: the theoretical relationship between SI and serializability; exactly which anomalies each level prevents; how performance differs under various workloads; how different databases implement these levels; and guidelines for choosing the appropriate isolation level for your application.

Theoretical Foundations

To compare SI and Serializable, we must first understand their theoretical foundations and where they sit in the isolation hierarchy.\n\nSerializability: The Gold Standard\n\nSerializability is the strongest commonly-used isolation level. A schedule is serializable if its effects are equivalent to some serial execution of the same transactions. This means no anomalies are possible—every concurrent execution could have happened as a sequence of non-overlapping transactions.\n\nFormal Definition:\n\nA schedule $S$ is serializable if there exists a serial schedule $S'$ with the same transactions where: for all data items $x$, if transaction $T_i$ reads or writes $x$ in $S$, the final value and all intermediate values of $x$ are identical to those in $S'$.\n\nSnapshot Isolation: A Different Approach\n\nSnapshot Isolation provides a different guarantee: each transaction sees a consistent snapshot of the database as it existed at the transaction's start time. Writes are checked for conflicts using First-Committer-Wins. SI was developed as a practical alternative to serializability that provides high concurrency while preventing most common anomalies.

Converting Mermaid diagram...

Where SI Fits in the Hierarchy:\n\nSnapshot Isolation is not part of the original SQL standard's four isolation levels. It sits between REPEATABLE READ and SERIALIZABLE in terms of anomaly prevention:\n\n- SI prevents all anomalies that REPEATABLE READ prevents\n- SI also prevents phantom reads for read-only transactions\n- SI does NOT guarantee serializability (allows write skew)\n- SI uses a fundamentally different mechanism (snapshots vs locks)\n\nImportant Terminology Confusion:\n\nMany databases conflate these terms:\n- PostgreSQL calls SI "REPEATABLE READ" and offers true SERIALIZABLE via SSI\n- Oracle calls SI "SERIALIZABLE" (misleadingly)\n- MySQL InnoDB's "REPEATABLE READ" uses SI plus gap locking\n\nAlways verify what your specific database actually provides at each isolation level.

Naming Is Inconsistent Across Databases

Oracle's 'SERIALIZABLE' level is actually Snapshot Isolation and does NOT prevent write skew. PostgreSQL's 'SERIALIZABLE' does prevent write skew via SSI. Don't assume isolation level names mean the same thing across databases!

Anomaly Prevention: A Detailed Comparison

Let's systematically compare which anomalies are prevented by each isolation level.

Anomaly Prevention by Isolation Level
Anomaly	Snapshot Isolation	Serializable (2PL)	Serializable (SSI)	Notes
Dirty Read	✅ Prevented	✅ Prevented	✅ Prevented	Both only see committed data
Non-Repeatable Read	✅ Prevented	✅ Prevented	✅ Prevented	SI via snapshot; 2PL via locks
Phantom Read (read-only)	✅ Prevented	✅ Prevented	✅ Prevented	SI snapshot includes all rows
Phantom Read (write)	⚠️ Partial	✅ Prevented	✅ Prevented	SI may allow phantoms affecting writes
Lost Update	✅ Prevented	✅ Prevented	✅ Prevented	SI via FCW; 2PL via locks
Read Skew	✅ Prevented	✅ Prevented	✅ Prevented	SI snapshot is consistent
Write Skew	❌ Allowed	✅ Prevented	✅ Prevented	SI's key weakness
Serialization Anomaly	❌ Possible	✅ Prevented	✅ Prevented	By definition

Understanding Each Anomaly:\n\nDirty Read: Reading uncommitted data. SI prevents this because snapshots only include committed transactions.\n\nNon-Repeatable Read: Reading the same row twice and getting different values. SI prevents this because the snapshot is immutable.\n\nPhantom Read: A range query returns different rows on re-execution. For read-only transactions, SI's snapshot prevents this. For transactions that write based on range query results, phantoms can cause problems similar to write skew.\n\nLost Update: Two transactions read the same row, both modify it, one overwrites the other. SI's FCW ensures only the first committer's update survives; the second must retry.\n\nRead Skew: Reading two related items that are inconsistent with each other. SI's snapshot ensures all reads see the same consistent state.\n\nWrite Skew: Reading overlapping data, writing different items, violating a spanning constraint. This is the one SI does NOT prevent.

The Write Skew Difference

Write skew is the ONLY standard anomaly that separates SI from serializability. If your application has no constraints spanning multiple rows, SI is effectively as strong as serializable for your use case. The question is: do you have such constraints?

Implementation Mechanisms

SI and Serializable isolation use fundamentally different mechanisms, leading to different performance characteristics.\n\nSnapshot Isolation Mechanism:\n\n1. Snapshot on start: Transaction captures a point-in-time view\n2. Read from snapshot: All reads see the frozen snapshot state\n3. Write directly: Writes create new versions in current state\n4. FCW on commit: Check for write-write conflicts on same rows\n\n2PL Serializable Mechanism:\n\n1. Acquire locks: Get shared locks for reads, exclusive for writes\n2. Hold locks: Locks held until transaction completes (two-phase)\n3. Block on conflict: Transactions wait for conflicting locks\n4. Release at end: All locks released on commit/abort\n\nSSI Serializable Mechanism:\n\n1. Snapshot + tracking: Take snapshot AND track read/write sets\n2. SIREAD locks: Virtual locks record reads without blocking\n3. Detect cycles: Check for rw-antidependency cycles\n4. Abort on danger: Abort if dangerous structure detected

SI Characteristics

•Non-blocking reads — Readers never wait for writers
•Non-blocking writes to different rows — No lock waits for distinct data
•Version chains — Storage overhead for multiple versions
•FCW aborts — Write conflicts cause retries (not waits)
•Consistent reads — Always see point-in-time snapshot

2PL Characteristics

•Blocking reads/writes — Lock conflicts cause waits
•Deadlock possible — Circular wait detection needed
•Lock table overhead — Memory for lock management
•No aborts for conflict — Wait instead of retry
•Current value reads — See latest committed data

SSI Combines Both Approaches:\n\nSSI uses SI as its base (snapshot reads, FCW) but adds dependency tracking to detect serializability violations.\n\nSIREAD Locks:\n\nUnlike real locks, SIREAD locks don't block. They're record-keeping mechanisms that track:\n- What transactions read which data\n- What data was written after being read by another transaction\n\nConflict Detection:\n\nAt commit time, SSI checks if the committing transaction is part of a "dangerous structure":\n- A transaction with rw-antidependency from a committed transaction\n- AND rw-antidependency to another concurrent transaction\n\nIf detected, the transaction is aborted with a serialization failure. The application must retry.

SSI vs 2PL Trade-off

SSI trades blocking for aborts. 2PL makes transactions wait. SSI lets them run and aborts if there's a problem. For workloads with few conflicts, SSI is often faster. For workloads with frequent conflicts, waiting (2PL) may be more efficient than retrying (SSI).

Performance Characteristics

Performance differences between SI and Serializable depend heavily on workload characteristics. Let's analyze different scenarios.

Performance by Workload Type
Workload	SI Performance	2PL Serializable	SSI Serializable
Read-heavy, few conflicts	Excellent (no blocking)	Good (shared locks)	Excellent (no blocking)
Write-heavy, different rows	Excellent (no conflicts)	Poor (lock overhead)	Good (tracking overhead)
Write-heavy, same rows	Good (FCW retries)	Good (blocking)	Good (retries + tracking)
Mixed, high contention	Good (FCW handles)	Poor (blocking, deadlocks)	Fair (many retries)
Long transactions	Good (snapshots cheap)	Poor (lock duration)	Fair (SIREAD retention)
Short OLTP	Excellent	Good	Very Good

SI Performance Advantages:\n\n1. Readers never block writers and vice versa: This is the fundamental advantage. Long-running reports don't hold up short OLTP transactions.\n\n2. No deadlock between readers and writers: Deadlocks can still occur between writers, but the reader-writer deadlock class is eliminated.\n\n3. Predictable read latency: Reads always succeed immediately with snapshot data. No lock waits.\n\n4. High concurrency for multi-row reads: Reading multiple tables or rows for complex queries doesn't accumulate locks.\n\nSerializable Performance Costs:\n\n2PL Costs:\n- Lock acquisition and release overhead\n- Lock table memory\n- Blocking wait time\n- Deadlock detection and resolution\n- Lock escalation overhead\n\nSSI Costs:\n- SIREAD lock tracking overhead\n- Predicate lock management\n- Conflict detection at commit\n- Higher abort rate than SI\n- SIREAD lock retention for active transactions

performance_benchmark.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Simplified benchmark comparison (illustrative numbers)
 
WORKLOAD: 1000 concurrent transactions
  - 80% read-only (10 rows each)
  - 20% read-write (read 5 rows, write 1 row)
  - 5% contention rate on hot rows
 
SNAPSHOT ISOLATION RESULTS:
  Throughput:      15,000 tx/sec
  Avg read latency:    2ms
  Avg write latency:   5ms
  Abort rate:          0.5% (FCW conflicts)
  Deadlocks:           0
 
2PL SERIALIZABLE RESULTS:
  Throughput:       8,000 tx/sec
  Avg read latency:    3ms (lock acquisition)
  Avg write latency:  12ms (blocking waits)
  Abort rate:          0.2% (deadlock victims)
  Deadlocks:          20/sec (detected and resolved)
 
SSI SERIALIZABLE RESULTS:
  Throughput:      12,000 tx/sec
  Avg read latency:    2.5ms (SIREAD tracking)
  Avg write latency:   6ms
  Abort rate:          2% (serialization failures + FCW)
  Deadlocks:           0
 
# Key observation: SSI is between SI and 2PL in overhead
# Trade-off: SSI has higher abort rate but lower latency

Abort Costs Matter

When comparing abort rates, remember that aborts have costs: the aborted work is wasted, and the retry consumes additional resources. High abort rates can negate SSI's latency advantages. Monitor and tune based on actual abort rates in your workload.

How Different Databases Implement These Levels

Understanding that isolation level names mean different things in different databases is crucial for portability and correctness.

Isolation Level Mapping Across Databases
Database	REPEATABLE READ	SERIALIZABLE	True Serializable?
PostgreSQL	Snapshot Isolation	SSI	Yes (SERIALIZABLE level)
Oracle	N/A (use SERIALIZABLE)	Snapshot Isolation	No (misleadingly named)
MySQL InnoDB	SI + Gap Locking	2PL	Yes (SERIALIZABLE level)
SQL Server	Lock-based RR	2PL or SI+SSI	Yes (both methods available)
CockroachDB	SI	SSI	Yes (SERIALIZABLE level)
TiDB	SI	Not available	No (optimistic SI only)

PostgreSQL's Approach:\n\n- REPEATABLE READ: Actual Snapshot Isolation. Prevents phantoms for reads but allows write skew.\n- SERIALIZABLE: SSI-based. Adds rw-antidependency tracking to detect write skew.\n\nUsage:\nsql\n-- Snapshot Isolation (allows write skew)\nBEGIN ISOLATION LEVEL REPEATABLE READ;\n\n-- True Serializable (prevents write skew)\nBEGIN ISOLATION LEVEL SERIALIZABLE;\n\n\nKey Points:\n- SSI requires no schema changes\n- Serialization failures must be handled with retries\n- SSI has ~5-10% overhead over SI for typical workloads\n- Long-running read transactions can accumulate SIREAD locks

Verify Your Database's Behavior

Never assume isolation level names have consistent meaning. Test your specific database version with write skew scenarios to verify whether your 'SERIALIZABLE' actually prevents write skew. Documentation can be outdated or unclear.

Choosing the Right Isolation Level

The choice between SI and Serializable depends on your application's requirements, workload characteristics, and tolerance for complexity.

Use Snapshot Isolation When...

•No multi-row constraints exist — If your application has no constraints spanning multiple rows, SI provides the same guarantees as serializable
•Constraints enforced by database — UNIQUE constraints, foreign keys, and triggers can catch violations regardless of isolation level
•Read-heavy workload — SI's non-blocking reads excel when most transactions are read-only
•Long-running reports — Reports can run without blocking or being blocked by OLTP workloads
•High concurrency required — SI's minimal blocking maximizes throughput
•Application handles conflicts — If retrying FCW failures is acceptable and rare

Use Serializable When...

•Application-level constraints span rows — Multi-row invariants (like 'at least one on-call') require serializable
•Correctness is paramount — Financial systems, inventory management where bugs are costly
•Complex business logic — When it's hard to reason about all possible interleavings
•Audit requirements — When you must be able to explain execution as a serial schedule
•Simpler application code — Serializable lets you reason sequentially without concurrency concerns
•Short transactions — SSI overhead is minimal for quick OLTP transactions

Decision Framework:\n\n\n1. Identify all constraints in your application\n - Schema constraints (UNIQUE, FK, CHECK)\n - Application constraints (business rules)\n\n2. Categorize application constraints\n - Single-row constraints → SI is sufficient\n - Multi-row constraints → Need Serializable OR explicit locking\n\n3. Evaluate constraint enforcement options\n - Can constraint be a DB trigger? → SI + trigger\n - Can we use SELECT FOR UPDATE? → SI + locking\n - Too complex? → Use Serializable\n\n4. Consider performance requirements\n - High read concurrency needed? → Prefer SI/SSI\n - Low contention? → SSI overhead is acceptable\n - High contention? → Consider 2PL or SI + explicit locking\n\n5. Choose and test\n - Test with production-like workloads\n - Measure abort rates and latency\n - Verify constraint enforcement\n

Default to Serializable, Optimize to SI

If in doubt, start with SERIALIZABLE. It's correct by default. Once you understand your workload and can prove that write skew isn't a concern for specific transactions, you can selectively use lower isolation levels for those transactions. Correctness first, performance second.

Migrating Between Isolation Levels

Changing isolation levels in an existing application requires careful planning. Let's examine strategies for both directions.\n\nMigrating SI → Serializable:\n\nThis is the 'safer' direction—you're adding protection, not removing it.\n\nSteps:\n1. Update connection configuration to use SERIALIZABLE\n2. Implement retry logic for serialization failures:\n python\n MAX_RETRIES = 3\n for attempt in range(MAX_RETRIES):\n try:\n with connection.begin():\n do_transaction_work()\n break # Success\n except SerializationFailure:\n if attempt == MAX_RETRIES - 1:\n raise\n time.sleep(random.uniform(0.01, 0.1))\n \n3. Monitor abort rates and adjust if too high\n4. Test under production-like load\n\nPotential Issues:\n- Higher abort rates require retry logic everywhere\n- Some transactions may retry repeatedly under high contention\n- SSI's SIREAD tracking may increase memory pressure

Migrating Serializable → SI:\n\nThis is the 'dangerous' direction—you're removing protection.\n\nSteps:\n1. Audit all constraints: Identify every application constraint\n2. Categorize by risk: Which constraints span multiple rows?\n3. Implement alternatives for risky constraints:\n - Add database triggers\n - Use SELECT FOR UPDATE\n - Add explicit lock rows\n - Keep SERIALIZABLE for these specific transactions\n4. Test exhaustively: Write tests that exercise concurrent scenarios\n5. Monitor for violations: Log and alert on constraint violations\n\nPotential Issues:\n- Risk of introducing write skew bugs\n- Need comprehensive test coverage\n- May need to mix isolation levels per transaction\n- Subtle bugs may only appear under high concurrency

Mixed Isolation Levels

Some applications use different isolation levels for different transactions. This is valid but adds complexity. Document clearly which transactions use which levels and why. Ensure developers understand the implications when adding new transactions.

Summary: SI vs Serializable

The choice between Snapshot Isolation and Serializable is one of the most important decisions in database application design. Each level offers different trade-offs between correctness guarantees, performance characteristics, and implementation complexity.

Key Takeaways

•SI prevents most anomalies but allows write skew — It sits between REPEATABLE READ and SERIALIZABLE in the anomaly hierarchy
•Serializable guarantees equivalence to serial execution — Either via 2PL (blocking) or SSI (abort on dangerous structure)
•SI offers non-blocking reads and high concurrency — Performance advantage for read-heavy or low-contention workloads
•SSI trades blocking for aborts — Good for many workloads but high abort rates can degrade performance
•Database naming is inconsistent — Oracle's SERIALIZABLE is SI; PostgreSQL's REPEATABLE READ is SI; always verify behavior
•Choose based on your constraints — If no multi-row constraints exist, SI may be sufficient; otherwise, use Serializable
•Default to correctness, optimize later — Start with SERIALIZABLE and relax only when you can prove safety

What's Next:\n\nThe final page explores Practical Usage of Snapshot Isolation, covering real-world patterns, optimization strategies, monitoring techniques, and best practices for deploying SI-based systems in production.

Page Complete

You now understand the comprehensive trade-offs between Snapshot Isolation and Serializable isolation—from theoretical foundations through practical implementation differences, performance characteristics, database-specific behaviors, and decision frameworks for choosing the right level for your applications.