Database Management SystemsSnapshot Isolation

Snapshot Isolation: Consistent Views for Concurrent Transactions

LevelAdvanced

Duration75 mins

TopicSnapshot Isolation

1 / 5

Snapshot Concept

The Illusion of Frozen Time

Imagine you're taking a photograph of a bustling marketplace. In that single instant, the camera captures every vendor, every customer, every product—frozen in time. Even as the marketplace continues to buzz with activity after the shutter clicks, your photograph preserves an exact, consistent view of that precise moment. This is the essence of a snapshot in database systems.\n\nA database snapshot provides a transaction with a logically consistent view of all data as it existed at a specific point in time. While other transactions continue to read and modify data concurrently, your transaction operates on its own private, immutable view—unaffected by changes happening elsewhere. This powerful abstraction enables remarkable levels of concurrency while maintaining the consistency guarantees that databases must provide.

What You Will Learn

By the end of this page, you will understand what database snapshots are, how they provide point-in-time consistency, the relationship between snapshots and transaction timestamps, and why snapshots form the foundation for modern multi-version concurrency control (MVCC) systems used in PostgreSQL, Oracle, MySQL InnoDB, and virtually every major database engine.

What Is a Database Snapshot?

A database snapshot is a logically consistent, read-only view of the database at a specific point in time, called the snapshot timestamp. When a transaction begins under snapshot isolation, it establishes a snapshot and all subsequent reads within that transaction will observe the database state as it existed at that timestamp—regardless of any concurrent modifications made by other transactions.\n\nFormal Definition:\n\nLet $T_i$ be a transaction that begins at time $t_s$ (the snapshot timestamp). The snapshot $S(T_i)$ is defined as:\n\n$$S(T_i) = \{ (x, v) \mid x \text{ is a data item with value } v \text{ at time } t_s \}$$\n\nFor every read operation $R_i(x)$ performed by $T_i$, the value returned is $v$ where $(x, v) \in S(T_i)$, irrespective of any writes $W_j(x)$ by concurrent transactions $T_j$ where $j \neq i$.

The Photography Analogy Extended

Think of each transaction as having its own personal camera. When the transaction starts, it takes a photograph of the entire database. Throughout its execution, it only looks at this photograph—never at the "live" database that continues to change. Other transactions have their own photographs, taken at their own start times. Everyone works from their own consistent view.

Key Properties of Snapshots:\n\n1. Point-in-Time Consistency\n\nA snapshot represents the database at exactly one moment. All data items in the snapshot are mutually consistent—they reflect the committed state of the database at the snapshot timestamp. There are no "torn reads" where some data reflects one moment and other data reflects another.\n\n2. Immutability\n\nOnce established, a snapshot never changes. Even if the underlying data is modified by other transactions, the snapshot remains frozen. This immutability is what enables non-blocking reads—readers never wait for writers.\n\n3. Isolation from Concurrent Modifications\n\nA transaction reading from its snapshot is completely isolated from uncommitted changes made by concurrent transactions. It sees only data that was committed before the snapshot timestamp.\n\n4. Logical, Not Physical\n\nSnapshots are logical constructs. The database doesn't actually copy all data at the snapshot moment. Instead, it uses version chains or other mechanisms to reconstruct the correct version of each data item when requested.

Snapshots vs Traditional Locking

To appreciate the power of snapshots, we must contrast them with traditional locking-based concurrency control. In lock-based systems, readers and writers can block each other—leading to reduced concurrency, potential deadlocks, and complex lock management. Snapshots fundamentally transform this equation.\n\nThe Lock-Based World:\n\nIn strict two-phase locking (2PL), a transaction reading data item $x$ must acquire a shared lock $S(x)$, and a transaction writing $x$ must acquire an exclusive lock $X(x)$. These locks conflict:\n\n- A writer holding $X(x)$ blocks all readers attempting to acquire $S(x)$\n- Readers holding $S(x)$ block any writer attempting to acquire $X(x)$\n\nThis means long-running read transactions can delay critical write operations, and write-heavy workloads can starve readers.\n\nThe Snapshot-Based World:\n\nWith snapshots, readers don't acquire locks on data items at all. Instead, they read from their snapshot—a frozen view of past state. Writers modify the "live" current state. Because readers and writers operate on different versions of data, they never conflict.

Traditional Locking

•Readers block writers (shared locks conflict with exclusive locks)
•Writers block readers (exclusive locks block shared lock requests)
•Long read transactions delay write operations
•Deadlocks possible between readers and writers
•Lock escalation can serialize entire workloads
•Complex lock management overhead

Snapshot-Based Access

•Readers never block writers (read from snapshot)
•Writers never block readers (write to current state)
•Long read transactions don't impact write throughput
•No reader-writer deadlocks possible
•Consistent read performance regardless of concurrency
•Minimal lock overhead for read operations

The Fundamental Shift

Snapshot-based systems transform database concurrency from a zero-sum game (readers and writers competing for locks) to a parallel world model (readers and writers operating on different time-views of data). This is why MVCC-based databases like PostgreSQL and Oracle can achieve such high concurrency levels.

Snapshot Timestamp Assignment

When Is the Snapshot Taken?\n\nA critical design decision in snapshot-based systems is determining when a transaction's snapshot timestamp is assigned. This choice affects both the consistency semantics and the visibility of concurrent updates.\n\nOption 1: Transaction Start Time\n\nThe snapshot is established when the transaction begins (issues its first statement or explicit BEGIN command). This is the most common approach, used in PostgreSQL's REPEATABLE READ and Oracle's READ COMMITTED (with modifications).\n\n\nT1 begins at time 100 → Snapshot timestamp = 100\nT1 sees all changes committed before time 100\nT1 does NOT see changes committed at time 101, 102, ... (even if they commit while T1 runs)\n\n\nOption 2: Statement Start Time\n\nA new snapshot is established for each SQL statement within the transaction. This is PostgreSQL's READ COMMITTED behavior.\n\n\nT1 begins at time 100\n Statement S1 at time 100 → Snapshot = 100\n Statement S2 at time 150 → Snapshot = 150 (sees more recent commits)\n Statement S3 at time 200 → Snapshot = 200 (sees even more recent commits)\n\n\nOption 3: First Read Time\n\nThe snapshot is established lazily when the transaction performs its first read operation. This can reduce overhead for write-only transactions.

Snapshot Timing Strategies Compared
Strategy	Snapshot Established	Consistency Level	Use Case
Transaction Start	At BEGIN/first statement	Repeatable Read	Reports, analytics, long transactions needing consistent view
Statement Start	At each SQL statement	Read Committed	OLTP workloads, short transactions, fresh data preferred
First Read	At first SELECT	Varies	Write-heavy workloads, lazy initialization

Timestamp Generation Mechanisms:\n\nThe system needs a reliable way to generate monotonically increasing timestamps. Common approaches include:\n\n1. System Clock Timestamps\n\nUse wall-clock time (e.g., microseconds since epoch). Simple but can have issues with clock skew in distributed systems.\n\n2. Logical Counter (Transaction ID)\n\nMaintain a global counter that increments for each new transaction. PostgreSQL uses 32-bit transaction IDs (XIDs). This ensures strict ordering but requires wraparound handling.\n\n3. Hybrid Logical Clocks\n\nCombine physical time with logical counters to get the benefits of both: approximate real-time ordering with guaranteed monotonicity. Used in distributed databases like CockroachDB.\n\n4. Lamport Timestamps\n\nPure logical clocks that maintain happened-before relationships. Useful in distributed systems but don't correlate with real time.

The Visibility Rule

A transaction T with snapshot timestamp ts can only see changes made by transactions that: (1) committed BEFORE ts, AND (2) had a transaction ID less than T's snapshot horizon. Changes from transactions that started before ts but committed after ts are NOT visible—they weren't yet committed at the snapshot moment.

Snapshot Visibility Rules

When a transaction reads data under snapshot isolation, it must determine which version of each data item is visible. This requires precise visibility rules that consider transaction states and commit timestamps.\n\nThe Core Visibility Question:\n\nGiven a transaction $T_{reader}$ with snapshot timestamp $ts$, and a data item $x$ with multiple versions $\{v_1, v_2, ..., v_n\}$ created by transactions $\{T_1, T_2, ..., T_n\}$, which version should $T_{reader}$ see?\n\nVisibility Rule:\n\nVersion $v_i$ created by $T_i$ is visible to $T_{reader}$ if and only if:\n\n1. $T_i$ committed before $T_{reader}$'s snapshot was taken\n2. $T_i$'s commit timestamp $\leq ts$ (the snapshot timestamp)\n3. $v_i$ is the most recent version satisfying conditions 1 and 2\n\nAmong all visible versions, the transaction sees the one with the highest commit timestamp.

visibility_check.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
FUNCTION is_version_visible(version, reader_snapshot_ts, reader_xid):
    writer_xid = version.created_by_transaction
    writer_commit_ts = get_commit_timestamp(writer_xid)
    
    # Rule 1: Writer must have committed
    IF writer_xid is still active OR writer_xid is aborted:
        RETURN false
    
    # Rule 2: Writer must have committed before our snapshot
    IF writer_commit_ts > reader_snapshot_ts:
        RETURN false
    
    # Rule 3: Writer must not be in our snapshot's active list
    IF writer_xid IN reader_snapshot.active_transactions:
        RETURN false  # Was running when we took snapshot
    
    # Rule 4: Check for newer versions that might hide this one
    IF version.deleted_by IS NOT NULL:
        deleter_xid = version.deleted_by
        IF is_version_visible(deleter_xid, reader_snapshot_ts, reader_xid):
            RETURN false  # This version was deleted/updated
    
    RETURN true
 
FUNCTION get_visible_version(data_item, reader_snapshot):
    # Walk version chain from newest to oldest
    FOR version IN data_item.versions ORDER BY commit_ts DESC:
        IF is_version_visible(version, reader_snapshot.ts, reader_snapshot.xid):
            RETURN version
    
    # No visible version (item didn't exist at snapshot time)
    RETURN NULL

The Active Transaction List:\n\nA critical component of snapshot visibility is tracking which transactions were active (in-progress) when the snapshot was taken. Consider this scenario:\n\n\nTime 100: T1 begins\nTime 110: T2 begins (T2's snapshot sees T1 as active)\nTime 120: T1 writes X = 10\nTime 130: T1 commits\nTime 140: T2 reads X\n\n\nEven though T1 committed (at time 130) before T2 reads (at time 140), T2 should NOT see T1's changes. Why? Because when T2 took its snapshot at time 110, T1 was still active. T2's snapshot captured the state before T1 committed.\n\nThis is why snapshots must record the list of active transaction IDs at the snapshot moment. Any changes from these transactions are invisible, regardless of when they commit.

PostgreSQL's Snapshot Structure

PostgreSQL snapshots consist of: xmin (all XIDs < xmin are visible if committed), xmax (all XIDs >= xmax are invisible), and xip[] (list of in-progress XIDs between xmin and xmax). A transaction is visible only if its XID < xmin OR (xmin <= XID < xmax AND XID not in xip[] AND committed).

Snapshots and Multi-Version Concurrency Control

Snapshots are the logical abstraction; Multi-Version Concurrency Control (MVCC) is the implementation mechanism that makes snapshots practical. Without MVCC, providing each transaction with its own consistent view would require either:\n\n1. Copying the entire database at snapshot time (impossibly expensive)\n2. Blocking all writes during reads (defeats the purpose)\n\nMVCC solves this by maintaining multiple versions of each data item, allowing different transactions to see different versions simultaneously.\n\nHow MVCC Enables Snapshots:\n\n1. When a transaction writes data item $x$, it creates a new version rather than overwriting the old one\n2. Each version is tagged with the creating transaction's ID and timestamp\n3. The old version remains available for transactions with older snapshots\n4. A garbage collection process (VACUUM in PostgreSQL) eventually removes versions no longer needed by any active snapshot

Converting Mermaid diagram...

The Version Chain:\n\nMVCC maintains data as a chain of versions, typically ordered from newest to oldest. When a transaction reads:\n\n1. Start at the head of the version chain (newest version)\n2. Walk backward through versions\n3. Stop at the first version that passes visibility rules\n4. Return that version's value\n\nThis allows each reader to find its appropriate version without blocking any writer.\n\nCost of Versioning:\n\nMVCC's benefits come with costs:\n\n- Storage overhead: Multiple versions consume more space\n- Version chain traversal: Long chains slow down reads\n- Garbage collection: Old versions must be cleaned up\n- Index maintenance: Some implementations require index entries for each version\n\nThese costs are manageable with proper tuning (vacuum frequency, version chain limits) and are generally outweighed by the concurrency benefits.

The Symbiotic Relationship

Snapshots define WHAT each transaction should see (logical consistency). MVCC defines HOW to efficiently provide those different views (physical implementation). You cannot have practical snapshot isolation without MVCC, and MVCC exists primarily to enable snapshot-based isolation levels.

Snapshot Consistency Guarantees

What exactly does a snapshot guarantee? Understanding the precise consistency semantics is crucial for designing correct applications.\n\nGuarantees Provided by Snapshots:\n\n1. Point-in-Time Consistency (Snapshot Read)\n\nAll reads within a transaction observe the same database state. If you read table A and then table B, both reflect the committed state at your snapshot timestamp. There are no "torn reads" across tables.\n\n2. Repeatable Reads\n\nIf you read the same row twice within a transaction, you get the same value both times (assuming you use transaction-level snapshots, not statement-level). This eliminates the non-repeatable read anomaly.\n\n3. No Dirty Reads\n\nYou only see committed data. Uncommitted changes from other transactions are invisible, eliminating dirty read anomalies.\n\n4. Read Stability\n\nOnce you observe a data item's value, that value remains stable for the duration of your transaction. External updates don't affect your view.

Anomalies Prevented by Snapshots

•Dirty Read — Prevented. Only committed data at snapshot time is visible
•Non-Repeatable Read — Prevented. Same query returns same results within transaction
•Phantom Read (read-only) — Prevented. Range queries see consistent snapshot
•Read Skew — Prevented. Related data items observed consistently

Guarantees NOT Provided by Snapshots Alone:\n\nSnapshots provide excellent read consistency but do not, by themselves, prevent all anomalies:\n\n1. Lost Updates\n\nTwo transactions can both read the same value, compute updates, and write. One update overwrites the other. Snapshot isolation requires additional mechanisms (first-committer-wins) to prevent this.\n\n2. Write Skew\n\nTwo transactions can read overlapping data, make decisions based on that data, and write to non-overlapping items in ways that violate application constraints. This is the signature anomaly of snapshot isolation and will be covered in detail in a later page.\n\n3. Serialization Anomalies\n\nSnapshot isolation does not guarantee serializability. There exist schedules that are allowed under SI but are not equivalent to any serial execution.\n\nThese limitations are important—snapshot isolation is powerful but not a complete solution for all consistency requirements.

Know Your Boundaries

Snapshot isolation eliminates most read anomalies but allows certain write anomalies (notably write skew). For applications requiring full serializability, additional mechanisms—such as serializable snapshot isolation (SSI)—are needed. We'll explore these limitations and solutions in subsequent pages.

Snapshots in Action: A Practical Scenario

Let's trace through a concrete example to solidify understanding. Consider a banking system with an accounts table:\n\nInitial State (committed):\n\nAccount A: Balance = $1000\nAccount B: Balance = $500\nAccount C: Balance = $750\n\n\nTimeline of Transactions:

Transaction Timeline with Snapshots
Time	T1 (Report)	T2 (Transfer)	T3 (Deposit)	Database State
100	BEGIN\n(snapshot at 100)	—	—	A=$1000, B=$500, C=$750
110	—	BEGIN\n(snapshot at 110)	—	A=$1000, B=$500, C=$750
120	—	—	BEGIN\n(snapshot at 120)	A=$1000, B=$500, C=$750
130	—	Read A → $1000	—	A=$1000, B=$500, C=$750
140	—	Write A = $800	—	A=$800 (uncommitted)
150	Read A → $1000	—	—	T1 sees snapshot: A=$1000
160	—	Write B = $700	Write C = $850	A=$800, B=$700 (uncommitted)\nC=$850 (uncommitted)
170	—	COMMIT	—	A=$800, B=$700 committed
180	Read B → $500	—	—	T1 sees snapshot: B=$500
190	—	—	COMMIT	C=$850 committed
200	Read C → $750	—	—	T1 sees snapshot: C=$750
210	Total = $2250 ✓	—	—	T1 reports consistent snapshot

Analysis:\n\nT1 (Report Transaction): Started at time 100, generating a total balance report. Even though T2 and T3 both committed during T1's execution, T1 sees none of their changes. T1's snapshot captured A=$1000, B=$500, C=$750 at time 100, and it reports that consistent total: $2,250.\n\nT2 (Transfer Transaction): Transferred $200 from A to B. It read A (seeing $1000 from its snapshot at time 110), wrote A=$800, wrote B=$700, and committed. These changes became visible to transactions starting after time 170.\n\nT3 (Deposit Transaction): Deposited $100 to C. It read C from its snapshot (seeing $750), wrote C=$850, and committed. This change became visible after time 190.\n\nKey Observations:\n\n✓ T1's report is consistent—it shows the exact state at time 100\n✓ T1 was never blocked by T2 or T3's writes\n✓ T2 and T3 were never blocked by T1's reads\n✓ All three transactions completed successfully with maximum concurrency

The Power of Snapshots

Without snapshots, T1 would either see an inconsistent view (some pre-transfer, some post-transfer data) or would need to block T2 and T3 until it finished. Snapshots enable T1 to generate a consistent report while T2 and T3 process updates concurrently—maximizing both consistency and throughput.

Summary: The Snapshot Foundation

Database snapshots represent a paradigm shift in how we think about concurrent data access. Rather than transactions fighting over access to shared current state, each transaction operates on its own frozen view of the past.

Key Takeaways

•A snapshot is a point-in-time view — It captures the committed database state at a specific timestamp, providing isolation from concurrent modifications
•Snapshots eliminate reader-writer blocking — Readers operate on frozen past state while writers modify current state, enabling maximum concurrency
•Snapshot timestamps determine visibility — Only changes committed before the snapshot timestamp are visible; in-progress transactions at snapshot time remain invisible
•MVCC implements snapshots efficiently — Multiple versions of data items allow different transactions to see appropriate versions without blocking or copying
•Snapshots prevent most read anomalies — Dirty reads, non-repeatable reads, and read-only phantoms are all eliminated by proper snapshot implementation
•Snapshots alone don't guarantee serializability — Write anomalies like write skew can still occur; additional mechanisms may be needed for full isolation

What's Next:\n\nNow that we understand what snapshots are and how they provide consistency, the next page explores Snapshot Isolation Implementation—the specific mechanisms database systems use to establish snapshots, maintain version visibility, and handle the complexities of snapshot-based concurrency control.

Page Complete

You now understand the fundamental concept of database snapshots—the frozen, point-in-time views that enable high-concurrency, consistent database access. This foundation is essential for understanding snapshot isolation's implementation, limitations, and practical applications in modern database systems.

1 / 5

Loading learning content...

Database Management SystemsSnapshot Isolation

Snapshot Isolation: Consistent Views for Concurrent Transactions

LevelAdvanced

Duration75 mins

TopicSnapshot Isolation

1 / 5

Snapshot Concept

The Illusion of Frozen Time

What You Will Learn

What Is a Database Snapshot?

The Photography Analogy Extended

Snapshots vs Traditional Locking

Traditional Locking

•Readers block writers (shared locks conflict with exclusive locks)
•Writers block readers (exclusive locks block shared lock requests)
•Long read transactions delay write operations
•Deadlocks possible between readers and writers
•Lock escalation can serialize entire workloads
•Complex lock management overhead

Snapshot-Based Access

•Readers never block writers (read from snapshot)
•Writers never block readers (write to current state)
•Long read transactions don't impact write throughput
•No reader-writer deadlocks possible
•Consistent read performance regardless of concurrency
•Minimal lock overhead for read operations

The Fundamental Shift

Snapshot Timestamp Assignment

Snapshot Timing Strategies Compared
Strategy	Snapshot Established	Consistency Level	Use Case
Transaction Start	At BEGIN/first statement	Repeatable Read	Reports, analytics, long transactions needing consistent view
Statement Start	At each SQL statement	Read Committed	OLTP workloads, short transactions, fresh data preferred
First Read	At first SELECT	Varies	Write-heavy workloads, lazy initialization

The Visibility Rule

Snapshot Visibility Rules

visibility_check.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
FUNCTION is_version_visible(version, reader_snapshot_ts, reader_xid):
    writer_xid = version.created_by_transaction
    writer_commit_ts = get_commit_timestamp(writer_xid)
    
    # Rule 1: Writer must have committed
    IF writer_xid is still active OR writer_xid is aborted:
        RETURN false
    
    # Rule 2: Writer must have committed before our snapshot
    IF writer_commit_ts > reader_snapshot_ts:
        RETURN false
    
    # Rule 3: Writer must not be in our snapshot's active list
    IF writer_xid IN reader_snapshot.active_transactions:
        RETURN false  # Was running when we took snapshot
    
    # Rule 4: Check for newer versions that might hide this one
    IF version.deleted_by IS NOT NULL:
        deleter_xid = version.deleted_by
        IF is_version_visible(deleter_xid, reader_snapshot_ts, reader_xid):
            RETURN false  # This version was deleted/updated
    
    RETURN true
 
FUNCTION get_visible_version(data_item, reader_snapshot):
    # Walk version chain from newest to oldest
    FOR version IN data_item.versions ORDER BY commit_ts DESC:
        IF is_version_visible(version, reader_snapshot.ts, reader_snapshot.xid):
            RETURN version
    
    # No visible version (item didn't exist at snapshot time)
    RETURN NULL

PostgreSQL's Snapshot Structure

Snapshots and Multi-Version Concurrency Control

Converting Mermaid diagram...

The Symbiotic Relationship

Snapshot Consistency Guarantees

Anomalies Prevented by Snapshots

•Dirty Read — Prevented. Only committed data at snapshot time is visible
•Non-Repeatable Read — Prevented. Same query returns same results within transaction
•Phantom Read (read-only) — Prevented. Range queries see consistent snapshot
•Read Skew — Prevented. Related data items observed consistently

Know Your Boundaries

Snapshots in Action: A Practical Scenario

Transaction Timeline with Snapshots
Time	T1 (Report)	T2 (Transfer)	T3 (Deposit)	Database State
100	BEGIN\n(snapshot at 100)	—	—	A=$1000, B=$500, C=$750
110	—	BEGIN\n(snapshot at 110)	—	A=$1000, B=$500, C=$750
120	—	—	BEGIN\n(snapshot at 120)	A=$1000, B=$500, C=$750
130	—	Read A → $1000	—	A=$1000, B=$500, C=$750
140	—	Write A = $800	—	A=$800 (uncommitted)
150	Read A → $1000	—	—	T1 sees snapshot: A=$1000
160	—	Write B = $700	Write C = $850	A=$800, B=$700 (uncommitted)\nC=$850 (uncommitted)
170	—	COMMIT	—	A=$800, B=$700 committed
180	Read B → $500	—	—	T1 sees snapshot: B=$500
190	—	—	COMMIT	C=$850 committed
200	Read C → $750	—	—	T1 sees snapshot: C=$750
210	Total = $2250 ✓	—	—	T1 reports consistent snapshot

The Power of Snapshots

Summary: The Snapshot Foundation

Key Takeaways

•A snapshot is a point-in-time view — It captures the committed database state at a specific timestamp, providing isolation from concurrent modifications
•Snapshots eliminate reader-writer blocking — Readers operate on frozen past state while writers modify current state, enabling maximum concurrency
•Snapshot timestamps determine visibility — Only changes committed before the snapshot timestamp are visible; in-progress transactions at snapshot time remain invisible
•MVCC implements snapshots efficiently — Multiple versions of data items allow different transactions to see appropriate versions without blocking or copying
•Snapshots prevent most read anomalies — Dirty reads, non-repeatable reads, and read-only phantoms are all eliminated by proper snapshot implementation
•Snapshots alone don't guarantee serializability — Write anomalies like write skew can still occur; additional mechanisms may be needed for full isolation

Page Complete

1 / 5