Loading learning content...
Imagine you're taking a photograph of a bustling marketplace. In that single instant, the camera captures every vendor, every customer, every product—frozen in time. Even as the marketplace continues to buzz with activity after the shutter clicks, your photograph preserves an exact, consistent view of that precise moment. This is the essence of a snapshot in database systems.\n\nA database snapshot provides a transaction with a logically consistent view of all data as it existed at a specific point in time. While other transactions continue to read and modify data concurrently, your transaction operates on its own private, immutable view—unaffected by changes happening elsewhere. This powerful abstraction enables remarkable levels of concurrency while maintaining the consistency guarantees that databases must provide.
By the end of this page, you will understand what database snapshots are, how they provide point-in-time consistency, the relationship between snapshots and transaction timestamps, and why snapshots form the foundation for modern multi-version concurrency control (MVCC) systems used in PostgreSQL, Oracle, MySQL InnoDB, and virtually every major database engine.
A database snapshot is a logically consistent, read-only view of the database at a specific point in time, called the snapshot timestamp. When a transaction begins under snapshot isolation, it establishes a snapshot and all subsequent reads within that transaction will observe the database state as it existed at that timestamp—regardless of any concurrent modifications made by other transactions.\n\nFormal Definition:\n\nLet $T_i$ be a transaction that begins at time $t_s$ (the snapshot timestamp). The snapshot $S(T_i)$ is defined as:\n\n$$S(T_i) = \{ (x, v) \mid x \text{ is a data item with value } v \text{ at time } t_s \}$$\n\nFor every read operation $R_i(x)$ performed by $T_i$, the value returned is $v$ where $(x, v) \in S(T_i)$, irrespective of any writes $W_j(x)$ by concurrent transactions $T_j$ where $j \neq i$.
Think of each transaction as having its own personal camera. When the transaction starts, it takes a photograph of the entire database. Throughout its execution, it only looks at this photograph—never at the "live" database that continues to change. Other transactions have their own photographs, taken at their own start times. Everyone works from their own consistent view.
Key Properties of Snapshots:\n\n1. Point-in-Time Consistency\n\nA snapshot represents the database at exactly one moment. All data items in the snapshot are mutually consistent—they reflect the committed state of the database at the snapshot timestamp. There are no "torn reads" where some data reflects one moment and other data reflects another.\n\n2. Immutability\n\nOnce established, a snapshot never changes. Even if the underlying data is modified by other transactions, the snapshot remains frozen. This immutability is what enables non-blocking reads—readers never wait for writers.\n\n3. Isolation from Concurrent Modifications\n\nA transaction reading from its snapshot is completely isolated from uncommitted changes made by concurrent transactions. It sees only data that was committed before the snapshot timestamp.\n\n4. Logical, Not Physical\n\nSnapshots are logical constructs. The database doesn't actually copy all data at the snapshot moment. Instead, it uses version chains or other mechanisms to reconstruct the correct version of each data item when requested.
To appreciate the power of snapshots, we must contrast them with traditional locking-based concurrency control. In lock-based systems, readers and writers can block each other—leading to reduced concurrency, potential deadlocks, and complex lock management. Snapshots fundamentally transform this equation.\n\nThe Lock-Based World:\n\nIn strict two-phase locking (2PL), a transaction reading data item $x$ must acquire a shared lock $S(x)$, and a transaction writing $x$ must acquire an exclusive lock $X(x)$. These locks conflict:\n\n- A writer holding $X(x)$ blocks all readers attempting to acquire $S(x)$\n- Readers holding $S(x)$ block any writer attempting to acquire $X(x)$\n\nThis means long-running read transactions can delay critical write operations, and write-heavy workloads can starve readers.\n\nThe Snapshot-Based World:\n\nWith snapshots, readers don't acquire locks on data items at all. Instead, they read from their snapshot—a frozen view of past state. Writers modify the "live" current state. Because readers and writers operate on different versions of data, they never conflict.
Snapshot-based systems transform database concurrency from a zero-sum game (readers and writers competing for locks) to a parallel world model (readers and writers operating on different time-views of data). This is why MVCC-based databases like PostgreSQL and Oracle can achieve such high concurrency levels.
When Is the Snapshot Taken?\n\nA critical design decision in snapshot-based systems is determining when a transaction's snapshot timestamp is assigned. This choice affects both the consistency semantics and the visibility of concurrent updates.\n\nOption 1: Transaction Start Time\n\nThe snapshot is established when the transaction begins (issues its first statement or explicit BEGIN command). This is the most common approach, used in PostgreSQL's REPEATABLE READ and Oracle's READ COMMITTED (with modifications).\n\n\nT1 begins at time 100 → Snapshot timestamp = 100\nT1 sees all changes committed before time 100\nT1 does NOT see changes committed at time 101, 102, ... (even if they commit while T1 runs)\n\n\nOption 2: Statement Start Time\n\nA new snapshot is established for each SQL statement within the transaction. This is PostgreSQL's READ COMMITTED behavior.\n\n\nT1 begins at time 100\n Statement S1 at time 100 → Snapshot = 100\n Statement S2 at time 150 → Snapshot = 150 (sees more recent commits)\n Statement S3 at time 200 → Snapshot = 200 (sees even more recent commits)\n\n\nOption 3: First Read Time\n\nThe snapshot is established lazily when the transaction performs its first read operation. This can reduce overhead for write-only transactions.
| Strategy | Snapshot Established | Consistency Level | Use Case |
|---|---|---|---|
| Transaction Start | At BEGIN/first statement | Repeatable Read | Reports, analytics, long transactions needing consistent view |
| Statement Start | At each SQL statement | Read Committed | OLTP workloads, short transactions, fresh data preferred |
| First Read | At first SELECT | Varies | Write-heavy workloads, lazy initialization |
Timestamp Generation Mechanisms:\n\nThe system needs a reliable way to generate monotonically increasing timestamps. Common approaches include:\n\n1. System Clock Timestamps\n\nUse wall-clock time (e.g., microseconds since epoch). Simple but can have issues with clock skew in distributed systems.\n\n2. Logical Counter (Transaction ID)\n\nMaintain a global counter that increments for each new transaction. PostgreSQL uses 32-bit transaction IDs (XIDs). This ensures strict ordering but requires wraparound handling.\n\n3. Hybrid Logical Clocks\n\nCombine physical time with logical counters to get the benefits of both: approximate real-time ordering with guaranteed monotonicity. Used in distributed databases like CockroachDB.\n\n4. Lamport Timestamps\n\nPure logical clocks that maintain happened-before relationships. Useful in distributed systems but don't correlate with real time.
A transaction T with snapshot timestamp ts can only see changes made by transactions that: (1) committed BEFORE ts, AND (2) had a transaction ID less than T's snapshot horizon. Changes from transactions that started before ts but committed after ts are NOT visible—they weren't yet committed at the snapshot moment.
When a transaction reads data under snapshot isolation, it must determine which version of each data item is visible. This requires precise visibility rules that consider transaction states and commit timestamps.\n\nThe Core Visibility Question:\n\nGiven a transaction $T_{reader}$ with snapshot timestamp $ts$, and a data item $x$ with multiple versions $\{v_1, v_2, ..., v_n\}$ created by transactions $\{T_1, T_2, ..., T_n\}$, which version should $T_{reader}$ see?\n\nVisibility Rule:\n\nVersion $v_i$ created by $T_i$ is visible to $T_{reader}$ if and only if:\n\n1. $T_i$ committed before $T_{reader}$'s snapshot was taken\n2. $T_i$'s commit timestamp $\leq ts$ (the snapshot timestamp)\n3. $v_i$ is the most recent version satisfying conditions 1 and 2\n\nAmong all visible versions, the transaction sees the one with the highest commit timestamp.
1234567891011121314151617181920212223242526272829303132
FUNCTION is_version_visible(version, reader_snapshot_ts, reader_xid): writer_xid = version.created_by_transaction writer_commit_ts = get_commit_timestamp(writer_xid) # Rule 1: Writer must have committed IF writer_xid is still active OR writer_xid is aborted: RETURN false # Rule 2: Writer must have committed before our snapshot IF writer_commit_ts > reader_snapshot_ts: RETURN false # Rule 3: Writer must not be in our snapshot's active list IF writer_xid IN reader_snapshot.active_transactions: RETURN false # Was running when we took snapshot # Rule 4: Check for newer versions that might hide this one IF version.deleted_by IS NOT NULL: deleter_xid = version.deleted_by IF is_version_visible(deleter_xid, reader_snapshot_ts, reader_xid): RETURN false # This version was deleted/updated RETURN true FUNCTION get_visible_version(data_item, reader_snapshot): # Walk version chain from newest to oldest FOR version IN data_item.versions ORDER BY commit_ts DESC: IF is_version_visible(version, reader_snapshot.ts, reader_snapshot.xid): RETURN version # No visible version (item didn't exist at snapshot time) RETURN NULLThe Active Transaction List:\n\nA critical component of snapshot visibility is tracking which transactions were active (in-progress) when the snapshot was taken. Consider this scenario:\n\n\nTime 100: T1 begins\nTime 110: T2 begins (T2's snapshot sees T1 as active)\nTime 120: T1 writes X = 10\nTime 130: T1 commits\nTime 140: T2 reads X\n\n\nEven though T1 committed (at time 130) before T2 reads (at time 140), T2 should NOT see T1's changes. Why? Because when T2 took its snapshot at time 110, T1 was still active. T2's snapshot captured the state before T1 committed.\n\nThis is why snapshots must record the list of active transaction IDs at the snapshot moment. Any changes from these transactions are invisible, regardless of when they commit.
PostgreSQL snapshots consist of: xmin (all XIDs < xmin are visible if committed), xmax (all XIDs >= xmax are invisible), and xip[] (list of in-progress XIDs between xmin and xmax). A transaction is visible only if its XID < xmin OR (xmin <= XID < xmax AND XID not in xip[] AND committed).
Snapshots are the logical abstraction; Multi-Version Concurrency Control (MVCC) is the implementation mechanism that makes snapshots practical. Without MVCC, providing each transaction with its own consistent view would require either:\n\n1. Copying the entire database at snapshot time (impossibly expensive)\n2. Blocking all writes during reads (defeats the purpose)\n\nMVCC solves this by maintaining multiple versions of each data item, allowing different transactions to see different versions simultaneously.\n\nHow MVCC Enables Snapshots:\n\n1. When a transaction writes data item $x$, it creates a new version rather than overwriting the old one\n2. Each version is tagged with the creating transaction's ID and timestamp\n3. The old version remains available for transactions with older snapshots\n4. A garbage collection process (VACUUM in PostgreSQL) eventually removes versions no longer needed by any active snapshot
The Version Chain:\n\nMVCC maintains data as a chain of versions, typically ordered from newest to oldest. When a transaction reads:\n\n1. Start at the head of the version chain (newest version)\n2. Walk backward through versions\n3. Stop at the first version that passes visibility rules\n4. Return that version's value\n\nThis allows each reader to find its appropriate version without blocking any writer.\n\nCost of Versioning:\n\nMVCC's benefits come with costs:\n\n- Storage overhead: Multiple versions consume more space\n- Version chain traversal: Long chains slow down reads\n- Garbage collection: Old versions must be cleaned up\n- Index maintenance: Some implementations require index entries for each version\n\nThese costs are manageable with proper tuning (vacuum frequency, version chain limits) and are generally outweighed by the concurrency benefits.
Snapshots define WHAT each transaction should see (logical consistency). MVCC defines HOW to efficiently provide those different views (physical implementation). You cannot have practical snapshot isolation without MVCC, and MVCC exists primarily to enable snapshot-based isolation levels.
What exactly does a snapshot guarantee? Understanding the precise consistency semantics is crucial for designing correct applications.\n\nGuarantees Provided by Snapshots:\n\n1. Point-in-Time Consistency (Snapshot Read)\n\nAll reads within a transaction observe the same database state. If you read table A and then table B, both reflect the committed state at your snapshot timestamp. There are no "torn reads" across tables.\n\n2. Repeatable Reads\n\nIf you read the same row twice within a transaction, you get the same value both times (assuming you use transaction-level snapshots, not statement-level). This eliminates the non-repeatable read anomaly.\n\n3. No Dirty Reads\n\nYou only see committed data. Uncommitted changes from other transactions are invisible, eliminating dirty read anomalies.\n\n4. Read Stability\n\nOnce you observe a data item's value, that value remains stable for the duration of your transaction. External updates don't affect your view.
Guarantees NOT Provided by Snapshots Alone:\n\nSnapshots provide excellent read consistency but do not, by themselves, prevent all anomalies:\n\n1. Lost Updates\n\nTwo transactions can both read the same value, compute updates, and write. One update overwrites the other. Snapshot isolation requires additional mechanisms (first-committer-wins) to prevent this.\n\n2. Write Skew\n\nTwo transactions can read overlapping data, make decisions based on that data, and write to non-overlapping items in ways that violate application constraints. This is the signature anomaly of snapshot isolation and will be covered in detail in a later page.\n\n3. Serialization Anomalies\n\nSnapshot isolation does not guarantee serializability. There exist schedules that are allowed under SI but are not equivalent to any serial execution.\n\nThese limitations are important—snapshot isolation is powerful but not a complete solution for all consistency requirements.
Snapshot isolation eliminates most read anomalies but allows certain write anomalies (notably write skew). For applications requiring full serializability, additional mechanisms—such as serializable snapshot isolation (SSI)—are needed. We'll explore these limitations and solutions in subsequent pages.
Let's trace through a concrete example to solidify understanding. Consider a banking system with an accounts table:\n\nInitial State (committed):\n\nAccount A: Balance = $1000\nAccount B: Balance = $500\nAccount C: Balance = $750\n\n\nTimeline of Transactions:
| Time | T1 (Report) | T2 (Transfer) | T3 (Deposit) | Database State |
|---|---|---|---|---|
| 100 | BEGIN\n(snapshot at 100) | — | — | A=$1000, B=$500, C=$750 |
| 110 | — | BEGIN\n(snapshot at 110) | — | A=$1000, B=$500, C=$750 |
| 120 | — | — | BEGIN\n(snapshot at 120) | A=$1000, B=$500, C=$750 |
| 130 | — | Read A → $1000 | — | A=$1000, B=$500, C=$750 |
| 140 | — | Write A = $800 | — | A=$800 (uncommitted) |
| 150 | Read A → $1000 | — | — | T1 sees snapshot: A=$1000 |
| 160 | — | Write B = $700 | Write C = $850 | A=$800, B=$700 (uncommitted)\nC=$850 (uncommitted) |
| 170 | — | COMMIT | — | A=$800, B=$700 committed |
| 180 | Read B → $500 | — | — | T1 sees snapshot: B=$500 |
| 190 | — | — | COMMIT | C=$850 committed |
| 200 | Read C → $750 | — | — | T1 sees snapshot: C=$750 |
| 210 | Total = $2250 ✓ | — | — | T1 reports consistent snapshot |
Analysis:\n\nT1 (Report Transaction): Started at time 100, generating a total balance report. Even though T2 and T3 both committed during T1's execution, T1 sees none of their changes. T1's snapshot captured A=$1000, B=$500, C=$750 at time 100, and it reports that consistent total: $2,250.\n\nT2 (Transfer Transaction): Transferred $200 from A to B. It read A (seeing $1000 from its snapshot at time 110), wrote A=$800, wrote B=$700, and committed. These changes became visible to transactions starting after time 170.\n\nT3 (Deposit Transaction): Deposited $100 to C. It read C from its snapshot (seeing $750), wrote C=$850, and committed. This change became visible after time 190.\n\nKey Observations:\n\n✓ T1's report is consistent—it shows the exact state at time 100\n✓ T1 was never blocked by T2 or T3's writes\n✓ T2 and T3 were never blocked by T1's reads\n✓ All three transactions completed successfully with maximum concurrency
Without snapshots, T1 would either see an inconsistent view (some pre-transfer, some post-transfer data) or would need to block T2 and T3 until it finished. Snapshots enable T1 to generate a consistent report while T2 and T3 process updates concurrently—maximizing both consistency and throughput.
Database snapshots represent a paradigm shift in how we think about concurrent data access. Rather than transactions fighting over access to shared current state, each transaction operates on its own frozen view of the past.
What's Next:\n\nNow that we understand what snapshots are and how they provide consistency, the next page explores Snapshot Isolation Implementation—the specific mechanisms database systems use to establish snapshots, maintain version visibility, and handle the complexities of snapshot-based concurrency control.
You now understand the fundamental concept of database snapshots—the frozen, point-in-time views that enable high-concurrency, consistent database access. This foundation is essential for understanding snapshot isolation's implementation, limitations, and practical applications in modern database systems.