Database Management SystemsMVCC

Multi-Version Concurrency Control (MVCC)

LevelIntermediate

Duration60 mins

TopicMVCC

1 / 5

Multi-Version Concurrency Control

The Concurrency Revolution

In the history of database management, few innovations have had as profound an impact on practical system performance as Multi-Version Concurrency Control (MVCC). While locking protocols and timestamp ordering addressed the fundamental challenge of ensuring serializability, they introduced an inherent bottleneck: readers and writers blocked each other.\n\nConsider a typical enterprise scenario: hundreds of analysts running complex reports while operational systems continuously update inventory, process orders, and record transactions. With traditional locking, every report query potentially blocks critical business operations. With timestamp ordering, many transactions abort and restart, wasting computational resources. The fundamental question emerged: Can we design a concurrency control mechanism where readers never wait for writers, and writers never wait for readers?\n\nThe answer is MVCC—a paradigm that fundamentally reshapes how databases handle concurrent access by maintaining multiple versions of each data item.

What You Will Learn

By the end of this page, you will understand the fundamental concept of MVCC, why it was developed, how it differs from lock-based and timestamp-based protocols, and why it has become the dominant concurrency control mechanism in modern database systems. You'll grasp the revolutionary insight that reading consistent data doesn't require blocking concurrent writes.

The Problem with Traditional Approaches

Before we dive into MVCC, we must fully appreciate the limitations of traditional concurrency control mechanisms. Both locking and timestamp ordering solved the serializability problem but introduced performance penalties that became increasingly problematic as database workloads grew.\n\nThe Locking Bottleneck:\n\nIn Two-Phase Locking (2PL), transactions acquire shared locks for reads and exclusive locks for writes. The fundamental conflict arises because:\n- A read operation requires a shared lock\n- A write operation requires an exclusive lock\n- Shared and exclusive locks are incompatible\n\nThis means that if Transaction T₁ is reading a data item, Transaction T₂ cannot write to it until T₁ releases its lock. Conversely, if T₂ is writing, T₁ cannot read. In high-throughput systems, this creates contention bottlenecks where critical paths through the data become serialization points.

Traditional Lock Compatibility Matrix
Lock Held\Requested	Shared (S)	Exclusive (X)
Shared (S)	✅ Compatible	❌ Conflict
Exclusive (X)	❌ Conflict	❌ Conflict

The Timestamp Ordering Overhead:\n\nTimestamp ordering eliminates locks but introduces a different problem: transaction aborts. When a transaction attempts an operation that would violate the timestamp order, it must be aborted and restarted. In workloads with high contention, abort rates can become substantial.\n\nConsider a scenario where multiple transactions attempt to update the same hot data item:\n- T1 (TS=100) reads item A\n- T2 (TS=110) writes item A (W-TS(A) becomes 110)\n- T1 attempts to write A → Abort! (TS(T1) < W-TS(A))\n\nEvery aborted transaction represents wasted work: parsing, planning, and partial execution—all discarded.

The Read-Write Asymmetry Problem

In many real-world applications, read operations vastly outnumber write operations—often by ratios of 100:1 or 1000:1. Having reads block writes (or cause write aborts) is a costly design when reads dominate. The insight behind MVCC is that reads don't need to see the 'current' value; they need to see a 'consistent' value from their transaction's perspective.

The Long-Running Transaction Problem:\n\nBoth locking and basic timestamp ordering suffer when transactions have significantly different execution times. Consider:\n\n- Analytical query T1: Scans 10 million rows, takes 30 minutes\n- Operational transaction T2: Updates a single row, takes 10ms\n\nWith locking, T1 holds shared locks across millions of rows for 30 minutes, blocking any updates to those rows. With timestamp ordering, T2 might be timestamped after T1 started but complete before T1 reads certain rows, causing complex ordering conflicts.\n\nThis read-write interference became the central challenge that MVCC was designed to solve.

The MVCC Paradigm

Multi-Version Concurrency Control introduces a fundamentally different approach: instead of controlling access to a single version of each data item, the database maintains multiple versions of the same data item, each corresponding to a different point in the database's history.\n\nThe Core Insight:\n\nWhen a transaction writes a new value, it doesn't overwrite the existing value—it creates a new version. The old version remains accessible to transactions that started before the write occurred. Each transaction sees a consistent snapshot of the database as it existed when the transaction began.\n\nThis simple insight has profound implications:\n1. Readers never block writers: A read operation accesses an older version while a writer creates a new version\n2. Writers never block readers: A write creates a new version without touching the version being read\n3. Consistent reads without locks: Transactions see a consistent snapshot without holding any locks

mvcc_conceptual_model.pseudo

Pseudocode

// Conceptual MVCC Model
 
// Each data item has multiple versions
DataItem {
    key: PrimaryKey
    versions: List<Version>  // Ordered by creation time
}
 
Version {
    value: Data
    created_by: TransactionID
    created_at: Timestamp
    deleted_by: TransactionID?  // NULL if not deleted
    deleted_at: Timestamp?
}
 
// Transaction maintains a consistent view
Transaction {
    id: TransactionID
    start_time: Timestamp
    // Sees only versions where:
    // 1. created_at <= start_time
    // 2. AND deleted_at IS NULL OR deleted_at > start_time
}
 
// Read Operation (never blocks)
function read(txn: Transaction, key: PrimaryKey) -> Data:
    item = database.get(key)
    
    // Find the correct version for this transaction
    for version in item.versions (ordered by created_at DESC):
        if version.created_at <= txn.start_time:
            if version.deleted_at IS NULL OR version.deleted_at > txn.start_time:
                return version.value
    
    return NOT_FOUND  // Item didn't exist at transaction start
 
// Write Operation (creates new version)
function write(txn: Transaction, key: PrimaryKey, new_value: Data):
    item = database.get_or_create(key)
    
    // Check for write-write conflicts (details vary by implementation)
    if concurrent_write_conflict(txn, item):
        abort(txn)
    
    // Create new version
    new_version = Version {
        value: new_value,
        created_by: txn.id,
        created_at: txn.commit_time,  // Set at commit
        deleted_by: NULL,
        deleted_at: NULL
    }
    
    // Mark previous version as superseded
    old_version = find_current_version(item)
    old_version.deleted_by = txn.id
    old_version.deleted_at = txn.commit_time
    
    item.versions.append(new_version)

Version Visibility Rules

A version is visible to a transaction if: (1) it was created by a transaction that committed before the reading transaction started, AND (2) it was not deleted by any transaction that committed before the reading transaction started. These rules ensure each transaction sees a consistent snapshot.

MVCC vs Locking: A Detailed Comparison

To fully appreciate MVCC, we must compare it systematically with lock-based concurrency control across multiple dimensions. This comparison reveals why MVCC became the preferred approach for modern database systems.\n\nConceptual Difference:\n\nLocking operates on a pessimistic model: it assumes conflicts will occur and prevents them by restricting access. MVCC operates on an optimistic model for reads: it assumes reads can proceed without interference because they access historical versions.\n\nFor writes, MVCC systems typically employ some form of conflict detection (first-writer-wins, or validation at commit), making them pessimistic for write-write conflicts while remaining optimistic for read-write interactions.

MVCC vs Locking: Comprehensive Comparison
Aspect	Lock-Based (2PL)	MVCC
Read-Write Isolation	Readers block writers; writers block readers	Readers and writers operate independently
Read Latency	May wait for write locks to release	Always immediate—reads from snapshot
Write Latency	May wait for read locks to release	No waiting for readers; may conflict with concurrent writers
Lock Overhead	Lock table management, lock escalation	Version chain traversal, garbage collection
Deadlock Risk	Yes—circular lock waits possible	Reduced—no lock waits for reads
Storage Overhead	Minimal—single version per item	Higher—multiple versions retained
Transaction Aborts	Lower—conflicts resolved by waiting	Higher for write conflicts—first-writer-wins
Long Transaction Impact	Blocks concurrent operations	Only affects version garbage collection
Suitable Workloads	Write-heavy, short transactions	Read-heavy, mixed transaction lengths

MVCC Advantages

•Non-blocking reads — Readers never wait, crucial for read-heavy workloads
•Consistent snapshots — Natural support for long-running analytical queries
•Reduced deadlocks — Read operations cannot participate in deadlock cycles
•High concurrency — Excellent throughput when reads dominate
•Temporal queries — Historical data access is architecturally natural

MVCC Tradeoffs

•Storage overhead — Multiple versions consume more disk and memory
•Garbage collection — Dead versions must be identified and reclaimed
•Version chain traversal — Finding correct version has non-trivial cost
•Write amplification — Each update creates full new version
•Complexity — Transaction visibility rules are sophisticated

When Locking Still Wins:\n\nDespite MVCC's advantages, locking can be preferable in specific scenarios:\n\n1. Write-heavy workloads: When most transactions modify data, MVCC's write-write conflict handling may cause more aborts than lock waits would\n2. Memory-constrained systems: Version storage overhead is significant when RAM is limited\n3. Very short transactions: Lock acquisition overhead may be lower than version management\n4. Strict serializability needs: Some MVCC implementations provide only snapshot isolation, not full serializability

Theoretical Foundation

Understanding MVCC's theoretical underpinnings helps explain both its power and its limitations. The key concept is the multiversion serialization graph.\n\nMultiversion Serializability:\n\nIn traditional (single-version) databases, serializability is defined by equivalence to some serial schedule. With multiple versions, we extend this: a multiversion schedule is serializable if it is equivalent to some serial execution where each read operation receives the value written by the most recent write operation (in the serial order) on that data item.\n\nMVCC achieves this by ensuring that each transaction reads from a consistent snapshot—all reads see values as of the same logical point in time.\n\nThe Snapshot Isolation Model:\n\nMost practical MVCC implementations provide Snapshot Isolation (SI), which guarantees:\n\n1. Snapshot Property: Each transaction reads from a snapshot of committed data as of its start time\n2. First-Committer-Wins: If two concurrent transactions both write the same item, only the first to commit succeeds; the other aborts

snapshot_isolation_definition.txt

Formal Definition

SNAPSHOT ISOLATION - Formal Definition
 
Let T be the set of transactions, and for each transaction Ti ∈ T:
  - Start(Ti) = timestamp when Ti began
  - Commit(Ti) = timestamp when Ti committed (undefined if aborted)
 
For a data item X, let Versions(X) = {v1, v2, ..., vn} ordered by creation time.
 
SNAPSHOT PROPERTY:
A read operation by transaction Ti on item X returns version vj where:
  - vj.commit_time = max{v.commit_time : v ∈ Versions(X) ∧ v.commit_time < Start(Ti)}
 
  In words: Ti sees the most recent version committed before Ti started.
 
FIRST-COMMITTER-WINS:
For concurrent transactions Ti and Tj where:
  - Ti writes X
  - Tj writes X
  - Their execution overlaps: Start(Ti) < Commit(Tj) AND Start(Tj) < Commit(Ti)
 
Then at most one of Ti or Tj may commit successfully.
 
WRITE-WRITE CONFLICT:
Transactions Ti and Tj conflict on write if:
  - Both write to some common data item X
  - Neither had committed when the other started
  - Conflict resolution: abort the later committer
 
NOTE: Snapshot Isolation does NOT guarantee full serializability.
The "write skew" anomaly is possible under SI.

Snapshot Isolation ≠ Serializability

Snapshot Isolation prevents many anomalies but allows 'write skew'—where two transactions each read items, make decisions based on those reads, and write different items in ways that violate application invariants. Full serializability requires additional mechanisms (like predicate locks or serializable snapshot isolation).

Version Ordering:\n\nMVCC systems must maintain a total order on versions. This is typically achieved through:\n\n1. Commit timestamps: Each transaction receives a unique commit timestamp when it successfully commits. Versions are ordered by the commit timestamp of the transaction that created them.\n\n2. Transaction IDs: Some systems use monotonically increasing transaction IDs that serve as logical timestamps.\n\n3. Hybrid approaches: Combining physical timestamps with logical counters to handle timestamp collisions.\n\nThe ordering ensures that for any data item, we can always determine which version is 'newest' relative to any transaction's snapshot point.

Historical Context and Evolution

MVCC didn't emerge fully formed—it evolved through decades of database research and industrial necessity. Understanding this history illuminates why MVCC took its current form.\n\nOrigins in the 1970s:\n\nThe concept of maintaining multiple versions of data for concurrency control first appeared in academic research in the late 1970s. David P. Reed's 1978 dissertation at MIT introduced the concept of 'naming and synchronization in a decentralized computer system,' which included ideas about versioned data.\n\nOracle's Innovation (1984):\n\nOracle became the first major commercial database to implement a form of MVCC. Oracle's approach used 'rollback segments' (now called 'undo tablespaces') to store old versions of data, allowing readers to construct consistent views without locking. This gave Oracle a significant competitive advantage for concurrent workloads.\n\nThe PostgreSQL Model (1996+):\n\nPostgreSQL adopted a different MVCC architecture where multiple versions are stored in the main heap itself rather than in separate undo storage. This 'heap-only tuple' approach has distinct tradeoffs that we'll explore in depth.\n\nModern Ubiquity:\n\nToday, virtually every major database system incorporates MVCC or MVCC-like mechanisms:\n- PostgreSQL: Heap-based MVCC\n- MySQL InnoDB: Undo log-based MVCC\n- SQL Server: Tempdb-based row versioning (optional)\n- Oracle: Undo tablespace-based MVCC\n- MongoDB: WiredTiger storage engine uses MVCC\n- CockroachDB, TiDB, YugabyteDB: Distributed MVCC implementations

MVCC Evolution Timeline
Year	Development	Significance
1978	Reed's MIT dissertation on versioned data	Theoretical foundation established
1981	Multiversion timestamp ordering formalized	Academic algorithms developed
1984	Oracle introduces rollback segments	First major commercial MVCC
1995	PostgreSQL development begins with MVCC design	Heap-based MVCC pioneered
2001	InnoDB becomes default MySQL storage engine	MVCC enters open-source mainstream
2005	SQL Server 2005 adds snapshot isolation	Microsoft adopts MVCC concepts
2010s	NewSQL databases: CockroachDB, TiDB, Spanner	Distributed MVCC at global scale
2020s	Serializable Snapshot Isolation becomes common	Addressing SI's serializability gaps

Industry Validation

The widespread adoption of MVCC across virtually all modern database systems—from traditional RDBMS to NewSQL and NoSQL—validates its fundamental advantages. When Oracle, PostgreSQL, MySQL, SQL Server, and MongoDB all converge on similar approaches, it signals that MVCC addresses genuine, universal requirements for database concurrency.

MVCC Implementation Strategies Overview

While the MVCC concept is straightforward—maintain multiple versions, let transactions read appropriate versions—the implementation details vary significantly across database systems. Each approach makes different tradeoffs between storage efficiency, read performance, write performance, and garbage collection complexity.\n\nApproach 1: Append-Only / Heap-Based (PostgreSQL Model)\n\nNew versions are appended to the main data file (heap). Old versions remain in place until vacuum (garbage collection) removes them.\n\n- Write: Create new tuple, update index pointers\n- Read: Follow chain to find visible version\n- GC: Background vacuum process reclaims dead tuples\n\nTradeoffs: Simple write path, but table bloat and index maintenance overhead.\n\nApproach 2: Undo Log-Based (Oracle, MySQL InnoDB Model)\n\nThe main table stores only the newest version. When a new version is written, the old version is moved to an undo log (rollback segment).\n\n- Write: Copy old value to undo, write new value in-place\n- Read: For old snapshots, reconstruct by applying undo records\n- GC: Undo segments can be reclaimed once no active transaction needs them\n\nTradeoffs: Main table stays compact, but reading old versions requires reconstruction.

Append-Only (PostgreSQL)

•✅ Writes are fast (no undo copy)
•✅ Old version reads are direct
•✅ Simpler recovery (no undo needed)
•❌ Table bloat between vacuums
•❌ Indexes point to old versions
•❌ Vacuum is I/O intensive

Undo Log-Based (Oracle/InnoDB)

•✅ Main table stays compact
•✅ Index management simpler
•✅ Better cache efficiency
•❌ Write requires undo copy
•❌ Old reads need reconstruction
•❌ Undo sizing can be tricky

Approach 3: Delta Storage (Some Modern Systems)\n\nInstead of storing complete versions, store only the differences (deltas) between versions. This minimizes storage overhead when updates modify only part of a row.\n\nTradeoffs: Extremely space-efficient, but reading requires applying delta chain.\n\nApproach 4: Hybrid Approaches\n\nModern systems often combine strategies. For example, storing deltas for recent versions (hot data) while maintaining full versions for older snapshots (cold data), or using different strategies for different table types.\n\nThe choice of implementation strategy significantly impacts system characteristics:\n- Write-heavy workloads: Favor undo-log or delta approaches for compact storage\n- Read-heavy workloads: Favor append-only for direct version access\n- Mixed workloads: Hybrid approaches may offer the best balance

Summary: The MVCC Revolution

MVCC represents one of the most significant paradigm shifts in database concurrency control. By maintaining multiple versions of data items, it fundamentally changes the relationship between readers and writers, enabling the high-concurrency systems that power modern applications.\n\nConsolidating Our Understanding:

Key Takeaways

•MVCC maintains multiple versions of each data item — Writers create new versions; old versions remain accessible to concurrent readers
•Readers never block writers, and writers never block readers — The fundamental advantage over lock-based concurrency control
•Each transaction sees a consistent snapshot — All reads within a transaction see data as of the transaction's start time
•Write-write conflicts require resolution — Most implementations use first-writer-wins or first-committer-wins
•Snapshot Isolation is the common MVCC isolation level — It prevents many anomalies but allows write skew
•Different implementations make different tradeoffs — Heap-based vs. undo-log-based vs. delta storage
•Garbage collection is essential — Old versions must eventually be reclaimed to prevent unbounded storage growth
•MVCC is now ubiquitous — Virtually every major modern database system uses MVCC or MVCC-like mechanisms

Looking Ahead:\n\nIn the following pages, we'll explore the detailed mechanics of MVCC:\n\n- Multiple Versions: How version chains are structured and managed\n- Read Consistency: How transactions determine which version to read\n- PostgreSQL Implementation: A deep dive into a real-world MVCC system\n- MVCC Advantages: The full spectrum of benefits MVCC provides\n\nEach page builds on this foundation to give you complete mastery of MVCC concepts and implementation.

Page Complete

You now understand the fundamental concept of MVCC, why it was developed, how it compares to traditional concurrency control, and its theoretical foundation. Next, we'll explore how databases actually manage multiple versions of data items.

1 / 5

Loading learning content...

Database Management SystemsMVCC

Multi-Version Concurrency Control (MVCC)

LevelIntermediate

Duration60 mins

TopicMVCC

1 / 5

Multi-Version Concurrency Control

The Concurrency Revolution

What You Will Learn

The Problem with Traditional Approaches

Traditional Lock Compatibility Matrix
Lock Held\Requested	Shared (S)	Exclusive (X)
Shared (S)	✅ Compatible	❌ Conflict
Exclusive (X)	❌ Conflict	❌ Conflict

The Read-Write Asymmetry Problem

The MVCC Paradigm

mvcc_conceptual_model.pseudo

Pseudocode

// Conceptual MVCC Model
 
// Each data item has multiple versions
DataItem {
    key: PrimaryKey
    versions: List<Version>  // Ordered by creation time
}
 
Version {
    value: Data
    created_by: TransactionID
    created_at: Timestamp
    deleted_by: TransactionID?  // NULL if not deleted
    deleted_at: Timestamp?
}
 
// Transaction maintains a consistent view
Transaction {
    id: TransactionID
    start_time: Timestamp
    // Sees only versions where:
    // 1. created_at <= start_time
    // 2. AND deleted_at IS NULL OR deleted_at > start_time
}
 
// Read Operation (never blocks)
function read(txn: Transaction, key: PrimaryKey) -> Data:
    item = database.get(key)
    
    // Find the correct version for this transaction
    for version in item.versions (ordered by created_at DESC):
        if version.created_at <= txn.start_time:
            if version.deleted_at IS NULL OR version.deleted_at > txn.start_time:
                return version.value
    
    return NOT_FOUND  // Item didn't exist at transaction start
 
// Write Operation (creates new version)
function write(txn: Transaction, key: PrimaryKey, new_value: Data):
    item = database.get_or_create(key)
    
    // Check for write-write conflicts (details vary by implementation)
    if concurrent_write_conflict(txn, item):
        abort(txn)
    
    // Create new version
    new_version = Version {
        value: new_value,
        created_by: txn.id,
        created_at: txn.commit_time,  // Set at commit
        deleted_by: NULL,
        deleted_at: NULL
    }
    
    // Mark previous version as superseded
    old_version = find_current_version(item)
    old_version.deleted_by = txn.id
    old_version.deleted_at = txn.commit_time
    
    item.versions.append(new_version)

Version Visibility Rules

MVCC vs Locking: A Detailed Comparison

MVCC vs Locking: Comprehensive Comparison
Aspect	Lock-Based (2PL)	MVCC
Read-Write Isolation	Readers block writers; writers block readers	Readers and writers operate independently
Read Latency	May wait for write locks to release	Always immediate—reads from snapshot
Write Latency	May wait for read locks to release	No waiting for readers; may conflict with concurrent writers
Lock Overhead	Lock table management, lock escalation	Version chain traversal, garbage collection
Deadlock Risk	Yes—circular lock waits possible	Reduced—no lock waits for reads
Storage Overhead	Minimal—single version per item	Higher—multiple versions retained
Transaction Aborts	Lower—conflicts resolved by waiting	Higher for write conflicts—first-writer-wins
Long Transaction Impact	Blocks concurrent operations	Only affects version garbage collection
Suitable Workloads	Write-heavy, short transactions	Read-heavy, mixed transaction lengths

MVCC Advantages

•Non-blocking reads — Readers never wait, crucial for read-heavy workloads
•Consistent snapshots — Natural support for long-running analytical queries
•Reduced deadlocks — Read operations cannot participate in deadlock cycles
•High concurrency — Excellent throughput when reads dominate
•Temporal queries — Historical data access is architecturally natural

MVCC Tradeoffs

•Storage overhead — Multiple versions consume more disk and memory
•Garbage collection — Dead versions must be identified and reclaimed
•Version chain traversal — Finding correct version has non-trivial cost
•Write amplification — Each update creates full new version
•Complexity — Transaction visibility rules are sophisticated

Theoretical Foundation

snapshot_isolation_definition.txt

Formal Definition

SNAPSHOT ISOLATION - Formal Definition
 
Let T be the set of transactions, and for each transaction Ti ∈ T:
  - Start(Ti) = timestamp when Ti began
  - Commit(Ti) = timestamp when Ti committed (undefined if aborted)
 
For a data item X, let Versions(X) = {v1, v2, ..., vn} ordered by creation time.
 
SNAPSHOT PROPERTY:
A read operation by transaction Ti on item X returns version vj where:
  - vj.commit_time = max{v.commit_time : v ∈ Versions(X) ∧ v.commit_time < Start(Ti)}
 
  In words: Ti sees the most recent version committed before Ti started.
 
FIRST-COMMITTER-WINS:
For concurrent transactions Ti and Tj where:
  - Ti writes X
  - Tj writes X
  - Their execution overlaps: Start(Ti) < Commit(Tj) AND Start(Tj) < Commit(Ti)
 
Then at most one of Ti or Tj may commit successfully.
 
WRITE-WRITE CONFLICT:
Transactions Ti and Tj conflict on write if:
  - Both write to some common data item X
  - Neither had committed when the other started
  - Conflict resolution: abort the later committer
 
NOTE: Snapshot Isolation does NOT guarantee full serializability.
The "write skew" anomaly is possible under SI.

Snapshot Isolation ≠ Serializability

Historical Context and Evolution

MVCC Evolution Timeline
Year	Development	Significance
1978	Reed's MIT dissertation on versioned data	Theoretical foundation established
1981	Multiversion timestamp ordering formalized	Academic algorithms developed
1984	Oracle introduces rollback segments	First major commercial MVCC
1995	PostgreSQL development begins with MVCC design	Heap-based MVCC pioneered
2001	InnoDB becomes default MySQL storage engine	MVCC enters open-source mainstream
2005	SQL Server 2005 adds snapshot isolation	Microsoft adopts MVCC concepts
2010s	NewSQL databases: CockroachDB, TiDB, Spanner	Distributed MVCC at global scale
2020s	Serializable Snapshot Isolation becomes common	Addressing SI's serializability gaps

Industry Validation

MVCC Implementation Strategies Overview

Append-Only (PostgreSQL)

•✅ Writes are fast (no undo copy)
•✅ Old version reads are direct
•✅ Simpler recovery (no undo needed)
•❌ Table bloat between vacuums
•❌ Indexes point to old versions
•❌ Vacuum is I/O intensive

Undo Log-Based (Oracle/InnoDB)

•✅ Main table stays compact
•✅ Index management simpler
•✅ Better cache efficiency
•❌ Write requires undo copy
•❌ Old reads need reconstruction
•❌ Undo sizing can be tricky

Summary: The MVCC Revolution

Key Takeaways

•MVCC maintains multiple versions of each data item — Writers create new versions; old versions remain accessible to concurrent readers
•Readers never block writers, and writers never block readers — The fundamental advantage over lock-based concurrency control
•Each transaction sees a consistent snapshot — All reads within a transaction see data as of the transaction's start time
•Write-write conflicts require resolution — Most implementations use first-writer-wins or first-committer-wins
•Snapshot Isolation is the common MVCC isolation level — It prevents many anomalies but allows write skew
•Different implementations make different tradeoffs — Heap-based vs. undo-log-based vs. delta storage
•Garbage collection is essential — Old versions must eventually be reclaimed to prevent unbounded storage growth
•MVCC is now ubiquitous — Virtually every major modern database system uses MVCC or MVCC-like mechanisms

Page Complete

1 / 5