Loading learning content...
The concept of maintaining multiple versions sounds straightforward, but the implementation details determine whether an MVCC system is practical for production use. This page explores the structural and operational aspects of version management: how versions are physically organized, what metadata they carry, how they form chains of history, and how the database eventually reclaims obsolete versions.\n\nEvery time you update a row in a PostgreSQL table, every time you modify a document in MongoDB, every time InnoDB processes an UPDATE statement—the database is creating, organizing, and managing multiple versions of your data. Understanding this mechanism is essential for:\n\n- Performance tuning: Knowing how version chains affect query performance\n- Capacity planning: Understanding storage amplification from version overhead\n- Troubleshooting: Diagnosing issues caused by version bloat or long-running transactions\n- Architecture decisions: Choosing between storage engines with different versioning strategies
By the end of this page, you will understand how version chains are structured, what metadata each version carries, how different storage models organize versions, and how garbage collection reclaims obsolete versions. You'll gain insight into the engineering tradeoffs that shape real-world MVCC implementations.
Every version of a data item must carry sufficient metadata for the database to answer two critical questions:\n\n1. Who created this version? — Which transaction wrote this value, and when did it commit?\n2. Who invalidated this version? — Which transaction superseded this value with a newer one?\n\nThese questions determine visibility: whether a given transaction should see this version or skip to the next in the chain.\n\nEssential Version Metadata:\n\nWhile specific implementations vary, most MVCC systems track similar information for each version:
| Metadata Field | Purpose | Example Values |
|---|---|---|
| Xmin / Created-By | Transaction ID that created this version | 10042 (PostgreSQL), TRX_ID (InnoDB) |
| Xmax / Deleted-By | Transaction ID that deleted/superseded this version (0 or NULL if still current) | 10057 or NULL |
| Commit Timestamp | Logical or physical time when creating transaction committed | 2024-01-15T14:30:00 or LSN 0x5F3A |
| Version Pointer | Link to previous or next version in the chain | Heap location or undo segment offset |
| Transaction Status | Whether creating transaction committed, aborted, or in-progress | Committed, Aborted, In-Progress |
| Row Content | The actual data values for this version | Column values, BLOB references, etc. |
The Visibility Challenge:\n\nDetermining whether a version is visible requires knowing the status of the transactions that created and potentially deleted it. This creates an interesting bootstrapping problem: the version exists in storage, but its visibility depends on transaction state that may be stored elsewhere.\n\nDatabases solve this in different ways:\n\n- PostgreSQL: Maintains pg_xact (formerly clog) to track transaction commit status. Visibility check requires consulting this structure.\n- InnoDB: Stores transaction state in the undo log headers. Read views compare transaction IDs against active transaction lists.\n- Oracle: Rollback segments contain transaction status. System change numbers (SCNs) provide ordering.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
// Determining Version Visibility// Given: Transaction T with snapshot timestamp S// Version V of data item with metadata function is_version_visible(T: Transaction, V: Version) -> Boolean: // Step 1: Check if creator transaction has committed creator_status = get_transaction_status(V.xmin) if creator_status == ABORTED: return false // Created by aborted transaction - never visible if creator_status == IN_PROGRESS: if V.xmin == T.id: return true // Transaction sees its own uncommitted writes else: return false // Created by another in-progress transaction // Creator committed - check if commit was before our snapshot if creator_status == COMMITTED: if V.xmin_commit_time > T.snapshot_time: return false // Committed after our snapshot started // Step 2: Check if version has been superseded if V.xmax == 0 or V.xmax == NULL: return true // Current version - not yet superseded deleter_status = get_transaction_status(V.xmax) if deleter_status == ABORTED: return true // Delete was rolled back - version still valid if deleter_status == IN_PROGRESS: if V.xmax == T.id: return false // We deleted it ourselves else: return true // Another transaction's uncommitted delete - still visible to us if deleter_status == COMMITTED: if V.xmax_commit_time > T.snapshot_time: return true // Deleted after our snapshot - still visible to us else: return false // Deleted before our snapshot - not visible return false // Should not reach here // Helper: Get transaction commit statusfunction get_transaction_status(txn_id: TransactionID) -> Status: // Different implementations: // PostgreSQL: Check pg_xact (clog) // InnoDB: Check undo log headers + active transaction list // Oracle: Check rollback segment transaction table return lookup_transaction_status(txn_id)PostgreSQL optimization: Rather than consulting pg_xact for every visibility check, PostgreSQL can set 'hint bits' directly in the tuple header once a transaction's fate is known. This trades a small amount of write I/O (marking the hint) for significant read savings on subsequent accesses.
When multiple versions of the same logical row exist, they must be connected in a way that allows efficient traversal. The version chain is the data structure that links these versions together. Different database systems organize version chains differently, with significant implications for performance.\n\nKey Design Decisions:\n\n1. Chain Direction: Does the chain go from newest to oldest (N2O) or oldest to newest (O2N)?\n2. Primary Location: Is the newest version in the main table, or the oldest?\n3. Version Storage: Are all versions in the main heap, or in separate storage?\n4. Index Behavior: Do indexes point to the new version, old version, or a version-independent location?
| Architecture | Chain Direction | Primary Table Content | Example Systems |
|---|---|---|---|
| Append-Only (O2N) | Oldest → Newest | All versions (newest is latest) | PostgreSQL |
| Append-Only (N2O) | Newest → Oldest | All versions (oldest at head) | Some research systems |
| Undo Log / Delta (N2O) | Newest → Oldest | Only newest version | InnoDB, Oracle, SQL Server |
| Time-Travel Table | Separate index | Current only; history in separate table | Temporal table implementations |
Append-Only / Heap-Based (PostgreSQL Model):\n\nIn PostgreSQL's architecture, when a row is updated:\n\n1. The original tuple remains in place but is marked with xmax = updating transaction's ID\n2. A new tuple is inserted (potentially on a different page)\n3. The old tuple contains a forward pointer (ctid) to the new tuple\n4. Indexes may point to any version in the chain\n\nThis creates a chain where traversal goes from old to new (O2N):
PostgreSQL Heap-Based Version Chain (O2N) Initial State - Row inserted by Txn 100:┌────────────────────────────────────────────────────┐│ Heap Page 1 ││ ┌────────────────────────────────────────────┐ ││ │ Tuple @ (1,1) │ ││ │ xmin: 100 xmax: 0 ctid: (1,1) │ ││ │ data: {name: "Alice", balance: 1000} │ ││ └────────────────────────────────────────────┘ │└────────────────────────────────────────────────────┘ After UPDATE by Txn 150 (balance = 900):┌────────────────────────────────────────────────────┐│ Heap Page 1 ││ ┌────────────────────────────────────────────┐ ││ │ Tuple @ (1,1) [OLD VERSION] │ ││ │ xmin: 100 xmax: 150 ctid: (1,3) ──────────► to new│ │ data: {name: "Alice", balance: 1000} │ ││ └────────────────────────────────────────────┘ ││ ││ ┌────────────────────────────────────────────┐ ││ │ Tuple @ (1,3) [NEW VERSION - CURRENT] │ ││ │ xmin: 150 xmax: 0 ctid: (1,3) │ ││ │ data: {name: "Alice", balance: 900} │ ││ └────────────────────────────────────────────┘ │└────────────────────────────────────────────────────┘ After another UPDATE by Txn 200 (balance = 850):┌─────────────────────────────────────────────────────────────────┐│ Heap Page 1 Heap Page 2 ││ ┌─────────────────────────────────┐ ┌─────────────────── ││ │ (1,1) xmin:100 xmax:150 ─────────────► │ ││ │ ctid:(1,3) │ ┌──┘ ││ │ balance: 1000 │ │ ││ └─────────────────────────────────┘ ▼ ││ ┌─────────────────────────────────┐ (2,1) [CURRENT] ││ │ (1,3) xmin:150 xmax:200 ────────────► xmin:200 xmax:0 ││ │ balance: 900 │ balance: 850 ││ └─────────────────────────────────┘ │└─────────────────────────────────────────────────────────────────┘ Version Chain: (1,1) → (1,3) → (2,1) 100 150 200 (Transaction IDs)Undo Log-Based (InnoDB Model):\n\nInnoDB takes a fundamentally different approach. The main table (clustered index) always contains the newest version. When an update occurs:\n\n1. The current values are copied to an undo log record\n2. The row in the main table is modified in-place\n3. The row header contains a pointer to the undo log record\n4. Undo records chain backwards through history\n\nThis creates a Newest-to-Oldest (N2O) chain:
InnoDB Undo Log-Based Version Chain (N2O) Initial State - Row inserted by Trx 100:┌────────────────────────────────────────────────────┐│ Clustered Index (Primary Key) ││ ┌────────────────────────────────────────────┐ ││ │ Row PK=1 │ ││ │ DB_TRX_ID: 100 │ ││ │ DB_ROLL_PTR: NULL (no previous version) │ ││ │ data: {name: "Alice", balance: 1000} │ ││ └────────────────────────────────────────────┘ │└────────────────────────────────────────────────────┘ After UPDATE by Trx 150 (balance = 900):┌────────────────────────────────────────────────────┐│ Clustered Index ││ ┌────────────────────────────────────────────┐ ││ │ Row PK=1 [CURRENT VERSION] │ ││ │ DB_TRX_ID: 150 │ ││ │ DB_ROLL_PTR: ──────────────────────────────► ││ │ data: {name: "Alice", balance: 900} │ ││ └────────────────────────────────────────────┘ │└────────────────────────────────────────────────────┘ │┌─────────────────────────────────────────────────┼──┐│ Undo Log (Rollback Segment) ▼ ││ ┌────────────────────────────────────────────┐ ││ │ Undo Record │ ││ │ TRX_ID: 100 │ ││ │ Previous Undo: NULL │ ││ │ Old values: {balance: 1000} │ ││ └────────────────────────────────────────────┘ │└────────────────────────────────────────────────────┘ After another UPDATE by Trx 200 (balance = 850):┌────────────────────────────────────────────────────┐│ Clustered Index ││ ┌────────────────────────────────────────────┐ ││ │ Row PK=1 [CURRENT VERSION] │ ││ │ DB_TRX_ID: 200 DB_ROLL_PTR: ─────────────────► Undo #2│ │ data: {balance: 850} │ ││ └────────────────────────────────────────────┘ │└────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────────┐│ Undo Log ││ ┌────────────────────────────────────────────┐ ││ │ Undo Record #2 (Trx 200's changes) │ ││ │ TRX_ID: 150 │ ││ │ Previous Undo: ────────────► Undo #1 │ ││ │ Old values: {balance: 900} │ ││ └────────────────────────────────────────────┘ ││ ┌────────────────────────────────────────────┐ ││ │ Undo Record #1 (Trx 150's changes) │ ││ │ TRX_ID: 100 │ ││ │ Previous Undo: NULL │ ││ │ Old values: {balance: 1000} │ ││ └────────────────────────────────────────────┘ │└────────────────────────────────────────────────────┘ Version Chain: Current → Undo#2 → Undo#1 → (original insert) 200 150 100N2O chains (InnoDB) optimize for current-version reads: most queries want the newest data, which is found immediately in the main table. O2N chains (PostgreSQL) may require traversing the chain to find the current version, but index maintenance is more complex in N2O systems.
Secondary indexes create significant complexity in MVCC systems. The fundamental challenge: an index entry must point to a row, but multiple versions of that row exist. Different architectures handle this differently, with major performance implications.\n\nThe PostgreSQL Approach: Indexes Point to Any Version\n\nIn PostgreSQL, secondary indexes point to heap tuple locations (TID = table ID + item pointer). When a row is updated:\n\n1. A new heap tuple is created\n2. The index might need a new entry pointing to the new tuple\n3. The old index entry still points to the old (now dead) tuple\n\nThis means:\n- Index scans may find old versions and must follow the chain to find visible versions\n- Index-only scans are complicated by visibility\n- Index bloat occurs as entries for old versions accumulate\n\nHOT (Heap-Only Tuples) Optimization:\n\nPostgreSQL's HOT optimization addresses index bloat when the updated columns are not indexed:\n- If the update doesn't change any indexed column values, the new version can be created in the same page without updating indexes\n- Indexes continue pointing to the old version, which forwards to the new version\n- Significantly reduces index maintenance overhead for many workloads
PostgreSQL HOT (Heap-Only Tuple) Update Scenario: Table users (id PK, name indexed, balance NOT indexed)UPDATE users SET balance = 900 WHERE id = 1; Without HOT (indexed column change or no room in page):┌────────────────────────────────────────┐│ Index on 'name' ││ "Alice" → (page 1, item 1) [OLD] ││ "Alice" → (page 2, item 5) [NEW] ←─── New index entry needed!└────────────────────────────────────────┘ With HOT (non-indexed column change, room in same page):┌────────────────────────────────────────┐│ Index on 'name' ││ "Alice" → (page 1, item 1) │ ← Index unchanged!└────────────────────────────────────────┘ │ ▼┌────────────────────────────────────────┐│ Heap Page 1 ││ ┌──────────────────────────────┐ ││ │ Item 1 (OLD - HEAP ONLY) │ ││ │ xmax: 150 │ ││ │ t_ctid: (1, 3) ──────────────────►││ └──────────────────────────────┘ ││ ┌──────────────────────────────┐ ││ │ Item 3 (NEW - CURRENT) │ ││ │ xmax: 0 │ ││ │ HEAP_ONLY flag set │ ← Not directly indexed│ └──────────────────────────────┘ │└────────────────────────────────────────┘ HOT chain pruning: When page is accessed, old HOT versionscan be immediately reclaimed without full VACUUM.The InnoDB Approach: Indexes Point to Primary Key\n\nInnoDB takes a different approach. Secondary indexes don't store pointers to physical row locations—they store the primary key value of the row.\n\nWhen reading via a secondary index:\n1. Find the primary key in the secondary index\n2. Use the primary key to locate the row in the clustered index\n3. Check visibility against the current version in the clustered index\n4. If a historical version is needed, follow the undo chain\n\nThis has interesting implications:\n- Secondary indexes don't need updating when non-indexed columns change\n- Index entries don't point to dead versions directly\n- But: secondary index lookups always require a primary key lookup\n- The clustered index becomes a potential bottleneck
Every version goes through a predictable lifecycle, from creation during a transaction to eventual garbage collection. Understanding this lifecycle is essential for both performance tuning and troubleshooting.\n\nPhase 1: Creation\n\nA version is born when a transaction performs an INSERT or UPDATE:\n- INSERT: Creates the first version of a new row\n- UPDATE: Creates a new version of an existing row (and marks the old version as superseded)\n- DELETE: Often implemented as creating a 'delete marker' version\n\nAt creation time, the version is only visible to the transaction that created it. The version's metadata reflects the in-progress status.\n\nPhase 2: Visibility Window\n\nAfter the creating transaction commits, the version becomes visible to new transactions. It remains visible until superseded by a newer version.\n\nThe version's visibility window is defined by:\n- Start: The commit timestamp of the creating transaction\n- End: The commit timestamp of the transaction that supersedes it (or infinity if current)\n\nPhase 3: Obsolescence\n\nA version becomes obsolete when it's no longer visible to any active or future transaction. This happens when:\n- A newer version exists AND\n- The newest superseding transaction has committed AND\n- No active transaction has a snapshot old enough to need this version\n\nPhase 4: Garbage Collection\n\nOnce obsolete, the version can be reclaimed. The timing and mechanism depend on the system:
| Phase | PostgreSQL | InnoDB/MySQL |
|---|---|---|
| Creation | Insert new heap tuple | Modify row, copy old to undo log |
| Visibility Start | Creating txn commits, hint bits set | Creating trx commits |
| Obsolescence Detection | xmax committed AND xmax < oldest active snapshot | Read view no longer references this version |
| GC Mechanism | VACUUM process scans heap | Purge background thread |
| GC Trigger | Manual, autovacuum thresholds, or wraparound prevention | Continuous background purge |
| Space Reclamation | Clears dead tuples, updates FSM | Undo log space reused, undo tablespace may shrink |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// Version Lifecycle State Machine enum VersionState { IN_PROGRESS, // Created by active transaction COMMITTED, // Creating transaction committed - visible to new transactions OBSOLETE, // Superseded and no longer needed by any transaction RECLAIMED // Space has been reclaimed by GC} class Version { state: VersionState = IN_PROGRESS created_by: Transaction superseded_by: Transaction = null // Transition: IN_PROGRESS → COMMITTED on_creator_commit(): this.state = COMMITTED // Version now visible to new transactions starting after commit // Transition: COMMITTED → OBSOLETE on_obsolescence_detected(): // Happens when: // 1. superseded_by has committed, AND // 2. oldest_active_snapshot > max(created_by.commit_time, superseded_by.commit_time) this.state = OBSOLETE // Version can now be garbage collected // Transition: OBSOLETE → RECLAIMED on_garbage_collection(): // PostgreSQL: VACUUM removes tuple, updates visibility map, FSM // InnoDB: Purge thread frees undo log record this.state = RECLAIMED free_storage(this)} // Determining if version is obsoletefunction is_version_obsolete(v: Version) -> Boolean: if v.state != COMMITTED: return false if v.superseded_by == null: return false // Still the current version if not v.superseded_by.is_committed(): return false // Newer version not yet permanent oldest_snapshot = get_oldest_active_snapshot_timestamp() // If oldest active snapshot started after the superseder committed, // no one can ever need this version again return oldest_snapshot > v.superseded_by.commit_timeA critical implication: if a transaction holds a snapshot for a long time (e.g., a 3-hour analytical query), ALL versions created after that snapshot started cannot be garbage collected—even if they've been superseded millions of times. This is a common cause of table bloat and performance degradation in production systems.
Garbage collection (GC) in MVCC systems is a critical background process that determines long-term system health. Without effective GC, version storage grows unboundedly, leading to storage exhaustion, performance degradation, and eventual system failure.\n\nPostgreSQL VACUUM:\n\nPostgreSQL's VACUUM process is the primary garbage collection mechanism. It operates in several modes:\n\n1. Standard VACUUM: Scans tables for dead tuples, marks their space as reusable, updates statistics\n2. VACUUM FULL: Rewrites the entire table, compacting it to reclaim space to the OS\n3. Autovacuum: Background daemon that triggers VACUUM based on table activity thresholds\n\nVACUUM must also address transaction ID wraparound: PostgreSQL uses 32-bit transaction IDs that can wrap around after ~4 billion transactions. VACUUM freezes old tuples to prevent visibility issues.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
-- PostgreSQL VACUUM Operations -- Standard VACUUM: reclaims dead tuple space within tableVACUUM my_table;-- Table size may not shrink - space is marked reusable-- Fast, doesn't lock the table for writes -- VACUUM ANALYZE: reclaims space AND updates statisticsVACUUM ANALYZE my_table;-- Critical for query planner to make good decisions -- VACUUM FULL: rewrites entire table, returns space to OSVACUUM FULL my_table;-- Requires exclusive lock - blocks reads and writes!-- Use only when significant space reclamation needed -- Check vacuum stats for a tableSELECT schemaname, relname, n_live_tup, n_dead_tup, n_dead_tup::float / NULLIF(n_live_tup + n_dead_tup, 0) as dead_ratio, last_vacuum, last_autovacuum, last_analyze, vacuum_count, autovacuum_countFROM pg_stat_user_tables WHERE relname = 'my_table'; -- Autovacuum configuration parameters-- autovacuum_vacuum_threshold = 50 (minimum dead tuples before vacuum)-- autovacuum_vacuum_scale_factor = 0.2 (fraction of table that must be dead)-- autovacuum_naptime = 1min (sleep time between runs) -- A table with 100,000 rows triggers autovacuum when:-- dead_tuples > 50 + (0.2 * 100,000) = 20,050 dead tuples -- Monitoring for bloatSELECT current_setting('autovacuum_vacuum_scale_factor') as scale_factor, current_setting('autovacuum_vacuum_threshold') as threshold; -- Check for tables needing vacuumSELECT schemaname || '.' || relname as table_name, n_dead_tup as dead_tuples, n_live_tup as live_tuples, n_dead_tup::float / NULLIF(n_live_tup, 0) * 100 as dead_percentage, last_autovacuumFROM pg_stat_user_tablesWHERE n_dead_tup > 1000ORDER BY n_dead_tup DESC;InnoDB Purge:\n\nInnoDB uses a different approach. The purge system runs continuously as a background thread (or multiple threads in recent versions):\n\n1. Purge Thread: Scans undo log for records no longer needed by any read view\n2. History List: Tracks undo logs awaiting purge in chronological order\n3. Purge Lag: If purge falls behind, the history list grows, consuming undo tablespace\n\nUnlike PostgreSQL's VACUUM, InnoDB's purge doesn't affect the main tablespace—it only reclaims undo log space. The main table stays compact because only the current version is stored there.
| Aspect | PostgreSQL VACUUM | InnoDB Purge |
|---|---|---|
| What is cleaned | Dead heap tuples + index entries | Undo log records |
| Main table impact | Creates free space within table pages | No impact (main table always has current only) |
| Triggering | Manual, autovacuum thresholds | Continuous background |
| Blocking | Standard: none; FULL: exclusive lock | None (background) |
| Monitoring | pg_stat_user_tables, pg_stat_progress_vacuum | SHOW ENGINE INNODB STATUS (history length) |
| Failure mode | Table bloat, txid wraparound | Undo tablespace growth, long row lookups |
In PostgreSQL: Monitor n_dead_tup and table size growth. In InnoDB: Watch 'History list length' in SHOW ENGINE INNODB STATUS—values over 1 million indicate purge lag. In both: Identify long-running transactions that prevent GC progress.
MVCC's version storage creates measurable overhead compared to single-version systems. Understanding this overhead is crucial for capacity planning and architecture decisions.\n\nPer-Version Overhead:\n\nEach version carries metadata beyond the row data itself:
| System | Metadata per Tuple | Overhead Estimate |
|---|---|---|
| PostgreSQL | xmin (4B) + xmax (4B) + cmin/cmax (4B) + ctid (6B) + infomask (4B) + natts (2B) | ~23 bytes + alignment padding |
| InnoDB | DB_TRX_ID (6B) + DB_ROLL_PTR (7B) + implicit row ID if no PK (6B) | ~13-19 bytes |
| Oracle | ITL (interested transaction list) + row header | Varies by block size |
Version Amplification:\n\nThe more significant overhead comes from version amplification—storing multiple complete versions of rows.\n\nScenario: A table has 1 million rows, each 100 bytes. A batch process updates every row once per day.\n\nPostgreSQL (Append-Only):\n- After update: 2 million tuples exist (1M dead, 1M live)\n- Before VACUUM: 200MB table (doubled)\n- VACUUM removes dead tuples: table returns to ~100MB\n- If vacuum is slow: temporary 2x-10x table bloat\n\nInnoDB (Undo Log):\n- Main table: constant 100MB (only live rows)\n- Undo log: 1M undo records × cost per record\n- Undo overhead: ~10-30MB for this case\n- Purge continuously reclaims\n\nFor update-heavy workloads with delays in GC, PostgreSQL can experience significant temporary bloat that InnoDB avoids.
1234567891011121314151617181920212223242526272829303132333435363738394041424344
-- PostgreSQL: Analyzing version storage overhead -- Calculate table bloat estimateSELECT schemaname, relname, pg_size_pretty(pg_total_relation_size(relid)) as total_size, pg_size_pretty(pg_relation_size(relid)) as table_size, pg_size_pretty(pg_indexes_size(relid)) as indexes_size, n_live_tup, n_dead_tup, ROUND( n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 2 ) as dead_pct, -- Estimate actual data size vs allocated ROUND( (n_live_tup * 100)::numeric / (pg_relation_size(relid) / 8192.0 * 200), 2 -- Assume 200 rows/page at fill factor ) as fill_efficiency_pctFROM pg_stat_user_tablesWHERE schemaname = 'public'ORDER BY pg_total_relation_size(relid) DESC; -- More accurate bloat calculation using pgstattupleCREATE EXTENSION IF NOT EXISTS pgstattuple; SELECT table_len, tuple_count, tuple_len, dead_tuple_count, dead_tuple_len, free_space, free_percentFROM pgstattuple('my_table'); -- InnoDB: Check undo tablespace usage-- (Run in MySQL/MariaDB)-- SHOW ENGINE INNODB STATUS; -- Look for "History list length"-- -- SELECT-- TOTAL_EXTENTS * 1048576 / 1024 / 1024 as undo_mb-- FROM INFORMATION_SCHEMA.FILES-- WHERE FILE_TYPE = 'UNDO LOG';For PostgreSQL: Plan for 2-3x table size under normal operations to account for version bloat between vacuums. For InnoDB: Plan undo tablespace for peak transaction length × peak concurrent transactions × average row size. Both: Always monitor actual usage patterns in production.
We've explored the intricate mechanics of how MVCC systems create, organize, and eventually reclaim multiple versions of data. This knowledge is essential for understanding system behavior and making informed operational decisions.
Looking Ahead:\n\nWith a solid understanding of how multiple versions are managed, we're ready to explore read consistency—how transactions determine exactly which version to read to achieve consistent, repeatable results. This is where MVCC's theoretical guarantees become practical reality.
You now understand the structural foundations of MVCC: version metadata, chain architectures, index interactions, lifecycle phases, garbage collection mechanisms, and storage overhead. Next, we'll see how these versions enable consistent reads across transactions.