Loading content...
Consider an airline booking system where a transaction needs to verify seat availability and passenger counts. The transaction reads 150 passengers booked and 180 total seats, confirming 30 seats remain available. Later in the same transaction, it reads the seat count again to finalize a booking—but now sees 175 total seats because a concurrent maintenance operation updated the aircraft configuration.
The transaction now holds two contradictory facts:
If it proceeds with the original calculation (30 available), it would allow overbooking—the actual availability is now 175 - 150 = 25 seats. Yet the transaction believes 30 seats are free.
This is read inconsistency: a transaction operating on data that presents a logically impossible view of the database—facts that couldn't coexist at any single point in time. Unrepeatable reads are the mechanism; inconsistent data views are the consequence.
By the end of this page, you will understand how unrepeatable reads create inconsistent data views, the concept of snapshot consistency, how multi-item inconsistencies emerge from single-item anomalies, and the relationship between read consistency and transaction correctness.
Read inconsistency occurs when a transaction's reads don't form a coherent snapshot of the database. The transaction observes a state that never actually existed—a Frankenstein's monster assembled from fragments of different database states.
Types of Read Inconsistency:
1. Single-Item Inconsistency (Basic Unrepeatable Read) The same data item is read twice with different values. The transaction has contradictory information about one specific piece of data.
2. Multi-Item Inconsistency (Read Skew) Related data items are read at different times, producing values that couldn't coexist. The inconsistency spans multiple data items that should maintain certain relationships.
3. Aggregate Inconsistency Reading the same aggregate query (SUM, COUNT, AVG) multiple times yields different results due to intervening modifications.
| Type | Data Scope | Example | Detection Difficulty |
|---|---|---|---|
| Single-Item | One row/field | Balance reads as $100, then $80 | Easy (compare values) |
| Multi-Item | Related rows | Account A + Account B should sum to $1000, but transaction sees $1050 | Medium (requires invariant knowledge) |
| Aggregate | Set of rows | COUNT(*) returns 100, then 103 | Easy (compare aggregates) |
| Derived | Computed values | Cached calculation differs from re-calculation | Hard (requires re-computation) |
The Fundamental Problem:
Databases evolve through a sequence of consistent states. Each committed transaction moves the database from one consistent state to another. For a transaction to operate correctly, it should perceive a single, coherent state.
When reads are inconsistent, the transaction perceives a synthetic state—one that never existed in the database's history:
Database States: S₀ → S₁ → S₂ → S₃ → S₄
↑ ↑
Transaction reads: Read R₁ from S₁ Read R₂ from S₃
R₁ ∪ R₂ ≠ any Sᵢ
The transaction's working set {R₁, R₂} corresponds to no actual database state. Decisions based on this synthetic state may violate real-world constraints.
Read inconsistency is particularly dangerous because each individual read returns valid, committed data. There's no error, no exception, no indication that anything is wrong. The database is functioning correctly—it's the transaction's perception that's fragmented across time.
The ideal solution to read inconsistency is snapshot consistency: every read in a transaction returns values from the same logical point in time—a consistent snapshot of the database.
Snapshot Consistency Definition:
A transaction exhibits snapshot consistency if all its reads observe values from a single consistent database state, regardless of when the reads occur or what concurrent transactions do.
With snapshot consistency, even if a transaction takes minutes to complete and reads the same data multiple times, every read returns the value as it was at the snapshot point—typically the transaction's start time.
Snapshot Properties:
How Databases Implement Snapshots:
Multi-Version Concurrency Control (MVCC):
MVCC maintains multiple versions of each data item, each tagged with a transaction timestamp or version number. When a transaction reads:
-- PostgreSQL example: Snapshot at start of first statement (READ COMMITTED)
-- or at transaction start (REPEATABLE READ/SERIALIZABLE)
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT balance FROM accounts WHERE id = 1; -- Returns 1000
-- Another session: UPDATE accounts SET balance = 500 WHERE id = 1; COMMIT;
SELECT balance FROM accounts WHERE id = 1; -- Still returns 1000!
COMMIT;
The second SELECT returns 1000 because the transaction's snapshot was taken before the concurrent update. PostgreSQL's REPEATABLE READ provides true snapshot consistency.
The snapshot timestamp (when the snapshot was taken) is different from when reads occur. A transaction might take its snapshot at T=100, then perform a read at T=150, but see data as of T=100. This temporal decoupling is what provides consistency.
Read skew is a specific form of read inconsistency where a transaction reads multiple related data items and observes an inconsistent combination due to intervening modifications.
The Classic Example: The Constraint Violation
Consider a database invariant: Account_A.balance + Account_B.balance = 1000 (total funds are conserved).
Initial State:
Account A: 500
Account B: 500
Sum: 1000 ✓
T₁ (Read): T₂ (Transfer):
──────────────────────── ────────────────────────
R(A) → 500
W(A) ← 600 (add 100)
W(B) ← 400 (subtract 100)
COMMIT
R(B) → 400
T₁ sees: A=500, B=400
Sum: 900 ≠ 1000 ✗
T₁ reads A before T₂'s update and B after. It observes a state where 100 dollars have vanished—a state that never existed in reality. T₂'s transfer was atomic (A+100 and B-100 happened together), but T₁ saw a fragmented view.
Read Skew in Real Systems:
Read skew creates serious problems in applications that need to enforce cross-item constraints:
| Domain | Invariant | Read Skew Consequence |
|---|---|---|
| Banking | Account balances must reconcile | Audit reports show discrepancies |
| Inventory | Stock count = received - sold | Inventory appears to have losses |
| Bookings | Reserved + Available = Total | System allows overbooking |
| Accounting | Debits = Credits | Books fail to balance |
| HR | Headcount by department sums to total | Reporting inconsistencies |
The Relationship to Unrepeatable Reads:
Read skew is conceptually related to unrepeatable reads:
Both result from reading at different points in the database's evolution. Read skew can even occur without reading any single item twice—the inconsistency emerges from the combination of reads across different items.
Preventing read skew requires the same mechanisms as preventing unrepeatable reads: snapshot isolation or serializable transactions. If related data items must be consistent, they need to be read from the same snapshot.
Aggregate queries (SUM, COUNT, AVG, etc.) are particularly susceptible to read inconsistency because they implicitly read many rows. If concurrent transactions modify the underlying data between when the query starts and completes, the aggregate may reflect a mix of old and new values.
The Aggregate Inconsistency Problem:
-- Transaction T₁: Generating a balance report
BEGIN;
-- Query 1: Total deposits
SELECT SUM(amount) FROM transactions WHERE type = 'DEPOSIT';
-- Returns: 100,000
-- Meanwhile, T₂: INSERT INTO transactions VALUES (..., 'DEPOSIT', 5000); COMMIT;
-- Query 2: Total withdrawals
SELECT SUM(amount) FROM transactions WHERE type = 'WITHDRAWAL';
-- Returns: 45,000
-- Query 3: Net calculation (done in app, or could be SQL)
SELECT SUM(amount) FROM transactions;
-- If T₂'s deposit is included: 60,000 (not 55,000!)
COMMIT;
The report shows:
The totals don't reconcile because they were computed at different points in time.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
-- Demonstration of aggregate inconsistency-- Run in two separate sessions to see the effect -- Session 1: Create test tableCREATE TABLE IF NOT EXISTS inventory ( product_id INT PRIMARY KEY, category VARCHAR(50), quantity INT); -- Insert test dataINSERT INTO inventory VALUES (1, 'Electronics', 100);INSERT INTO inventory VALUES (2, 'Electronics', 150);INSERT INTO inventory VALUES (3, 'Clothing', 200);INSERT INTO inventory VALUES (4, 'Clothing', 75); -- ============================================-- Session 1: Read Committed (allows inconsistency)-- ============================================BEGIN;SET TRANSACTION ISOLATION LEVEL READ COMMITTED; -- First aggregate: Electronics totalSELECT SUM(quantity) AS electronics_total FROM inventory WHERE category = 'Electronics';-- Returns: 250 -- [Session 2: UPDATE inventory SET quantity = 200 WHERE product_id = 1; COMMIT;]-- Electronics item 1: 100 -> 200 -- Second aggregate: Clothing totalSELECT SUM(quantity) AS clothing_total FROM inventory WHERE category = 'Clothing';-- Returns: 275 -- Third aggregate: Overall totalSELECT SUM(quantity) AS grand_total FROM inventory;-- Returns: 625 (includes the update!) -- But 250 + 275 = 525 ≠ 625!-- Inconsistency: electronics read old data, grand total read new data COMMIT; -- ============================================-- Alternative: REPEATABLE READ (consistent)-- ============================================BEGIN;SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; SELECT SUM(quantity) AS electronics_total FROM inventory WHERE category = 'Electronics';-- Returns: 250 -- [Session 2: UPDATE inventory SET quantity = 200 WHERE product_id = 1; COMMIT;]-- Change is invisible to this transaction SELECT SUM(quantity) AS clothing_total FROM inventory WHERE category = 'Clothing';-- Returns: 275 SELECT SUM(quantity) AS grand_total FROM inventory;-- Returns: 525 (consistent with sub-totals!) COMMIT;Why Aggregates Are Especially Vulnerable:
Duration: Aggregate queries over large tables take time, increasing the window for concurrent modifications
Implicit scope: The query scans many rows—each is a potential inconsistency point
No explicit revisit: Unlike explicit repeated reads, aggregate inconsistency doesn't involve reading the same row twice—just different rows at different times
Invisible composition: Multiple partial aggregates combined in the application are particularly risky if computed in separate queries
Best Practice: Atomic Reporting:
-- Good: Single query, atomic read
SELECT
SUM(CASE WHEN type = 'DEPOSIT' THEN amount ELSE 0 END) AS deposits,
SUM(CASE WHEN type = 'WITHDRAWAL' THEN amount ELSE 0 END) AS withdrawals,
SUM(amount) AS net
FROM transactions;
-- This single query reads all rows in one atomic operation,
-- eliminating the window for inconsistency
Financial and compliance reports are high-stakes areas where aggregate inconsistency can have serious consequences—audit failures, regulatory issues, or incorrect business decisions. Always use snapshot isolation or single-query approaches for critical reporting.
Detecting read inconsistency in running systems is challenging because each individual read returns correct data. Detection requires comparing reads or checking invariants—actions that applications may not naturally perform.
Detection Strategies:
1. Invariant Checking: If you know that certain values should maintain relationships, verify those relationships explicitly:
-- Check that sub-totals equal grand total
DO $$
DECLARE
electronics_sum INT;
clothing_sum INT;
grand_total INT;
BEGIN
SELECT SUM(quantity) INTO electronics_sum FROM inventory WHERE category = 'Electronics';
SELECT SUM(quantity) INTO clothing_sum FROM inventory WHERE category = 'Clothing';
SELECT SUM(quantity) INTO grand_total FROM inventory;
IF electronics_sum + clothing_sum != grand_total THEN
RAISE EXCEPTION 'Inconsistency detected: % + % != %',
electronics_sum, clothing_sum, grand_total;
END IF;
END $$;
2. Version Comparison: Some systems track row versions or timestamps that can be compared:
-- PostgreSQL: Check if version changed
SELECT xmin, balance FROM accounts WHERE id = 1; -- xmin = 1234
-- ... later ...
SELECT xmin, balance FROM accounts WHERE id = 1; -- xmin = 1236?
-- Different xmin = row was modified
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177
from typing import Dict, Set, List, Tuple, Optionalfrom dataclasses import dataclass, fieldfrom collections import defaultdictimport hashlib @dataclassclass ReadRecord: """Records information about a read operation.""" item: str value: any read_number: int # 1st, 2nd, etc read of this item timestamp: int @dataclassclass InconsistencyDetector: """ Detects read inconsistency patterns in transaction execution. Strategies: 1. Track repeated reads of same item (unrepeatable reads) 2. Track reads of related items (potential read skew) 3. Verify invariants when specified """ reads: Dict[str, List[ReadRecord]] = field(default_factory=lambda: defaultdict(list)) invariants: List[callable] = field(default_factory=list) current_values: Dict[str, any] = field(default_factory=dict) timestamp: int = 0 def record_read(self, item: str, value: any) -> Optional[str]: """ Record a read and check for unrepeatable read. Returns warning message if inconsistency detected. """ self.timestamp += 1 item_reads = self.reads[item] read_number = len(item_reads) + 1 record = ReadRecord( item=item, value=value, read_number=read_number, timestamp=self.timestamp ) item_reads.append(record) self.current_values[item] = value # Check for unrepeatable read if read_number > 1: previous = item_reads[-2] if previous.value != value: return (f"UNREPEATABLE READ DETECTED: " f"Item '{item}' changed from {previous.value} " f"(read #{previous.read_number}) to {value} " f"(read #{read_number})") return None def add_invariant(self, name: str, check_fn: callable, involved_items: List[str]) -> None: """ Add an invariant that should hold across related items. Args: name: Human-readable invariant name check_fn: Function(values_dict) -> bool involved_items: List of item names involved in invariant """ self.invariants.append({ 'name': name, 'check': check_fn, 'items': set(involved_items) }) def check_invariants(self) -> List[str]: """ Check all invariants against current read values. Returns list of violated invariant messages. """ violations = [] for inv in self.invariants: # Check if we have values for all involved items if not all(item in self.current_values for item in inv['items']): continue # Not enough data yet try: if not inv['check'](self.current_values): violations.append( f"INVARIANT VIOLATION: {inv['name']} failed with " f"values {dict((k, self.current_values[k]) for k in inv['items'])}" ) except Exception as e: violations.append(f"INVARIANT ERROR: {inv['name']} raised {e}") return violations def generate_consistency_hash(self) -> str: """ Generate a hash of all current values. Can be used to compare transaction snapshots. """ sorted_items = sorted(self.current_values.items()) content = str(sorted_items).encode() return hashlib.md5(content).hexdigest()[:12] def get_read_summary(self) -> Dict: """Get summary of all reads performed.""" return { 'items_read': len(self.reads), 'total_reads': sum(len(v) for v in self.reads.values()), 'items_read_multiple': sum(1 for v in self.reads.values() if len(v) > 1), } def demonstrate_detection(): """Demonstrate inconsistency detection.""" print("=" * 60) print("Scenario: Bank Account Consistency Check") print("=" * 60) detector = InconsistencyDetector() # Define invariant: accounts sum to 1000 detector.add_invariant( name="Total Balance Conservation", check_fn=lambda v: v.get('account_a', 0) + v.get('account_b', 0) == 1000, involved_items=['account_a', 'account_b'] ) # Simulate reads that observe inconsistent state print("\nTransaction reads:") # Read account A (old value) warning = detector.record_read('account_a', 500) print(f" Read account_a = 500") if warning: print(f" ⚠️ {warning}") # Simulate concurrent transfer: A goes to 600, B goes to 400 # Transaction reads B (new value) warning = detector.record_read('account_b', 400) print(f" Read account_b = 400") if warning: print(f" ⚠️ {warning}") # Check invariants violations = detector.check_invariants() print(f"\nInvariant check:") if violations: for v in violations: print(f" ❌ {v}") print(" Transaction observed an inconsistent state!") else: print(" ✓ All invariants satisfied") print("\n" + "=" * 60) print("Scenario: Unrepeatable Read Detection") print("=" * 60) detector2 = InconsistencyDetector() print("\nTransaction reads:") warning = detector2.record_read('balance', 1000) print(f" Read balance = 1000") if warning: print(f" ⚠️ {warning}") # Simulate concurrent modification + commit # Transaction reads again warning = detector2.record_read('balance', 800) print(f" Read balance = 800") if warning: print(f" ⚠️ {warning}") summary = detector2.get_read_summary() print(f"\nRead Summary: {summary}") if __name__ == "__main__": demonstrate_detection()Logging and Monitoring:
Production systems should log enough information to reconstruct whether inconsistency occurred:
With this information, post-hoc analysis can identify transactions that may have experienced inconsistency.
In practice, preventing inconsistency (through proper isolation levels) is far easier than detecting it after the fact. Detection is primarily useful for auditing, testing, and understanding system behavior—not for fixing problems in real-time.
Read inconsistency manifests in applications in various ways, often without clear error messages. Understanding these manifestations helps developers recognize when inconsistency might be the root cause of bugs.
Common Manifestations:
Debugging Pattern:
When investigating potential read inconsistency:
Example Investigation:
Bug report: "Inventory occasionally shows negative stock after sale"
1. Isolation level: READ COMMITTED
2. Dependencies: available_stock = total_stock - reserved
3. Read pattern:
- Read total_stock (100)
- Process sale logic...
- Read reserved (50)
- Calculate available = 100 - 50 = 50
- But total_stock was actually updated to 40 mid-transaction!
4. Concurrency: Multiple sales processing simultaneously
5. Fix: Use REPEATABLE READ or read all values in single query
Read inconsistency often causes 'heisenbugs'—bugs that disappear when you try to observe them. Adding logging can change timing enough to prevent the race. Running in a debugger serializes operations. The bug only manifests under specific, hard-to-reproduce concurrent conditions.
Different database systems and configurations provide different consistency guarantees. Understanding what your system actually provides is essential for writing correct applications.
PostgreSQL Consistency Levels:
| Level | Snapshot Timing | Consistency Guarantee |
|---|---|---|
| READ UNCOMMITTED | Same as READ COMMITTED | Per-statement snapshot (no dirty reads) |
| READ COMMITTED | Start of each statement | Statement-level consistency only |
| REPEATABLE READ | Start of transaction | Transaction-level snapshot consistency |
| SERIALIZABLE | Start of transaction + conflict detection | True serializability |
MySQL (InnoDB) Consistency Levels:
| Level | Locking Behavior | Read Consistency |
|---|---|---|
| READ UNCOMMITTED | No locks for reads | May see uncommitted changes |
| READ COMMITTED | Locks released after each read | Per-statement view |
| REPEATABLE READ | Locks held until transaction end | Transaction-level snapshot |
| SERIALIZABLE | Gap locks + next-key locks | Full serialization |
Oracle and SQL Server:
Both provide snapshot isolation explicitly:
-- Oracle
SET TRANSACTION READ ONLY; -- Snapshot at transaction start
-- SQL Server
SET TRANSACTION ISOLATION LEVEL SNAPSHOT; -- Requires DB configuration
Key Implementation Differences:
Default isolation levels vary by database and even by connection library. Always explicitly set the isolation level for transactions that require consistency guarantees. 'It worked in testing' doesn't mean it will work under production concurrency.
Inconsistent reads are the consequence of unrepeatable reads and related phenomena, creating transactions that observe synthetic database states. Let's consolidate the key concepts:
What's next:
We've examined how unrepeatable reads create inconsistency. Next, we'll look at concrete example scenarios—real-world situations where these anomalies cause problems, helping you recognize and anticipate them in your own systems.
You now understand how unrepeatable reads lead to broader read inconsistency, the concept of snapshot consistency, and how to recognize consistency issues in applications. Next, we'll explore detailed real-world scenarios that demonstrate these problems in action.