Lost Update Problem - Learning Module

Loading content...

0/241

Prevention Need

The Imperative of Prevention

Having understood what lost updates are, seen them manifest in real scenarios, and examined the depth of corruption they cause, a natural question arises: How do we prevent them?

Prevention of lost updates is not optional for any system that modifies shared data concurrently. The silent, cumulative, and trust-eroding nature of the problem means that "waiting until it becomes an issue" is not a viable strategy. By the time lost updates manifest visibly, significant damage has already occurred.

This page explores why prevention is essential, the layers at which prevention can occur, and the primary mechanisms used in production systems. We'll develop a framework for selecting appropriate prevention strategies based on specific application requirements.

Prevention vs. Detection

Prevention and detection are complementary but not equivalent. Detection identifies corruption after it occurs—useful for forensics but the damage is done. Prevention stops corruption before it happens. For data integrity, prevention is always the primary strategy; detection is a safety net.

Why Prevention Cannot Be Avoided

Some engineering problems allow "monitor and react" approaches—you observe the system, and when issues arise, you address them. Lost updates do not permit this approach for several fundamental reasons:

1. Corruption is Silent

Lost updates produce no errors, warnings, or alerts. You cannot "react" to something you cannot detect. By the time you realize data is wrong (through audits, customer complaints, or invariant violations), the corruption has already propagated.

2. Damage is Cumulative

Each lost update adds to the total corruption. A system running for months without prevention accumulates drift that becomes increasingly difficult to correct. There is no steady state where "some" lost updates are acceptable.

3. Correction is Often Impossible

Once a counter shows 9,800 instead of 10,000, how do you determine which 200 increments were lost? Which specific customers' transactions were affected? Often, the information needed to correct corruption was never recorded because the system "didn't know" objects were being lost.

4. Trust Damage is Irreversible

When users discover that a system produces incorrect data, their trust is damaged. Even after fixing the root cause, the memory of "that time the numbers were wrong" persists. Trust is easy to lose and hard to rebuild.

The Cost-Benefit Reality

•Prevention Cost: Minutes to hours of developer time implementing proper concurrency control during development.
•Remediation Cost: Days to weeks investigating mysterious discrepancies, weeks to months correcting corrupted data, permanent trust damage.
•Ratio: Prevention costs roughly 1-10 units of effort; remediation costs 100-1000 units for the same scope.

The Engineering Principle

In reliability engineering, the axiom is: 'If it can fail silently, it will fail silently at the worst possible time.' Lost updates are the textbook example. Prevention is not gold-plating or over-engineering—it is the baseline for professional database application development.

Prevention Layers

Lost update prevention can be implemented at multiple layers of the system stack. Each layer offers different tradeoffs between control, performance, and complexity.

Prevention Layers in the System Stack
Layer	Mechanism	Advantages	Disadvantages
Database Engine	Isolation levels, internal locking	Transparent to application; well-tested	May not prevent all cases; can reduce concurrency
SQL Statement	Atomic operations, locking hints	Precise control; clear semantics	Requires developer discipline; database-specific
Transaction	Explicit locking, serializable isolation	Guarantees correctness; composable	Can cause deadlocks; reduces throughput
Application	Optimistic locking, version checks	Full control; works across databases	Complex to implement correctly; easy to bypass
Architecture	Serializing queues, event sourcing	Eliminates concurrency at source	Major architectural change; may not suit all workloads

Layer Selection Principles:

Prefer Lower Layers: When possible, rely on database-level prevention. The database engine is tested, optimized, and handles edge cases you might not anticipate.

Move Up When Necessary: Application-level prevention becomes necessary when:

Business logic spans multiple database operations
You need to check conditions in application code before committing
Cross-database transactions are involved
The database doesn't support your required isolation level

Combine Layers for Defense in Depth: Production systems often use multiple prevention layers—database serializable isolation with application-level version checking provides redundant protection.

Pessimistic Locking

Pessimistic locking assumes that conflicts are likely and prevents them by acquiring locks before accessing data. The philosophy is: "Assume the worst—lock early and hold until done."

How It Prevents Lost Updates:

By acquiring an exclusive lock on a row before reading it, a transaction ensures that no other transaction can read or modify that row until the lock is released. This serializes access and eliminates the read-modify-write race condition.

pessimistic_locking.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Pessimistic locking example: Bank withdrawal
-- Using SELECT ... FOR UPDATE to acquire an exclusive lock
 
BEGIN TRANSACTION;
 
-- Acquire exclusive lock on the row while reading
-- Other transactions attempting to SELECT FOR UPDATE, UPDATE, or DELETE
-- will block until this transaction commits or rolls back
SELECT balance FROM accounts 
WHERE account_id = 'ACC-7821' 
FOR UPDATE;
 
-- Now we have exclusive access - no other transaction can interfere
-- Read balance: $500
 
-- Check sufficient funds
-- (Application logic - if balance < withdrawal_amount, abort)
 
-- Perform the update safely
UPDATE accounts 
SET balance = balance - 100 
WHERE account_id = 'ACC-7821';
 
COMMIT;
 
-- Lock is released on commit
-- Now other transactions can proceed

Timeline with Pessimistic Locking:

Pessimistic Locking Prevents Lost Update
Time	T₁ (Withdrawal $100)	T₂ (Withdrawal $400)	Locks Held	Balance
t₀	BEGIN	BEGIN	None	$500
t₁	SELECT FOR UPDATE → $500	—	T₁ holds X-lock on row	$500
t₂	—	SELECT FOR UPDATE → BLOCKED	T₁ holds X-lock	$500
t₃	UPDATE balance = 400	WAITING...	T₁ holds X-lock	$400
t₄	COMMIT (releases lock)	WAITING...	None	$400
t₅	—	UNBLOCKED: SELECT → $400	T₂ holds X-lock	$400
t₆	—	UPDATE balance = 0	T₂ holds X-lock	$0
t₇	—	COMMIT	None	$0 ✓

Correct Result Achieved

T₂ was blocked until T₁ completed. When T₂ finally read the balance, it saw the correct post-T₁ value ($400) and computed the correct final value ($0). No update was lost.

Advantages

•Guaranteed prevention of lost updates
•Simple mental model—lock before access
•Works with any read-modify-write logic
•No possibility of "lost" work due to version mismatch
•Widely supported by all major databases

Disadvantages

•Reduces concurrency—transactions wait for locks
•Risk of deadlocks when multiple rows are locked
•Long-running transactions hold locks, blocking others
•Not suitable for high-contention hot spots
•Doesn't work across database boundaries

Optimistic Locking

Optimistic locking (also called optimistic concurrency control) assumes that conflicts are rare and checks for conflicts only at write time. The philosophy is: "Assume the best—check before committing."

How It Works:

When reading a record, also read a "version" indicator (version number, timestamp, or hash)
Perform application logic without holding locks
When writing, verify that the version hasn't changed since the read
If version matches, update succeeds and version is incremented
If version doesn't match, update fails—handle via retry or abort

optimistic_locking.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- Optimistic locking using version number
-- Schema includes a version column
CREATE TABLE accounts (
    account_id VARCHAR(50) PRIMARY KEY,
    balance DECIMAL(15, 2) NOT NULL,
    version INT NOT NULL DEFAULT 1,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
 
-- Transaction T1: Withdraw $100
BEGIN TRANSACTION;
 
-- Read balance AND version
SELECT balance, version FROM accounts WHERE account_id = 'ACC-7821';
-- Returns: balance = $500, version = 1
 
-- Application computes new balance (no lock held)
-- new_balance = 500 - 100 = 400
 
-- Attempt update with version check
UPDATE accounts 
SET balance = 400, 
    version = version + 1,
    updated_at = NOW()
WHERE account_id = 'ACC-7821' 
  AND version = 1;  -- Only succeed if version unchanged
 
-- Check rows affected
-- If 1 row affected: Success! Version was still 1.
-- If 0 rows affected: CONFLICT! Someone else modified the row.
 
COMMIT;

Handling Optimistic Lock Failures:

When the UPDATE affects 0 rows, the application must handle the conflict:

optimistic_handling.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
def withdraw_with_optimistic_locking(account_id: str, amount: Decimal) -> Result:
    """
    Implements withdrawal with optimistic locking and retry logic
    """
    MAX_RETRIES = 3
    
    for attempt in range(MAX_RETRIES):
        try:
            # Start transaction
            db.begin()
            
            # Read current state including version
            account = db.query("""
                SELECT balance, version 
                FROM accounts 
                WHERE account_id = %s
            """, (account_id,))
            
            # Check business rule
            if account['balance'] < amount:
                db.rollback()
                return Result.failure("Insufficient funds")
            
            new_balance = account['balance'] - amount
            old_version = account['version']
            
            # Attempt conditional update
            rows_updated = db.execute("""
                UPDATE accounts 
                SET balance = %s, version = version + 1
                WHERE account_id = %s AND version = %s
            """, (new_balance, account_id, old_version))
            
            if rows_updated == 1:
                # Success - no conflict
                db.commit()
                return Result.success(f"Withdrawn {amount}, new balance: {new_balance}")
            else:
                # Conflict detected - version changed
                db.rollback()
                # On conflict, retry with updated data
                # Exponential backoff to reduce collision probability
                time.sleep(0.01 * (2 ** attempt))
                continue
                
        except Exception as e:
            db.rollback()
            raise
    
    # Max retries exceeded
    return Result.failure("Transaction failed after maximum retries - high contention")

Advantages

•No lock waiting—maximum concurrency
•No deadlock risk
•Works across database boundaries
•Ideal for low-contention workloads
•Web-friendly—no long-held connections needed

Disadvantages

•Conflicts cause work to be discarded—wasted CPU
•Under high contention, retry storms can occur
•Starvation possible—same transaction keeps colliding
•More complex error handling in application
•Version column must be managed carefully

When to Use Optimistic vs Pessimistic

Use optimistic locking when: read-heavy workloads, low probability of conflicting writes, stateless applications, or distributed/cross-database scenarios. Use pessimistic locking when: high write contention on same records, complex multi-statement transactions, or conflicts are common and retry storms would be expensive.

Atomic Operations

The simplest and most efficient prevention mechanism is eliminating the read-modify-write pattern entirely by using atomic operations. Instead of reading a value, computing in the application, and writing back, perform the entire operation in a single database statement.

The Key Insight:

The lost update vulnerability exists because there's a gap between reading and writing. Atomic operations close this gap by combining read, modify, and write into a single indivisible unit.

atomic_operations.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- VULNERABLE: Read-Modify-Write pattern
-- Step 1: Read
SELECT balance FROM accounts WHERE account_id = 'ACC-7821';
-- Gap: another transaction can modify balance here!
-- Step 2: Write computed value
UPDATE accounts SET balance = 400 WHERE account_id = 'ACC-7821';
 
-- SAFE: Atomic operation
-- Single statement combines read + modify + write
UPDATE accounts 
SET balance = balance - 100 
WHERE account_id = 'ACC-7821';
 
-- The database engine ensures atomicity:
-- 1. Acquires lock on row
-- 2. Reads current balance
-- 3. Computes balance - 100
-- 4. Writes new value
-- 5. Releases lock
-- All in one atomic operation - no gap for interference

Common Atomic Operation Patterns:

atomic_patterns.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Counter increment
-- Instead of: SELECT count; UPDATE count = count + 1
UPDATE page_views SET count = count + 1 WHERE page_id = 'P-123';
 
-- Balance modification
-- Instead of: SELECT balance; check; UPDATE balance = computed
UPDATE accounts SET balance = balance - 100 
WHERE account_id = 'ACC-7821' AND balance >= 100;
-- Returns 1 row affected if successful, 0 if insufficient funds
 
-- Inventory decrement with constraint
UPDATE inventory 
SET quantity = quantity - 1 
WHERE sku = 'WIDGET-X' AND quantity >= 1
RETURNING quantity;  -- PostgreSQL: return new value
 
-- Conditional state change (booking)
UPDATE rooms 
SET is_available = FALSE, booked_by = 'user-123'
WHERE room_id = 'CONF-A' AND time_slot = '2024-01-15 14:00'
  AND is_available = TRUE;
-- Returns 1 if booking succeeded, 0 if room was already booked
 
-- Append to JSON array (PostgreSQL)
UPDATE user_logs 
SET events = events || '{"action": "login", "time": "..."}'::jsonb
WHERE user_id = 'U-456';
 
-- Increment with atomic fetch (PostgreSQL sequence-like)
UPDATE sequences 
SET next_val = next_val + 1 
WHERE seq_name = 'order_number'
RETURNING next_val - 1 AS current_val;

Preferred When Possible

Atomic operations are the gold standard for preventing lost updates. They're simpler, faster (no extra round-trips), and eliminate the problem at its source. Whenever operation logic can be expressed in a single SQL statement, prefer atomic operations over read-modify-write patterns.

Limitations of Atomic Operations:

Not all logic can be expressed atomically:

Complex business rules requiring application-side computation
Operations spanning multiple tables without database support for compound atomicity
Logic requiring external service calls between read and write
Conditional updates based on data from different systems

In these cases, pessimistic or optimistic locking becomes necessary.

Isolation Level Considerations

SQL defines four standard isolation levels, each providing different guarantees. However, only SERIALIZABLE fully prevents lost updates through the database engine alone. Understanding why lower isolation levels don't suffice is critical for system design.

Isolation Levels and Lost Update Prevention
Isolation Level	Prevents Lost Updates?	How/Why
READ UNCOMMITTED	No	Allows dirty reads. No protection against R→R→W→W pattern.
READ COMMITTED	No	Prevents dirty reads but allows concurrent writes on same row. The R→R→W→W pattern is fully possible.
REPEATABLE READ	Depends on implementation	SQL standard: No. PostgreSQL: Yes (via snapshot isolation). MySQL InnoDB: No.
SERIALIZABLE	Yes	Guarantees equivalent to serial execution. Any interleaving that would cause lost updates is detected and one transaction aborted.

The SQL Standard Surprise

REPEATABLE READ in the SQL standard does NOT prevent lost updates. It only guarantees that if you read a row twice in the same transaction, you'll see the same value. Two transactions can still both read the same row and write conflicting updates. PostgreSQL extends REPEATABLE READ to use snapshot isolation, which does detect conflicts—but this is implementation-specific, not standard-mandated.

Why Developers Get Confused:

The isolation level documentation often says REPEATABLE READ prevents "non-repeatable reads," and developers interpret this as preventing all concurrent modification issues. The confusion arises because:

Non-repeatable read (anomaly name): T₁ reads X, T₂ modifies X, T₁ reads X again and sees a different value.
Lost update (different anomaly): T₁ reads X, T₂ reads X, both modify and write back, one write is lost.

These are different problems! REPEATABLE READ addresses #1 by locking read rows against modification (or using snapshots). It does not address #2 because both transactions reading concurrently is allowed—the problem is on the write side, not the read side.

The SERIALIZABLE Solution:

SERIALIZABLE isolation truly prevents lost updates by ensuring that the final result is equivalent to some serial execution order. If two transactions would produce a lost update, the database detects the conflict (via predicate locking, snapshot isolation with conflict detection, or other mechanisms) and aborts one transaction.

serializable_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Using SERIALIZABLE to prevent lost updates
-- Transaction T1
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
 
SELECT balance FROM accounts WHERE account_id = 'ACC-7821';
-- Returns $500
 
-- Meanwhile, T2 also reads $500 and attempts to update
 
UPDATE accounts SET balance = 400 WHERE account_id = 'ACC-7821';
COMMIT;
-- T1 commits successfully
 
-- Transaction T2 (concurrent)
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
 
SELECT balance FROM accounts WHERE account_id = 'ACC-7821';
-- Returns $500 (same snapshot or value)
 
UPDATE accounts SET balance = 100 WHERE account_id = 'ACC-7821';
COMMIT;
-- ERROR: could not serialize access due to concurrent update
-- T2 must retry
 
-- After retry, T2 reads $400 (T1's committed value)
-- Correctly computes $0

Choosing the Right Prevention Strategy

Given multiple prevention mechanisms, how do you choose the right one for your application? The decision depends on several factors: contention patterns, latency requirements, transaction complexity, and architectural constraints.

Decision Framework

•Can the operation be expressed atomically? If yes, use atomic UPDATE statements. This is always the best choice when possible.
•Is contention low? If yes, optimistic locking is efficient—conflicts are rare, so retry costs are minimal.
•Is contention high on specific rows? If yes, pessimistic locking (SELECT FOR UPDATE) prevents retry storms.
•Is the transaction complex with multiple operations? If yes, consider pessimistic locking or SERIALIZABLE isolation to avoid partial work.
•Is the system distributed across databases? If yes, application-level optimistic locking with version columns is often the only option.
•Are latency requirements strict? If yes, avoid pessimistic locking on hot paths (blocking increases latency).

Strategy Selection Matrix
Scenario	Recommended Strategy	Rationale
Simple counter increment	Atomic UPDATE	Single statement, no gap possible
Inventory decrement with check	Atomic UPDATE with WHERE condition	Combine check and decrement atomically
Multi-field update, low contention	Optimistic locking (version column)	No blocking, retries are rare
Bank transfer (debit + credit)	Pessimistic locking (SELECT FOR UPDATE)	Complex transaction, must not fail halfway
Booking system with high demand	Pessimistic locking or SERIALIZABLE	High contention on popular slots
Distributed microservices	Application-level optimistic locking	No single database to control; use version tokens
Legacy system, can't modify schema	SERIALIZABLE isolation level	Prevention without schema changes

Layered Defense

In critical systems, combine strategies: use atomic operations where possible, add version columns for application-level verification, and set database isolation to SERIALIZABLE for the most sensitive transactions. Defense in depth ensures that if one layer fails to prevent an issue, another catches it.

Summary: The Prevention Imperative

We've established that lost update prevention is not optional—it's a fundamental requirement for any system with concurrent data modification. We've explored the primary prevention mechanisms and how to select among them.

Key Takeaways

•Prevention is not optional: The silent, cumulative nature of lost updates means waiting for problems to manifest is never acceptable.
•Multiple layers available: Prevention can occur at database engine, SQL statement, transaction, application, or architecture layers.
•Atomic operations preferred: When logic can be expressed in a single UPDATE statement, this eliminates the read-write gap entirely.
•Pessimistic locking for high contention: SELECT FOR UPDATE ensures exclusive access but reduces concurrency.
•Optimistic locking for low contention: Version columns detect conflicts at write time with minimal overhead.
•Isolation levels matter: Only SERIALIZABLE truly prevents lost updates at the database level; lower levels require additional mechanisms.
•Match strategy to workload: Choose prevention mechanisms based on contention patterns, transaction complexity, and system constraints.

What's Next:

With prevention strategies understood, the final piece is detection—how to identify when lost updates have occurred despite prevention efforts, for forensics and system validation. The next page explores detection mechanisms.

Page Complete

You now understand why lost update prevention is essential and have a framework for selecting appropriate prevention strategies. This knowledge enables you to design concurrent systems that maintain data integrity from the start.

Prevention Need

The Imperative of Prevention

Having understood what lost updates are, seen them manifest in real scenarios, and examined the depth of corruption they cause, a natural question arises: How do we prevent them?

Prevention vs. Detection

Why Prevention Cannot Be Avoided

1. Corruption is Silent

2. Damage is Cumulative

3. Correction is Often Impossible

4. Trust Damage is Irreversible

The Cost-Benefit Reality

•Prevention Cost: Minutes to hours of developer time implementing proper concurrency control during development.
•Remediation Cost: Days to weeks investigating mysterious discrepancies, weeks to months correcting corrupted data, permanent trust damage.
•Ratio: Prevention costs roughly 1-10 units of effort; remediation costs 100-1000 units for the same scope.

The Engineering Principle

Prevention Layers

Lost update prevention can be implemented at multiple layers of the system stack. Each layer offers different tradeoffs between control, performance, and complexity.

Prevention Layers in the System Stack
Layer	Mechanism	Advantages	Disadvantages
Database Engine	Isolation levels, internal locking	Transparent to application; well-tested	May not prevent all cases; can reduce concurrency
SQL Statement	Atomic operations, locking hints	Precise control; clear semantics	Requires developer discipline; database-specific
Transaction	Explicit locking, serializable isolation	Guarantees correctness; composable	Can cause deadlocks; reduces throughput
Application	Optimistic locking, version checks	Full control; works across databases	Complex to implement correctly; easy to bypass
Architecture	Serializing queues, event sourcing	Eliminates concurrency at source	Major architectural change; may not suit all workloads

Layer Selection Principles:

Prefer Lower Layers: When possible, rely on database-level prevention. The database engine is tested, optimized, and handles edge cases you might not anticipate.

Move Up When Necessary: Application-level prevention becomes necessary when:

Business logic spans multiple database operations
You need to check conditions in application code before committing
Cross-database transactions are involved
The database doesn't support your required isolation level

Pessimistic Locking

Pessimistic locking assumes that conflicts are likely and prevents them by acquiring locks before accessing data. The philosophy is: "Assume the worst—lock early and hold until done."

How It Prevents Lost Updates:

pessimistic_locking.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Pessimistic locking example: Bank withdrawal
-- Using SELECT ... FOR UPDATE to acquire an exclusive lock
 
BEGIN TRANSACTION;
 
-- Acquire exclusive lock on the row while reading
-- Other transactions attempting to SELECT FOR UPDATE, UPDATE, or DELETE
-- will block until this transaction commits or rolls back
SELECT balance FROM accounts 
WHERE account_id = 'ACC-7821' 
FOR UPDATE;
 
-- Now we have exclusive access - no other transaction can interfere
-- Read balance: $500
 
-- Check sufficient funds
-- (Application logic - if balance < withdrawal_amount, abort)
 
-- Perform the update safely
UPDATE accounts 
SET balance = balance - 100 
WHERE account_id = 'ACC-7821';
 
COMMIT;
 
-- Lock is released on commit
-- Now other transactions can proceed

Timeline with Pessimistic Locking:

Pessimistic Locking Prevents Lost Update
Time	T₁ (Withdrawal $100)	T₂ (Withdrawal $400)	Locks Held	Balance
t₀	BEGIN	BEGIN	None	$500
t₁	SELECT FOR UPDATE → $500	—	T₁ holds X-lock on row	$500
t₂	—	SELECT FOR UPDATE → BLOCKED	T₁ holds X-lock	$500
t₃	UPDATE balance = 400	WAITING...	T₁ holds X-lock	$400
t₄	COMMIT (releases lock)	WAITING...	None	$400
t₅	—	UNBLOCKED: SELECT → $400	T₂ holds X-lock	$400
t₆	—	UPDATE balance = 0	T₂ holds X-lock	$0
t₇	—	COMMIT	None	$0 ✓

Correct Result Achieved

T₂ was blocked until T₁ completed. When T₂ finally read the balance, it saw the correct post-T₁ value ($400) and computed the correct final value ($0). No update was lost.

Advantages

•Guaranteed prevention of lost updates
•Simple mental model—lock before access
•Works with any read-modify-write logic
•No possibility of "lost" work due to version mismatch
•Widely supported by all major databases

Disadvantages

•Reduces concurrency—transactions wait for locks
•Risk of deadlocks when multiple rows are locked
•Long-running transactions hold locks, blocking others
•Not suitable for high-contention hot spots
•Doesn't work across database boundaries

Optimistic Locking

How It Works:

When reading a record, also read a "version" indicator (version number, timestamp, or hash)
Perform application logic without holding locks
When writing, verify that the version hasn't changed since the read
If version matches, update succeeds and version is incremented
If version doesn't match, update fails—handle via retry or abort

optimistic_locking.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- Optimistic locking using version number
-- Schema includes a version column
CREATE TABLE accounts (
    account_id VARCHAR(50) PRIMARY KEY,
    balance DECIMAL(15, 2) NOT NULL,
    version INT NOT NULL DEFAULT 1,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
 
-- Transaction T1: Withdraw $100
BEGIN TRANSACTION;
 
-- Read balance AND version
SELECT balance, version FROM accounts WHERE account_id = 'ACC-7821';
-- Returns: balance = $500, version = 1
 
-- Application computes new balance (no lock held)
-- new_balance = 500 - 100 = 400
 
-- Attempt update with version check
UPDATE accounts 
SET balance = 400, 
    version = version + 1,
    updated_at = NOW()
WHERE account_id = 'ACC-7821' 
  AND version = 1;  -- Only succeed if version unchanged
 
-- Check rows affected
-- If 1 row affected: Success! Version was still 1.
-- If 0 rows affected: CONFLICT! Someone else modified the row.
 
COMMIT;

Handling Optimistic Lock Failures:

When the UPDATE affects 0 rows, the application must handle the conflict:

optimistic_handling.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
def withdraw_with_optimistic_locking(account_id: str, amount: Decimal) -> Result:
    """
    Implements withdrawal with optimistic locking and retry logic
    """
    MAX_RETRIES = 3
    
    for attempt in range(MAX_RETRIES):
        try:
            # Start transaction
            db.begin()
            
            # Read current state including version
            account = db.query("""
                SELECT balance, version 
                FROM accounts 
                WHERE account_id = %s
            """, (account_id,))
            
            # Check business rule
            if account['balance'] < amount:
                db.rollback()
                return Result.failure("Insufficient funds")
            
            new_balance = account['balance'] - amount
            old_version = account['version']
            
            # Attempt conditional update
            rows_updated = db.execute("""
                UPDATE accounts 
                SET balance = %s, version = version + 1
                WHERE account_id = %s AND version = %s
            """, (new_balance, account_id, old_version))
            
            if rows_updated == 1:
                # Success - no conflict
                db.commit()
                return Result.success(f"Withdrawn {amount}, new balance: {new_balance}")
            else:
                # Conflict detected - version changed
                db.rollback()
                # On conflict, retry with updated data
                # Exponential backoff to reduce collision probability
                time.sleep(0.01 * (2 ** attempt))
                continue
                
        except Exception as e:
            db.rollback()
            raise
    
    # Max retries exceeded
    return Result.failure("Transaction failed after maximum retries - high contention")

Advantages

•No lock waiting—maximum concurrency
•No deadlock risk
•Works across database boundaries
•Ideal for low-contention workloads
•Web-friendly—no long-held connections needed

Disadvantages

•Conflicts cause work to be discarded—wasted CPU
•Under high contention, retry storms can occur
•Starvation possible—same transaction keeps colliding
•More complex error handling in application
•Version column must be managed carefully

When to Use Optimistic vs Pessimistic

Atomic Operations

The Key Insight:

The lost update vulnerability exists because there's a gap between reading and writing. Atomic operations close this gap by combining read, modify, and write into a single indivisible unit.

atomic_operations.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- VULNERABLE: Read-Modify-Write pattern
-- Step 1: Read
SELECT balance FROM accounts WHERE account_id = 'ACC-7821';
-- Gap: another transaction can modify balance here!
-- Step 2: Write computed value
UPDATE accounts SET balance = 400 WHERE account_id = 'ACC-7821';
 
-- SAFE: Atomic operation
-- Single statement combines read + modify + write
UPDATE accounts 
SET balance = balance - 100 
WHERE account_id = 'ACC-7821';
 
-- The database engine ensures atomicity:
-- 1. Acquires lock on row
-- 2. Reads current balance
-- 3. Computes balance - 100
-- 4. Writes new value
-- 5. Releases lock
-- All in one atomic operation - no gap for interference

Common Atomic Operation Patterns:

atomic_patterns.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Counter increment
-- Instead of: SELECT count; UPDATE count = count + 1
UPDATE page_views SET count = count + 1 WHERE page_id = 'P-123';
 
-- Balance modification
-- Instead of: SELECT balance; check; UPDATE balance = computed
UPDATE accounts SET balance = balance - 100 
WHERE account_id = 'ACC-7821' AND balance >= 100;
-- Returns 1 row affected if successful, 0 if insufficient funds
 
-- Inventory decrement with constraint
UPDATE inventory 
SET quantity = quantity - 1 
WHERE sku = 'WIDGET-X' AND quantity >= 1
RETURNING quantity;  -- PostgreSQL: return new value
 
-- Conditional state change (booking)
UPDATE rooms 
SET is_available = FALSE, booked_by = 'user-123'
WHERE room_id = 'CONF-A' AND time_slot = '2024-01-15 14:00'
  AND is_available = TRUE;
-- Returns 1 if booking succeeded, 0 if room was already booked
 
-- Append to JSON array (PostgreSQL)
UPDATE user_logs 
SET events = events || '{"action": "login", "time": "..."}'::jsonb
WHERE user_id = 'U-456';
 
-- Increment with atomic fetch (PostgreSQL sequence-like)
UPDATE sequences 
SET next_val = next_val + 1 
WHERE seq_name = 'order_number'
RETURNING next_val - 1 AS current_val;

Preferred When Possible

Limitations of Atomic Operations:

Not all logic can be expressed atomically:

Complex business rules requiring application-side computation
Operations spanning multiple tables without database support for compound atomicity
Logic requiring external service calls between read and write
Conditional updates based on data from different systems

In these cases, pessimistic or optimistic locking becomes necessary.

Isolation Level Considerations

Isolation Levels and Lost Update Prevention
Isolation Level	Prevents Lost Updates?	How/Why
READ UNCOMMITTED	No	Allows dirty reads. No protection against R→R→W→W pattern.
READ COMMITTED	No	Prevents dirty reads but allows concurrent writes on same row. The R→R→W→W pattern is fully possible.
REPEATABLE READ	Depends on implementation	SQL standard: No. PostgreSQL: Yes (via snapshot isolation). MySQL InnoDB: No.
SERIALIZABLE	Yes	Guarantees equivalent to serial execution. Any interleaving that would cause lost updates is detected and one transaction aborted.

The SQL Standard Surprise

Why Developers Get Confused:

Non-repeatable read (anomaly name): T₁ reads X, T₂ modifies X, T₁ reads X again and sees a different value.
Lost update (different anomaly): T₁ reads X, T₂ reads X, both modify and write back, one write is lost.

The SERIALIZABLE Solution:

serializable_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Using SERIALIZABLE to prevent lost updates
-- Transaction T1
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
 
SELECT balance FROM accounts WHERE account_id = 'ACC-7821';
-- Returns $500
 
-- Meanwhile, T2 also reads $500 and attempts to update
 
UPDATE accounts SET balance = 400 WHERE account_id = 'ACC-7821';
COMMIT;
-- T1 commits successfully
 
-- Transaction T2 (concurrent)
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
 
SELECT balance FROM accounts WHERE account_id = 'ACC-7821';
-- Returns $500 (same snapshot or value)
 
UPDATE accounts SET balance = 100 WHERE account_id = 'ACC-7821';
COMMIT;
-- ERROR: could not serialize access due to concurrent update
-- T2 must retry
 
-- After retry, T2 reads $400 (T1's committed value)
-- Correctly computes $0

Choosing the Right Prevention Strategy

Decision Framework

•Can the operation be expressed atomically? If yes, use atomic UPDATE statements. This is always the best choice when possible.
•Is contention low? If yes, optimistic locking is efficient—conflicts are rare, so retry costs are minimal.
•Is contention high on specific rows? If yes, pessimistic locking (SELECT FOR UPDATE) prevents retry storms.
•Is the transaction complex with multiple operations? If yes, consider pessimistic locking or SERIALIZABLE isolation to avoid partial work.
•Is the system distributed across databases? If yes, application-level optimistic locking with version columns is often the only option.
•Are latency requirements strict? If yes, avoid pessimistic locking on hot paths (blocking increases latency).

Strategy Selection Matrix
Scenario	Recommended Strategy	Rationale
Simple counter increment	Atomic UPDATE	Single statement, no gap possible
Inventory decrement with check	Atomic UPDATE with WHERE condition	Combine check and decrement atomically
Multi-field update, low contention	Optimistic locking (version column)	No blocking, retries are rare
Bank transfer (debit + credit)	Pessimistic locking (SELECT FOR UPDATE)	Complex transaction, must not fail halfway
Booking system with high demand	Pessimistic locking or SERIALIZABLE	High contention on popular slots
Distributed microservices	Application-level optimistic locking	No single database to control; use version tokens
Legacy system, can't modify schema	SERIALIZABLE isolation level	Prevention without schema changes

Layered Defense

Summary: The Prevention Imperative

Key Takeaways

•Prevention is not optional: The silent, cumulative nature of lost updates means waiting for problems to manifest is never acceptable.
•Multiple layers available: Prevention can occur at database engine, SQL statement, transaction, application, or architecture layers.
•Atomic operations preferred: When logic can be expressed in a single UPDATE statement, this eliminates the read-write gap entirely.
•Pessimistic locking for high contention: SELECT FOR UPDATE ensures exclusive access but reduces concurrency.
•Optimistic locking for low contention: Version columns detect conflicts at write time with minimal overhead.
•Isolation levels matter: Only SERIALIZABLE truly prevents lost updates at the database level; lower levels require additional mechanisms.
•Match strategy to workload: Choose prevention mechanisms based on contention patterns, transaction complexity, and system constraints.

What's Next:

Page Complete