Thomas Write Rule - Learning Module

Loading content...

0/252

Outdated Writes

When Writes Become Irrelevant

In the previous page, we introduced the Thomas Write Rule and its fundamental insight: some write operations are outdated before they even execute. An outdated write is a write operation that, if executed, would produce a value that is immediately superseded by a later write—making the operation's effect invisible to any subsequent transaction.

This concept is deceptively simple but has profound implications for database concurrency control. Understanding outdated writes deeply allows us to:

Optimize concurrency by avoiding unnecessary transaction aborts
Reason about schedule correctness with precision
Design systems that exploit semantic properties of operations

In this page, we'll dissect the concept of outdated writes, exploring their nature, detection mechanisms, and the conditions under which ignoring them preserves correctness.

What You Will Learn

By the end of this page, you will understand the precise definition of an outdated write, the formal conditions that identify one, how the timestamp ordering protocol detects outdated writes in O(1) time, and the subtle edge cases where writes that appear outdated actually aren't.

Formal Definition of Outdated Writes

Let's establish a precise, formal definition of what constitutes an outdated write.

Definition (Outdated Write):

A write operation write(X) by transaction T_i is outdated if and only if:

There exists a transaction T_j with TS(T_j) > TS(T_i) that has already executed write(X)
No transaction T_k with TS(T_i) < TS(T_k) has read X after T_i's logical position in the serialization order

In simpler terms: A write is outdated when a later transaction has already overwritten the data item, and no transaction "in between" has read the value that the outdated write would have produced.

The Temporal Analogy:

Imagine a timeline where transactions are ordered by their timestamps. T_i tries to write a value to X at position 10 on the timeline. But T_j, at position 20, has already written a different value. Any transaction at position 15 that reads X will see... what?

In the correct serial execution (T_i → T_j):

T_i writes at position 10
Transaction at position 15 reads T_i's value
T_j writes at position 20, overwriting T_i's value

But if T_j has already executed and no transaction has read X:

T_i's write would be invisible—it would be immediately overwritten
Skipping T_i's write produces the same final state

The Observability Principle

A write is outdated if its effect can never be observed. If no transaction can read the value produced by the write (because a later write supersedes it before any read occurs), the write is semantically meaningless. This is the theoretical foundation for safely ignoring outdated writes.

The Dual Condition:

Notice that our definition has two parts. Both must be satisfied for a write to be truly outdated:

Condition 1: A later write exists

W-TS(X) > TS(T_i)
This is necessary but NOT sufficient

Condition 2: No intervening reads

R-TS(X) < TS(T_i) (no transaction between T_i and T_j has read X)
This is also necessary but NOT sufficient alone

Only when both conditions hold can we safely ignore the write.

Detecting Outdated Writes in the Timestamp Protocol

The elegance of the timestamp ordering protocol is that detecting outdated writes is trivial—it requires only a simple comparison. Let's see how the timestamps maintained by the protocol enable this.

The Timestamp State:

For each data item X, the protocol maintains:

W-TS(X): The timestamp of the most recent transaction to write X
R-TS(X): The timestamp of the most recent transaction to read X

When transaction T attempts write(X):

// Detection Algorithm
if (TS(T) < R-TS(X)) {
    // A later transaction has read the old value
    // T's write would break serializability
    return ABORT;
} else if (TS(T) < W-TS(X)) {
    // A later transaction has already written to X
    // T's write is OUTDATED
    return OUTDATED_WRITE;
} else {
    // No conflict, proceed with write
    return EXECUTE_WRITE;
}

Time Complexity: O(1)

The detection requires exactly two comparisons:

Compare TS(T) with R-TS(X)
Compare TS(T) with W-TS(X)

No scanning of transaction lists, no graph analysis—just two integer comparisons. This efficiency is one of the key advantages of timestamp-based protocols.

Outdated Write Detection States
Condition	Meaning	Action
TS(T) ≥ R-TS(X) AND TS(T) ≥ W-TS(X)	No conflicts, T is the latest accessor	Execute write, update W-TS(X)
TS(T) < R-TS(X)	A later transaction has read X	Abort T (NOT an outdated write)
TS(T) ≥ R-TS(X) AND TS(T) < W-TS(X)	A later transaction has written X, no later read	Outdated write detected

The Order of Checks Matters:

Notice that we check R-TS(X) before W-TS(X). This is crucial because:

If TS(T) < R-TS(X), we must abort regardless of W-TS(X)
The outdated write optimization only applies when no read conflict exists

Proof of Correctness for Detection:

The check TS(T) < W-TS(X) precisely captures condition 1 of our definition (a later write exists). The check TS(T) ≥ R-TS(X) ensures condition 2 holds (no transaction with timestamp between T and the writer of W-TS(X) has read X).

Why does R-TS(X) capture "intervening reads"? Because:

R-TS(X) is the largest timestamp to have read X
If R-TS(X) > TS(T), some transaction "after" T in serial order has read X
That read would have seen the value T was supposed to write
Ignoring T's write would give a different observable state

Thus, the two-comparison detection perfectly identifies outdated writes.

Outdated Writes in Practice: Common Scenarios

Let's examine several practical scenarios where outdated writes occur naturally in database workloads.

Scenario 1: Concurrent Counter Increments

Consider a counter X that multiple transactions attempt to increment:

Initial state: X = 100
T₁ (timestamp 10): Read X = 100, compute 101, attempt write(X, 101)
T₂ (timestamp 20): Read X = 100, compute 101, attempt write(X, 101)
T₃ (timestamp 30): Read X = 100, compute 101, attempt write(X, 101)

If T₃ executes first:

T₃ writes X = 101, W-TS(X) = 30
T₂ attempts write(X, 101): TS(T₂) = 20 < W-TS(X) = 30
- If R-TS(X) < 20: Outdated write detected
T₁ attempts write(X, 101): TS(T₁) = 10 < W-TS(X) = 30
- If R-TS(X) < 10: Outdated write detected

Both T₁'s and T₂'s writes are outdated! The Thomas Write Rule allows both to complete without aborting, dramatically improving throughput.

Important Caveat

Notice that all three transactions in this scenario produce the wrong result! They all read 100 and try to write 101, but a correct serial execution would give X = 103 (three increments). The Thomas Write Rule doesn't fix this—it's a concurrency control mechanism, not a conflict resolution mechanism. Applications needing correct increments under concurrency must use techniques like read-modify-write atomicity or optimistic concurrency control.

Scenario 2: Last-Write-Wins Semantics

Many systems use "last-write-wins" semantics for updates:

User profile updates where the most recent save overwrites previous changes
Configuration changes where the latest setting takes precedence
Sensor readings where only the most recent measurement matters

In these cases, outdated writes are not just acceptable—they're expected! The Thomas Write Rule aligns perfectly with last-write-wins semantics.

Scenario 3: Cache Refresh Operations

Consider a cache that's refreshed by background jobs:

Job J₁ (timestamp 10) fetches data, attempts to update cache
Job J₂ (timestamp 20) fetches newer data, updates cache first
J₁'s update is now outdated—it would overwrite newer data with stale data

Ignoring J₁'s write is not just an optimization—it's necessary for correctness in this scenario. The Thomas Write Rule prevents stale cache entries.

Common Sources of Outdated Writes

•Periodic refresh jobs — Multiple jobs updating cached data with the latest version
•Idempotent operations — Repeated operations that set values to the same final state
•Convergent updates — Distributed systems where multiple replicas propagate the same update
•Overwrite-style updates — Applications that replace rather than modify values
•Logging systems — Where newer logs supersede older entries for the same event

When Writes Appear Outdated But Aren't

It's equally important to understand when writes should NOT be treated as outdated, even if they appear to be at first glance.

Case 1: Intervening Reads Exist

This is the most critical case. Consider:

T₁ (timestamp 10): Wants to write X
T₂ (timestamp 15): Has already read X
T₃ (timestamp 20): Has already written X

Even though T₃ has overwritten X (W-TS(X) = 20 > TS(T₁) = 10), T₂ has read X (R-TS(X) = 15 > TS(T₁) = 10). T₂'s read creates a dependency that cannot be ignored.

In the serial schedule T₁ → T₂ → T₃:

T₁ writes X = v₁
T₂ reads X and sees v₁
T₃ writes X = v₃

If we ignore T₁'s write:

T₂ reads some original value v₀ (not v₁)
T₃ writes X = v₃

T₂ saw a different value! This breaks serializability. T₁'s write is NOT outdated despite T₃'s later write.

Critical Distinction

A write is only outdated if no transaction with a timestamp between the writer and the overwriter has read the data item. The R-TS(X) check is essential—without it, we would incorrectly classify writes as outdated and break serializability.

Case 2: Semantic Dependencies

Some writes have semantic dependencies that timestamps don't capture:

T₁ (timestamp 10):
    write(X, 100)
    write(Y, 200)    // Y = 2 * X

T₂ (timestamp 20):
    write(X, 500)

If T₂'s write(X) executes first, then T₁ attempts write(X, 100), the Thomas Write Rule says ignore T₁'s write to X. But what about Y?

In the serial schedule T₁ → T₂:

X = 500, Y = 200

With Thomas Write Rule:

X = 500, Y = 200 (T₁'s write to X ignored, but write to Y executed)

The final state is the same! The rule works correctly here. But if T₁'s writes to X and Y were meant to be atomic pairs, the application might have other expectations.

Case 3: Write-Triggered Side Effects

Some database systems trigger side effects on writes:

Triggers that execute on INSERT/UPDATE
Audit logging that records every write
Replication that propagates writes to other nodes

If a write is ignored by the Thomas Write Rule, these side effects don't occur. This may or may not be acceptable depending on application requirements.

Summary: When Writes Can and Cannot Be Considered Outdated
Condition	Can Ignore Write?	Reason
TS(T) < W-TS(X) AND TS(T) ≥ R-TS(X)	Yes	No read has seen the value T would write
TS(T) < W-TS(X) AND TS(T) < R-TS(X)	No	A later transaction has read the old value
Write triggers side effects	Application-dependent	Side effects may be required regardless
Write is part of atomic pair	Check all writes	All related writes must be evaluated together

Visualizing Outdated Writes

Understanding outdated writes becomes clearer with visual representations. Let's examine several scenarios using timeline diagrams.

Timeline Representation:

We'll use a horizontal timeline where:

Position on the timeline represents the timestamp
W(X) represents a write to data item X
R(X) represents a read of data item X
Solid lines represent executed operations
Dashed lines represent the logical position of an outdated write

Scenario A: Simple Outdated Write (Safe to Ignore)

Timeline:    10         20         30
             |          |          |
T₁:         W(X)-----→ (ignored)   
T₂:                   W(X)--------→ (executed)

State:
- T₂ executes write(X) first: W-TS(X) = 20
- T₁ attempts write(X): TS(T₁) = 10 < W-TS(X) = 20
- R-TS(X) = 0 (no reads)
- T₁'s write is OUTDATED → Ignored
- Final value: T₂'s value ✓

Scenario B: Intervening Read (Must Abort)

Timeline:    10         15         20
             |          |          |
T₁:         W(X)-----→ (must execute or abort)
T₂:                   R(X)
T₃:                               W(X)

State:
- T₃ executes write(X) first: W-TS(X) = 20
- T₂ reads X: R-TS(X) = 15
- T₁ attempts write(X): TS(T₁) = 10 < R-TS(X) = 15
- T₂ has READ between T₁ and T₃!
- T₁ must ABORT (not outdated - has observable effect)

The Critical Difference

In Scenario A, no transaction exists between T₁ and T₂ that observed X. T₁'s write would be invisible. In Scenario B, T₂ read X between T₁ and T₃'s positions. T₂ would see different values depending on whether T₁'s write executed, making T₁'s write semantically meaningful (not outdated).

Scenario C: Multiple Outdated Writes

Timeline:    10    15    20    25    30
             |     |     |     |     |
T₁:         W(X)  (ignored)
T₂:               W(X)  (ignored)
T₃:                     W(X)  (ignored)
T₄:                           W(X)  (executed first)

State:
- T₄ executes first: W-TS(X) = 25
- T₃ attempts: TS(T₃) = 20 < W-TS(X) = 25 → Outdated
- T₂ attempts: TS(T₂) = 15 < W-TS(X) = 25 → Outdated
- T₁ attempts: TS(T₁) = 10 < W-TS(X) = 25 → Outdated
- All three earlier writes ignored!
- Final value: T₄'s value ✓

This scenario shows the power of the Thomas Write Rule: four transactions attempting to write the same item, but only one abort-free execution with correct final state. Without the rule, T₁, T₂, and T₃ would all abort and restart.

Formal Properties of Outdated Writes

Let's establish some formal properties of outdated writes that help in reasoning about their behavior in complex scenarios.

Property 1: Transitivity

If transaction T₁'s write is outdated by T₂'s write, and T₂'s write is outdated by T₃'s write, then T₁'s write is also outdated by T₃'s write (assuming no intervening reads).

Proof:

W-TS(X) after T₃ writes = TS(T₃)
TS(T₁) < TS(T₂) < TS(T₃)
Therefore TS(T₁) < TS(T₃) = W-TS(X)
T₁'s write satisfies the outdated condition with respect to T₃ ∎

Property 2: Independence

Outdatedness is determined per data item. A transaction's write to X being outdated says nothing about its writes to other items.

Example:

T₁ writes X and Y
T₂ writes only X with later timestamp
T₁'s write(X) is outdated; T₁'s write(Y) is not affected

Property 3: Non-Commutativity with Reads

Outdatedness is not symmetric with respect to read operations. A write becoming outdated by another write doesn't affect reads in the same way.

Property 4: Idempotence of Ignoring

Ignoring an outdated write is an idempotent operation. If we decide to ignore a write and later "reconsider," the outcome is the same—the write was never executed.

Property 5: Preservation of Final State

Ignoring outdated writes preserves the final state of the database. Formally:

Let S be a schedule, and S' be the schedule obtained by removing all outdated writes from S. Then:

FinalState(S) = FinalState(S')
For each data item X, the final value in S equals the final value in S'

Design Implication

Property 5 is the fundamental guarantee that makes the Thomas Write Rule correct. It ensures that ignoring outdated writes doesn't change what the database looks like at the end of all transactions. Combined with the R-TS check, it also ensures that no transaction observes a different value than it would in the equivalent serial schedule.

Implementation Considerations

Implementing outdated write detection efficiently requires careful attention to several aspects.

Timestamp Storage:

Each data item needs to store:

The current value
W-TS: Write timestamp (typically 64-bit integer)
R-TS: Read timestamp (typically 64-bit integer)

This adds 16 bytes of overhead per data item. For a table with 1 million rows, this is 16 MB—negligible for modern systems but worth considering for memory-constrained environments.

Atomic Timestamp Operations:

Checking and updating timestamps must be atomic to avoid race conditions:

// BAD: Race condition possible
if (TS(T) >= R-TS(X) && TS(T) >= W-TS(X)) {
    // Another transaction could modify W-TS(X) here!
    X = new_value;
    W-TS(X) = TS(T);
}

// GOOD: Atomic check-and-update
lock(X);
if (TS(T) >= R-TS(X) && TS(T) >= W-TS(X)) {
    X = new_value;
    W-TS(X) = TS(T);
}
unlock(X);

Implementation Best Practices

•Use compare-and-swap (CAS) — Modern CPUs provide CAS instructions that enable lock-free timestamp updates in many cases.
•Consider timestamp granularity — Microsecond timestamps are usually sufficient; nanosecond precision adds overhead without benefit.
•Handle timestamp wraparound — 64-bit timestamps won't wrap for millennia, but 32-bit timestamps require careful handling.
•Log outdated writes — Even if ignored, logging outdated writes can help with debugging and performance analysis.
•Monitor outdated write frequency — High frequency might indicate application design issues worth addressing.

Handling Outdated Write Returns:

When a write is detected as outdated, the system must communicate this to the calling code:

enum WriteResult {
    SUCCESS,          // Write executed normally
    ABORT_REQUIRED,   // Conflict with read, must abort
    OUTDATED_IGNORED  // Write was outdated, ignored
}

WriteResult attemptWrite(Transaction T, DataItem X, Value v) {
    if (TS(T) < R-TS(X)) {
        return ABORT_REQUIRED;
    } else if (TS(T) < W-TS(X)) {
        return OUTDATED_IGNORED;  // Thomas Write Rule
    } else {
        X.value = v;
        W-TS(X) = TS(T);
        return SUCCESS;
    }
}

The calling code can then continue execution knowing the write was handled appropriately.

Summary: Outdated Writes

We've comprehensively explored the concept of outdated writes. Let's consolidate the key takeaways:

Key Takeaways

•Outdated writes are writes whose effects are invisible — Their values would be immediately overwritten before any transaction could observe them.
•Detection requires two conditions — TS(T) < W-TS(X) (later write exists) AND TS(T) ≥ R-TS(X) (no intervening reads).
•Detection is O(1) — Two simple integer comparisons using stored timestamps.
•Intervening reads prevent outdatedness — If any transaction read the item, the write cannot be ignored.
•Ignoring outdated writes preserves correctness — The final database state is identical to the equivalent serial schedule.
•Implementation requires atomic operations — Race conditions must be avoided when checking and updating timestamps.

What's Next:

In the next page, we'll examine the mechanism of ignoring obsolete writes in greater detail. We'll explore the exact semantics of what "ignoring" means, how it affects transaction state, and the subtle implications for transaction processing and recovery.

Page Complete

You now have a deep understanding of outdated writes—what they are, how to detect them, when they occur, and when apparent outdatedness is actually a conflict requiring abort. This knowledge forms the foundation for understanding the Thomas Write Rule's performance benefits.