Loading learning content...
In the basic timestamp ordering protocol, a transaction is aborted whenever it attempts to write a value that is "too late"—when another transaction with a larger timestamp has already read the data item. This strict approach ensures serializability, but it can lead to unnecessary transaction aborts that waste computational resources and reduce system throughput.
Consider this scenario: Transaction T₁ (timestamp 10) wants to write to data item X, but transaction T₂ (timestamp 20) has already written to X. Under basic timestamp ordering, T₁ must be aborted and restarted. But think about this carefully: if T₂'s write has already overwritten X, does T₁'s write even matter anymore?
This is precisely the insight that led to one of the most elegant optimizations in database concurrency control—the Thomas Write Rule.
By the end of this page, you will understand the Thomas Write Rule, its motivation, formal definition, and why it represents a fundamental insight about the nature of write operations in database systems. You'll see how this simple rule can dramatically improve transaction throughput without sacrificing correctness.
The Thomas Write Rule was introduced by Robert H. Thomas in the late 1970s as part of his work on distributed database systems. Thomas observed that the basic timestamp ordering protocol was overly conservative in its handling of write operations, leading to transaction aborts that were unnecessary from the perspective of data correctness.
The Problem with Basic Timestamp Ordering:
In the basic timestamp ordering protocol, every write operation must adhere to a strict rule: a transaction T can write to data item X only if no transaction with a larger timestamp has already read or written X. The rationale is straightforward—if a later transaction has already accessed X, then allowing T to write would violate the timestamp order, potentially breaking serializability.
However, Thomas recognized a subtle but crucial distinction: reads and writes are fundamentally different operations with different implications for serializability.
When a transaction T₁ attempts to write a value that has already been overwritten by a later transaction T₂, T₁'s write is effectively invisible—it would be immediately superseded by T₂'s write. Rather than aborting T₁ and wasting all its work, we can simply ignore the obsolete write and let T₁ continue.
Why Reads Must Be Protected, But Some Writes Can Be Ignored:
Consider the difference:
Read-Write Conflict: If T₂ (timestamp 20) has already read X, and T₁ (timestamp 10) wants to write X, we have a real problem. T₂ has made decisions based on the old value of X. If T₁'s write were allowed, it would change history—T₂ should have seen T₁'s value, not the old one. This violates serializability.
Write-Write Conflict: If T₂ (timestamp 20) has already written to X, and T₁ (timestamp 10) wants to write X, the situation is different. T₁'s write would simply be overwritten by T₂'s write anyway. In any serial schedule where T₁ comes before T₂, T₁'s write would be superseded. T₁'s write is obsolete before it even happens.
This distinction is the foundation of the Thomas Write Rule.
The Thomas Write Rule modifies the write rule of the basic timestamp ordering protocol. Let's first recall the basic protocol, then see how Thomas's modification improves it.
Basic Timestamp Ordering Protocol (Write Rule):
When transaction T wants to execute write(X):
If TS(T) < R-TS(X): A transaction with a larger timestamp has already read the old value of X. T's write would change history. Abort T and restart with a new timestamp.
If TS(T) < W-TS(X): A transaction with a larger timestamp has already written to X. T's write would be overwritten. Abort T and restart with a new timestamp.
Otherwise: Execute the write and set W-TS(X) = TS(T).
TS(T) = Timestamp of transaction T R-TS(X) = Read timestamp of data item X (largest timestamp of any transaction that has read X) W-TS(X) = Write timestamp of data item X (largest timestamp of any transaction that has written X)
The Thomas Write Rule Modification:
The Thomas Write Rule changes only case 2. Instead of aborting T, we simply ignore the write operation and let T continue:
When transaction T wants to execute write(X):
If TS(T) < R-TS(X): A transaction with a larger timestamp has already read X. Abort T and restart.
If TS(T) < W-TS(X): A transaction with a larger timestamp has already written to X. T's write is obsolete. Ignore the write but do NOT abort T. Let T continue with its other operations.
Otherwise: Execute the write and set W-TS(X) = TS(T).
| Condition | Basic Timestamp Ordering | Thomas Write Rule |
|---|---|---|
| TS(T) < R-TS(X) | Abort T and restart | Abort T and restart (same) |
| TS(T) < W-TS(X) | Abort T and restart | Ignore write, T continues |
| TS(T) ≥ R-TS(X) and TS(T) ≥ W-TS(X) | Execute write, update W-TS(X) | Execute write, update W-TS(X) (same) |
The Elegance of the Rule:
Notice that the Thomas Write Rule doesn't add complexity—it actually simplifies the protocol's response to one scenario. Instead of the expensive operation of aborting and restarting a transaction, we simply skip the obsolete write and continue. The transaction loses nothing (its write was going to be invisible anyway), and the system gains the benefit of avoiding unnecessary work.
Let's walk through a concrete example to see the Thomas Write Rule in action. Consider two transactions, T₁ and T₂, with timestamps 10 and 20 respectively, both operating on data item X.
Initial State:
Schedule of Operations:
Analyzing the Outcome:
In the Thomas Write Rule scenario, the final value of X is 200 (written by T₂). This is exactly what we want! In any serial schedule where T₁ (timestamp 10) executes before T₂ (timestamp 20), T₁'s write to X would be followed by T₂'s write—meaning T₂'s value would be the final one.
The Thomas Write Rule produces the same final database state as the serial schedule T₁ → T₂, but without the overhead of aborting and restarting T₁. The ignored write is called an obsolete write or a blind write that would have been immediately overwritten.
The Intuition:
Think of it this way: T₁'s write to X is like writing a message on a whiteboard, only to have T₂ immediately erase it and write a different message. Why bother writing the first message at all? The Thomas Write Rule captures this intuition formally.
The Thomas Write Rule's correctness isn't immediately obvious. How can we simply ignore a write operation and still maintain correctness? Let's examine the reasoning carefully.
The Serialization Order Argument:
The timestamp ordering protocol produces schedules that are equivalent to a serial schedule where transactions execute in timestamp order. If T₁ has timestamp 10 and T₂ has timestamp 20, the equivalent serial schedule is T₁ → T₂.
Now consider what happens to data item X in the serial schedule T₁ → T₂:
T₁'s write exists in the serial schedule, but its effect is invisible in the final state. Any transaction that reads X after both T₁ and T₂ will see T₂'s value, not T₁'s.
A write operation whose value will be overwritten before any transaction reads it is semantically invisible. Ignoring such a write doesn't change the database state that any transaction can observe. The Thomas Write Rule exploits this principle.
Formal Correctness Argument:
Let T_i be a transaction with timestamp TS(T_i), attempting to write data item X, where W-TS(X) = TS(T_j) and TS(T_i) < TS(T_j).
Claim: Ignoring T_i's write preserves serializability.
Proof Sketch:
The equivalent serial schedule orders transactions by timestamp: T_i before T_j.
In this serial schedule:
By ignoring T_i's write and keeping T_j's write:
The resulting schedule is view-equivalent to the serial schedule T_i → T_j.
Important Caveat: This argument works only because W-TS(X) = TS(T_j) > TS(T_i). If a third transaction T_k with TS(T_i) < TS(T_k) < TS(T_j) reads X, T_k would see T_j's value in both cases (with or without the Thomas Write Rule), maintaining consistency.
To deeply understand the Thomas Write Rule, we must appreciate the different roles of read and write timestamps and why they're handled differently.
R-TS(X): The Read Timestamp
The read timestamp R-TS(X) records the largest timestamp of any transaction that has read data item X. This is critical because:
Reads establish dependencies: When a transaction reads X, it makes decisions based on that value. Future transactions' writes cannot change what a past transaction "saw."
Reads cannot be undone: Once a transaction has read a value and potentially used it in computations or decisions, that read has consequences that persist even if the transaction later aborts.
Read-write conflicts break serializability: If T₁ (timestamp 10) tries to write X after T₂ (timestamp 20) has read X, we cannot allow it. T₂ has already seen the old value; allowing T₁'s write would mean T₂ should have seen T₁'s value instead.
W-TS(X): The Write Timestamp
The write timestamp W-TS(X) records the largest timestamp of any transaction that has written to data item X. This is simpler because:
Writes are overwritable: Unlike reads, a write's effect can be "erased" by a subsequent write.
Write-write conflicts can sometimes be resolved: If T₁ (timestamp 10) tries to write X after T₂ (timestamp 20) has written X, T₁'s value would be immediately superseded. No transaction can observe T₁'s write.
The Thomas Write Rule exploits this: We can safely ignore T₁'s write, treating it as if it happened but was immediately overwritten.
| Aspect | R-TS(X) - Read Timestamp | W-TS(X) - Write Timestamp |
|---|---|---|
| Meaning | Largest timestamp that has read X | Largest timestamp that has written X |
| Dependency Type | Creates hard dependencies (decisions made) | Creates soft dependencies (can be overwritten) |
| Conflict Resolution | Must abort if TS(T) < R-TS(X) | Can ignore write if TS(T) < W-TS(X) |
| Why Different | Reads cannot be retroactively changed | Writes can be overwritten before observation |
| Thomas Write Rule Impact | Not affected (still causes abort) | Allows ignoring obsolete writes |
The Thomas Write Rule does NOT allow ignoring writes when TS(T) < R-TS(X). If a later transaction has read the data item, the obsolete write MUST cause an abort. The rule applies ONLY to write-write conflicts where no read has occurred.
The Thomas Write Rule produces schedules that are view serializable but not necessarily conflict serializable. Understanding this distinction is crucial for appreciating what the Thomas Write Rule achieves.
Conflict Serializability Review:
A schedule S is conflict serializable if we can transform it into a serial schedule by swapping the order of non-conflicting operations. Two operations conflict if:
View Serializability Review:
A schedule S is view serializable if:
The Thomas Write Rule and View Serializability:
When we ignore an obsolete write using the Thomas Write Rule, we may create a schedule that is not conflict serializable but is view serializable.
Example:
Consider transactions T₁ (timestamp 10) and T₂ (timestamp 20):
Schedule S:
T₂: write(X)
T₁: write(X) [IGNORED by Thomas Write Rule]
T₂: commit
T₁: commit
Conflict Analysis:
View Analysis:
The Key Insight:
The Thomas Write Rule sacrifices conflict serializability for better performance, but maintains the stronger guarantee of view serializability. This is acceptable because view serializability is the true correctness criterion—it ensures the schedule produces the same results as some serial schedule.
This makes the Thomas Write Rule protocol theoretically interesting: it's a practical, efficient protocol that produces view serializable schedules—something that is NP-complete to verify in general, but is achieved here through timestamp ordering constraints.
The Thomas Write Rule represents one of several strategies for handling write-write conflicts in concurrent database systems. Let's compare it with alternative approaches.
| Approach | Action on Conflict | Pros | Cons |
|---|---|---|---|
| Basic Timestamp Ordering | Abort younger transaction | Simple, conflict serializable | Many unnecessary aborts |
| Thomas Write Rule | Ignore obsolete write | Fewer aborts, higher throughput | Only view serializable |
| Two-Phase Locking (2PL) | Wait for lock release | Conflict serializable, no aborts | Potential deadlocks |
| MVCC (Multi-Version) | Create new version | High concurrency, no blocking | Storage overhead, GC complexity |
When to Use the Thomas Write Rule:
The Thomas Write Rule is most beneficial in systems where:
Write-write conflicts are common: Applications with many concurrent writes to shared data items benefit most.
Transactions are long-running: Aborting and restarting long transactions is expensive; ignoring obsolete writes saves significant work.
View serializability is acceptable: Applications that don't require strict conflict serializability can leverage this optimization.
Read-write conflicts are rare: The rule doesn't help with read-write conflicts, so systems with many reads may see less benefit.
When NOT to Use the Thomas Write Rule:
Strict conflict serializability required: Some applications need the stronger guarantee; auditing systems, for example.
Write operations have side effects: If writing triggers external actions (logging, notifications), ignoring writes could cause inconsistencies.
Recovery considerations: Ignoring writes complicates some recovery scenarios (discussed in later sections).
We've covered the foundational concepts of the Thomas Write Rule. Let's consolidate the key takeaways:
What's Next:
In the following pages, we'll explore the concept of outdated writes in greater depth, understanding exactly when and why writes become obsolete. We'll then examine how the Thomas Write Rule handles these cases and analyze the performance improvements it provides. Finally, we'll rigorously examine the correctness guarantees to ensure you have complete confidence in applying this optimization.
You now understand the Thomas Write Rule—its motivation, formal definition, and why it works. This optimization is a perfect example of how deep understanding of database semantics can lead to significant performance improvements without sacrificing correctness.