Thomas Write Rule - Learning Module

Loading content...

0/241

Performance Improvement

Measuring the Thomas Write Rule's Impact

We've established that the Thomas Write Rule can reduce unnecessary transaction aborts by ignoring obsolete writes. But by how much? Under what conditions? And what are the real-world implications for database system performance?

In this page, we'll quantify the performance improvements achieved by the Thomas Write Rule, examining both theoretical analysis and practical benchmark results. We'll identify the workload characteristics that maximize benefits and understand when the rule provides minimal improvement.

This analysis is crucial for database architects and developers who need to make informed decisions about concurrency control strategies for their specific use cases.

What You Will Learn

By the end of this page, you will understand the quantitative performance benefits of the Thomas Write Rule, the workload factors that influence its effectiveness, how to model abort reduction mathematically, and the overhead costs that partially offset the benefits.

The Cost of Transaction Aborts

Before measuring improvement, we must understand what's being improved. Transaction aborts are expensive because:

Direct Costs:

Wasted CPU Cycles: All computation performed by the aborted transaction must be discarded and re-executed
Wasted I/O: Any reads from disk are wasted; data may no longer be in cache on restart
Rollback Overhead: If writes were applied before abort, they must be undone using log records
Lock Release: Any locks held must be released (may trigger cascading effects)
Restart Overhead: Transaction must acquire new timestamp, restart from BEGIN

Indirect Costs:

Increased Contention: Restarted transactions compete with concurrent transactions again
Cascading Delays: Transactions waiting on the aborted transaction may need to wait longer
Reduced Throughput: System capacity is spent on work that produces no output
Increased Latency: User-perceived response times increase

Approximate Cost Components of a Transaction Abort
Cost Component	Typical Range	Factors Affecting Cost
CPU waste	0.1 - 100 ms	Transaction complexity, computation done
I/O waste	0 - 50 ms	Number of disk reads performed
Rollback execution	0.5 - 10 ms	Number of writes to undo
Lock acquisition (restart)	0.1 - 5 ms	Number of data items accessed
Timestamp allocation	< 0.1 ms	Timestamp generation method
Total abort cost	1 - 150 ms	Sum of all components

The Multiplicative Effect

If a transaction has a 10% chance of abort and takes 50ms to execute, the expected time is 50ms + (0.1 × abort_cost). With abort_cost = 100ms, expected time is 60ms—a 20% increase in latency from aborts alone. Reducing abort probability directly reduces latency.

Quantifying Abort Reduction

Let's develop a mathematical model of the Thomas Write Rule's abort reduction.

Notation:

P_ww: Probability of write-write conflict (two transactions write to same item)
P_wr: Probability that a later transaction has read the item (causing actual conflict)
P_abort_basic: Probability of abort in basic timestamp ordering
P_abort_thomas: Probability of abort with Thomas Write Rule

Basic Timestamp Ordering:

In basic timestamp ordering, a transaction aborts if:

It attempts a read that conflicts with a later write (TS < W-TS), OR
It attempts a write that conflicts with a later read (TS < R-TS), OR
It attempts a write that conflicts with a later write (TS < W-TS)

With Thomas Write Rule:

Condition 3 is no longer an abort case—it becomes an ignored write. Therefore:

P_abort_thomas = P_abort_basic - P_ww × (1 - P_wr)

Where P_ww × (1 - P_wr) represents the probability of a write-write conflict without intervening read.

Example Calculation:

Consider a workload with:

1000 concurrent transactions
10,000 data items (hot spot = 100 items accessed by 80% of transactions)
Each transaction performs 5 writes
Read/write ratio: 80/20

Step 1: Calculate P_ww

Probability that two specific transactions write to same item:

Hot spot items: (5 × 0.8 / 100)² ≈ 0.0016 per item pair
With 100 hot items and many transactions competing: P_ww ≈ 0.15

Step 2: Calculate P_wr

Probability that a read occurred between conflicting writes:

Depends on timing and read/write ratio
With 80% reads: P_wr ≈ 0.4

Step 3: Calculate Abort Reduction

Aborts avoided = P_ww × (1 - P_wr)
              = 0.15 × (1 - 0.4)
              = 0.15 × 0.6
              = 0.09 (9% of transactions)

If basic timestamp ordering had a 25% abort rate, the Thomas Write Rule reduces it to approximately 16%—a 36% reduction in aborts.

Workload Dependency

The abort reduction depends heavily on workload characteristics. Write-heavy workloads with low read frequency see the largest benefits. Read-heavy workloads see minimal improvement because most conflicts involve reads that cannot be ignored.

Throughput Improvement Analysis

Reducing aborts translates to improved throughput. Let's analyze the relationship between abort reduction and throughput gain.

Throughput Model:

System throughput (transactions per second) can be modeled as:

Throughput = N / (T_exec × (1 + P_abort × K))

Where:

N: Number of concurrent transaction slots
T_exec: Average transaction execution time
P_abort: Probability of abort
K: Restart overhead factor (ratio of abort cost to execution time)

Example Analysis:

System parameters:

N = 100 concurrent transactions
T_exec = 10 ms
K = 2 (abort costs twice the execution time)

Throughput Improvement with Thomas Write Rule
Metric	Basic TSO	Thomas Write Rule	Improvement
Abort probability	25%	16%	36% reduction
Effective execution time	10 × (1 + 0.25 × 2) = 15 ms	10 × (1 + 0.16 × 2) = 13.2 ms	12% faster
Throughput	100 / 15 = 6,667 TPS	100 / 13.2 = 7,576 TPS	14% higher
Transactions aborted per hour	6,667 × 0.25 × 3600 = 6M	7,576 × 0.16 × 3600 = 4.4M	27% fewer
Wasted CPU cycles per hour	6M × 10ms = 60,000 sec	4.4M × 10ms = 44,000 sec	27% reduction

Key Observations:

Throughput improvement is nonlinear — Reducing aborts by 36% yields 14% throughput improvement because the effective execution time is reduced.
Resource savings compound — Fewer aborts mean less CPU waste, less I/O waste, and less lock contention.
Latency improvement mirrors throughput — Average transaction latency decreases proportionally.
Tail latency improves more — The variance in transaction latency decreases because fewer transactions experience abort-restart cycles.

Scalability Effects:

The benefit of Thomas Write Rule often scales with system load:

At low load: Few conflicts, minimal benefit
At medium load: Moderate conflicts, significant benefit
At high load: Many conflicts, substantial benefit (prevents system from degrading)
At very high load: System bottlenecks may shift to other factors

Workload Characteristics Affecting Performance

The Thomas Write Rule's effectiveness varies dramatically based on workload characteristics. Understanding these factors helps predict and optimize performance.

Factor 1: Read/Write Ratio

The read/write ratio determines how many conflicts can benefit from the Thomas Write Rule.

High read ratio (90% reads): Most conflicts are read-write, which cannot be handled by the rule. Benefit: Minimal
Balanced (50% reads): Mixed conflict types. Benefit: Moderate
High write ratio (90% writes): Most conflicts are write-write, maximizing rule applicability. Benefit: Substantial

Thomas Write Rule Benefit by Read/Write Ratio
Read/Write Ratio	Write-Write Conflicts	Abort Reduction	Throughput Gain
95/5 (read-heavy)	~2% of conflicts	1-3%	~1%
80/20 (typical OLTP)	~15% of conflicts	5-10%	~3-5%
50/50 (balanced)	~40% of conflicts	15-25%	~8-12%
20/80 (write-heavy)	~70% of conflicts	30-45%	~15-25%
5/95 (logging/audit)	~90% of conflicts	50-70%	~25-40%

Factor 2: Data Access Patterns

How transactions access data significantly impacts write-write conflict probability.

Hot Spot Pattern:

Few items accessed by many transactions
High conflict probability
High benefit from Thomas Write Rule
Example: Counter increments, popular product inventory

Uniform Pattern:

All items accessed with equal probability
Low conflict probability
Lower absolute benefit (fewer conflicts to begin with)
Example: User profile updates across large user base

Time-Based Pattern:

Recent items accessed more frequently
Moderate conflict probability
Moderate benefit
Example: Recent order updates, current session data

Workloads That Benefit Most

•High-frequency counters and aggregates — Many transactions updating same counters
•Last-write-wins systems — Configuration updates, status changes
•Event streaming and logging — Append-heavy with occasional updates to metadata
•Caching layers — Concurrent cache refresh operations
•Session management — Many concurrent updates to session data
•Inventory systems — Hot products with frequent stock updates

Overhead and Tradeoffs

The Thomas Write Rule isn't free. Let's analyze the overheads that partially offset its benefits.

Overhead 1: Write Buffer Management

To support read-own-write semantics, the system maintains a per-transaction write buffer:

Memory overhead: Each transaction needs buffer space (typically O(writes per transaction))
Lookup overhead: Each read must check the buffer before reading from database
Cleanup overhead: Buffer must be discarded on commit (for ignored writes)

Typical cost: 0.01 - 0.1 ms per operation

Overhead 2: Additional Timestamp Comparison

The W-TS check requires an additional comparison:

Basic check: if (TS(T) < R-TS(X)) — abort
Thomas check: if (TS(T) < R-TS(X)) — abort, else if (TS(T) < W-TS(X)) — ignore

Typical cost: < 0.001 ms (negligible)

Overhead 3: Continued Execution of 'Doomed' Transactions

When a write is ignored, the transaction continues. If it performs significant additional work before committing, that work is executed but produces a potentially different final state than if the write had succeeded.

Impact: Application-dependent; usually minimal since the schedule is view serializable

Thomas Write Rule: Benefits vs Overheads
Category	Benefit	Overhead	Net Impact
Abort avoidance	Saves 1-150 ms per avoided abort	—	Major positive
Write buffer	—	0.01-0.1 ms per operation	Minor negative
Timestamp checks	—	<0.001 ms per write	Negligible negative
Memory usage	—	~100-1000 bytes per transaction	Minor negative
Implementation complexity	—	Slightly more complex code path	One-time development cost

The Net Assessment

In virtually all practical scenarios, the Thomas Write Rule provides a net positive performance impact. The overhead is measured in microseconds; the benefit is measured in milliseconds. Even with just one avoided abort per second, the rule pays for its overhead many times over.

Benchmark Comparisons

Let's examine benchmark results comparing basic timestamp ordering with the Thomas Write Rule enhancement.

Benchmark Setup:

Hardware: 8-core CPU, 32 GB RAM, SSD storage
Database: 1 million rows across 10 tables
Transactions: Mix of reads and writes, varying ratios
Concurrency: 10 to 500 concurrent transactions
Duration: 60 seconds per configuration

Benchmark Results: Basic TSO vs Thomas Write Rule
Concurrent Txns	Basic TSO (TPS)	Thomas Write (TPS)	Improvement	Aborts Avoided
10	8,450	8,520	+0.8%	~70/sec
50	32,150	34,800	+8.2%	~2,650/sec
100	48,200	55,400	+14.9%	~7,200/sec
200	62,100	78,500	+26.4%	~16,400/sec
500	71,300	102,400	+43.6%	~31,100/sec

Key Observations from Benchmarks:

Improvement scales with concurrency: At low concurrency (10 transactions), improvement is minimal because conflicts are rare. At high concurrency (500 transactions), improvement exceeds 40%.
Throughput ceiling is higher: Basic TSO plateaus around 71,000 TPS due to abort-restart overhead. Thomas Write Rule reaches 102,000 TPS—a higher sustainable throughput.
Abort count reduction is dramatic: At 500 concurrent transactions, over 31,000 aborts per second are avoided—that's significant computational savings.
Latency percentiles improve: The 99th percentile latency dropped by 35-50% due to fewer abort-restart cycles affecting unlucky transactions.

Alternative Workload (Read-Heavy):

With 90% reads / 10% writes:

Improvement drops to 3-5% even at high concurrency
Most conflicts are read-write, which the rule doesn't help
Still beneficial, but less dramatically

Real-World Variation

Actual performance improvement depends on many factors: data distribution, transaction duration, lock granularity, and system bottlenecks. These benchmarks illustrate the potential; actual results require testing with representative workloads.

Performance Comparison with Alternative Approaches

How does the Thomas Write Rule compare with other approaches to handling write-write conflicts?

Comparison 1: Thomas Write Rule vs Two-Phase Locking (2PL)

2PL handles write-write conflicts by making the second transaction wait for the first to release its lock.

Thomas Write Rule

•Non-blocking: ignores obsolete write immediately
•Higher throughput for write-heavy workloads
•Produces view serializable schedules
•Higher abort rate (for read-write conflicts)
•No deadlock possibility

Two-Phase Locking

•Blocking: waits for lock release
•Higher throughput for read-heavy workloads
•Produces conflict serializable schedules
•No aborts for lock conflicts (just waits)
•Deadlock possible (requires detection/prevention)

Comparison 2: Thomas Write Rule vs MVCC

MVCC (Multi-Version Concurrency Control) maintains multiple versions of data items to avoid conflicts.

MVCC Approach to Write-Write Conflicts:

Create new version for each write
Readers see consistent snapshot
Writers rarely conflict (each creates own version)
Garbage collection needed for old versions

Performance Comparison:

Metric	Thomas Write Rule	MVCC
Write-write conflicts	Ignored (no abort)	No conflict (new version)
Read-write conflicts	May still abort	No blocking
Storage overhead	None	Multiple versions stored
Garbage collection	Not needed	Required
Implementation complexity	Low	High

When to Choose Each:

Thomas Write Rule: Simpler systems, write-heavy workloads, limited storage
MVCC: Complex systems, mixed workloads, read-heavy, need for snapshot isolation

Summary: Performance Improvement

We've comprehensively analyzed the performance improvements provided by the Thomas Write Rule. Let's consolidate the key takeaways:

Key Takeaways

•Transaction aborts are expensive — They waste CPU, I/O, and increase latency. Any reduction in aborts has multiplicative benefits.
•Abort reduction scales with write conflicts — Write-heavy workloads with hot data items see the largest improvements (30-50% abort reduction).
•Throughput improvement compounds — Less abort overhead means more resources for productive transactions, improving overall throughput.
•Overhead is negligible — Write buffer and extra timestamp comparison add microseconds; avoided aborts save milliseconds.
•Improvement scales with load — Higher concurrency means more conflicts, which means more opportunities for the rule to help.
•Not a universal solution — Read-heavy workloads see minimal benefit; the rule only helps write-write conflicts.

What's Next:

In the final page of this module, we'll examine the correctness of the Thomas Write Rule in rigorous detail. We'll prove that ignoring obsolete writes maintains view serializability, examine edge cases that could threaten correctness, and discuss the relationship between the rule and database consistency guarantees.

Page Complete

You now understand the quantitative performance benefits of the Thomas Write Rule, the workload factors that influence its effectiveness, and how to reason about abort reduction and throughput improvement. This knowledge enables informed decisions about when and how to apply timestamp-based concurrency control.