Timestamp Ordering - Learning Module

Loading content...

0/252

Timestamp Assignment

The Critical Moment of Definition

A transaction's journey through a database system has many phases: it begins, reads data, performs computations, writes results, and commits. At which exact point should the system assign a timestamp? This seemingly simple question has profound implications for correctness, performance, and system behavior.

Assign too early, and the timestamp might not reflect the transaction's actual position in the serialization order. Assign too late, and you might miss opportunities to detect conflicts before wasted work. The timestamp assignment policy is a fundamental design decision that shapes how the entire concurrency control mechanism behaves.

In this page, we'll examine when timestamps are assigned, what factors influence this decision, and how different policies affect system performance and correctness.

What You Will Learn

By the end of this page, you will understand the different timestamp assignment moments (begin, first operation, commit), the implications of each choice for serializability and abort rates, how read and write timestamps are maintained for data items, the interaction between assignment policy and conflict detection, and practical considerations in production systems.

When to Assign Timestamps

The timestamp assignment moment defines a transaction's position in the logical serial order. There are three primary choices for when to assign:

1. At Transaction Start (BEGIN)

The most common approach: assign the timestamp immediately when the transaction begins, before any operations execute.

BEGIN TRANSACTION  → Timestamp assigned here
READ(A)
WRITE(B, value)
COMMIT

Implications:

Simple and predictable—every transaction has its timestamp from the start
Timestamp reflects "intention to serialize" at this point
May lead to more aborts if transaction is long-running (newer transactions may conflict)

2. At First Operation

Defer assignment until the transaction actually accesses data:

BEGIN TRANSACTION  → No timestamp yet
READ(A)            → Timestamp assigned here (first access)
WRITE(B, value)
COMMIT

Implications:

Better reflects when the transaction "really" starts participating
Reduces window for conflicts during idle time between BEGIN and first operation
More complex: must track "not yet timestamped" state

3. At Commit Time

Wait until the transaction is ready to commit before assigning:

BEGIN TRANSACTION  → No timestamp yet
READ(A)            → Record reads/writes without timestamp
WRITE(B, value)
COMMIT             → Timestamp assigned here

Implications:

Most optimistic: assumes transaction will succeed
Validation happens at commit time
Forms the basis for Optimistic Concurrency Control (OCC)

Timestamp Assignment Strategies Comparison
Strategy	When Assigned	Abort Risk	Validation Point	Use Case
Begin-time	Transaction start	Higher for long transactions	On each operation	Standard timestamp ordering
First-operation	First read/write	Moderate	On each operation	Reduced idle-time conflicts
Commit-time	At COMMIT	Detected late (at commit)	Commit validation	Optimistic concurrency control

Classic Timestamp Ordering Uses Begin-Time

The 'classic' timestamp ordering protocol we'll study in detail assumes begin-time assignment. Each transaction gets its timestamp at BEGIN, and all conflict checks compare this fixed timestamp against data item timestamps. This simplicity makes the protocol easier to analyze and implement correctly.

The Assignment Process

Let's examine the mechanics of timestamp assignment at transaction start, the most common approach in timestamp ordering protocols.

Step-by-Step Process:

Transaction Initiates: Client sends BEGIN TRANSACTION request
Allocate Transaction Context: System creates internal transaction state:
- Transaction ID (may differ from timestamp)
- Read set (empty initially)
- Write set (empty initially)
- Status (ACTIVE)
Generate Timestamp: Obtain unique, monotonically increasing value:
- Atomic increment of global counter, OR
- System clock with collision handling
Record Assignment: Store timestamp in transaction context:
- transaction.timestamp = generated_value
- This value is immutable for transaction lifetime
Acknowledge to Client: Return success, transaction is now active

timestamp_assignment.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
from dataclasses import dataclass, field
from typing import Dict, Set, Optional
from enum import Enum
import threading
 
class TransactionStatus(Enum):
    ACTIVE = "active"
    COMMITTED = "committed"
    ABORTED = "aborted"
 
@dataclass
class TransactionContext:
    """
    Internal representation of an active transaction.
    Timestamp is assigned at creation and never changes.
    """
    txn_id: int
    timestamp: int  # Assigned at begin, immutable
    status: TransactionStatus = TransactionStatus.ACTIVE
    read_set: Set[str] = field(default_factory=set)
    write_set: Dict[str, any] = field(default_factory=dict)
 
class TransactionManager:
    """
    Manages transaction lifecycle including timestamp assignment.
    """
    def __init__(self, timestamp_generator):
        self._ts_generator = timestamp_generator
        self._txn_counter = 0
        self._active_transactions: Dict[int, TransactionContext] = {}
        self._lock = threading.Lock()
    
    def begin_transaction(self) -> TransactionContext:
        """
        Starts a new transaction with timestamp assignment.
        
        Returns:
            New TransactionContext with assigned timestamp
        """
        with self._lock:
            # Step 1: Generate unique transaction ID
            self._txn_counter += 1
            txn_id = self._txn_counter
            
            # Step 2: Assign timestamp (the critical moment!)
            timestamp = self._ts_generator.get_timestamp()
            
            # Step 3: Create transaction context
            txn = TransactionContext(
                txn_id=txn_id,
                timestamp=timestamp
            )
            
            # Step 4: Register as active
            self._active_transactions[txn_id] = txn
            
            return txn
    
    def get_transaction(self, txn_id: int) -> Optional[TransactionContext]:
        """Retrieves active transaction by ID."""
        return self._active_transactions.get(txn_id)
    
    def abort_transaction(self, txn: TransactionContext) -> None:
        """Aborts transaction, removes from active set."""
        with self._lock:
            txn.status = TransactionStatus.ABORTED
            self._active_transactions.pop(txn.txn_id, None)
    
    def commit_transaction(self, txn: TransactionContext) -> None:
        """Commits transaction (after all validation passes)."""
        with self._lock:
            txn.status = TransactionStatus.COMMITTED
            self._active_transactions.pop(txn.txn_id, None)

Transaction ID vs Timestamp

Some systems use the same value for transaction ID and timestamp—they're both just incrementing counters. Others separate them: the transaction ID identifies the transaction internally (for logging, debugging), while the timestamp determines serialization order. Separating them provides flexibility but adds a small amount of complexity.

Data Item Timestamps

While transaction timestamps are assigned once at begin, data item timestamps are updated dynamically as operations execute. These data-level timestamps enable the protocol to detect conflicts.

Two Timestamps Per Data Item:

For each data item Q in the database:

W-timestamp(Q) — Write Timestamp
- The timestamp of the most recent transaction that successfully wrote Q
- Updated after each successful write operation
- Initially: 0 or the timestamp of the transaction that created Q
R-timestamp(Q) — Read Timestamp
- The largest timestamp of any transaction that has read Q
- Updated after each successful read operation
- Initially: 0 or the timestamp of any initial read

Why Both Are Needed:

W-timestamp enables detecting write-write conflicts (later writer must have larger timestamp). R-timestamp enables detecting write-read conflicts (cannot write "before" a read that already happened).

Updating Data Item Timestamps:

On Successful Read of Q by Transaction T:

R-timestamp(Q) = max(R-timestamp(Q), TS(T))

The read timestamp advances to track the "latest" reader.

On Successful Write of Q by Transaction T:

W-timestamp(Q) = TS(T)
R-timestamp(Q) = TS(T)  // Also update read timestamp

Both timestamps update—the write establishes both a new "written by" and a new "seen by" point.

On Aborted Transaction: No updates needed—the transaction never "happened" from the database's perspective. However, if R-timestamp was already updated on a read, this doesn't need to be rolled back (conservative).

Example: Data Item Timestamps Evolution
Action	TS(T)	W-TS(Q)	R-TS(Q)	Notes
Initial state		0	0	Q has never been accessed
T₁ (TS=100) reads Q	100	0	100	R-TS updated to 100
T₂ (TS=150) reads Q	150	0	150	R-TS updated to 150
T₃ (TS=200) writes Q	200	200	200	Both timestamps updated
T₄ (TS=180) wants to write Q	180	200	200	⚠️ REJECTED: TS(T₄) < W-TS(Q)
T₅ (TS=250) reads Q	250	200	250	R-TS updated to 250

Storage Overhead Consideration

Every data item needs two additional timestamp fields. For a table with 1 billion rows and 8-byte timestamps, that's 16 GB of overhead. In practice, this metadata is stored with the tuple in the buffer pool and may be compressed or approximated (e.g., stored at page level instead of row level) to reduce overhead.

Assignment and Conflict Detection

The relationship between timestamp assignment and conflict detection is the core of timestamp ordering. Let's trace through how these interact.

The Fundamental Principle:

A transaction T with timestamp TS(T) should see the database as if all transactions with smaller timestamps have completed, and no transactions with larger timestamps have started. Any operation that violates this is rejected.

For Reads:

When T wants to read data item Q:

Conflict check: Is TS(T) < W-timestamp(Q)?
- If yes: T is trying to read a "future" value (written by a later transaction)
- Action: Abort T and restart with new timestamp
- If no: Read is allowed; update R-timestamp(Q) = max(R-timestamp(Q), TS(T))

For Writes:

When T wants to write data item Q:

Conflict check 1: Is TS(T) < R-timestamp(Q)?
- If yes: A "later" transaction already read the old value; T's write would invalidate that read
- Action: Abort T and restart
Conflict check 2: Is TS(T) < W-timestamp(Q)?
- If yes: A "later" transaction already wrote; T's write is obsolete
- Action: Abort T (or use Thomas Write Rule to ignore)

Visualizing the Conflict:

Consider this timeline:

Timeline:     TS=100        TS=150         TS=200
              T1 begins     T2 begins      T3 begins
                  |             |
                  |         T2 writes Q (W-TS=150)
                  |             |
              T1 wants to write Q
              TS(T1)=100 < W-TS(Q)=150 → ABORT!

T1 "should have" written Q before T2. But T2 already wrote. If we let T1 write now, the final value of Q would be T1's value—as if T1 happened last. This violates the timestamp order.

The Abort-Restart Loop:

When a transaction is aborted:

All its changes are discarded (not yet committed anyway)
It receives a new, higher timestamp
It restarts from the beginning
With the new timestamp, it's now logically "after" the conflicting transaction
The operation that previously failed should now succeed

This loop may repeat if another conflict occurs, leading to potential starvation (discussed later).

Early Detection Is Key

Conflicts are detected on each operation, not at commit time. This 'eager' validation means wasted work is minimized—a transaction that will eventually fail due to a conflict is aborted as soon as that conflict becomes apparent, not after doing all its work. This is fundamentally different from optimistic protocols that defer all validation to commit.

Assignment Policy Trade-offs

Different timestamp assignment policies offer different trade-offs. Let's analyze them systematically.

Begin-Time Assignment Trade-offs:

Advantages

•Simple implementation: Timestamp known for entire transaction lifetime
•Early conflict detection: Can check conflicts immediately on operations
•Predictable behavior: No surprises about transaction's logical position
•Easy debugging: Timestamp visible from transaction start

Disadvantages

•Long transactions disadvantaged: Old timestamp increases conflict risk
•Idle time counts: Delay between BEGIN and first op 'ages' transaction
•Higher abort rates: Especially under high contention
•Starvation risk: Slow transactions repeatedly aborted

First-Operation Assignment:

Advantages:

Reduces penalty for idle time between BEGIN and first operation
Transaction's "age" starts when it actually participates
May reduce abort rates for transactions with variable start delay

Disadvantages:

Slightly more complex: must track "not yet timestamped" state
First operation must include timestamp assignment overhead
If first operation fails, transaction still needs timestamp for retry

Commit-Time Assignment (Optimistic):

Advantages:

Maximum optimism: assume success until proven otherwise
No conflicts during execution—all work completes first
Good for low-contention workloads with rare conflicts

Disadvantages:

Wasted work: entire transaction runs before conflict detection
Complex validation: must check all reads and writes at commit
May require "critical section" at commit that limits parallelism

Workload Characteristics Matter

High-contention workloads favor begin-time assignment with early abort—fail fast, don't waste work. Low-contention, read-heavy workloads may benefit from commit-time assignment—most transactions succeed, so optimism is rewarded. Analyze your workload's conflict rate to choose appropriately.

Handling Long Transactions

Begin-time timestamp assignment creates a particular challenge for long-running transactions: they receive an "old" timestamp and face increasing conflict probability as time passes.

The Long Transaction Problem:

Consider:

Transaction L starts at timestamp 1000, runs for 5 minutes
During those 5 minutes, 10,000 shorter transactions complete
Transactions 1001 through 11,000 have all read and written various data items
When L tries to write data item Q:
- If any transaction from 1001-11,000 read Q, L is aborted
- L's timestamp (1000) is "old news"

Result: Long transactions face extremely high abort rates, potentially never completing.

Mitigation Strategies:

1. Transaction Splitting: Break long transactions into shorter ones:

Instead of one 5-minute transaction, use 300 one-second transactions
Each receives a fresh timestamp
Requires application-level coordination for cross-transaction consistency

2. Priority Timestamps: Assign priority based on transaction age or importance:

When L conflicts with newer transaction S, abort S instead
Requires modifying the basic protocol
May cause starvation for short transactions if long ones dominate

3. Timestamp Advancement: Allow transactions to "update" their timestamp under specific conditions:

When L would be aborted, check if it could succeed with a newer timestamp
If write-sets don't conflict, assign new timestamp and continue
Complex: must validate all previous operations against new timestamp

4. Separate Classes: Run long and short transactions in separate isolation contexts:

Batch jobs use one timestamp domain
OLTP uses another
Cross-domain conflicts handled specially

Mixed Workloads Are Challenging

Databases serving both long analytical queries and short OLTP transactions face inherent tension in timestamp ordering. This is one reason MVCC (Multi-Version Concurrency Control) became popular: readers never block writers, and long reads see consistent snapshots without conflicting with concurrent writes. We'll explore MVCC later in this chapter.

Implementation Considerations

Implementing timestamp assignment correctly requires attention to several practical details.

Atomicity of Assignment:

Timestamp generation and transaction context creation should be atomic or carefully ordered:

// WRONG: Gap between timestamp and registration
ts = generate_timestamp()     // TS = 500
// <-- Another thread could see TS 500 is "used" but find no transaction
registration = create_txn(ts)

// RIGHT: Atomic registration
with global_lock:
    ts = generate_timestamp()
    registration = create_txn(ts)
    active_transactions.add(registration)

Without atomicity, a window exists where the timestamp is "consumed" but the transaction isn't yet visible to conflict detection.

Timestamp Visibility:

Other transactions and the conflict detection mechanism must be able to:

Look up a transaction's timestamp by ID
Determine if a timestamp belongs to an active, committed, or aborted transaction
Handle the case where a transaction has been cleaned up

Range Considerations:

Timestamp values must have sufficient range:

32-bit: ~4.3 billion transactions (wraps in hours to weeks at scale)
48-bit: ~281 trillion transactions (centuries of headroom)
64-bit: ~18.4 quintillion (effectively infinite)

Wraparound handling is complex; prefer larger ranges.

Clock Granularity (if clock-based):

If using system time:

Ensure resolution exceeds transaction rate: 1M txn/sec needs µs or better
Handle ties explicitly with secondary counter
Manage clock adjustments (NTP steps)

Production Complexity

Real database timestamp systems have many additional concerns: recovery (timestamps must survive crashes), replication (timestamps must be consistent across replicas), and observability (logging which timestamps were assigned when). The basic concepts we've covered form the foundation, but production implementations layer significant engineering on top.

Summary: Timestamp Assignment

We've thoroughly examined timestamp assignment—when, where, and how transactions receive their ordering identifier. Let's consolidate the key insights:

Key Takeaways

•Assignment moment matters — Begin-time, first-operation, and commit-time each have distinct trade-offs for conflict detection and abort overhead.
•Begin-time is the classic choice — Simple, predictable, enables early conflict detection, but disadvantages long transactions.
•Data items have R-timestamp and W-timestamp — These track which transactions have read and written each item, enabling conflict detection.
•Conflicts are detected operation-by-operation — Transactions are aborted as soon as a violation is detected, minimizing wasted work.
•Long transactions face high abort risk — Their old timestamps accumulate conflicts; mitigation strategies exist but add complexity.
•Implementation requires careful atomicity — Timestamp generation and transaction registration must be properly coordinated.

What's Next:

With timestamps assigned to transactions and maintained on data items, we can now examine how these timestamps create a transaction ordering. The next page explores how timestamp ordering establishes a serialization order and how this order is enforced through the protocol's read and write rules.

Page Complete

You now understand timestamp assignment—the critical moment when a transaction's logical position is defined. From assignment policies through data item timestamps to conflict detection integration, you can analyze how assignment choices affect system behavior. Next, we'll see how these timestamps create and enforce transaction ordering.

Timestamp Assignment

The Critical Moment of Definition

In this page, we'll examine when timestamps are assigned, what factors influence this decision, and how different policies affect system performance and correctness.

What You Will Learn

When to Assign Timestamps

The timestamp assignment moment defines a transaction's position in the logical serial order. There are three primary choices for when to assign:

1. At Transaction Start (BEGIN)

The most common approach: assign the timestamp immediately when the transaction begins, before any operations execute.

BEGIN TRANSACTION  → Timestamp assigned here
READ(A)
WRITE(B, value)
COMMIT

Implications:

Simple and predictable—every transaction has its timestamp from the start
Timestamp reflects "intention to serialize" at this point
May lead to more aborts if transaction is long-running (newer transactions may conflict)

2. At First Operation

Defer assignment until the transaction actually accesses data:

BEGIN TRANSACTION  → No timestamp yet
READ(A)            → Timestamp assigned here (first access)
WRITE(B, value)
COMMIT

Implications:

Better reflects when the transaction "really" starts participating
Reduces window for conflicts during idle time between BEGIN and first operation
More complex: must track "not yet timestamped" state

3. At Commit Time

Wait until the transaction is ready to commit before assigning:

BEGIN TRANSACTION  → No timestamp yet
READ(A)            → Record reads/writes without timestamp
WRITE(B, value)
COMMIT             → Timestamp assigned here

Implications:

Most optimistic: assumes transaction will succeed
Validation happens at commit time
Forms the basis for Optimistic Concurrency Control (OCC)

Timestamp Assignment Strategies Comparison
Strategy	When Assigned	Abort Risk	Validation Point	Use Case
Begin-time	Transaction start	Higher for long transactions	On each operation	Standard timestamp ordering
First-operation	First read/write	Moderate	On each operation	Reduced idle-time conflicts
Commit-time	At COMMIT	Detected late (at commit)	Commit validation	Optimistic concurrency control

Classic Timestamp Ordering Uses Begin-Time

The Assignment Process

Let's examine the mechanics of timestamp assignment at transaction start, the most common approach in timestamp ordering protocols.

Step-by-Step Process:

Transaction Initiates: Client sends BEGIN TRANSACTION request
Allocate Transaction Context: System creates internal transaction state:
- Transaction ID (may differ from timestamp)
- Read set (empty initially)
- Write set (empty initially)
- Status (ACTIVE)
Generate Timestamp: Obtain unique, monotonically increasing value:
- Atomic increment of global counter, OR
- System clock with collision handling
Record Assignment: Store timestamp in transaction context:
- transaction.timestamp = generated_value
- This value is immutable for transaction lifetime
Acknowledge to Client: Return success, transaction is now active

timestamp_assignment.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
from dataclasses import dataclass, field
from typing import Dict, Set, Optional
from enum import Enum
import threading
 
class TransactionStatus(Enum):
    ACTIVE = "active"
    COMMITTED = "committed"
    ABORTED = "aborted"
 
@dataclass
class TransactionContext:
    """
    Internal representation of an active transaction.
    Timestamp is assigned at creation and never changes.
    """
    txn_id: int
    timestamp: int  # Assigned at begin, immutable
    status: TransactionStatus = TransactionStatus.ACTIVE
    read_set: Set[str] = field(default_factory=set)
    write_set: Dict[str, any] = field(default_factory=dict)
 
class TransactionManager:
    """
    Manages transaction lifecycle including timestamp assignment.
    """
    def __init__(self, timestamp_generator):
        self._ts_generator = timestamp_generator
        self._txn_counter = 0
        self._active_transactions: Dict[int, TransactionContext] = {}
        self._lock = threading.Lock()
    
    def begin_transaction(self) -> TransactionContext:
        """
        Starts a new transaction with timestamp assignment.
        
        Returns:
            New TransactionContext with assigned timestamp
        """
        with self._lock:
            # Step 1: Generate unique transaction ID
            self._txn_counter += 1
            txn_id = self._txn_counter
            
            # Step 2: Assign timestamp (the critical moment!)
            timestamp = self._ts_generator.get_timestamp()
            
            # Step 3: Create transaction context
            txn = TransactionContext(
                txn_id=txn_id,
                timestamp=timestamp
            )
            
            # Step 4: Register as active
            self._active_transactions[txn_id] = txn
            
            return txn
    
    def get_transaction(self, txn_id: int) -> Optional[TransactionContext]:
        """Retrieves active transaction by ID."""
        return self._active_transactions.get(txn_id)
    
    def abort_transaction(self, txn: TransactionContext) -> None:
        """Aborts transaction, removes from active set."""
        with self._lock:
            txn.status = TransactionStatus.ABORTED
            self._active_transactions.pop(txn.txn_id, None)
    
    def commit_transaction(self, txn: TransactionContext) -> None:
        """Commits transaction (after all validation passes)."""
        with self._lock:
            txn.status = TransactionStatus.COMMITTED
            self._active_transactions.pop(txn.txn_id, None)

Transaction ID vs Timestamp

Data Item Timestamps

While transaction timestamps are assigned once at begin, data item timestamps are updated dynamically as operations execute. These data-level timestamps enable the protocol to detect conflicts.

Two Timestamps Per Data Item:

For each data item Q in the database:

W-timestamp(Q) — Write Timestamp
- The timestamp of the most recent transaction that successfully wrote Q
- Updated after each successful write operation
- Initially: 0 or the timestamp of the transaction that created Q
R-timestamp(Q) — Read Timestamp
- The largest timestamp of any transaction that has read Q
- Updated after each successful read operation
- Initially: 0 or the timestamp of any initial read

Why Both Are Needed:

W-timestamp enables detecting write-write conflicts (later writer must have larger timestamp). R-timestamp enables detecting write-read conflicts (cannot write "before" a read that already happened).

Updating Data Item Timestamps:

On Successful Read of Q by Transaction T:

R-timestamp(Q) = max(R-timestamp(Q), TS(T))

The read timestamp advances to track the "latest" reader.

On Successful Write of Q by Transaction T:

W-timestamp(Q) = TS(T)
R-timestamp(Q) = TS(T)  // Also update read timestamp

Both timestamps update—the write establishes both a new "written by" and a new "seen by" point.

Example: Data Item Timestamps Evolution
Action	TS(T)	W-TS(Q)	R-TS(Q)	Notes
Initial state		0	0	Q has never been accessed
T₁ (TS=100) reads Q	100	0	100	R-TS updated to 100
T₂ (TS=150) reads Q	150	0	150	R-TS updated to 150
T₃ (TS=200) writes Q	200	200	200	Both timestamps updated
T₄ (TS=180) wants to write Q	180	200	200	⚠️ REJECTED: TS(T₄) < W-TS(Q)
T₅ (TS=250) reads Q	250	200	250	R-TS updated to 250

Storage Overhead Consideration

Assignment and Conflict Detection

The relationship between timestamp assignment and conflict detection is the core of timestamp ordering. Let's trace through how these interact.

The Fundamental Principle:

For Reads:

When T wants to read data item Q:

Conflict check: Is TS(T) < W-timestamp(Q)?
- If yes: T is trying to read a "future" value (written by a later transaction)
- Action: Abort T and restart with new timestamp
- If no: Read is allowed; update R-timestamp(Q) = max(R-timestamp(Q), TS(T))

For Writes:

When T wants to write data item Q:

Conflict check 1: Is TS(T) < R-timestamp(Q)?
- If yes: A "later" transaction already read the old value; T's write would invalidate that read
- Action: Abort T and restart
Conflict check 2: Is TS(T) < W-timestamp(Q)?
- If yes: A "later" transaction already wrote; T's write is obsolete
- Action: Abort T (or use Thomas Write Rule to ignore)

Visualizing the Conflict:

Consider this timeline:

Timeline:     TS=100        TS=150         TS=200
              T1 begins     T2 begins      T3 begins
                  |             |
                  |         T2 writes Q (W-TS=150)
                  |             |
              T1 wants to write Q
              TS(T1)=100 < W-TS(Q)=150 → ABORT!

T1 "should have" written Q before T2. But T2 already wrote. If we let T1 write now, the final value of Q would be T1's value—as if T1 happened last. This violates the timestamp order.

The Abort-Restart Loop:

When a transaction is aborted:

All its changes are discarded (not yet committed anyway)
It receives a new, higher timestamp
It restarts from the beginning
With the new timestamp, it's now logically "after" the conflicting transaction
The operation that previously failed should now succeed

This loop may repeat if another conflict occurs, leading to potential starvation (discussed later).

Early Detection Is Key

Assignment Policy Trade-offs

Different timestamp assignment policies offer different trade-offs. Let's analyze them systematically.

Begin-Time Assignment Trade-offs:

Advantages

•Simple implementation: Timestamp known for entire transaction lifetime
•Early conflict detection: Can check conflicts immediately on operations
•Predictable behavior: No surprises about transaction's logical position
•Easy debugging: Timestamp visible from transaction start

Disadvantages

•Long transactions disadvantaged: Old timestamp increases conflict risk
•Idle time counts: Delay between BEGIN and first op 'ages' transaction
•Higher abort rates: Especially under high contention
•Starvation risk: Slow transactions repeatedly aborted

First-Operation Assignment:

Advantages:

Reduces penalty for idle time between BEGIN and first operation
Transaction's "age" starts when it actually participates
May reduce abort rates for transactions with variable start delay

Disadvantages:

Slightly more complex: must track "not yet timestamped" state
First operation must include timestamp assignment overhead
If first operation fails, transaction still needs timestamp for retry

Commit-Time Assignment (Optimistic):

Advantages:

Maximum optimism: assume success until proven otherwise
No conflicts during execution—all work completes first
Good for low-contention workloads with rare conflicts

Disadvantages:

Wasted work: entire transaction runs before conflict detection
Complex validation: must check all reads and writes at commit
May require "critical section" at commit that limits parallelism

Workload Characteristics Matter

Handling Long Transactions

Begin-time timestamp assignment creates a particular challenge for long-running transactions: they receive an "old" timestamp and face increasing conflict probability as time passes.

The Long Transaction Problem:

Consider:

Transaction L starts at timestamp 1000, runs for 5 minutes
During those 5 minutes, 10,000 shorter transactions complete
Transactions 1001 through 11,000 have all read and written various data items
When L tries to write data item Q:
- If any transaction from 1001-11,000 read Q, L is aborted
- L's timestamp (1000) is "old news"

Result: Long transactions face extremely high abort rates, potentially never completing.

Mitigation Strategies:

1. Transaction Splitting: Break long transactions into shorter ones:

Instead of one 5-minute transaction, use 300 one-second transactions
Each receives a fresh timestamp
Requires application-level coordination for cross-transaction consistency

2. Priority Timestamps: Assign priority based on transaction age or importance:

When L conflicts with newer transaction S, abort S instead
Requires modifying the basic protocol
May cause starvation for short transactions if long ones dominate

3. Timestamp Advancement: Allow transactions to "update" their timestamp under specific conditions:

When L would be aborted, check if it could succeed with a newer timestamp
If write-sets don't conflict, assign new timestamp and continue
Complex: must validate all previous operations against new timestamp

4. Separate Classes: Run long and short transactions in separate isolation contexts:

Batch jobs use one timestamp domain
OLTP uses another
Cross-domain conflicts handled specially

Mixed Workloads Are Challenging

Implementation Considerations

Implementing timestamp assignment correctly requires attention to several practical details.

Atomicity of Assignment:

Timestamp generation and transaction context creation should be atomic or carefully ordered:

// WRONG: Gap between timestamp and registration
ts = generate_timestamp()     // TS = 500
// <-- Another thread could see TS 500 is "used" but find no transaction
registration = create_txn(ts)

// RIGHT: Atomic registration
with global_lock:
    ts = generate_timestamp()
    registration = create_txn(ts)
    active_transactions.add(registration)

Without atomicity, a window exists where the timestamp is "consumed" but the transaction isn't yet visible to conflict detection.

Timestamp Visibility:

Other transactions and the conflict detection mechanism must be able to:

Look up a transaction's timestamp by ID
Determine if a timestamp belongs to an active, committed, or aborted transaction
Handle the case where a transaction has been cleaned up

Range Considerations:

Timestamp values must have sufficient range:

32-bit: ~4.3 billion transactions (wraps in hours to weeks at scale)
48-bit: ~281 trillion transactions (centuries of headroom)
64-bit: ~18.4 quintillion (effectively infinite)

Wraparound handling is complex; prefer larger ranges.

Clock Granularity (if clock-based):

If using system time:

Ensure resolution exceeds transaction rate: 1M txn/sec needs µs or better
Handle ties explicitly with secondary counter
Manage clock adjustments (NTP steps)

Production Complexity

Summary: Timestamp Assignment

We've thoroughly examined timestamp assignment—when, where, and how transactions receive their ordering identifier. Let's consolidate the key insights:

Key Takeaways

•Assignment moment matters — Begin-time, first-operation, and commit-time each have distinct trade-offs for conflict detection and abort overhead.
•Begin-time is the classic choice — Simple, predictable, enables early conflict detection, but disadvantages long transactions.
•Data items have R-timestamp and W-timestamp — These track which transactions have read and written each item, enabling conflict detection.
•Conflicts are detected operation-by-operation — Transactions are aborted as soon as a violation is detected, minimizing wasted work.
•Long transactions face high abort risk — Their old timestamps accumulate conflicts; mitigation strategies exist but add complexity.
•Implementation requires careful atomicity — Timestamp generation and transaction registration must be properly coordinated.

What's Next:

Page Complete