Loading learning content...
Every computer has a clock. It ticks away constantly, tracking the passage of real-world seconds, milliseconds, and nanoseconds. When a database needs unique, monotonically increasing timestamps, the system clock presents an obvious solution: just ask the operating system what time it is.
This approach is intuitive—the clock is already there, already incrementing, already providing a universally understood ordering. When transaction T₁ starts at 10:00:00.000 and T₂ starts at 10:00:00.001, the timestamps directly reflect real-world causality.
But as we'll discover, using system clocks for database timestamps introduces subtle challenges that can undermine correctness. Understanding these challenges—and how to address them—is essential for building reliable timestamp-based systems.
By the end of this page, you will understand how system clock timestamps work, the precision and resolution requirements for database use, the challenges of clock skew and synchronization, clock adjustment problems (NTP leaps), and when system clocks are appropriate versus when alternatives are needed.
Before using system clocks for timestamps, we need to understand how computers track time. This involves multiple layers of hardware and software working together.
Hardware Timekeeping:
At the lowest level, computers use hardware oscillators to generate periodic signals:
Real-Time Clock (RTC): A battery-backed chip that maintains time even when powered off. Typically uses a 32.768 kHz crystal oscillator. Drifts several seconds per day.
High-Precision Event Timer (HPET): Modern motherboard timer providing nanosecond-resolution timing for the operating system.
Time Stamp Counter (TSC): A register in modern CPUs that increments with each clock cycle. Provides the finest granularity (sub-nanosecond) but requires calibration.
Software Time Management:
The operating system uses hardware timers to maintain:
| OS/Language | Wall Clock API | Monotonic API | Typical Resolution |
|---|---|---|---|
| Linux C | gettimeofday(), clock_gettime(CLOCK_REALTIME) | clock_gettime(CLOCK_MONOTONIC) | Nanoseconds |
| Windows C++ | GetSystemTimeAsFileTime() | QueryPerformanceCounter() | 100 nanoseconds / varies |
| Java | System.currentTimeMillis() | System.nanoTime() | Milliseconds / Nanoseconds |
| Python | time.time() | time.monotonic() | Microseconds / Nanoseconds |
| PostgreSQL | now(), statement_timestamp() | pg_catalog.timeofday() | Microseconds |
For timestamp ordering, you might think the wall clock is ideal because it reflects 'real' time. But the wall clock can jump backward during NTP synchronization! Monotonic clocks never go backward, making them safer for ordering—though they don't correspond to human-readable times. Many databases use a combination: wall clock for human-visible timestamps, monotonic components for ordering guarantees.
For timestamps to provide unique, ordered identifiers, the clock's resolution must be fine enough that no two transactions receive the same timestamp. This imposes strict requirements on clock precision.
Key Terminology:
The Uniqueness Challenge:
Consider a database handling 100,000 transactions per second. With millisecond-resolution timestamps:
With microsecond resolution:
Throughput vs Resolution Requirements:
Let's calculate the minimum required resolution for given transaction rates:
| Transactions/Second | Minimum Resolution for Uniqueness | Notes |
|---|---|---|
| 1,000 | 1 millisecond | Very low volume, any modern clock works |
| 10,000 | 100 microseconds | Standard OLTP workloads |
| 100,000 | 10 microseconds | High-performance systems |
| 1,000,000 | 1 microsecond | Extreme throughput |
| 10,000,000 | 100 nanoseconds | Requires specialized approaches |
Modern hardware easily provides microsecond resolution. Nanosecond resolution is available but less consistent across platforms. For transactions exceeding hardware clock resolution, we need tie-breaking mechanisms.
Even with nanosecond resolution, concurrent calls to the clock API from multiple CPU cores can return the same value. The hardware might increment between reads, but there's no guarantee. Database systems must handle ties explicitly—typically with a secondary counter or by rejecting concurrent timestamp requests until the clock advances.
When two transactions request timestamps within the same clock tick—or when clock resolution is insufficient—we face timestamp collisions. Since timestamps must be unique by definition, we need systematic strategies to resolve these conflicts.
Strategy 1: Wait for Clock Advance
The simplest approach: if the current clock value equals the last assigned timestamp, wait until it changes.
last_timestamp = 0
function get_timestamp():
current = system_clock()
while current <= last_timestamp:
current = system_clock() // busy-wait or sleep
last_timestamp = current
return current
Pros: Guarantees uniqueness with pure clock values Cons: Limits throughput to clock resolution; can cause contention
Strategy 2: Sub-Clock Counter Extension
Append a secondary counter that increments within each clock tick:
last_clock = 0
sub_counter = 0
function get_timestamp():
current_clock = system_clock()
if current_clock == last_clock:
sub_counter = sub_counter + 1
else:
last_clock = current_clock
sub_counter = 0
return (current_clock, sub_counter) // composite timestamp
Pros: Higher throughput; no waiting Cons: Timestamps become composite values; need handling if counter overflows
Strategy 3: Hybrid Logical Clocks (HLC)
A sophisticated approach combining physical time with logical components:
HLC timestamps look like (physical_time, logical_counter) and provide:
This approach is used by CockroachDB, MongoDB, and other distributed databases.
Pure system-clock timestamps are rare in high-performance databases. Most production systems use composite approaches: the clock provides the major component for real-time correlation, while counters or node IDs ensure uniqueness. The 'timestamp' becomes a structured value rather than a simple integer, though it still provides total ordering.
Computer clocks are imperfect—they drift relative to true physical time. A clock that runs 1 part per million (ppm) fast will gain about 86 milliseconds per day. This drift creates challenges for timestamp ordering.
Sources of Clock Drift:
| Hardware/Environment | Typical Drift | Error Per Day | Notes |
|---|---|---|---|
| PC RTC (cheap crystal) | 20-100 ppm | 1.7-8.6 seconds | Without NTP correction |
| Server-grade hardware | 50-100 ppm | 4.3-8.6 seconds | Better crystals, still drifts |
| GPS-disciplined clock | < 0.001 ppm | < 86 microseconds | Expensive, high-accuracy |
| Atomic clock (Cesium) | ~10⁻¹² ppm | < 1 nanosecond | Laboratory/infrastructure grade |
| VM guest clock | 100-1000 ppm | 8.6-86 seconds | Virtualization overhead |
Network Time Protocol (NTP):
NTP corrects clock drift by synchronizing with reference time servers. Key characteristics:
The Stepping Problem:
When NTP determines the clock is significantly off (typically > 128ms), it may step the clock—instantly changing the time. If the clock moves backward:
Most production systems configure NTP to only slew (never step) after initial synchronization, accepting temporary drift in exchange for monotonicity.
If a database relies on wall-clock timestamps and the clock jumps backward 10 minutes, new transactions get timestamps from 10 minutes ago. They appear 'older' than transactions that already committed, potentially leading to lost updates or phantom reads. This is why monotonic clocks or logical timestamps are often preferred for correctness-critical ordering.
When multiple database nodes each assign timestamps using their local clocks, the challenges multiply. Different nodes have different clocks, and keeping them perfectly synchronized is physically impossible.
The Fundamental Problem:
Imagine two database nodes, A and B:
User 1 sends transaction T₁ to Node A at true time t=100ms:
User 2 sends transaction T₂ to Node B at true time t=101ms (1ms later):
Result: T₂ (actually later) has a smaller timestamp than T₁.
If T₁ and T₂ both access the same data, the timestamp ordering is inverted from real causality.
Why This Matters:
Consider a scenario where:
The read appears to have happened "after" the withdrawal in timestamp order, even though it happened before. If the system uses timestamps strictly, the withdrawal might see stale data or the read might miss the withdrawal—both are incorrect.
Approaches to Distributed Timestamps:
Centralized Timestamp Server: A single node assigns all timestamps
Clock Synchronization Bounds: Characterize maximum clock skew and build protocols around it
Logical Timestamps: Use vector clocks or Lamport clocks instead of physical time
Hybrid Logical Clocks: Combine physical and logical components
Spanner uses GPS receivers and atomic clocks at each datacenter to bound clock uncertainty to a few milliseconds. The TrueTime API returns an interval [earliest, latest] rather than a point. Transactions wait for the uncertainty interval to pass before committing, ensuring that if TS(T₁) < TS(T₂), then T₁ actually committed before T₂ started. This provides external consistency—stronger than serializability—at the cost of commit latency.
Let's examine concrete implementation patterns used by real database systems to leverage system clocks while mitigating their limitations.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
import timeimport threading class MonotonicTimestampGenerator: """ Generates monotonically increasing timestamps using system clock with tie-breaking. """ def __init__(self): self._last_timestamp = 0 self._lock = threading.Lock() def get_timestamp(self) -> int: """ Returns a unique, monotonically increasing timestamp. Uses microseconds with logical counter for sub-microsecond ordering. Returns: 64-bit integer: high 48 bits = microseconds, low 16 bits = counter """ with self._lock: # Get current time in microseconds current_us = int(time.time() * 1_000_000) # Extract microseconds and counter from last timestamp last_us = self._last_timestamp >> 16 last_counter = self._last_timestamp & 0xFFFF if current_us > last_us: # Clock advanced - use new time, reset counter new_timestamp = (current_us << 16) | 0 elif current_us == last_us: # Same microsecond - increment counter if last_counter >= 0xFFFF: # Counter overflow - wait for clock to advance while int(time.time() * 1_000_000) <= current_us: time.sleep(0.000001) # 1 microsecond current_us = int(time.time() * 1_000_000) new_timestamp = (current_us << 16) | 0 else: new_timestamp = (current_us << 16) | (last_counter + 1) else: # Clock went backward! Use last time + 1 counter # This handles NTP adjustments gracefully new_timestamp = self._last_timestamp + 1 self._last_timestamp = new_timestamp return new_timestamp # Usage examplegenerator = MonotonicTimestampGenerator() # Sequential timestamps are guaranteed unique and increasingts1 = generator.get_timestamp() # e.g., 1704067200000000 << 16 | 0ts2 = generator.get_timestamp() # e.g., 1704067200000000 << 16 | 1ts3 = generator.get_timestamp() # e.g., 1704067200000001 << 16 | 0 print(f"ts1: {ts1}, ts2: {ts2}, ts3: {ts3}")print(f"Ordering holds: {ts1 < ts2 < ts3}") # TrueKey Implementation Details:
This pattern provides:
System clock timestamps are appropriate in specific scenarios. Understanding these helps make informed architectural decisions.
Most production databases don't use pure system clock timestamps. PostgreSQL's transaction IDs are sequential counters. MySQL's InnoDB uses transaction IDs internally, wall clocks for visibility. CockroachDB and Spanner use hybrid logical clocks. The 'timestamp' concept is adapted to each system's needs, often combining clock components with counters, node IDs, or other ordering guarantees.
We've thoroughly examined system clock-based timestamp generation. Let's consolidate the essential insights:
What's Next:
System clocks are one approach to timestamp generation. An alternative avoids clock complexity entirely: logical counters provide guaranteed uniqueness and monotonicity without any dependency on physical time. We'll explore this elegant alternative next.
You now understand the mechanics, challenges, and trade-offs of system clock-based timestamps. From hardware oscillators through NTP synchronization to distributed skew, you can analyze whether clock-based timestamps suit a given application. Next, we'll examine the simpler, more reliable alternative: logical counters.