Loading learning content...
Last-Write-Wins (LWW) has become the de facto standard for automatic conflict resolution in multi-leader systems. Apache Cassandra, Amazon DynamoDB (optionally), CouchDB, and countless custom implementations use LWW as their primary or default resolution strategy.
The appeal is obvious: LWW is simple to understand, trivial to implement, and guarantees convergence. Every node, applying the same timestamp comparison, arrives at the same result. No complex merge logic, no domain-specific code, no human intervention.
But this simplicity is deceptive. LWW carries subtle assumptions about time, causality, and data semantics that frequently break in production. Engineers who don't deeply understand LWW's failure modes discover them through data loss incidents.
This page dissects LWW: its guarantees, its mechanisms, its failure modes, and when it's genuinely appropriate versus when it's a dangerous convenience.
By the end of this page, you will understand: (1) The precise semantics of LWW and why it guarantees convergence, (2) Physical vs. logical timestamps and their trade-offs, (3) Clock synchronization challenges and their impact on LWW correctness, (4) Production implementations in Cassandra and DynamoDB, and (5) Tie-breaking mechanisms for equal timestamps.
At its core, LWW makes a simple assertion: when two writes conflict, the one with the higher timestamp wins. But the implications of this assertion are nuanced.
Formal Definition:
Given writes W₁ with timestamp T₁ and W₂ with timestamp T₂ to the same key:
What LWW Guarantees:
What LWW Does NOT Guarantee:
LWW guarantees that all replicas converge to THE SAME state—not that they converge to the CORRECT state. If all replicas agree on a wrong value (because clock skew caused the wrong write to win), they're still converged. Convergence is a consistency property, not a correctness property.
The most intuitive approach to LWW uses physical timestamps—the actual wall-clock time when a write occurs. This is what most developers imagine when they think of LWW.
How Physical Timestamps Work:
Implementation:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
interface PhysicalTimestampedValue<T> { value: T; timestamp: number; // milliseconds since epoch sourceNodeId: string; // for tie-breaking} class PhysicalLWWStore<T> { private data = new Map<string, PhysicalTimestampedValue<T>>(); constructor(private readonly nodeId: string) {} // Local write: generate timestamp from wall clock write(key: string, value: T): PhysicalTimestampedValue<T> { const entry: PhysicalTimestampedValue<T> = { value, timestamp: Date.now(), // Wall-clock time sourceNodeId: this.nodeId }; return this.applyWrite(key, entry); } // Replicated write: apply incoming entry if it wins applyReplicatedWrite(key: string, incoming: PhysicalTimestampedValue<T>): boolean { const current = this.data.get(key); if (!current) { this.data.set(key, incoming); return true; // Applied } if (this.isNewer(incoming, current)) { this.data.set(key, incoming); return true; // Applied } return false; // Rejected (current is newer) } private isNewer( a: PhysicalTimestampedValue<T>, b: PhysicalTimestampedValue<T> ): boolean { if (a.timestamp !== b.timestamp) { return a.timestamp > b.timestamp; } // Tie-breaker: lexicographic node ID comparison return a.sourceNodeId > b.sourceNodeId; } private applyWrite(key: string, entry: PhysicalTimestampedValue<T>): PhysicalTimestampedValue<T> { const current = this.data.get(key); if (!current || this.isNewer(entry, current)) { this.data.set(key, entry); return entry; } // Rare: our own write lost to existing entry (shouldn't happen in normal operation) return current; }}The Clock Synchronization Problem:
Physical timestamps assume clocks are synchronized. In practice, they're not:
NTP (Network Time Protocol):
Hardware Clock Drift:
| Synchronization Method | Typical Accuracy | Best Case | Failure Mode Accuracy |
|---|---|---|---|
| NTP (Internet) | 10-100ms | 1-10ms | Seconds to minutes (network issues) |
| NTP (LAN server) | 0.1-1ms | <0.1ms | 10-100ms (server load) |
| PTP (Precision Time Protocol) | < 1μs | < 100ns | μs to ms (network congestion) |
| GPS Time | < 100ns | < 10ns | seconds (GPS signal loss) |
| Google TrueTime | < 7ms (guaranteed) | < 1ms | Never exceeds bound (by design) |
Failure Scenario: Clock Skew Data Loss
Consider two leaders, Leader-A and Leader-B. Leader-A's clock is 5 seconds ahead of Leader-B's clock.
value = 100 (timestamp: 1000000000000)value = 200 (timestamp: 1000000007000 — 5 seconds ahead)value = 200 wins everywhereSeemingly correct! But now:
value = 200, decides to set value = 300 (timestamp: 1000000010000)value = 200, decides to set value = 400 (timestamp: 1000000015000 — still 5 seconds ahead)Leader-A always wins due to systematic clock skew. Leader-B's users experience consistent data loss.
Random clock skew causes occasional data loss. Systematic skew—where one datacenter's clocks are consistently ahead—causes permanent bias. All writes from the 'slow' datacenter lose conflicts forever. This is particularly dangerous because it may not surface immediately in testing.
To address physical clock limitations, distributed systems use logical timestamps or hybrid logical clocks (HLC) that combine physical and logical components.
Lamport Clocks (Pure Logical):
Lamport clocks provide a simple logical ordering without physical time:
cct, set c = max(c, t) + 1c at that momentProperty: If event A causally precedes event B (A → B), then timestamp(A) < timestamp(B).
Limitation: Lamport clocks can diverge from wall-clock time, making timestamps unintuitive for debugging.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576
/** * Hybrid Logical Clock (HLC) combines physical time with logical counter. * Properties: * - Maintains causality: if A → B, then HLC(A) < HLC(B) * - Stays close to physical time * - Monotonically increasing within a node */interface HybridTimestamp { // Physical component: wall-clock time, used as primary ordering physical: number; // Logical component: breaks ties when physical times are equal logical: number; // Source node ID for cross-node tie-breaking nodeId: string;} class HybridLogicalClock { private lastPhysical: number = 0; private logical: number = 0; constructor(private readonly nodeId: string) {} // Generate timestamp for a local event now(): HybridTimestamp { const wallClock = Date.now(); if (wallClock > this.lastPhysical) { // Wall clock has advanced: use it, reset logical this.lastPhysical = wallClock; this.logical = 0; } else { // Wall clock hasn't advanced (or went backwards!): increment logical this.logical++; } return { physical: this.lastPhysical, logical: this.logical, nodeId: this.nodeId }; } // Update clock based on received timestamp (maintains causality) receive(remote: HybridTimestamp): HybridTimestamp { const wallClock = Date.now(); if (wallClock > this.lastPhysical && wallClock > remote.physical) { // Wall clock is ahead of both: use it this.lastPhysical = wallClock; this.logical = 0; } else if (remote.physical > this.lastPhysical) { // Remote is ahead: adopt remote's physical, increment logical this.lastPhysical = remote.physical; this.logical = remote.logical + 1; } else if (this.lastPhysical > remote.physical) { // Local is ahead: keep local physical, increment logical this.logical++; } else { // Physical times equal: take max logical + 1 this.logical = Math.max(this.logical, remote.logical) + 1; } return { physical: this.lastPhysical, logical: this.logical, nodeId: this.nodeId }; }} // HLC Comparison for LWWfunction compareHLC(a: HybridTimestamp, b: HybridTimestamp): number { if (a.physical !== b.physical) return a.physical - b.physical; if (a.logical !== b.logical) return a.logical - b.logical; return a.nodeId.localeCompare(b.nodeId);}Why HLC Matters for LWW:
Causality preservation: If write A is read and then used to compute write B, HLC ensures A's timestamp < B's timestamp. LWW correctly orders them.
Wall-clock proximity: Unlike pure Lamport clocks, HLC stays within bounded skew of actual wall-clock time. Timestamps remain meaningful for debugging.
Monotonicity: HLC never goes backward, even if the physical clock goes backward (NTP correction). Prevents timestamp collision and ordering anomalies.
| Mechanism | Causality Preserved? | Close to Wall Clock? | Drift Bounded? |
|---|---|---|---|
| Physical clock only | No | Yes | No (arbitrary skew possible) |
| Lamport clock | Yes | No (can diverge) | No (monotonic but unbounded) |
| Hybrid Logical Clock | Yes | Yes (within skew) | Yes (bounded by sync protocol) |
| Google TrueTime | Yes | Yes (with uncertainty interval) | Yes (GPS-backed) |
For production multi-leader systems using LWW, use Hybrid Logical Clocks (HLC) rather than raw physical timestamps. HLC provides causality guarantees while remaining intuitive and bounded. Libraries exist for most languages (e.g., CockroachDB's HLC implementation).
Let's examine how industry-leading databases implement LWW in production.
Apache Cassandra's LWW Implementation:
Cassandra uses cell-level LWW (each column in a row has its own timestamp):
12345678910111213141516171819202122232425262728
-- Cassandra cell-level LWW example -- Each column update carries a timestamp-- These can come from different coordinators at different times -- Node 1 writes at T=1000INSERT INTO users (user_id, name, email) VALUES ('alice', 'Alice Smith', 'alice@old.com')USING TIMESTAMP 1000000000000; -- Node 2 writes at T=1200 (200μs later, different column)INSERT INTO users (user_id, name, email)VALUES ('alice', 'Alice Williams', 'alice@new.com')USING TIMESTAMP 1000000000200; -- Result after replication:-- user_id: 'alice'-- name: 'Alice Williams' (T=1200 wins)-- email: 'alice@new.com' (T=1200 wins) -- If only email was updated at T=1200:UPDATE users SET email = 'alice@new.com'WHERE user_id = 'alice'USING TIMESTAMP 1000000000200; -- Result:-- name: 'Alice Smith' (T=1000, never updated)-- email: 'alice@new.com' (T=1200 wins for this cell)Amazon DynamoDB:
DynamoDB offers both LWW and optimistic locking options:
| Aspect | Cassandra | DynamoDB Global Tables |
|---|---|---|
| Resolution granularity | Cell (column) level | Item (row) level |
| Timestamp source | Client or coordinator | Server (AWS controlled) |
| Clock sync dependency | High (client clocks) | Low (AWS infrastructure) |
| Customization | Can use USING TIMESTAMP | No-custom LWW only |
| Timestamp precision | Microseconds | AWS-managed (opaque) |
Cassandra's cell-level LWW is more permissive: concurrent updates to different columns merge cleanly. DynamoDB's item-level LWW means any concurrent update to the same item triggers conflict resolution, even if different attributes changed. Consider this when choosing databases.
When two writes have identical timestamps, LWW needs a deterministic tie-breaker to maintain convergence. The choice of tie-breaker subtly affects system behavior.
Common Tie-Breaking Strategies:
| Strategy | Implementation | Trade-offs |
|---|---|---|
| Node ID ordering | Higher node ID wins | Systematic bias toward certain nodes; predictable but unfair |
| Value hash | Higher hash(value) wins | Content-dependent; same inputs always produce same output |
| Write ID / UUID | Compare unique write identifiers | No bias; requires generating UUIDs per write |
| Random (non-deterministic) | Randomly choose winner | Non-convergent! Different nodes may choose differently |
| Composite | Compare multiple fields in order | Flexible; can incorporate domain-specific ordering |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
interface WriteRecord<T> { value: T; timestamp: number; nodeId: string; writeId: string; // UUID generated per write} // Strategy 1: Node ID ordering (simple but biased)function tieBreakByNodeId<T>(a: WriteRecord<T>, b: WriteRecord<T>): WriteRecord<T> { return a.nodeId > b.nodeId ? a : b;} // Strategy 2: Write ID ordering (no bias, requires UUIDs)function tieBreakByWriteId<T>(a: WriteRecord<T>, b: WriteRecord<T>): WriteRecord<T> { return a.writeId > b.writeId ? a : b;} // Strategy 3: Value hash (content-dependent, deterministic)function tieBreakByValueHash<T>(a: WriteRecord<T>, b: WriteRecord<T>): WriteRecord<T> { const hashA = computeHash(JSON.stringify(a.value)); const hashB = computeHash(JSON.stringify(b.value)); return hashA > hashB ? a : b;} // Strategy 4: Composite (combine multiple criteria)function tieBreakComposite<T>(a: WriteRecord<T>, b: WriteRecord<T>): WriteRecord<T> { // First: compare timestamps (already equal, but future-proofing) if (a.timestamp !== b.timestamp) { return a.timestamp > b.timestamp ? a : b; } // Second: prefer certain node types (e.g., primary datacenter) const priorityA = getNodePriority(a.nodeId); const priorityB = getNodePriority(b.nodeId); if (priorityA !== priorityB) { return priorityA > priorityB ? a : b; } // Third: fall back to write ID return a.writeId > b.writeId ? a : b;} function getNodePriority(nodeId: string): number { // Example: primary datacenter has higher priority if (nodeId.startsWith('primary-')) return 100; if (nodeId.startsWith('secondary-')) return 50; return 10;} function computeHash(input: string): string { // Use consistent hash function (e.g., SHA-256) // Simplified for illustration return input.split('').reduce((acc, char) => { return ((acc << 5) - acc) + char.charCodeAt(0); }, 0).toString(16);}Timestamp Collision Probability:
With millisecond timestamps, collision probability depends on write rate:
| Write Rate (per key) | Collision Probability (ms resolution) | With Microsecond Resolution |
|---|---|---|
| 1 write/second | ~0.1% | ~0.0001% |
| 10 writes/second | ~1% | ~0.001% |
| 100 writes/second | ~10% | ~0.01% |
| 1000 writes/second | ~65% | ~0.1% |
Recommendation: Use microsecond or nanosecond precision timestamps, plus a robust tie-breaker. Never rely on timestamp uniqueness alone.
Never use random or non-deterministic tie-breaking in production. If Node A and Node B make different tie-break decisions, they diverge permanently. The entire point of LWW—guaranteed convergence—is lost. Always use deterministic comparison functions.
Understanding LWW's failure modes enables us to mitigate them. Here are production-proven techniques to make LWW safer.
1. Application-Level Conflict Logging:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364
interface ConflictEvent<T> { key: string; winner: WriteRecord<T>; loser: WriteRecord<T>; timestampDelta: number; // How close were they? detectedAt: Date; resolvedBy: 'lww' | 'tie-breaker';} class LWWWithLogging<T> { private conflictLog: ConflictEvent<T>[] = []; resolve(key: string, local: WriteRecord<T>, incoming: WriteRecord<T>): WriteRecord<T> { if (local.timestamp === incoming.timestamp) { // Tie: log and use tie-breaker const winner = tieBreakByWriteId(local, incoming); const loser = winner === local ? incoming : local; this.logConflict(key, winner, loser, 'tie-breaker'); return winner; } const winner = incoming.timestamp > local.timestamp ? incoming : local; const loser = winner === local ? incoming : local; // Only log if timestamps were close (potential clock skew issue) const delta = Math.abs(incoming.timestamp - local.timestamp); if (delta < 1000) { // Within 1 second this.logConflict(key, winner, loser, 'lww'); } return winner; } private logConflict( key: string, winner: WriteRecord<T>, loser: WriteRecord<T>, resolvedBy: 'lww' | 'tie-breaker' ) { this.conflictLog.push({ key, winner, loser, timestampDelta: Math.abs(winner.timestamp - loser.timestamp), detectedAt: new Date(), resolvedBy }); // Alert if conflict rate is high this.checkConflictRate(); } private checkConflictRate() { const recentConflicts = this.conflictLog.filter( c => c.detectedAt.getTime() > Date.now() - 60000 // Last minute ); if (recentConflicts.length > 100) { console.warn('High conflict rate detected:', recentConflicts.length, 'in last minute'); // Trigger alert, investigation } }}No single mitigation makes LWW safe. Combine multiple strategies: use HLC timestamps, log all conflicts, monitor clock sync, alert on high conflict rates, and avoid LWW for critical data. Each layer catches failures the others miss.
We've dissected Last-Write-Wins—the most common automatic conflict resolution strategy. Let's consolidate the key insights:
What's Next:
LWW is just one approach to conflict resolution. The next page explores custom conflict resolution—when LWW's simplicity isn't sufficient and applications need domain-specific merge logic that preserves more information across conflicting writes.
You now understand LWW's mechanics, failure modes, and production implementations in depth. Next, we'll explore how to move beyond LWW with custom conflict resolution strategies that preserve domain semantics.