Loading learning content...
We've established that causal consistency requires preserving the happens-before relationship across distributed operations. We've seen how session guarantees manifest these properties for users. But how do we actually implement this? How does a replica know whether it has all the causal dependencies of an incoming write? How do we track what a session has observed?
The answer lies in logical clocks and dependency tracking—elegant mathematical constructs that let us reason about causal order without relying on unreliable physical time. These mechanisms are the foundation of every causally consistent system ever built.
By the end of this page, you will understand Lamport timestamps, vector clocks, version vectors, and hybrid logical clocks—the fundamental building blocks for implementing causal consistency. You'll grasp when to use each mechanism and the practical trade-offs involved in real system implementations.
Lamport timestamps (also called Lamport clocks) were introduced by Leslie Lamport in his 1978 paper alongside the happens-before relation. They provide a simple way to assign a logical timestamp to every event such that causally related events have correctly ordered timestamps.
The algorithm:
Each process maintains a counter (the Lamport clock). The rules are:
1234567891011121314151617181920212223242526272829303132333435363738394041
class LamportClock { private timestamp: number = 0; // Rule 1: Tick before any local operation tick(): number { return ++this.timestamp; } // Rule 2: Get timestamp to send with message send(): number { return this.tick(); } // Rule 3: Update clock on message receive receive(messageTimestamp: number): number { this.timestamp = Math.max(this.timestamp, messageTimestamp) + 1; return this.timestamp; } // Current value (read-only) current(): number { return this.timestamp; }} // Example usageconst clockA = new LamportClock();const clockB = new LamportClock(); // Process A performs local operationconst a1 = clockA.tick(); // a1 = 1 // Process A sends message to Bconst msgTs = clockA.send(); // msgTs = 2const b1 = clockB.receive(msgTs); // b1 = 3 // Process B sends reply to Aconst replyTs = clockB.send(); // replyTs = 4const a2 = clockA.receive(replyTs); // a2 = 5 // Causal chain: a1(1) → msg(2) → b1(3) → reply(4) → a2(5)The key property:
If event A happens-before event B, then the Lamport timestamp of A is less than the Lamport timestamp of B:
A → B ⟹ L(A) < L(B)
This is the clock consistency condition—causality implies timestamp ordering.
The critical limitation:
The converse is not true! If L(A) < L(B), we cannot conclude that A happened before B. They might be concurrent events that happen to have different timestamps.
L(A) < L(B) ⟹ A → B ∨ A ∥ B
This means Lamport timestamps can tell us 'A might have happened before B' but cannot distinguish between 'A definitely happened before B' and 'A and B are concurrent'. This limitation is fundamental—more sophisticated mechanisms are needed to distinguish these cases.
Two events with Lamport timestamps 5 and 10 might be causally related (5 happened before 10), or they might be concurrent (neither influenced the other). Lamport timestamps alone can't tell the difference. For full causal consistency, we need vector clocks.
Vector clocks extend Lamport timestamps to overcome their fundamental limitation. Instead of a single counter, a vector clock maintains one counter for each process (or replica) in the system. This allows us to precisely determine whether two events are causally related or concurrent.
The structure:
A vector clock V is an array of N integers, where N is the number of processes. V[i] represents process i's logical time—how many events process i has 'witnessed'.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
class VectorClock { private clock: Map<string, number>; private processId: string; constructor(processId: string) { this.processId = processId; this.clock = new Map(); } // Increment own counter before local operation tick(): VectorClock { const current = this.clock.get(this.processId) || 0; this.clock.set(this.processId, current + 1); return this.clone(); } // Merge with received clock: component-wise max, then tick receive(other: VectorClock): void { for (const [id, value] of other.clock) { const current = this.clock.get(id) || 0; this.clock.set(id, Math.max(current, value)); } this.tick(); } // Compare two vector clocks compare(other: VectorClock): 'before' | 'after' | 'concurrent' | 'equal' { let hasLess = false; let hasGreater = false; const allKeys = new Set([...this.clock.keys(), ...other.clock.keys()]); for (const key of allKeys) { const thisVal = this.clock.get(key) || 0; const otherVal = other.clock.get(key) || 0; if (thisVal < otherVal) hasLess = true; if (thisVal > otherVal) hasGreater = true; } if (!hasLess && !hasGreater) return 'equal'; if (hasLess && !hasGreater) return 'before'; if (!hasLess && hasGreater) return 'after'; return 'concurrent'; // Both hasLess and hasGreater are true } clone(): VectorClock { const cloned = new VectorClock(this.processId); for (const [id, value] of this.clock) { cloned.clock.set(id, value); } return cloned; }}Comparison rules:
Given vector clocks V and W:
| V | W | Comparison | Interpretation |
|---|---|---|---|
| [2, 3, 1] | [2, 3, 1] | V = W | Same event or equivalent state |
| [1, 2, 1] | [2, 3, 1] | V < W | V happened before W |
| [3, 4, 2] | [2, 3, 1] | V > W | V happened after W |
| [2, 1, 3] | [1, 3, 2] | V ∥ W | V and W are concurrent |
Vector clocks provide a complete characterization of causality: V < W if and only if V happened-before W. Unlike Lamport timestamps, vector clocks can definitively identify concurrent events—events that have no causal relationship.
Version vectors are closely related to vector clocks but are used for a slightly different purpose. While vector clocks track the causal history of individual events, version vectors track the state of data at replicas.
The key difference:
In practice, version vectors are what you'll encounter in distributed databases like Riak, Dynamo-style systems, and many eventually consistent stores.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
interface VersionedValue<T> { value: T; version: VersionVector;} class VersionVector { private versions: Map<string, number>; // replicaId -> version number constructor() { this.versions = new Map(); } // Increment version for a specific replica (when that replica writes) increment(replicaId: string): void { const current = this.versions.get(replicaId) || 0; this.versions.set(replicaId, current + 1); } // Merge two version vectors (after reconciliation) merge(other: VersionVector): VersionVector { const merged = new VersionVector(); const allReplicas = new Set([ ...this.versions.keys(), ...other.versions.keys() ]); for (const replicaId of allReplicas) { merged.versions.set( replicaId, Math.max( this.versions.get(replicaId) || 0, other.versions.get(replicaId) || 0 ) ); } return merged; } // Determine relationship descends(ancestor: VersionVector): boolean { // Returns true if this version is a descendant of ancestor for (const [replicaId, version] of ancestor.versions) { if ((this.versions.get(replicaId) || 0) < version) { return false; } } return true; } // Detect conflicts (concurrent updates) conflicts(other: VersionVector): boolean { const thisDescends = this.descends(other); const otherDescends = other.descends(this); // Conflict if neither descends from the other return !thisDescends && !otherDescends; }}Conflict detection with version vectors:
When a client reads data from one replica and writes to another (or when replicas synchronize), version vectors allow the system to detect conflicts:
No conflict: If the incoming version descends from the current version (or vice versa), one is simply newer—take the descending version.
Conflict (concurrent updates): If neither version descends from the other, both were modified independently since a common ancestor. The system must resolve this—either through application logic, last-write-wins, or by keeping both versions (siblings) for the client to resolve.
Scenario: Alice and Bob both edit a document offline Initial state at all replicas: value = "Hello" version = {A: 1, B: 0} (originated at replica A) Alice's replica (A): - Alice edits: "Hello World" - Version becomes: {A: 2, B: 0} Bob's replica (B): - Bob edits (from same original): "Hello Everyone" - Version becomes: {A: 1, B: 1} When replicas synchronize: Alice's version: {A: 2, B: 0} Bob's version: {A: 1, B: 1} Comparison: - Alice's A=2 > Bob's A=1 (Alice is ahead on A) - Alice's B=0 < Bob's B=1 (Bob is ahead on B) - Neither descends from the other → CONFLICT DETECTED Resolution options: 1. Last-write-wins (using wall clock, may lose Bob's data) 2. Merge the changes (application-specific) 3. Create siblings: keep both, let next client resolveAmazon's Dynamo paper and systems like Riak use version vectors (often attached to objects) to detect concurrent updates. When conflicts occur, they either apply a conflict resolution strategy or return multiple conflicting values (siblings) to the client.
Hybrid Logical Clocks (HLC) combine the best aspects of physical clocks and Lamport clocks. They were introduced by Kulkarni et al. in 2014 to address a practical limitation: vector clocks don't scale well when you have many processes, and Lamport clocks don't give you any relation to real time.
The insight:
Physical clocks are useful for humans and for ordering events that are far apart in time, but they can't be perfectly synchronized across distributed nodes. Logical clocks are perfect for causality but give no information about real time.
HLC provides both: a timestamp that respects causality (if A → B, then HLC(A) < HLC(B)) while staying close to physical time (HLC is bounded by actual wall clock time).
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
class HybridLogicalClock { // l: physical time portion (coarse-grained wall clock) // c: logical counter (for events in the same physical time) private l: number; private c: number; constructor() { this.l = Date.now(); // Physical time in milliseconds this.c = 0; } // Get current HLC timestamp now(): { l: number; c: number } { const pt = Date.now(); // Physical time if (pt > this.l) { // Physical time has advanced - reset counter this.l = pt; this.c = 0; } else { // Physical time hasn't advanced - increment counter this.c++; } return { l: this.l, c: this.c }; } // Receive a timestamp from another node receive(remote: { l: number; c: number }): { l: number; c: number } { const pt = Date.now(); if (pt > this.l && pt > remote.l) { // Physical time is ahead of both - use it this.l = pt; this.c = 0; } else if (this.l === remote.l) { // Same physical time as remote - take max counter + 1 this.c = Math.max(this.c, remote.c) + 1; } else if (this.l > remote.l) { // We're ahead - just increment our counter this.c++; } else { // Remote is ahead - adopt remote time, increment counter this.l = remote.l; this.c = remote.c + 1; } return { l: this.l, c: this.c }; } // Compare two HLC timestamps static compare(a: { l: number; c: number }, b: { l: number; c: number }): number { if (a.l !== b.l) return a.l - b.l; return a.c - b.c; }}Key properties of HLC:
HLC is excellent for systems that need causality guarantees with reasonable connection to wall-clock time—distributed databases, event logs, streaming systems. CockroachDB and MongoDB use HLC or similar mechanisms. HLC doesn't distinguish concurrent events (like vector clocks do), but the fixed-size property makes it scalable.
Beyond logical clocks, implementing causal consistency requires dependency tracking—mechanisms to record what an operation depends on and to ensure those dependencies are satisfied before the operation becomes visible.
The core challenge:
When a client writes to a replica, that write might depend on data the client previously read from other replicas. The receiving replica must:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
interface CausalWrite { key: string; value: any; // The write's unique identifier writeId: string; // Dependencies: what this write causally depends on // Could be version vectors, specific write IDs, or other markers dependencies: VersionVector; // Timestamp for ordering among independent writes timestamp: HLC;} class CausalReplica { private data: Map<string, VersionedValue>; private pendingWrites: CausalWrite[] = []; // Writes waiting for deps private appliedVersions: VersionVector; // What we've applied // Receive a write from replication receiveWrite(write: CausalWrite): void { if (this.dependenciesSatisfied(write)) { this.applyWrite(write); this.tryApplyPending(); // Some pending writes might now be ready } else { // Buffer until dependencies arrive this.pendingWrites.push(write); } } private dependenciesSatisfied(write: CausalWrite): boolean { // Check if we've seen everything this write depends on return this.appliedVersions.descends(write.dependencies); } private applyWrite(write: CausalWrite): void { // Apply the write to our local state this.data.set(write.key, { value: write.value, version: this.appliedVersions.merge(write.dependencies) }); // Update our version vector to reflect this write this.appliedVersions.increment(write.writeId); } private tryApplyPending(): void { let applied = true; while (applied) { applied = false; for (let i = this.pendingWrites.length - 1; i >= 0; i--) { if (this.dependenciesSatisfied(this.pendingWrites[i])) { this.applyWrite(this.pendingWrites[i]); this.pendingWrites.splice(i, 1); applied = true; } } } }}Dependency tracking strategies:
When a write's dependencies aren't satisfied, the replica must decide: buffer the write (delay visibility) or block replication (backpressure the sender). Buffering can lead to unbounded memory usage; blocking can reduce throughput. Most systems buffer with limits and provide mechanisms to 'force' delivery when buffers overflow.
Causal broadcast is a communication primitive that ensures messages are delivered to all replicas in an order consistent with causality. If message A causally precedes message B, every recipient will receive A before B.
Formal specification:
A broadcast protocol is causally ordered if:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101
// Causal broadcast implementation using vector clocksclass CausalBroadcast { private nodeId: string; private clock: VectorClock; private pending: Map<string, PendingMessage[]>; // Per-sender pending queues private delivered: Set<string>; // Message IDs we've delivered constructor(nodeId: string) { this.nodeId = nodeId; this.clock = new VectorClock(nodeId); this.pending = new Map(); this.delivered = new Set(); } // Broadcast a message to all nodes broadcast(message: any): BroadcastMessage { const timestamp = this.clock.tick(); const broadcastMsg: BroadcastMessage = { id: `${this.nodeId}-${timestamp.get(this.nodeId)}`, sender: this.nodeId, payload: message, vectorClock: timestamp.clone() }; // Send to all other nodes (implementation depends on network layer) this.sendToAll(broadcastMsg); // Deliver locally immediately this.deliverLocally(broadcastMsg); return broadcastMsg; } // Receive a broadcast message from another node receive(msg: BroadcastMessage): void { if (this.delivered.has(msg.id)) { return; // Already delivered } if (this.canDeliver(msg)) { this.deliver(msg); } else { // Buffer until we can deliver this.bufferPending(msg); } } private canDeliver(msg: BroadcastMessage): boolean { // Can deliver if we've seen all messages this one depends on // Sender's component should be exactly one more than what we have // All other components should be <= what we have const senderComponent = msg.vectorClock.get(msg.sender); const ourSenderProgress = this.clock.get(msg.sender) || 0; if (senderComponent !== ourSenderProgress + 1) { return false; // Missing earlier messages from this sender } for (const [nodeId, value] of msg.vectorClock.entries()) { if (nodeId !== msg.sender) { const ourValue = this.clock.get(nodeId) || 0; if (value > ourValue) { return false; // Missing messages from another sender } } } return true; } private deliver(msg: BroadcastMessage): void { // Update our clock this.clock.receive(msg.vectorClock); this.delivered.add(msg.id); // Deliver to application this.deliverToApplication(msg.payload); // Check if any pending messages can now be delivered this.processPending(); } private processPending(): void { let delivered = true; while (delivered) { delivered = false; for (const [sender, messages] of this.pending) { for (let i = 0; i < messages.length; i++) { if (this.canDeliver(messages[i].msg)) { this.deliver(messages[i].msg); messages.splice(i, 1); delivered = true; break; } } } } }}Causal broadcast underlies many distributed systems: replicated state machines, collaborative editing (CRDTs often use causal broadcast), distributed databases with causal consistency. The vector clock overhead can be significant in systems with many nodes, leading to optimizations like causal barriers and epoch-based delivery.
Each causal consistency mechanism involves trade-offs. Understanding these helps choose the right approach for your system.
| Mechanism | Space Overhead | Concurrency Detection | Physical Time Relation | Scalability |
|---|---|---|---|---|
| Lamport Timestamps | O(1) - single integer | No | None | Excellent |
| Vector Clocks | O(N) - one integer per node | Yes | None | Poor for many nodes |
| Version Vectors | O(N) - one integer per replica | Yes (conflict detection) | None | Moderate |
| Hybrid Logical Clocks | O(1) - two integers | No | Bounded offset from wall clock | Excellent |
| Dotted Version Vectors | O(writes) - can be pruned | Yes, per-key | None | Good with pruning |
MongoDB uses HLC-style timestamps. CockroachDB uses HLC with uncertainty intervals. Riak uses dotted version vectors. DynamoDB uses vector clocks (though Amazon's paper sparked debate about version vectors vs. vector clocks). Most systems have adapted these primitives to their specific needs.
We've explored the fundamental mechanisms for implementing causal consistency. Let's consolidate:
What's next:
We've seen how to implement causal consistency. The next page examines the trade-offs with strong consistency—when is causal consistency sufficient, when do you need stronger guarantees, and what are the performance and availability implications of each choice?
You now understand the core mechanisms for implementing causal consistency: Lamport timestamps, vector clocks, version vectors, hybrid logical clocks, dependency tracking, and causal broadcast. This knowledge enables you to evaluate and design causally consistent distributed systems.