Loading learning content...
Lamport clocks gave us a powerful tool for ordering events consistently, but they left a critical gap: we cannot determine if two events are concurrent. When L(a) < L(b), it might mean a → b (causally ordered) or a || b (concurrent). For many distributed systems problems—conflict detection, causally consistent replication, debugging—this ambiguity is unacceptable.
Vector clocks solve this problem definitively. Introduced independently by Colin Fidge and Friedemann Mattern in 1988, vector clocks extend Lamport's scalar timestamp to a vector of timestamps, one per process. This seemingly simple change—from a single number to a list of numbers—yields remarkable power: vector clocks capture the complete happened-before relation, enabling precise detection of both causality and concurrency.
If Lamport clocks are photographs (a snapshot at one point in time), vector clocks are movies—they capture the entire causal history of each event.
By the end of this page, you will master: how vector clocks represent causal history; the update rules and their intuition; how to determine if events are causally related or concurrent; real-world applications in databases and distributed systems; and the tradeoffs between expressiveness and overhead.
To understand why we need vectors, let's revisit the limitation of Lamport clocks.
The Concurrency Detection Problem:
Consider two processes, P1 and P2, that have never communicated:
P1: a(1) -------- b(2)
P2: c(1) -------- d(2)
Both processes have Lamport timestamps 1 and 2 for their events. We have L(a) = L(c) = 1 and L(b) = L(d) = 2. From these timestamps:
We know a || c and b || d (they're concurrent—no messages exchanged), but the Lamport timestamps don't tell us this. We could break ties with process IDs, but that would incorrectly suggest one happened before the other.
The Key Insight: Per-Process Knowledge
The solution is to track what each process knows about each other process. Instead of a single counter, each process maintains a vector of counters—one for each process in the system.
Vector V[i, j] = k means: "Process i knows that process j has executed k events."
Each process's vector represents its causal history—the set of events that could have influenced it.
Vector Comparison:
For vectors V1 and V2 of length n:
The partial order on vectors corresponds exactly to the happened-before relation:
a → b ⟺ V(a) < V(b)
This is the key property. Unlike Lamport clocks, the implication goes both ways. If V(a) < V(b), we know definitively that a happened before b. If neither V(a) < V(b) nor V(b) < V(a), we know definitively that a and b are concurrent.
Lamport clocks create a total order (any two events can be compared). Vector clocks create a partial order (some events are incomparable—they're concurrent). This partial order is exactly the happened-before relation, which is itself a partial order. The math aligns perfectly.
The vector clock algorithm is a natural extension of Lamport clocks. Each of the n processes in the system maintains a vector of n counters.
Initialization:
Each process i initializes its vector clock:
V_i = [0, 0, 0, ..., 0] // n zeros
Three Update Rules:
Rule 1: Before a local event (internal event):
V_i[i] = V_i[i] + 1
Increment own component.
Rule 2: Before sending a message:
V_i[i] = V_i[i] + 1
message.timestamp = V_i // Attach entire vector
Increment own component and attach the vector to the message.
Rule 3: Upon receiving a message with vector V_msg:
For each j in 1..n:
V_i[j] = max(V_i[j], V_msg[j])
V_i[i] = V_i[i] + 1
Merge the received vector (component-wise max) then increment own component.
The merge operation captures all events that the sender knew about (which are now in our causal history). The increment marks the receive event itself.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154
/** * Vector Clock Implementation * * A complete implementation of vector clocks with full causality * detection and comparison operations. */ type VectorTimestamp = Map<string, number>; class VectorClock { private vector: Map<string, number>; private processId: string; constructor(processId: string, allProcessIds: string[]) { this.processId = processId; this.vector = new Map(); // Initialize all components to 0 for (const pid of allProcessIds) { this.vector.set(pid, 0); } } /** * Rule 1: Increment own component for local event */ tick(): VectorTimestamp { const current = this.vector.get(this.processId) || 0; this.vector.set(this.processId, current + 1); return this.getTimestamp(); } /** * Rule 2: Increment and return timestamp for sending */ send(): VectorTimestamp { return this.tick(); // Same as tick - increment and return } /** * Rule 3: Merge received vector, then increment own component */ receive(messageVector: VectorTimestamp): VectorTimestamp { // Merge: component-wise maximum for (const [pid, value] of messageVector) { const current = this.vector.get(pid) || 0; this.vector.set(pid, Math.max(current, value)); } // Increment own component for the receive event const current = this.vector.get(this.processId) || 0; this.vector.set(this.processId, current + 1); return this.getTimestamp(); } /** * Get a copy of the current vector timestamp */ getTimestamp(): VectorTimestamp { return new Map(this.vector); } /** * Compare two vector timestamps * Returns: -1 if v1 < v2, 1 if v1 > v2, 0 if equal, null if concurrent */ static compare(v1: VectorTimestamp, v2: VectorTimestamp): -1 | 0 | 1 | null { let v1LessOrEqual = true; let v2LessOrEqual = true; let equal = true; // Get all unique keys const allKeys = new Set([...v1.keys(), ...v2.keys()]); for (const key of allKeys) { const val1 = v1.get(key) || 0; const val2 = v2.get(key) || 0; if (val1 > val2) { v1LessOrEqual = false; equal = false; } if (val2 > val1) { v2LessOrEqual = false; equal = false; } } if (equal) return 0; if (v1LessOrEqual) return -1; // v1 < v2: v1 happened before v2 if (v2LessOrEqual) return 1; // v1 > v2: v2 happened before v1 return null; // Concurrent } /** * Check if v1 happened before v2 (v1 → v2) */ static happenedBefore(v1: VectorTimestamp, v2: VectorTimestamp): boolean { return VectorClock.compare(v1, v2) === -1; } /** * Check if two events are concurrent */ static areConcurrent(v1: VectorTimestamp, v2: VectorTimestamp): boolean { return VectorClock.compare(v1, v2) === null; } /** * Merge two vectors (used for determining combined causal history) */ static merge(v1: VectorTimestamp, v2: VectorTimestamp): VectorTimestamp { const result = new Map(v1); for (const [key, val2] of v2) { const val1 = result.get(key) || 0; result.set(key, Math.max(val1, val2)); } return result; }} // Demonstrationfunction demonstrateVectorClocks() { const processes = ["P1", "P2", "P3"]; const p1 = new VectorClock("P1", processes); const p2 = new VectorClock("P2", processes); const p3 = new VectorClock("P3", processes); // P1 does a local event const e1 = p1.tick(); // [1, 0, 0] console.log("P1 local event:", Object.fromEntries(e1)); // P1 sends to P2 const msg1 = p1.send(); // [2, 0, 0] console.log("P1 sends message:", Object.fromEntries(msg1)); // P2 receives from P1 const e2 = p2.receive(msg1); // max([0,0,0], [2,0,0]) + [0,1,0] = [2, 1, 0] console.log("P2 receives:", Object.fromEntries(e2)); // P3 does an independent local event const e3 = p3.tick(); // [0, 0, 1] console.log("P3 local event:", Object.fromEntries(e3)); // Compare events console.log("Comparisons:"); console.log("e1 → e2?", VectorClock.happenedBefore(e1, e2)); // true console.log("e1 || e3?", VectorClock.areConcurrent(e1, e3)); // true console.log("e2 || e3?", VectorClock.areConcurrent(e2, e3)); // true} demonstrateVectorClocks();When you receive a message, you're learning about all the events that the sender knew about. The merge (component-wise max) combines your knowledge with theirs. If they knew about 5 events from P3 and you only knew about 3, now you know about 5. This is how causal information propagates through the system.
Let's trace through a detailed example to see how vector clocks capture causality.
Scenario: Three processes, P1, P2, P3
We'll label events as e<sub>process</sub><sub>sequence</sub> (e.g., e₁₁ is the first event on P1).
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
Timeline (physical time flows left to right): P1: e11[1,0,0] --send--> e12[2,0,0] -------- e13[5,0,2] <--recv-- ... ↘ ↗P2: e21[0,1,0] ----> e22[2,2,0] <-recv ↙ ↓ send ↓P3: e31[0,0,1] -----> e32[2,2,2] <-recv --send--> e33[2,2,3] ↗ (msg to P1)--↗ Step-by-step: 1. P1 local event e11: P1 increments → V1 = [1,0,0]2. P2 local event e21: P2 increments → V2 = [0,1,0]3. P3 local event e31: P3 increments → V3 = [0,0,1] 4. P1 sends to P2 (event e12): P1 increments → V1 = [2,0,0], message carries [2,0,0] 5. P2 receives from P1 (event e22): merge([0,1,0], [2,0,0]) = [2,1,0] then increment → V2 = [2,2,0] 6. P2 sends to P3 (using same event e22, or separate event e23): Let's say P2 sends in e22, message carries [2,2,0] 7. P3 receives from P2 (event e32): merge([0,0,1], [2,2,0]) = [2,2,1] then increment → V3 = [2,2,2] 8. P3 sends to P1 (event e33): increment → V3 = [2,2,3], message carries [2,2,3] 9. P1 receives from P3 (event e13): merge([2,0,0], [2,2,3]) = [2,2,3] then increment → V1 = [5,2,3]... wait, that's wrong! Actually: we need to track P1's component correctly. After e12: V1 = [2,0,0] Then receive from e33: merge + increment P1's component merge([2,0,0], [2,2,3]) = [2,2,3] increment P1: V1 = [3,2,3] Let me correct the trace: Corrected event e13 at P1: merge([2,0,0], [2,2,3]) = [2,2,3] increment P1's component → V1 = [3,2,3]Analyzing the Relationships:
With the vector timestamps, we can definitively determine all causal relationships:
| Event A | V(A) | Event B | V(B) | Relationship |
|---|---|---|---|---|
| e11 | [1,0,0] | e12 | [2,0,0] | e11 → e12 (V(e11) < V(e12)) |
| e11 | [1,0,0] | e21 | [0,1,0] | e11 |
| e11 | [1,0,0] | e22 | [2,2,0] | e11 → e22 (all components ≤) |
| e21 | [0,1,0] | e22 | [2,2,0] | e21 → e22 (all components ≤) |
| e21 | [0,1,0] | e31 | [0,0,1] | e21 |
| e22 | [2,2,0] | e32 | [2,2,2] | e22 → e32 |
| e12 | [2,0,0] | e33 | [2,2,3] | e12 → e33 (via P2 and P3) |
| e31 | [0,0,1] | e32 | [2,2,2] | e31 → e32 |
The Power of Vector Clocks:
Notice that we can determine relationships even between events that didn't directly communicate:
e11 → e32: Even though P1 and P3 never directly communicated (up to this point), the vector clock reveals that information from e11 (via e12 → e22 → e32) reached P3.
e21 || e31: P2 and P3 hadn't communicated yet, and their events are correctly identified as concurrent.
This transitive causality detection is impossible with Lamport clocks alone.
To compare V1 = [a,b,c] and V2 = [x,y,z]: if a≤x AND b≤y AND c≤z, with at least one strict inequality, then V1 < V2. If neither V1 < V2 nor V2 < V1, they're concurrent. Practice this until it becomes intuitive!
The most important practical application of vector clocks is version vectors for conflict detection in distributed databases and replicated systems.
The Problem: Concurrent Updates
Consider a distributed key-value store with replicas on nodes A, B, and C. A client updates key "X" on node A, while another client simultaneously updates "X" on node B. When the replicas synchronize, how do we detect that these are conflicting updates rather than sequential ones?
Version Vectors to the Rescue:
Each value in the store carries a version vector. When a node updates a value:
When synchronizing, the receiving node compares version vectors:
The concurrency case (||) is only detectable with vector clocks. This enables the system to:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210
/** * Distributed Key-Value Store with Version Vectors * * Demonstrates how version vectors detect conflicts between * concurrent updates in a replicated system. */ type NodeId = string;type VersionVector = Map<NodeId, number>; interface VersionedValue<T> { value: T; version: VersionVector;} // Siblings represent conflicting concurrent versionsinterface ValueWithSiblings<T> { siblings: VersionedValue<T>[];} class VersionedKVStore<T> { private nodeId: NodeId; private store: Map<string, ValueWithSiblings<T>>; constructor(nodeId: NodeId) { this.nodeId = nodeId; this.store = new Map(); } /** * Get the current value(s) for a key * May return multiple siblings if there are unresolved conflicts */ get(key: string): ValueWithSiblings<T> | undefined { return this.store.get(key); } /** * Put a new value, providing the context from a previous read * The context helps us determine if this is an update or a new conflict */ put(key: string, value: T, context?: VersionVector): void { const existing = this.store.get(key); // Create new version vector const newVersion: VersionVector = new Map(context || []); // Increment our node's component const ourCurrent = newVersion.get(this.nodeId) || 0; newVersion.set(this.nodeId, ourCurrent + 1); const newVersionedValue: VersionedValue<T> = { value, version: newVersion }; if (!existing) { // First write to this key this.store.set(key, { siblings: [newVersionedValue] }); } else { // Filter out any versions that the new one dominates const remainingSiblings = existing.siblings.filter(sibling => { const cmp = this.compareVersions(sibling.version, newVersion); // Keep if sibling is concurrent or newer return cmp === null || cmp === 1; }); // Add our new version remainingSiblings.push(newVersionedValue); this.store.set(key, { siblings: remainingSiblings }); } } /** * Receive an update from another node during synchronization */ receiveUpdate(key: string, received: VersionedValue<T>): void { const existing = this.store.get(key); if (!existing) { // We don't have this key, accept it this.store.set(key, { siblings: [received] }); return; } // Check against all existing siblings const newSiblings: VersionedValue<T>[] = []; let dominated = false; for (const sibling of existing.siblings) { const cmp = this.compareVersions(received.version, sibling.version); if (cmp === -1) { // Received is older than this sibling, received is dominated dominated = true; newSiblings.push(sibling); } else if (cmp === 1) { // Received is newer, this sibling is dominated, don't keep it } else if (cmp === 0) { // Same version, keep one newSiblings.push(sibling); dominated = true; } else { // Concurrent! Keep both - this is a conflict newSiblings.push(sibling); } } if (!dominated) { newSiblings.push(received); } this.store.set(key, { siblings: newSiblings }); } /** * Resolve conflicts by providing a merged value * The new version vector dominates all siblings */ resolveConflict(key: string, resolvedValue: T): void { const existing = this.store.get(key); if (!existing || existing.siblings.length <= 1) return; // Merge all sibling version vectors let mergedVersion: VersionVector = new Map(); for (const sibling of existing.siblings) { for (const [node, count] of sibling.version) { mergedVersion.set(node, Math.max( mergedVersion.get(node) || 0, count )); } } // Increment our component mergedVersion.set(this.nodeId, (mergedVersion.get(this.nodeId) || 0) + 1); this.store.set(key, { siblings: [{ value: resolvedValue, version: mergedVersion }] }); } /** * Compare two version vectors * Returns: -1 (v1 < v2), 1 (v1 > v2), 0 (equal), null (concurrent) */ private compareVersions(v1: VersionVector, v2: VersionVector): -1 | 0 | 1 | null { const allKeys = new Set([...v1.keys(), ...v2.keys()]); let v1LessOrEqual = true; let v2LessOrEqual = true; for (const key of allKeys) { const val1 = v1.get(key) || 0; const val2 = v2.get(key) || 0; if (val1 > val2) v1LessOrEqual = false; if (val2 > val1) v2LessOrEqual = false; } if (v1LessOrEqual && v2LessOrEqual) return 0; // Equal if (v1LessOrEqual) return -1; // v1 < v2 if (v2LessOrEqual) return 1; // v1 > v2 return null; // Concurrent } /** * Check if there are unresolved conflicts for a key */ hasConflict(key: string): boolean { const entry = this.store.get(key); return entry !== undefined && entry.siblings.length > 1; }} // Demonstrationfunction demonstrateConflictDetection() { const nodeA = new VersionedKVStore<string>("A"); const nodeB = new VersionedKVStore<string>("B"); // Node A writes initial value nodeA.put("user:123", "Alice"); console.log("Node A writes 'Alice':", nodeA.get("user:123")); // Sync to Node B const value = nodeA.get("user:123")!.siblings[0]; nodeB.receiveUpdate("user:123", value); console.log("Node B receives:", nodeB.get("user:123")); // Now both nodes update concurrently! nodeA.put("user:123", "Alice Smith", value.version); // Using previous context nodeB.put("user:123", "Alice Jones", value.version); // Same context = concurrent! console.log("Node A updates to 'Alice Smith':", nodeA.get("user:123")); console.log("Node B updates to 'Alice Jones':", nodeB.get("user:123")); // Sync B's update to A const bValue = nodeB.get("user:123")!.siblings[0]; nodeA.receiveUpdate("user:123", bValue); console.log("After sync, Node A has conflict?", nodeA.hasConflict("user:123")); console.log("Node A siblings:", nodeA.get("user:123")); // Resolve the conflict nodeA.resolveConflict("user:123", "Alice Smith-Jones"); console.log("After resolution:", nodeA.get("user:123"));} demonstrateConflictDetection();Real-World Systems Using Version Vectors:
Amazon Dynamo / DynamoDB: Uses version vectors for conflict detection, famously allowing "shopping cart" conflicts to be merged (keeping all items from both carts).
Riak: Explicitly exposes version vectors (as "vclocks") to applications, letting them handle conflicts.
CouchDB: Uses revision trees with a similar principle for multi-master replication.
Git: While not using numeric vectors, Git's commit DAG serves a similar purpose—branches (concurrent development) create merge conflicts.
A subtle issue: if many clients update a key, the version vector grows with each new client ID. Dynamo paper discusses 'pruning' old entries, but this can incorrectly mark updates as concurrent. Some systems use 'dotted version vectors' or client-to-server-ID mapping to bound vector size.
Vector clocks aren't always the right choice. Let's compare them to alternatives so you can select the appropriate mechanism.
Last-Write-Wins (LWW):
The simplest approach: each write has a timestamp (wall clock or hybrid logical clock), and the latest timestamp wins. No conflict detection needed.
Advantages: Simple, constant-size timestamps, no siblings to manage. Disadvantages: Data loss! Concurrent writes just disappear. Relies on clock synchronization.
Sequence Numbers / Log Ordering:
All writes go through a totally ordered log (like Kafka or a consensus leader). Each write has a sequence number; later numbers win.
Advantages: Simple, total ordering, no conflicts possible. Disadvantages: Requires centralized (or consensus-based) sequencer. Limits write throughput.
CRDTs (Conflict-free Replicated Data Types):
Data structures designed so that concurrent updates can always be merged automatically (e.g., sets, counters, LWW-registers).
Advantages: Automatic merge, no user-visible conflicts. Disadvantages: Limited to specific data types. Can't express all application semantics.
| Approach | Detects Conflicts? | Overhead | Data Loss Risk | Best For |
|---|---|---|---|---|
| Lamport Clocks | No | O(1) per event | N/A (no conflict handling) | Total ordering, mutual exclusion |
| Vector Clocks | Yes | O(n) per event | No loss (siblings) | Multi-master replication |
| Last-Write-Wins | No | O(1) timestamp | Yes (lossy) | High-throughput, eventual consistency OK |
| Sequence Numbers | N/A (serialized) | O(1) + consensus cost | No | Linearizable systems |
| CRDTs | Auto-merged | Varies by CRDT | No (by design) | Specific data structure needs |
Practical Considerations:
1. Number of Participants:
Vector clocks scale with the number of processes (O(n) per timestamp). For 3 replicas, this is trivial. For 10,000 clients, it's prohibitive.
Solution: Use node-level vectors for server replicas (small n), and edge-level solutions (like version vectors per key) for clients.
2. Dynamic Membership:
Adding or removing processes requires careful coordination. If a new process joins, all vectors need a new component. If a process leaves, its component should eventually be pruned.
Solution: Use algorithms like Interval Tree Clocks that handle dynamic membership naturally.
3. Garbage Collection:
Version vectors attached to data persist. Old entries (from long-departed nodes) waste space.
Solution: Track which versions are "safe" (all replicas have seen them) and prune entries for departed nodes after all stale versions are resolved.
When to Use Vector Clocks:
Real systems often combine approaches. For example: use consensus for critical operations needing linearizability, vector clocks per-key for detecting conflicts in less critical data, and LWW for operational metadata where slight staleness is acceptable.
Vector clocks track what each process knows about each other process's progress. But what if we need to know what each process knows about what other processes know? This leads to higher-order constructs.
Matrix Clocks:
A matrix clock is an n×n matrix where entry M[i,j] represents: "Process i knows that process j has executed this many events."
Each process i maintains a matrix M_i:
Update Rules:
Local event: Increment M_i[i,i]
Send message:
Receive message with matrix M_msg from process j:
// Update our vector (row i) with sender's vector (their row j)
for k in 1..n:
M_i[i,k] = max(M_i[i,k], M_msg[j,k])
// Update our knowledge of what others know
for m in 1..n, k in 1..n:
M_i[m,k] = max(M_i[m,k], M_msg[m,k])
// Increment for receive event
M_i[i,i] = M_i[i,i] + 1
What Matrix Clocks Enable:
The key insight: min(M_i[1,k], M_i[2,k], ..., M_i[n,k]) gives the number of events from process k that all processes have seen (as known by process i).
This enables:
Garbage Collection: If all processes have seen version V of a value, earlier versions can be safely deleted.
Stable Messages: A message is "stable" when all processes have received it—useful for atomic broadcast implementations.
Checkpoint Coordination: Determine when all processes have advanced past a certain point.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
// Matrix Clock Implementation CLASS MatrixClock: n: int // Number of processes id: int // This process's ID (0 to n-1) matrix: int[n][n] // n x n matrix CONSTRUCTOR(processId, numProcesses): id = processId n = numProcesses matrix = new int[n][n] // All zeros FUNCTION localEvent(): matrix[id][id] += 1 FUNCTION send(): matrix[id][id] += 1 RETURN copy(matrix) // Attach entire matrix FUNCTION receive(senderId, msgMatrix): // Merge sender's knowledge about everyone with ours FOR k FROM 0 TO n-1: matrix[id][k] = max(matrix[id][k], msgMatrix[senderId][k]) // Update our knowledge of what all processes know FOR m FROM 0 TO n-1: FOR k FROM 0 TO n-1: matrix[m][k] = max(matrix[m][k], msgMatrix[m][k]) // Record receive event matrix[id][id] += 1 // What is the minimum # of events from process k seen by all? // (As known by this process) FUNCTION minGlobalProgress(k): minimum = infinity FOR i FROM 0 TO n-1: minimum = min(minimum, matrix[i][k]) RETURN minimum // Example: Garbage collection // Delete data versions with timestamp < minGlobalProgress for their origin FUNCTION canGarbageCollect(version): originProcess = version.origin versionCounter = version.counter RETURN versionCounter < minGlobalProgress(originProcess)Tradeoffs:
Other Extensions:
Plausible Clocks: Probabilistic clocks that approximate vector clock behavior with bounded space. Accept some false positives in concurrency detection.
Bloom Clocks: Use Bloom filters to compactly represent causal history. Bounded space but probabilistic.
Interval Tree Clocks (ITCs): Handle dynamic membership elegantly. Processes can fork and join. The "interval" structure naturally handles process creation and merging.
Dotted Version Vectors: Refine version vectors to track specific "dots" (events) rather than just counts. Better handling of conflict resolution context.
Matrix clocks are specialized. Use them when you need to answer "have all processes seen X?" without additional communication. Common in: garbage collection, causal atomic broadcast, and distributed checkpointing. For most other purposes, vector clocks suffice.
Implementing vector clocks in production systems requires attention to practical details.
1. Serialization:
Vector clocks must be serialized for network transmission and storage. Options:
// JSON (readable, slightly larger)
{"P1": 5, "P2": 3, "P3": 7}
// Array with known ordering (compact)
[5, 3, 7] // Requires agreed-upon process ID mapping
// Binary (most compact)
<varint count><varint id><varint value>...
For small, fixed process sets, the array format is often ideal. For dynamic membership, the map format handles unknown IDs gracefully.
2. Process ID Management:
Vector clocks require consistent process identification. Options:
3. Comparison Optimization:
Naive comparison is O(n). For frequent comparisons:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117
/** * Optimized Vector Clock with Serialization and Efficient Comparison */ class OptimizedVectorClock { // Use array for O(1) index access when process IDs are integers private vector: number[]; private processIndex: number; private idToIndex: Map<string, number>; private indexToId: string[]; constructor(myId: string, allIds: string[]) { // Sort IDs for consistent ordering across all nodes this.indexToId = [...allIds].sort(); this.idToIndex = new Map(this.indexToId.map((id, i) => [id, i])); this.processIndex = this.idToIndex.get(myId)!; this.vector = new Array(allIds.length).fill(0); } // Core operations remain the same... tick() { this.vector[this.processIndex]++; return this.serialize(); } send() { return this.tick(); } receive(serialized: string) { const other = this.deserialize(serialized); for (let i = 0; i < this.vector.length; i++) { this.vector[i] = Math.max(this.vector[i], other[i]); } this.vector[this.processIndex]++; return this.serialize(); } /** * Compact binary serialization using variable-length integers */ serialize(): string { // For demo, use JSON. Production would use binary encoding. return JSON.stringify(this.vector); } deserialize(data: string): number[] { return JSON.parse(data); } /** * Optimized comparison with early termination * Returns: { result: -1|0|1|null, comparisons: number } */ static compareOptimized( v1: number[], v2: number[] ): { result: -1 | 0 | 1 | null; comparisons: number } { let comparisons = 0; let v1LessOrEqual = true; let v2LessOrEqual = true; for (let i = 0; i < v1.length; i++) { comparisons++; if (v1[i] > v2[i]) { v1LessOrEqual = false; } if (v2[i] > v1[i]) { v2LessOrEqual = false; } // Early termination: if both directions have strict inequality, // they're concurrent. No need to check remaining elements. if (!v1LessOrEqual && !v2LessOrEqual) { return { result: null, comparisons }; } } if (v1LessOrEqual && v2LessOrEqual) { return { result: 0, comparisons }; // Equal } if (v1LessOrEqual) { return { result: -1, comparisons }; // v1 < v2 } return { result: 1, comparisons }; // v1 > v2 } /** * Delta-based merge: only transmit/apply changes * Useful for incremental synchronization */ static delta(before: number[], after: number[]): Map<number, number> { const delta = new Map<number, number>(); for (let i = 0; i < after.length; i++) { if (after[i] > before[i]) { delta.set(i, after[i]); } } return delta; } applyDelta(delta: Map<number, number>) { for (const [index, value] of delta) { this.vector[index] = Math.max(this.vector[index], value); } }} // Demonstration of early termination benefitfunction demonstrateOptimization() { // Two vectors that are concurrent (differ in first two positions) const v1 = [5, 3, 0, 0, 0, 0, 0, 0, 0, 0]; const v2 = [3, 5, 0, 0, 0, 0, 0, 0, 0, 0]; const result = OptimizedVectorClock.compareOptimized(v1, v2); console.log("Result:", result.result); // null (concurrent) console.log("Comparisons:", result.comparisons); // 2 (early termination!) console.log("Vector length:", v1.length); // 10 console.log("Saved comparisons:", v1.length - result.comparisons); // 8}4. Memory Management:
For systems with many keys, each with its own version vector:
5. Testing Vector Clock Implementations:
Vector clocks are easy to implement incorrectly. Key test cases:
Vector clocks are ideal for property-based testing. Generate random event sequences and verify that clock conditions hold. This catches subtle bugs that hand-crafted test cases miss.
Vector clocks extend Lamport clocks to capture the complete happened-before relation, enabling detection of both causality and concurrency. Let's consolidate the key insights:
The Bigger Picture:
Lamport clocks and vector clocks represent a profound insight: we can capture causality in distributed systems using only local information and message passing. No global clock or synchronization is needed. This forms the theoretical foundation for consistent distributed systems.
What's Next:
The next page covers NTP (Network Time Protocol), returning to physical clock synchronization but with the practical sophistication needed for real-world deployment. Understanding both logical and physical time approaches gives you the complete toolkit for distributed systems design.
You now understand vector clocks—the extension of Lamport clocks that captures complete causality. You can detect concurrent events, implement version vectors for conflict detection, and understand the tradeoffs between different clock mechanisms. Next, we'll explore NTP and how physical clocks are synchronized in practice.