Loading learning content...
You're architecting a new system, and you face a pivotal question: What level of consistency do we actually need?
The temptation is to reach for the strongest guarantee—linearizability. After all, it's the easiest to reason about. But linearizability comes with brutal costs: coordination overhead, latency spikes under contention, and unavailability during network partitions. Is it worth it?
On the other hand, eventual consistency is cheap and highly available—but the application logic to handle stale reads, out-of-order updates, and conflict resolution can be horrendously complex. Is the simplicity in the infrastructure worth the complexity in the application?
Causal consistency sits between these extremes. But when is it the right choice?
By the end of this page, you will understand the precise trade-offs between causal consistency and linearizability, including latency, availability, throughput, and programming complexity. You'll learn to identify which operations in your system need which level of consistency, and how to design hybrid systems that apply strong consistency surgically.
Before comparing trade-offs, let's ensure we have precise definitions of the consistency models we're comparing.
Linearizability (Strong Consistency):
Every operation appears to execute atomically at a single point in time between its invocation and response. All observers agree on a single global order of operations that respects real-time ordering—if operation A completes before operation B starts, A is ordered before B.
Key property: External observers cannot detect that the system is distributed. The system behaves as if there's a single copy of each data item.
Causal Consistency:
Operations that are causally related (connected by happens-before) are seen in the same order by all observers. Concurrent operations (not causally related) may be seen in different orders by different observers.
Key property: Cause precedes effect. You can't see a reply before the original message, a deletion before the creation, or any operation before the operations it depends on.
| Aspect | Linearizability | Causal Consistency |
|---|---|---|
| Causally related operations | Same order everywhere | Same order everywhere |
| Concurrent operations | Single global order | May differ by observer |
| Read staleness | Never stale | Can read stale (causally unrelated) data |
| Real-time ordering | Preserved | Not preserved |
| Single global history | Yes | No (multiple valid histories exist) |
| Requires coordination | Yes (consensus) | No (local operations) |
Linearizability cares about real-time ordering—if A finishes before B starts (according to wall clock time), A is ordered before B. Causal consistency only cares about logical ordering—A is before B only if there's a causal chain from A to B. Two operations that don't observe each other are concurrent, regardless of when they happened in real time.
The most dramatic difference between linearizability and causal consistency is in operation latency, particularly for writes and consistent reads.
Linearizability requires coordination:
To ensure a write is immediately visible to all subsequent reads (linearizability), the write must be synchronously replicated to a majority of nodes before acknowledging to the client. This typically involves:
For a geo-distributed system with nodes in different continents, this round of coordination can take 100-200ms or more. During contention (many concurrent writes to the same key), latency spikes further due to retry loops and conflict resolution.
Linearizable Write (Geo-distributed, 3 regions): Client in NYC → Leader in NYC → Replicate to € & Asia → Wait for quorum | | | 5ms +80ms (EU) +150ms (Asia) Minimum latency: max(EU, Asia) ≈ 150ms for quorumP99 latency: Often 300-500ms with network jitter and contention --- Causally Consistent Write (Geo-distributed, 3 regions): Client in NYC → Local replica in NYC → Acknowledge immediately | 5ms (Asynchronous replication to EU & Asia happens in background) Minimum latency: 5ms (local write)P99 latency: ~10-20ms (still just local I/O) --- Cost: Linearizable write is 15-30x slower than causal writeThe latency tax is paid on every operation:
This isn't a one-time cost—every linearizable operation pays the coordination tax. For read-heavy workloads, you might be able to read from local replicas (if your linearizable system allows stale reads for some operations). But any operation that needs the linearizable guarantee—especially writes and consistent reads—incurs the cross-region latency.
For systems with high write throughput or low latency requirements, this tax can be prohibitive.
For geo-distributed systems, coordination latency is fundamentally bounded by the speed of light. NYC to London is ~70ms round-trip at light speed through fiber. No amount of engineering can make synchronous coordination faster than physics allows. Causal consistency avoids this limit by making coordination asynchronous.
The CAP theorem states that during a network partition, a distributed system must choose between consistency and availability. This creates stark differences between linearizable and causally consistent systems.
Linearizability during partitions:
When a partition occurs (some nodes can't communicate with others), a linearizable system has two options:
Block operations on the minority side. The minority partition cannot reach a quorum, so it refuses writes and potentially reads. The majority can continue operating.
Refuse all operations. If the system can't determine which side is the majority (e.g., split-brain scenarios), it may refuse all operations until the partition heals.
Either way, some clients experience complete unavailability during the partition.
Causal consistency during partitions:
Causally consistent systems can remain fully available during partitions because:
When the partition heals, the system reconciles writes from different partitions. Since causally consistent systems already expect concurrent operations (and have mechanisms to handle them), this reconciliation is a normal part of operation—not a crisis recovery scenario.
The trade-off:
During a partition, causally consistent systems remain available but may accept conflicting concurrent writes that must be reconciled later. Linearizable systems refuse some operations but guarantee that accepted operations are globally consistent.
In practice, the availability difference is significant. A linearizable system might have 99.9% availability (0.1% downtime = ~8.7 hours/year). A causally consistent system can approach 99.99% or higher, with 'downtime' limited to individual node failures rather than coordination failures.
Beyond latency for individual operations, the choice of consistency model affects total system throughput—how many operations the system can handle per second.
Linearizability throughput bottleneck:
Linearizable writes to the same data item must be serialized. If 1000 clients all try to update the same counter, those updates must be processed one at a time (or in batches that are globally ordered). This creates a throughput ceiling determined by:
Linearizable System (single leader, quorum writes): Maximum write throughput = 1 / coordination_latencyIf coordination takes 50ms: max ~20 writes/second per key With contention (100 concurrent writers to same key): - Optimistic locking: ~20 succeed, ~80 retry - Each retry adds another 50ms - Effective throughput: < 20 writes/second total, most fail initially --- Causally Consistent System (multi-master, async replication): Maximum write throughput = replicas × local_write_capacityIf each replica handles 10,000 writes/second and we have 3 replicas: Max ~30,000 writes/second total (if writes are distributed) With contention (100 concurrent writers to same key): - All 100 writes accepted immediately (locally) - Conflicts resolved asynchronously - Effective throughput: 100 writes accepted in <10ms total --- For write-heavy workloads: 100x-1000x throughput differenceWhere linearizability throughput hurts most:
Hot keys: Any data item that many clients access concurrently (global counters, rate limiters, inventory counts) becomes a serialization bottleneck.
Write-heavy workloads: High write-to-read ratios suffer because writes are expensive.
Large clusters: More nodes means more coordination overhead per operation.
Geo-distribution: Cross-region latency directly impacts throughput because each operation takes longer.
Causal consistency throughput characteristics:
Causally consistent systems scale writes almost linearly with replicas because:
The cost is moved to conflict resolution—if conflicts are frequent and expensive to resolve, this creates different bottlenecks.
Systems often use sharding to distribute load and avoid hot keys. But some hot keys are unavoidable (e.g., a trending post's like count, a flash sale's inventory). Linearizability makes these cases extremely expensive; causal consistency with appropriate conflict resolution (like CRDTs for counters) handles them gracefully.
Beyond performance, there are semantic differences between linearizability and causal consistency that affect application correctness. Some operations fundamentally require linearizability; others work perfectly with causal consistency.
Operations that require linearizability:
Operations that work well with causal consistency:
Ask: 'If two operations happen concurrently (neither knows about the other), is there exactly one correct outcome, or are multiple outcomes acceptable?' If exactly one outcome is required, you need linearizability. If multiple outcomes are acceptable (with eventual reconciliation), causal consistency works.
The best real-world systems don't choose one consistency model for everything—they apply different consistency levels to different operations based on requirements. This hybrid approach captures benefits from both worlds.
Patterns for hybrid consistency:
1234567891011121314151617181920212223242526272829303132333435363738394041
// E-commerce system with hybrid consistency interface ProductCatalog { // Causal consistency - fine if product info is slightly stale getProduct(id: string, options?: { consistency: 'causal' }): Promise<Product>; // Strong consistency - must see latest inventory before checkout getInventory(productId: string, options: { consistency: 'strong' }): Promise<number>;} interface ShoppingCart { // Causal consistency - cart changes should be immediately visible to the user addItem(item: CartItem): Promise<void>; // Session guarantee: read-your-writes // Local operation, causal consistency sufficient getCart(): Promise<Cart>;} interface Checkout { // Linearizable - must reserve inventory atomically async checkout(cart: Cart): Promise<Order> { // This operation requires linearizability: // 1. Check inventory for all items // 2. Reserve inventory atomically // 3. If any item unavailable, rollback and fail return await this.transactionManager.executeLinearizable(async (txn) => { for (const item of cart.items) { const inventory = await txn.getInventory(item.productId); if (inventory < item.quantity) { throw new InsufficientInventoryError(item.productId); } await txn.decrementInventory(item.productId, item.quantity); } return await this.createOrder(cart); }); }} // Result: 99% of operations (browsing, carting) are fast (causal)// 1% of operations (checkout) are slower but correct (linearizable)Amazon's Dynamo uses this pattern extensively. Most operations use eventual consistency, but shopping cart and checkout use stronger guarantees. Google Spanner provides linearizability but also offers 'bounded staleness' reads for operations that don't need the latest data. CockroachDB offers both serializable and read committed isolation levels.
Given all these trade-offs, how do you actually decide what consistency level to use? Here's a decision framework:
| Requirement | Recommended Consistency |
|---|---|
| Must enforce unique constraints | Linearizability for that constraint |
| Need distributed locking | Linearizability |
| Compare-and-swap operations | Linearizability |
| Sub-50ms latency required globally | Causal or Eventual |
| High write throughput to hot keys | Causal with CRDTs |
| Must survive partitions without downtime | Causal or Eventual |
| User sees their own updates immediately | Causal (RYW guarantee) |
| Conversation/thread ordering | Causal |
| Approximate counts acceptable | Causal/Eventual |
| Exact counts required | Linearizability |
A good default is to build on causal consistency and add linearizability only for specific operations that provably need it. This approach gives you high performance and availability by default, paying the coordination cost only where necessary.
We've analyzed the multi-dimensional trade-offs between causal consistency and linearizability. Let's consolidate:
What's next:
We've covered theory, session guarantees, implementation, and trade-offs. The final page explores practical applications—real systems that use causal consistency, design patterns for causally consistent applications, and lessons from production deployments.
You now understand the precise trade-offs between causal consistency and linearizability, and you have a framework for making consistency decisions. This knowledge enables you to design systems that balance correctness, performance, and availability based on real requirements.