Loading learning content...
If you've studied distributed systems for even a few weeks, you've encountered the CAP theorem—Eric Brewer's famous result stating that distributed systems can provide at most two of three guarantees: Consistency, Availability, and Partition Tolerance. CAP has become gospel in system design discussions, the foundation upon which architects justify their database choices and system trade-offs.
But here's a provocative question: What does the CAP theorem tell you about system behavior when the network is working perfectly?
The answer is: almost nothing.
This is the fundamental limitation that led Daniel Abadi at Yale University to propose an extension in 2012—a theorem that completes the picture CAP left unfinished. Welcome to PACELC, the theorem that explains what's actually happening in your distributed systems most of the time.
By the end of this page, you will understand why CAP theorem provides an incomplete model for distributed system trade-offs, how PACELC extends CAP to cover normal operation scenarios, and why this distinction is critical for making informed architectural decisions. You'll discover that the trade-offs you face most frequently aren't about partitions at all—they're about latency.
Before we can understand why PACELC matters, we need to revisit CAP with fresh eyes—examining not just what it says, but critically, what it leaves unsaid.
The CAP Theorem in Brief:
In 2000, Eric Brewer conjectured (and in 2002, Seth Gilbert and Nancy Lynch formally proved) that any distributed data store can provide at most two of the following three guarantees simultaneously:
The theorem states that during a network partition—when nodes cannot communicate with each other—a distributed system must choose between providing consistency or availability. You cannot have both.
Why Partition Tolerance Is Non-Negotiable:
In any realistic distributed system, network partitions will occur. Cables get cut. Switches fail. Cloud regions lose connectivity. The question isn't whether partitions happen—it's how frequently and for how long.
This means that in practice, the CAP choice reduces to CP vs. AP:
| System Type | During Partition | Example Systems |
|---|---|---|
| CP (Consistency + Partition Tolerance) | Rejects operations that can't be consistently processed | MongoDB (default), HBase, Spanner, Zookeeper |
| AP (Availability + Partition Tolerance) | Accepts operations, allows divergence | Cassandra, DynamoDB (default), CouchDB, Riak |
CAP tells us what happens during a partition. But consider this reality:
Network partitions are rare events.
In a well-engineered system with quality infrastructure, partitions might occur for minutes or hours per year—not per day. The overwhelming majority of the time, your distributed system operates without partitions. All nodes can communicate. The network is functioning normally.
So what does CAP tell us about system behavior during normal operation?
Absolutely nothing.
In the absence of partitions, CAP appears to suggest you can have both consistency and availability. But anyone who has operated a distributed database knows this isn't the full story. Even when the network is healthy, you still face fundamental trade-offs.
Here's the core insight CAP misses: during normal operation, you don't face a consistency vs. availability trade-off—you face a consistency vs. latency trade-off. This is the gap that PACELC fills, and it's the trade-off you'll encounter far more frequently in practice.
The Latency Reality:
Consider a distributed database with three replicas across different data centers:
Strong consistency requires synchronous replication — A write isn't acknowledged until all (or a majority of) replicas confirm it. This means waiting for round-trip communication to potentially distant nodes.
Lower latency allows asynchronous replication — A write is acknowledged immediately after the primary processes it. Replicas are updated asynchronously, reducing wait time but creating a window where different replicas have different data.
This trade-off exists all the time, not just during partitions. And for most applications, it matters far more than the partition behavior because it affects every single operation, every single millisecond of every single day.
In 2012, Daniel Abadi published a paper titled "Consistency Tradeoffs in Modern Distributed Database System Design" that introduced PACELC as an extension to CAP. The formulation elegantly captures both partition and normal operation behavior:
PACELC is pronounced "pass-elk" and stands for:
Partition → Availability vs Consistency; Else → Latency vs Consistency
Read it as: "If there is a Partition, choose between Availability and Consistency; Else (during normal operation), choose between Latency and Consistency."
This simple extension captures a profound insight: distributed systems make trade-offs along two dimensions, not one.
IF Partition → A or C ELSE → L or C
A system is classified as PA/EL, PA/EC, PC/EL, or PC/EC based on its choices in both scenarios. This four-way classification provides much richer insight into system behavior than the simple CP/AP dichotomy of CAP.
The Four PACELC Classifications:
| Classification | During Partition | Normal Operation | Behavior Summary |
|---|---|---|---|
| PA/EL | Availability over Consistency | Latency over Consistency | Always prioritizes responsiveness; never blocks for consistency |
| PA/EC | Availability over Consistency | Consistency over Latency | Relaxes during failures but strict during normal operation |
| PC/EL | Consistency over Availability | Latency over Consistency | Strict during failures but relaxes during normal operation |
| PC/EC | Consistency over Availability | Consistency over Latency | Always prioritizes consistency; accepts higher latency and reduced availability |
Why Four Categories Matter:
The CAP theorem would classify Cassandra and DynamoDB both as "AP systems." But their behavior differs significantly during normal operation:
Cassandra with default settings is PA/EL — It prioritizes availability during partitions and prioritizes low latency during normal operation. Consistency is always secondary.
DynamoDB with strong consistency enabled is PA/EC — It prioritizes availability during partitions, but enforces strong consistency during normal operation, accepting higher latency.
This distinction is invisible under CAP but critically important for system architects choosing between these databases.
To deeply understand PACELC, we need to examine the mathematical relationships that make these trade-offs inescapable.
The Latency-Consistency Relationship:
Consider a distributed system with N replicas. For a write operation to be "strongly consistent," the system must ensure that any subsequent read—from any replica—returns that write or a later one. This requires synchronization.
The fundamental equation governing this trade-off is derived from quorum systems:
12345678910111213141516
Strong Consistency Requirement: R + W > N Where: R = Read quorum (number of replicas consulted for reads) W = Write quorum (number of replicas that must acknowledge writes) N = Total number of replicas Latency Implication: Write Latency ≥ RTT to Wth slowest replica Read Latency ≥ RTT to Rth slowest replica For strong consistency with N=3: If W=2, R=2: Wait for 2 of 3 replicas (majority) If W=3, R=1: Wait for all 3 on writes, single replica reads If W=1, R=3: Immediate writes, but reads must query allThe Inescapable Trade-off:
To achieve strong consistency:
In a geo-distributed system where replicas span continents, this is devastating:
With W=2 and N=3 replicas spanning these locations, every write must wait for at least one cross-region RTT. Strong consistency imposes a latency floor that cannot be optimized away—it's physics, not engineering.
Light in fiber travels at roughly 200,000 km/s. New York to London is ~5,500 km. The theoretical minimum RTT is ~55ms. With routing overhead and switching delays, real-world RTT is 80-150ms. No amount of optimization can beat physics—this is why the latency vs consistency trade-off is fundamental, not merely an implementation detail.
Relaxing Consistency for Latency:
If we choose eventual consistency (the EL choice in PACELC), we can:
This reduces write latency from 100ms+ to 1ms, a 100x improvement. But we sacrifice the guarantee that reads return the latest write—different replicas may temporarily have different data.
The key insight: this trade-off exists independently of partition behavior. Whether or not any partition occurs, you must choose between synchronous (consistent, slow) and asynchronous (inconsistent, fast) replication.
CAP theorem, despite its importance, has been criticized and misinterpreted since its introduction. Understanding these limitations illuminates why PACELC was necessary.
Problem 1: The Binary Fallacy
CAP presents C and A as binary choices—you have them or you don't. Reality is far more nuanced:
CAP's binary framing obscures these nuances. PACELC, by introducing latency as a continuous variable, implicitly acknowledges the spectrum nature of these trade-offs.
Problem 2: Partitions Are the Exception, Not the Rule
CAP's focus on partition behavior is somewhat like an automobile safety guide that only discusses what to do during a collision. Useful, certainly—but you spend 99.999% of your driving time not colliding. For that time, you need different guidance.
Similarly, systems spend most of their operational life in the 'Else' state of PACELC—no partition present. CAP provides no framework for the trade-offs during this vast majority of operational time.
Problem 3: The Consistency Confusion
CAP uses 'Consistency' to mean linearizability—the strongest consistency guarantee. But real distributed systems employ many consistency models:
CAP's binary C/not-C framing ignores this rich spectrum. PACELC's latency dimension implicitly captures it—stronger consistency requires more coordination, thus higher latency.
Understanding the timeline of distributed systems theory helps contextualize PACELC's contribution:
The Evolution of Distributed Systems Understanding:
| Year | Contribution | Key Insight |
|---|---|---|
| 1978 | Lamport's Time, Clocks paper | Fundamental impossibility of global time in distributed systems |
| 1985 | FLP Impossibility Result | Consensus impossible with even one faulty process (async model) |
| 2000 | Brewer's CAP Conjecture | Cannot have consistency, availability, and partition tolerance simultaneously |
| 2002 | Gilbert & Lynch CAP Proof | Formal proof of CAP as impossibility result |
| 2008 | Amazon Dynamo Paper | Demonstrated practical eventual consistency at scale; influenced modern NoSQL |
| 2010 | Google Megastore Paper | Achieved strong consistency across regions with latency trade-offs |
| 2012 | Abadi's PACELC Paper | Extended CAP to address normal operation latency/consistency trade-off |
| 2012 | Google Spanner Paper | TrueTime API enabling strong consistency with bounded latency |
The Industry Context:
When Abadi proposed PACELC, the industry was grappling with a practical problem: CAP was being used to justify architectural decisions it didn't actually address. Engineers were saying:
"We're an AP system because we need availability"
But this left unanswered: what about the 99.9% of operations during normal network conditions? The AP label said nothing about whether those operations would be consistent or eventually consistent, fast or slow.
PACELC provided the vocabulary to distinguish between Cassandra (PA/EL—always prioritizing responsiveness) and a system like DynamoDB with strong reads (PA/EC—allowing inconsistency only during partitions).
The Spanner Influence:
Google's Spanner, announced the same year as PACELC, demonstrated that achieving PC/EC (strong consistency always) was possible at global scale—if you were willing to invest in specialized hardware (atomic clocks and GPS receivers) and accept bounded latency penalties. Spanner's existence validated PACELC's framing: the trade-off isn't impossibility, it's cost—in latency, complexity, or literal dollars.
In 2012, Eric Brewer himself published "CAP Twelve Years Later: How the 'Rules' Have Changed," acknowledging that the binary interpretation of CAP was too simplistic. He noted that the consistency/availability trade-off is continuous, and that systems can implement different strategies for different operations. PACELC formalizes much of what Brewer clarified.
Let's examine how popular distributed systems classify under PACELC, revealing the nuances invisible under CAP's simpler model:
| System | PACELC | Partition Behavior | Normal Operation Behavior |
|---|---|---|---|
| Cassandra | PA/EL | Accepts writes on both sides, reconciles later | Prioritizes low latency; tunable consistency levels but defaults favor speed |
| DynamoDB (eventual) | PA/EL | Remains available across partitions | Eventually consistent reads are fast; writes use local quorum |
| DynamoDB (strong) | PA/EC | Remains available across partitions | Strong reads wait for consensus; higher latency for consistency |
| MongoDB | PC/EC | Primary becomes unavailable during election | All writes go to primary; reads can be from primary for consistency |
| PostgreSQL (streaming) | PC/EC | Primary unavailable if standbys unreachable | Synchronous replication ensures consistency; adds latency |
| CockroachDB | PC/EC | Majority required for operations | Consensus-based writes; latency floor determined by quorum |
| Google Spanner | PC/EC | Operations blocked if synchrony lost | TrueTime enables consistency with bounded latency ~10ms |
| Riak | PA/EL | Accepts conflicting writes, uses CRDTs | Eventual consistency with low latency; vector clocks track causality |
Analysis of Classifications:
Why are there more PA/EL and PC/EC systems than mixed classifications?
Systems tend to cluster at the extremes because organizations have clear priorities:
PA/EL systems serve use cases where speed and availability trump correctness: social media feeds, caching layers, session stores, analytics aggregation. Temporary inconsistency is acceptable.
PC/EC systems serve use cases where correctness is paramount: financial transactions, inventory management, coordination services. Latency is acceptable.
The mixed classifications (PA/EC, PC/EL) represent more sophisticated approaches:
PA/EC says: "During normal times, we can afford to wait for consistency. But during a crisis, availability matters more." DynamoDB with strongly consistent reads exemplifies this.
PC/EL is rare and represents an unusual philosophy: "Enforce consistency when things are broken, but relax when healthy." This is counterintuitive and thus uncommon.
Many modern systems are PACELC-configurable rather than fixed. Cassandra and DynamoDB allow per-operation consistency levels. MongoDB offers read concern and write concern settings. This configurability lets a single system behave as PA/EL for some operations and PC/EC for others—providing flexibility that neither CAP nor a fixed PACELC classification captures.
We've established why CAP provides an incomplete picture and how PACELC fills the gap. Let's consolidate the key insights:
The Practical Impact:
Understanding PACELC changes how you approach system design:
What's next:
In the following pages, we'll explore:
You now understand why PACELC extends CAP to provide a complete framework for distributed system trade-offs. The next page will explore what happens during normal operation—the 'Else' clause of PACELC—where your system spends the vast majority of its time.