Loading content...
You're the architect of a new distributed system. The product requirements seem simple: users need to read and write data, the system must handle millions of requests, and it shouldn't go down. But hidden in these innocent requirements is a profound decision that will shape every aspect of your system.
The Question: When your system is partitioned—and it will be—should it refuse some requests to maintain consistency, or continue serving requests that might return stale data?
This question has no universally correct answer. The right choice depends on your specific use case, your users' expectations, and the consequences of different failure modes. Banking systems can't show incorrect balances. Social media feeds can tolerate brief inconsistency. E-commerce carts might need to work both ways.
In this page, we'll develop a framework for making this decision—a structured approach to reasoning about CAP trade-offs that you can apply to any distributed system design.
By the end of this page, you will understand a decision framework for choosing between CP and AP, the business and technical factors that influence CAP trade-offs, how to apply different consistency levels to different data, and the nuanced reality that CAP is not a binary choice but a spectrum of possibilities.
Since partition tolerance is mandatory, the CAP choice reduces to:
CP (Consistency + Partition Tolerance): During a partition, the system may become unavailable but never returns incorrect data.
Characteristics:
AP (Availability + Partition Tolerance): During a partition, the system remains available but may return stale or inconsistent data.
Characteristics:
| Aspect | CP Systems | AP Systems |
|---|---|---|
| During Partition | Some requests fail | All requests succeed (from available nodes) |
| Data Consistency | Always consistent | Eventually consistent |
| Write Conflicts | Prevented by design | Must be resolved |
| User Experience | May see errors/timeouts | May see stale data |
| After Partition Heals | Resume normally | Merge/resolve conflicts |
| Implementation Complexity | Simpler semantics | Complex conflict resolution |
| Latency (normal ops) | Higher (sync replication) | Lower (async possible) |
| Throughput | Limited by coordination | Higher (parallel writes) |
The Core Question:
To choose between CP and AP, ask yourself:
"Which is worse for my users: seeing an error, or seeing incorrect data?"
This simple heuristic captures the essence of the CAP trade-off, but real systems require more nuanced analysis.
While CAP focuses on partition behavior, your CP/AP choice affects normal operations too. CP systems incur consistency overhead even without partitions (synchronous replication, quorum writes). AP systems enjoy lower latency always, but require conflict handling infrastructure. Consider the full operational profile, not just the partition edge case.
Making the CAP trade-off is not purely technical—it involves business, operational, and user experience considerations:
Business Factors:
Revenue Impact of Downtime vs. Errors:
Regulatory Requirements:
SLA Commitments:
User Experience Factors:
User Expectations:
Visibility of Inconsistency:
Recovery Options:
| Use Case | Consistency Need | Availability Need | Recommendation |
|---|---|---|---|
| Bank account balances | Critical (money!) | High (but correctness > uptime) | CP |
| Inventory counts | High (don't oversell) | High | CP or CP with degradation |
| Shopping cart contents | Medium | Critical | AP with merge |
| Social media feed | Low | Critical | AP |
| User authentication | High (security) | High | CP with local cache |
| Real-time bidding | Critical (auctions) | Critical | CP (accept lower availability) |
| Analytics/metrics | Low | Medium | AP |
| Distributed locks | Critical | Lower | CP |
| Configuration data | High | High | CP (small data, rare changes) |
| Session data | Medium | High | AP with TTL |
Almost every real system has data with different consistency requirements. The question isn't 'Should my system be CP or AP?' but 'Which parts of my system should be CP, and which should be AP?' This leads to heterogeneous consistency, where different data stores or even different tables have different consistency levels.
Modern distributed databases recognize that CAP is not a binary choice. Instead of baking a fixed CP or AP decision into the system, they offer tunable consistency—the ability to choose consistency levels per operation.
Cassandra Consistency Levels:
Apache Cassandra exemplifies tunable consistency. For each read or write, you specify a consistency level:
| Level | Write Requirement | Read Requirement | Consistency | Availability |
|---|---|---|---|---|
| ANY | One node (including hinted handoff) | N/A | Lowest | Highest |
| ONE | One replica | One replica | Low | High |
| TWO | Two replicas | Two replicas | Medium-Low | Medium-High |
| THREE | Three replicas | Three replicas | Medium | Medium |
| QUORUM | Majority of replicas | Majority of replicas | High | Medium-Low |
| LOCAL_QUORUM | Majority in local DC | Majority in local DC | High (local) | Medium |
| EACH_QUORUM | Majority in each DC | N/A | Very High | Low |
| ALL | All replicas | All replicas | Highest | Lowest |
Using Tunable Consistency in Practice:
The power of tunable consistency is in mixing levels:
Strong Consistency When Needed (Quorum):
WRITE at QUORUM + READ at QUORUM → Linearizable
(W + R > N guarantees overlap)
Eventual Consistency for Speed (ONE):
WRITE at ONE + READ at ONE → Eventual
(Fast but may read stale data)
Durable Writes, Fast Reads:
WRITE at QUORUM + READ at ONE → Writer-heavy consistency
(Writes are durable, reads may be stale)
Fast Writes, Consistent Reads:
WRITE at ONE + READ at ALL → Reader-heavy consistency
(Writes are fast, reads get latest value)
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111
from cassandra.cluster import Clusterfrom cassandra import ConsistencyLevel class TunableConsistencyDemo: """ Demonstrates using different consistency levels for different operations based on their requirements. """ def __init__(self, hosts=['localhost']): self.cluster = Cluster(hosts) self.session = self.cluster.connect('myapp') def update_user_balance(self, user_id: str, new_balance: float): """ Financial data: Use strong consistency. We cannot tolerate stale reads or lost writes. """ statement = self.session.prepare( "UPDATE accounts SET balance = ? WHERE user_id = ?" ) # QUORUM: Majority of replicas must acknowledge statement.consistency_level = ConsistencyLevel.QUORUM self.session.execute(statement, [new_balance, user_id]) # Write is now durable on majority of nodes def get_user_balance(self, user_id: str) -> float: """ Reading balance: Also use QUORUM to ensure we see latest write. WRITE(QUORUM) + READ(QUORUM) = linearizable """ statement = self.session.prepare( "SELECT balance FROM accounts WHERE user_id = ?" ) statement.consistency_level = ConsistencyLevel.QUORUM result = self.session.execute(statement, [user_id]) return result.one().balance def post_social_update(self, user_id: str, content: str): """ Social media post: Prioritize availability. It's okay if the post takes a moment to appear everywhere. """ statement = self.session.prepare( "INSERT INTO posts (user_id, content, timestamp) VALUES (?, ?, toTimestamp(now()))" ) # ONE: Just one replica is enough statement.consistency_level = ConsistencyLevel.ONE self.session.execute(statement, [user_id, content]) # Fast response, eventual propagation to other replicas def get_user_feed(self, user_id: str, limit: int = 50): """ Feed: Eventually consistent is fine. Missing a very recent post for a few seconds is acceptable. """ statement = self.session.prepare( "SELECT * FROM posts WHERE user_id = ? LIMIT ?" ) statement.consistency_level = ConsistencyLevel.ONE return self.session.execute(statement, [user_id, limit]) def update_inventory(self, product_id: str, delta: int): """ Inventory: Strong consistency to prevent overselling. Use lightweight transactions for atomic update. """ # Lightweight Transaction (LWT) provides linearizable semantics # It's slower but essential for inventory statement = self.session.prepare( """ UPDATE inventory SET count = count + ? WHERE product_id = ? IF EXISTS """ ) # Serial consistency ensures linearizability for LWT statement.serial_consistency_level = ConsistencyLevel.SERIAL result = self.session.execute(statement, [delta, product_id]) return result.was_applied def log_analytics_event(self, event_data: dict): """ Analytics: Availability is paramount, consistency doesn't matter. Even if we lose a few events, it's fine. """ statement = self.session.prepare( "INSERT INTO analytics_events (event_id, data) VALUES (?, ?)" ) # ANY: Even hinted handoff counts - maximum availability statement.consistency_level = ConsistencyLevel.ANY self.session.execute(statement, [uuid.uuid4(), json.dumps(event_data)]) # Summary of consistency choices:## Data Type | Write Level | Read Level | Rationale# -------------------------------------------------------------------------# Account balance | QUORUM | QUORUM | Financial accuracy# Inventory counts | SERIAL (LWT) | SERIAL | Prevent overselling# User profile | QUORUM | ONE | Durable, fast reads# Social posts | ONE | ONE | Speed over freshness# Analytics events | ANY | N/A | Fire and forget# Distributed locks | SERIAL | SERIAL | Must be linearizableDynamoDB offers 'Eventually Consistent Reads' (default) and 'Strongly Consistent Reads' (2x cost). MongoDB has 'Write Concern' (how many replicas must ack) and 'Read Concern' (what data can be read). Most modern distributed databases provide some form of tunable consistency—it's the industry's pragmatic response to CAP.
As we explored briefly in the Availability page, the PACELC theorem extends CAP to address normal operation:
If there is a Partition (P), choose between Availability (A) and Consistency (C); Else (E), choose between Latency (L) and Consistency (C).
This is crucial because partitions are (hopefully) rare, but latency is constant. PACELC captures the reality that consistency has a cost even when everything is working.
The Full Trade-off Matrix:
PACELC gives us four possible system types:
| System | Partition Behavior | Normal Behavior | Classification | Best For |
|---|---|---|---|---|
| Cassandra | Available | Low Latency | PA/EL | High-write, globally distributed |
| DynamoDB | Available | Low Latency | PA/EL | Web applications, gaming |
| Riak | Available | Low Latency | PA/EL | IoT, session stores |
| MongoDB (default) | Available | Low Latency | PA/EL | General purpose |
| CockroachDB | Consistent | Consistent | PC/EC | OLTP needing SQL |
| Google Spanner | Consistent | Consistent | PC/EC | Global transactions |
| ZooKeeper | Consistent | Consistent | PC/EC | Coordination, config |
| etcd | Consistent | Consistent | PC/EC | Kubernetes state |
| Yahoo PNUTS | Consistent | Low Latency | PC/EL | Geo user data |
| VoltDB | Consistent | Low Latency | PC/EL | In-memory OLTP |
Why PACELC Matters for Decision Making:
CAP only helps you reason about partition scenarios. PACELC helps you understand the full trade-off:
Scenario: E-commerce product catalog
CAP analysis:
PACELC analysis:
The PACELC lens reminds you that your normal-operation performance depends on your consistency choices too.
Partitions might occur 0.1% of the time. Normal operation is 99.9% of the time. PACELC encourages you to optimize for the common case while having a clear plan for partitions. A PA/EL system that's fast 99.9% of the time and eventually consistent during rare partitions is often the right trade-off for user-facing applications.
Real applications don't have uniform consistency needs. Different types of data require different trade-offs. Polyglot consistency is the practice of using different consistency levels—and even different databases—for different parts of your data.
Example: E-commerce Platform Data Tiers:
| Data Type | Consistency Need | Why | Technology Choice |
|---|---|---|---|
| Customer accounts | Strong | Security, authentication | PostgreSQL (CP) |
| Order records | Strong | Financial, legal | PostgreSQL (CP) |
| Inventory counts | Strong (or safe) | Prevent overselling | PostgreSQL with row locks |
| Shopping cart | Session/Eventual | UX priority | Redis with replication |
| Product catalog | Eventual | Changes infrequent | Elasticsearch (AP) |
| Recommendations | Eventual | Approximate is fine | Redis or Cassandra |
| View counts | Eventual | Accuracy not critical | Cassandra (AP) |
| User sessions | Session-level | Sticky sessions help | Redis Cluster |
Implementing Polyglot Consistency:
Architecture Pattern 1: Multiple Databases
Use different databases for different consistency needs:
Architecture Pattern 2: Single Database, Tunable Levels
Use a database with tunable consistency for everything:
Architecture Pattern 3: Primary + Replicas with Different Consistency
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124
from enum import Enumfrom typing import Any class ConsistencyLevel(Enum): STRONG = "strong" # Read-your-writes, linearizable SESSION = "session" # Consistent within user session EVENTUAL = "eventual" # May return stale data class PolyglotDataLayer: """ Unified data layer that routes to appropriate stores based on data type and consistency requirements. """ def __init__(self): # Different stores for different consistency needs self.postgres = PostgreSQLClient() # Strong consistency self.cassandra = CassandraClient() # Tunable consistency self.redis = RedisClient() # Session/cache self.elasticsearch = ElasticsearchClient() # Search (eventual) # ========================================== # STRONG CONSISTENCY OPERATIONS # ========================================== def get_account(self, user_id: str) -> Account: """User account: Strong consistency required.""" return self.postgres.query( "SELECT * FROM accounts WHERE id = ?", [user_id] ) def transfer_funds(self, from_id: str, to_id: str, amount: float) -> bool: """Financial transaction: ACID required.""" with self.postgres.transaction(): self.postgres.execute( "UPDATE accounts SET balance = balance - ? WHERE id = ?", [amount, from_id] ) self.postgres.execute( "UPDATE accounts SET balance = balance + ? WHERE id = ?", [amount, to_id] ) return True # Or raise on failure def reserve_inventory(self, product_id: str, quantity: int) -> bool: """Inventory: Strong consistency to prevent overselling.""" # Use row-level locking for atomic decrement result = self.postgres.execute( """ UPDATE inventory SET available = available - ? WHERE product_id = ? AND available >= ? """, [quantity, product_id, quantity] ) return result.rows_affected > 0 # ========================================== # SESSION CONSISTENCY OPERATIONS # ========================================== def get_cart(self, user_id: str) -> Cart: """Shopping cart: Session consistency is sufficient.""" # Redis with session affinity ensures user sees their updates cart_data = self.redis.get(f"cart:{user_id}") return Cart.from_json(cart_data) if cart_data else Cart() def add_to_cart(self, user_id: str, item: CartItem): """Add to cart: Fast, async replication is fine.""" cart = self.get_cart(user_id) cart.add_item(item) self.redis.set(f"cart:{user_id}", cart.to_json(), ex=3600) def get_session(self, session_id: str) -> Session: """Session data: User only needs to see their own updates.""" return self.redis.get(f"session:{session_id}") # ========================================== # EVENTUAL CONSISTENCY OPERATIONS # ========================================== def get_product(self, product_id: str) -> Product: """ Product details: Eventual consistency is fine. Product changes are rare, and stale data briefly is OK. """ return self.cassandra.query( "SELECT * FROM products WHERE id = ?", [product_id], consistency=ConsistencyLevel.ONE ) def search_products(self, query: str) -> list[Product]: """Search: Eventual consistency (index may be slightly behind).""" return self.elasticsearch.search("products", query) def log_product_view(self, user_id: str, product_id: str): """Analytics event: Fire-and-forget, eventual is fine.""" self.cassandra.execute( "INSERT INTO product_views (user_id, product_id, ts) VALUES (?, ?, ?)", [user_id, product_id, datetime.now()], consistency=ConsistencyLevel.ANY # Maximum availability ) def get_recommendations(self, user_id: str) -> list[Product]: """Recommendations: Computed periodically, staleness acceptable.""" # Cached in Redis, refreshed every hour cached = self.redis.get(f"recs:{user_id}") if cached: return [Product.from_id(p) for p in json.loads(cached)] # Fallback to Cassandra with eventual consistency return self.cassandra.query( "SELECT * FROM user_recommendations WHERE user_id = ?", [user_id], consistency=ConsistencyLevel.ONE ) # The key insight: Different operations have legitimately different needs.# Using strong consistency everywhere is wasteful.# Using eventual consistency everywhere is dangerous.# Polyglot consistency gives each operation what it actually needs.Polyglot consistency adds architectural complexity. You now have multiple data stores to manage, sync, and monitor. Only adopt this pattern if different consistency needs are genuinely present in your application. For simpler applications, a single database with tunable consistency may suffice.
Understanding CAP is one thing; applying it correctly is another. Here are pitfalls that catch even experienced engineers:
Mistake: Assuming Your Partition Strategy Will Work
Many teams design a partition handling strategy but never test it:
These assumptions must be tested under realistic partition conditions. Netflix's Chaos Monkey, and the broader Chaos Engineering discipline, exist precisely because untested strategies fail when you need them most.
Mistake: Not Documenting the Trade-off
When you make a CAP decision, document it:
Without documentation, the next engineer (or future you) will make assumptions that break the system.
The biggest CAP mistake is pretending your system provides both C and A during partitions. If your documentation says 'highly available AND strongly consistent,' you're either making a false claim, using non-standard definitions, or haven't tested partitions. Be honest about your trade-offs so users can make informed decisions.
We've developed a comprehensive framework for reasoning about CAP trade-offs. Let's consolidate the key insights:
What's Next:
We've explored the CAP theorem's components and how to make trade-off decisions. The final page of this module examines System Classification—how different distributed systems are categorized based on their CAP behavior, with examples of production systems and their design rationales.
You now have a framework for making CAP trade-offs. You understand the factors that influence the decision, how tunable consistency provides flexibility, the importance of PACELC for normal operation, and the value of polyglot consistency for heterogeneous data. This knowledge enables you to design distributed systems that make conscious, informed trade-offs.