System Design (HLD)Quorum-Based Replication

Quorum-Based Replication

LevelAdvanced

Duration75 mins

TopicQuorum-Based Replication

2 / 5

N, W, R Parameters

The Three Knobs That Control Everything

In every quorum-based distributed database—whether Cassandra, DynamoDB, Riak, or a custom-built system—there exist three fundamental parameters that determine nearly everything about the system's behavior: N, W, and R. These aren't just configuration options; they're the dials that let you precisely tune where your system sits on the spectrum between consistency, availability, durability, and performance.

Understanding these parameters deeply—not just their definitions, but their interactions, implications, and optimal configurations for different scenarios—is what separates engineers who configure databases from engineers who design systems that work at scale.

What You Will Learn

By the end of this page, you will master the formal definitions of N, W, and R; understand how to calculate optimal values for different workloads; analyze the tradeoffs between various configurations; recognize common configuration patterns used in production systems; and develop intuition for tuning these parameters in your own distributed systems.

Defining the Parameters Formally

Before diving into configurations and tradeoffs, let's establish rigorous definitions for each parameter. These definitions may seem simple, but precision here prevents confusion later.

N — Replication Factor

N represents the total number of replicas (copies) of each piece of data in the system. When you write a key-value pair, the system stores N copies across N different nodes.

Scope: N is typically set at the keyspace or table level, not per-operation
Implications: Higher N means more storage overhead but greater fault tolerance and read parallelism
Common values: 3 is the most common production value; 5 for critical data; 1 for ephemeral data

W — Write Quorum

W is the number of replicas that must acknowledge a write before the operation is considered successful. The write is sent to all N replicas, but the client receives confirmation after W acknowledgments.

Scope: Can be set globally, per-keyspace, or per-operation
Implications: Higher W means more durable writes but slower and less available
Common values: Majority (⌊N/2⌋ + 1), N (for critical writes), 1 (for eventual consistency)

R — Read Quorum

R is the number of replicas that must respond to a read before the system returns a value. The read queries multiple replicas and returns the most recent version.

Scope: Can be set globally, per-keyspace, or per-operation
Implications: Higher R means fresher reads but slower and less available
Common values: Majority, 1 (for speed), N (for absolute consistency)

Parameter Scope and Typical Configuration Points
Parameter	What It Controls	Where Typically Set	Changeable Per-Request?
N (Replication Factor)	Number of data copies	Keyspace/Table creation	No (requires data migration)
W (Write Quorum)	Writes needed to confirm	Global default + per-operation	Yes
R (Read Quorum)	Reads needed to respond	Global default + per-operation	Yes

cassandra-quorum-config.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- N: Replication Factor set at keyspace creation
CREATE KEYSPACE financial_data
WITH REPLICATION = {
  'class': 'NetworkTopologyStrategy',
  'datacenter1': 3,   -- N=3 in datacenter1
  'datacenter2': 3    -- N=3 in datacenter2 (total 6 replicas)
};
 
-- W: Write consistency level (can be set per query)
-- Writing with QUORUM (majority): W = floor(N/2) + 1 = 2
INSERT INTO accounts (id, balance) VALUES ('user-123', 10000)
USING CONSISTENCY QUORUM;
 
-- W: Writing to ALL replicas (W = N = 3)
INSERT INTO critical_config (key, value) VALUES ('rate_limit', '100')
USING CONSISTENCY ALL;
 
-- W: Writing with acknowledgment from any single node (W = 1)
INSERT INTO access_logs (user_id, timestamp, action) VALUES ('user-123', now(), 'login')
USING CONSISTENCY ONE;
 
-- R: Read consistency level (can be set per query)
-- Reading with QUORUM: R = floor(N/2) + 1 = 2
SELECT balance FROM accounts WHERE id = 'user-123'
USING CONSISTENCY QUORUM;
 
-- R: Reading from ONE replica (fast but potentially stale)
SELECT timestamp FROM access_logs WHERE user_id = 'user-123'
USING CONSISTENCY ONE;
 
-- Combined: Strong consistency requires W + R > N
-- With N=3, W=QUORUM(2), R=QUORUM(2): 2+2=4 > 3 ✓ Strong consistency
-- With N=3, W=ONE(1), R=ONE(1): 1+1=2 < 3 ✗ Eventual consistency

The Consistency Equation Deep Dive

The relationship W + R > N is the cornerstone of quorum-based consistency, but understanding it deeply requires exploring the mathematics from multiple angles.

Why Greater Than, Not Greater Than or Equal?

The strict inequality (>) rather than (≥) is crucial. If W + R = N exactly, it's possible for the write quorum and read quorum to be perfectly disjoint—the write touches one subset of nodes while the read touches the exact complement.

With N = 4, W = 2, R = 2:

Write could go to nodes {A, B}
Read could query nodes {C, D}
Intersection = ∅ → Read misses the latest write

With W + R = 5 > 4:

Write goes to ≥ 2 nodes, Read queries ≥ 3 nodes
No matter which 2 nodes receive the write, reading 3 nodes MUST hit at least one

The Overlap Formula Revisited:

The minimum guaranteed overlap is: Overlap = W + R - N

When this value is:

> 0: Guaranteed intersection (strong consistency)
= 0: No guaranteed intersection (but possible)
< 0: Definitely no intersection (mathematically impossible)

quorum-calculator.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
interface QuorumAnalysis {
  // Input parameters
  n: number;
  w: number;
  r: number;
  
  // Consistency analysis
  hasStrongConsistency: boolean;
  minOverlap: number;
  
  // Availability analysis
  writesAvailableWithFailures: number;
  readsAvailableWithFailures: number;
  
  // Recommendations
  consistencyLevel: 'strong' | 'eventual' | 'invalid';
  warnings: string[];
}
 
function analyzeQuorum(n: number, w: number, r: number): QuorumAnalysis {
  const warnings: string[] = [];
  
  // Validate inputs
  if (w > n) warnings.push(`W(${w}) cannot exceed N(${n})`);
  if (r > n) warnings.push(`R(${r}) cannot exceed N(${n})`);
  if (w < 1) warnings.push('W must be at least 1');
  if (r < 1) warnings.push('R must be at least 1');
  
  const hasStrongConsistency = w + r > n;
  const minOverlap = Math.max(0, w + r - n);
  
  // Failure tolerance
  const writesAvailableWithFailures = n - w;
  const readsAvailableWithFailures = n - r;
  
  // Warnings for edge cases
  if (w === n) {
    warnings.push('W=N: Any node failure will block writes');
  }
  if (r === n) {
    warnings.push('R=N: Any node failure will block reads');
  }
  if (w === 1 && hasStrongConsistency) {
    warnings.push('W=1 with strong consistency requires R=N; single node failure blocks reads');
  }
  
  let consistencyLevel: 'strong' | 'eventual' | 'invalid';
  if (w > n || r > n || w < 1 || r < 1) {
    consistencyLevel = 'invalid';
  } else if (hasStrongConsistency) {
    consistencyLevel = 'strong';
  } else {
    consistencyLevel = 'eventual';
  }
  
  return {
    n, w, r,
    hasStrongConsistency,
    minOverlap,
    writesAvailableWithFailures,
    readsAvailableWithFailures,
    consistencyLevel,
    warnings,
  };
}
 
// Example usage:
console.log(analyzeQuorum(5, 3, 3));
/* Output:
{
  n: 5, w: 3, r: 3,
  hasStrongConsistency: true,
  minOverlap: 1,
  writesAvailableWithFailures: 2,
  readsAvailableWithFailures: 2,
  consistencyLevel: 'strong',
  warnings: []
}
*/
 
console.log(analyzeQuorum(5, 2, 2));
/* Output:
{
  n: 5, w: 2, r: 2,
  hasStrongConsistency: false,
  minOverlap: 0,
  writesAvailableWithFailures: 3,
  readsAvailableWithFailures: 3,
  consistencyLevel: 'eventual',
  warnings: []
}
*/

The Goldilocks Principle

The most common configuration in production is N=3 with W=2 and R=2. This provides strong consistency (2+2=4 > 3), tolerates 1 node failure for both reads and writes, requires only 3x storage overhead, and offers reasonable latency (wait for 2 of 3 nodes). It's not always optimal, but it's a safe default that works well for most workloads.

Read-Heavy vs Write-Heavy Optimization

One of the most powerful aspects of quorum systems is the ability to skew the consistency burden toward either reads or writes based on your workload characteristics. This is where the real art of system design emerges.

The Fundamental Tradeoff:

For strong consistency (W + R > N), as you increase one parameter, you can decrease the other:

W = N, R = 1 → All burden on writes, reads are fast and available
W = 1, R = N → All burden on reads, writes are fast and available
W = R = majority → Balanced load

Read-Heavy Workloads (W High, R Low):

Typical for systems where:

Reads vastly outnumber writes (100:1 or more)
Write latency is less critical than read latency
Write availability during failures is acceptable to sacrifice
Data rarely changes but is frequently queried

Examples: Product catalogs, content delivery, reference data, configuration services

Read-Heavy Config (W High)

•Configuration: N=5, W=4, R=2
•Consistency: 4 + 2 = 6 > 5 ✓ Strong
•Write tolerance: 1 node failure
•Read tolerance: 3 node failures
•Write latency: High (wait for 4 nodes)
•Read latency: Low (wait for 2 nodes)
•Best for: Product catalogs, CMS, configs

Write-Heavy Config (R High)

•Configuration: N=5, W=2, R=4
•Consistency: 2 + 4 = 6 > 5 ✓ Strong
•Write tolerance: 3 node failures
•Read tolerance: 1 node failure
•Write latency: Low (wait for 2 nodes)
•Read latency: High (wait for 4 nodes)
•Best for: Logging, IoT ingestion, events

Write-Heavy Workloads (W Low, R High):

Typical for systems where:

Writes are frequent and must have low latency
Reads can tolerate slightly higher latency
Write availability is critical (can't drop events)
Data is written once and read infrequently

Examples: Event logging, IoT sensor data, time-series ingestion, audit trails

The Mathematics of Latency Optimization:

Latency in quorum operations is typically dominated by the slowest node in the quorum. This is the "tail latency" problem.

For W = 3 with N = 5:

Send write to 5 nodes
Return after 3rd fastest response
Latency ≈ 3rd percentile of node latencies

For W = 2 with N = 5:

Send write to 5 nodes
Return after 2nd fastest response
Latency ≈ 2nd percentile of node latencies

This is why lower quorums have better latency—you're measuring closer to the median node performance rather than the tail.

latency-simulation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// Simulate quorum latency based on node response times
function simulateQuorumLatency(
  nodeLatencies: number[],  // Array of latencies from each node
  quorum: number            // Number of responses needed
): number {
  // Sort latencies ascending
  const sorted = [...nodeLatencies].sort((a, b) => a - b);
  
  // Quorum is achieved when we get the q-th response
  // (0-indexed, so quorum-1)
  return sorted[quorum - 1];
}
 
// Example: 5 nodes with varying latencies
const nodeSpeeds = [10, 25, 50, 100, 500];  // ms (one slow node!)
 
console.log(`W=1: ${simulateQuorumLatency(nodeSpeeds, 1)}ms`);  // 10ms
console.log(`W=2: ${simulateQuorumLatency(nodeSpeeds, 2)}ms`);  // 25ms
console.log(`W=3: ${simulateQuorumLatency(nodeSpeeds, 3)}ms`);  // 50ms
console.log(`W=4: ${simulateQuorumLatency(nodeSpeeds, 4)}ms`);  // 100ms
console.log(`W=5: ${simulateQuorumLatency(nodeSpeeds, 5)}ms`);  // 500ms (!)
 
// Monte Carlo simulation for p99 latency
function simulateP99Latency(
  baseLatency: number,
  variance: number,
  nodeCount: number,
  quorum: number,
  iterations: number = 10000
): number {
  const latencies: number[] = [];
  
  for (let i = 0; i < iterations; i++) {
    // Generate random latencies for each node
    const nodeLatencies = Array(nodeCount).fill(0).map(() =>
      baseLatency + (Math.random() * variance * 2 - variance)
    );
    
    latencies.push(simulateQuorumLatency(nodeLatencies, quorum));
  }
  
  latencies.sort((a, b) => a - b);
  return latencies[Math.floor(iterations * 0.99)];  // p99
}
 
// Compare p99 latencies for different quorum settings
const base = 20;   // ms base latency
const variance = 10;  // ms variance
 
console.log('P99 Latencies:');
console.log(`W=2: ${simulateP99Latency(base, variance, 5, 2).toFixed(1)}ms`);
console.log(`W=3: ${simulateP99Latency(base, variance, 5, 3).toFixed(1)}ms`);
console.log(`W=4: ${simulateP99Latency(base, variance, 5, 4).toFixed(1)}ms`);

Per-Operation Consistency Levels

One of the most powerful features of modern quorum-based systems is the ability to specify W and R on a per-operation basis. This enables fine-grained consistency control that matches your application's actual requirements.

Why Per-Operation Consistency Matters:

Not all data is equally critical. Consider an e-commerce platform:

Account balance: Needs absolute consistency—never show stale balance
Product description: Eventually consistent is fine—stale by seconds is acceptable
User session data: Can tolerate some staleness, but prefer fresh
Analytics events: Fire-and-forget, eventual consistency is perfect

With per-operation consistency, you can optimize each use case without making global tradeoffs.

Per-Operation Consistency Strategy by Data Type
Data Type	Write Consistency	Read Consistency	W + R vs N	Rationale
Financial transactions	QUORUM (W=2)	QUORUM (R=2)	4 > 3 ✓	Strong consistency required
User profile data	QUORUM (W=2)	ONE (R=1)	3 = 3 ✗	Accept slight staleness for speed
Product catalog	ALL (W=3)	ONE (R=1)	4 > 3 ✓	Heavy reads, rare writes
Session tokens	ONE (W=1)	LOCAL_QUORUM	Varies	Fast writes, local reads
Metrics/Logs	ONE (W=1)	ONE (R=1)	2 < 3 ✗	Performance over consistency
Critical configs	ALL (W=3)	ALL (R=3)	6 > 3 ✓✓	Maximum durability and freshness

per-operation-consistency.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
enum ConsistencyLevel {
  ONE = 'ONE',           // W or R = 1
  QUORUM = 'QUORUM',     // W or R = floor(N/2) + 1
  ALL = 'ALL',           // W or R = N
  LOCAL_ONE = 'LOCAL_ONE',      // 1 in local datacenter
  LOCAL_QUORUM = 'LOCAL_QUORUM', // Quorum in local datacenter
  EACH_QUORUM = 'EACH_QUORUM',   // Quorum in each datacenter
}
 
interface QueryOptions {
  consistency: ConsistencyLevel;
  timeout?: number;
  retryPolicy?: RetryPolicy;
}
 
class FinancialService {
  // Critical: Account balances require strong consistency
  async getBalance(accountId: string): Promise<number> {
    return this.db.read(
      'SELECT balance FROM accounts WHERE id = ?',
      [accountId],
      { consistency: ConsistencyLevel.QUORUM }  // R=2
    );
  }
  
  async updateBalance(accountId: string, newBalance: number): Promise<void> {
    await this.db.write(
      'UPDATE accounts SET balance = ? WHERE id = ?',
      [newBalance, accountId],
      { consistency: ConsistencyLevel.QUORUM }  // W=2
      // Combined: 2+2 = 4 > 3 = N → Strong consistency
    );
  }
  
  // Less critical: Transaction history can be slightly stale
  async getTransactionHistory(accountId: string): Promise<Transaction[]> {
    return this.db.read(
      'SELECT * FROM transactions WHERE account_id = ? ORDER BY timestamp DESC',
      [accountId],
      { consistency: ConsistencyLevel.ONE }  // R=1 for speed
    );
  }
}
 
class LoggingService {
  // High volume, low criticality: Fire and forget
  async logEvent(event: LogEvent): Promise<void> {
    await this.db.write(
      'INSERT INTO events (id, type, data, timestamp) VALUES (?, ?, ?, ?)',
      [event.id, event.type, event.data, event.timestamp],
      { 
        consistency: ConsistencyLevel.ONE,  // W=1
        timeout: 50  // Fail fast
      }
    );
  }
}
 
class ConfigService {
  // Critical infrastructure: Maximum durability and freshness
  async getConfig(key: string): Promise<string> {
    return this.db.read(
      'SELECT value FROM configs WHERE key = ?',
      [key],
      { consistency: ConsistencyLevel.ALL }  // R=N
    );
  }
  
  async setConfig(key: string, value: string): Promise<void> {
    await this.db.write(
      'UPDATE configs SET value = ? WHERE key = ?',
      [value, key],
      { consistency: ConsistencyLevel.ALL }  // W=N
    );
  }
}

Mixing Consistency Levels Carefully

When using different consistency levels for the same data, be extremely careful about the guarantees. If you write with W=1 and read with R=1, there's no consistency guarantee even if you typically use QUORUM elsewhere. Always analyze the consistency equation for the actual W and R values used in each specific code path, not just the defaults.

Multi-Datacenter Quorum Considerations

When your system spans multiple datacenters (or availability zones), the quorum calculations become more nuanced. The N, W, R model must account for geographic distribution and the significant latency differences between local and cross-datacenter communication.

The Challenge:

Consider N = 6 with 3 replicas in each of 2 datacenters:

DC-East: Nodes {A, B, C}
DC-West: Nodes {D, E, F}

With W = 4 (majority), a write might need to wait for nodes in both datacenters:

If DC-East nodes respond quickly but DC-West is slow (cross-continent latency), write latency is dominated by the slowest required response
This can add 50-200ms to every write operation

Local vs Global Quorums:

To address this, systems like Cassandra introduce LOCAL_QUORUM:

LOCAL_QUORUM: Achieve quorum within the local datacenter only
EACH_QUORUM: Achieve quorum in every datacenter
QUORUM: Global quorum across all nodes regardless of location

multi-dc-quorum.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
interface DatacenterTopology {
  name: string;
  replicationFactor: number;
  nodes: Node[];
}
 
interface MultiDCQuorumConfig {
  datacenters: DatacenterTopology[];
  globalReplicationFactor: number;  // Sum of all DC replication factors
}
 
function calculateLocalQuorum(dcReplicationFactor: number): number {
  return Math.floor(dcReplicationFactor / 2) + 1;
}
 
function calculateGlobalQuorum(globalN: number): number {
  return Math.floor(globalN / 2) + 1;
}
 
// Example: 2 datacenters with RF=3 each
const topology: MultiDCQuorumConfig = {
  datacenters: [
    { name: 'us-east', replicationFactor: 3, nodes: ['A', 'B', 'C'] },
    { name: 'us-west', replicationFactor: 3, nodes: ['D', 'E', 'F'] },
  ],
  globalReplicationFactor: 6,
};
 
// Consistency levels and their behavior
const consistencyLevels = {
  ONE: {
    description: 'Single node acknowledgment',
    nodes: 1,
    latency: 'Lowest',
    consistency: 'Weakest'
  },
  LOCAL_ONE: {
    description: 'Single node in local DC',
    nodes: 1,
    latency: 'Very low (no cross-DC)',
    consistency: 'Weak but fast'
  },
  LOCAL_QUORUM: {
    description: 'Quorum in local DC only',
    nodesNeeded: calculateLocalQuorum(3),  // = 2
    latency: 'Low (no cross-DC)',
    consistency: 'Strong within DC'
  },
  QUORUM: {
    description: 'Global quorum across all DCs',
    nodesNeeded: calculateGlobalQuorum(6),  // = 4
    latency: 'High (must cross DC)',
    consistency: 'Strong globally'
  },
  EACH_QUORUM: {
    description: 'Quorum in EVERY datacenter',
    nodesNeeded: 2 + 2,  // 2 in each DC
    latency: 'Highest (all DCs required)',
    consistency: 'Strongest (survives DC failure)'
  },
  ALL: {
    description: 'All nodes must acknowledge',
    nodesNeeded: 6,
    latency: 'Highest possible',
    consistency: 'Maximum durability'
  },
};
 
// Latency analysis
console.log('Latency comparison:');
console.log('LOCAL_QUORUM (2 local nodes): ~5-10ms');
console.log('QUORUM (4 nodes, likely cross-DC): ~50-150ms');
console.log('EACH_QUORUM (2 per DC): ~50-150ms');
console.log('ALL (6 nodes): ~75-200ms');

Multi-DC Consistency Level Behavior (N=3 per DC, Total N=6)
Consistency Level	Nodes Needed	Cross-DC Required?	Survives DC Failure?	Typical Latency
ONE	1	Optional	No	< 5ms
LOCAL_ONE	1	No	No	< 5ms
LOCAL_QUORUM	2	No	No	5-15ms
QUORUM	4	Usually	Usually	50-150ms
EACH_QUORUM	4 (2+2)	Yes	Yes	50-150ms
ALL	6	Yes	Yes	75-200ms

The LOCAL_QUORUM Trade-off

LOCAL_QUORUM is popular for latency-sensitive applications because it avoids cross-datacenter communication. However, during a datacenter failure, data written with LOCAL_QUORUM in the failed DC may not be visible in surviving DCs until anti-entropy processes complete. If you need immediate consistency across DC failures, you must use EACH_QUORUM or QUORUM, accepting the latency penalty.

Choosing Optimal N Value

While W and R can be tuned per-operation, N (replication factor) is typically fixed at keyspace creation and is hard to change. Choosing the right N requires balancing multiple factors.

Factors Favoring Higher N:

Fault Tolerance: With N=5 and W=3, you can lose 2 nodes and still accept writes. With N=3 and W=2, you can only lose 1 node.
Read Scalability: More replicas = more nodes to distribute read load. With N=5 and R=1, reads can be load-balanced across 5 nodes.
Geographic Distribution: Higher N allows placing replicas in more regions for lower user latency.
Data Criticality: Mission-critical data warrants more redundancy.

Factors Favoring Lower N:

Storage Cost: N=5 means 5x storage overhead. For petabyte-scale data, this is significant.
Write Amplification: Every write must be sent to N nodes, consuming N× network bandwidth.
Consistency Coordination: Larger N means more nodes to coordinate, potentially increasing latency.
Operational Complexity: More replicas = more nodes to manage, monitor, and repair.

N Selection Guidelines
Scenario	Recommended N	Rationale
Development/Testing	1	No redundancy needed; save resources
Non-critical data (logs, metrics)	2	Some redundancy without storage overhead
Standard production data	3	Industry standard; survives 1 node failure with majority quorum
Critical production data	5	Survives 2 node failures with majority quorum
Multi-datacenter (2 DCs)	3 per DC (6 total)	Survives single DC failure
Global distribution (3+ regions)	2-3 per region	Balance between coverage and cost

n-selection.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
interface DataCharacteristics {
  isCritical: boolean;          // Can the business survive data loss?
  dataSize: 'small' | 'medium' | 'large' | 'massive';
  writeVolume: 'low' | 'medium' | 'high';
  readPattern: 'balanced' | 'read-heavy' | 'write-heavy';
  geographicRequirements: 'single-region' | 'multi-region' | 'global';
  budgetConstraint: 'low' | 'medium' | 'high';
}
 
function recommendReplicationFactor(data: DataCharacteristics): {
  recommendedN: number;
  reasoning: string[];
} {
  const reasoning: string[] = [];
  let n = 3; // Default starting point
  
  // Criticality adjustment
  if (data.isCritical) {
    n = Math.max(n, 5);
    reasoning.push('Critical data: Starting at N=5 for maximum protection');
  }
  
  // Size/cost adjustment
  if (data.dataSize === 'massive' && data.budgetConstraint === 'low') {
    n = Math.min(n, 3);
    reasoning.push('Large data + budget constraint: Capping at N=3');
  }
  
  // Geographic requirements
  if (data.geographicRequirements === 'multi-region') {
    n = Math.max(n, 4);  // At least 2 per region for 2 regions
    reasoning.push('Multi-region: Need at least N=4 (2 per region)');
  } else if (data.geographicRequirements === 'global') {
    n = Math.max(n, 6);  // At least 2 per region for 3+ regions
    reasoning.push('Global: Need at least N=6 (2 per region for 3 regions)');
  }
  
  // Read scalability
  if (data.readPattern === 'read-heavy' && data.writeVolume === 'low') {
    n = Math.max(n, 5);
    reasoning.push('Read-heavy: Higher N provides more read replicas');
  }
  
  // Write volume consideration
  if (data.writeVolume === 'high' && data.dataSize === 'large') {
    n = Math.min(n, 3);
    reasoning.push('High write volume + large data: Limiting N to control write amplification');
  }
  
  // Ensure odd number for clean majority quorums
  if (n % 2 === 0 && n > 2) {
    n += 1;
    reasoning.push(`Adjusted to odd number N=${n} for clean majority quorum`);
  }
  
  return { recommendedN: n, reasoning };
}
 
// Example usage
const recommendation = recommendReplicationFactor({
  isCritical: true,
  dataSize: 'medium',
  writeVolume: 'medium',
  readPattern: 'read-heavy',
  geographicRequirements: 'multi-region',
  budgetConstraint: 'medium',
});
 
console.log(`Recommended N: ${recommendation.recommendedN}`);
recommendation.reasoning.forEach(r => console.log(`  - ${r}`));

Why Odd Numbers for N?

N values of 3, 5, 7 are preferred because they allow clean majority quorums. With N=3, majority is 2 (survives 1 failure). With N=4, majority is 3 (survives 1 failure)—same fault tolerance but 33% more storage. With N=5, majority is 3 (survives 2 failures). Even numbers don't improve fault tolerance proportionally to their storage cost.

Common Production Configurations

Let's examine how major production systems configure N, W, R, and the reasoning behind their choices. These real-world examples provide battle-tested templates for your own systems.

Production System Configurations

•Amazon DynamoDB (Managed): N is abstracted but typically 3 replicas across AZs. W and R are exposed as 'Eventually Consistent' (R=1) or 'Strongly Consistent' (R≥2) reads. Writes are always durable to majority.
•Apache Cassandra: Default N=3, W=QUORUM (2), R=QUORUM (2). Provides strong consistency with 4 > 3. This is the most common production configuration.
•Riak: Default N=3, W=QUORUM (2), R=QUORUM (2). Also supports 'sloppy quorum' for higher availability during failures.
•CockroachDB: Uses Raft consensus with N=3 replicas per range. Writes require majority (2), reads go to leaseholder for linearizability.
•MongoDB: N is configurable per replica set (typically 3). Writes use 'w: majority' for durability. Reads from primary for consistency or secondaries for scale.
•Consul (Service Discovery): N=3 or 5 for Consul servers. Uses Raft, so W = majority is required. Reads can be stale (R=1) or consistent (R=leader).

cassandra-production.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- PATTERN 1: Standard Strong Consistency
-- Most common production configuration
CREATE KEYSPACE transactional_data
WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 3 };
 
-- Write: QUORUM (2), Read: QUORUM (2)
-- 2 + 2 = 4 > 3 → Strong consistency
-- Tolerates 1 node failure for both reads and writes
 
-- PATTERN 2: Read-Heavy Workload
-- News sites, content platforms, product catalogs
CREATE KEYSPACE content_data
WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 3 };
 
-- Write: ALL (3), Read: ONE (1)  
-- 3 + 1 = 4 > 3 → Strong consistency
-- Writes are rare but durable; reads are fast
-- Warning: Any node failure blocks writes
 
-- PATTERN 3: Write-Heavy Workload
-- Logging, metrics, event streaming
CREATE KEYSPACE event_data
WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 3 };
 
-- Write: ONE (1), Read: ONE (1)
-- 1 + 1 = 2 < 3 → Eventual consistency (acceptable for logs)
-- Maximum write throughput and availability
 
-- PATTERN 4: Multi-Datacenter (Local + Remote)
CREATE KEYSPACE global_data
WITH REPLICATION = {
  'class': 'NetworkTopologyStrategy',
  'us-east': 3,
  'eu-west': 3
};
 
-- Write: LOCAL_QUORUM, Read: LOCAL_QUORUM
-- Fast local operations, async cross-DC replication
-- During DC failure: may see stale data from surviving DC
 
-- PATTERN 5: Multi-Datacenter (Strict Consistency)
CREATE KEYSPACE critical_financial_data
WITH REPLICATION = {
  'class': 'NetworkTopologyStrategy',
  'us-east': 3,
  'eu-west': 3
};
 
-- Write: EACH_QUORUM, Read: QUORUM
-- Every write confirmed in both DCs
-- Higher latency but survives full DC failure with consistency

The Industry Standard: N=3, W=2, R=2

If you're unsure what to choose, start with N=3, W=QUORUM (2), R=QUORUM (2). This configuration is the most widely deployed in production, provides strong consistency, tolerates single node failures, has reasonable storage overhead (3x), and has well-understood operational characteristics. Deviate only when you have specific requirements that justify different tradeoffs.

Summary: Mastering N, W, R Parameters

We've comprehensively explored the three parameters that govern quorum-based replication. Let's consolidate the essential knowledge:

Key Takeaways

•N (Replication Factor) determines how many copies of data exist. Higher N = more fault tolerance and read scalability, but more storage and write amplification.
•W (Write Quorum) controls write durability and latency. Higher W = more durable writes but slower and less available during failures.
•R (Read Quorum) controls read freshness and latency. Higher R = fresher reads but slower and less available during failures.
•W + R > N is the mathematical guarantee for strong consistency—ensuring every read sees the latest completed write.
•Per-operation consistency allows fine-grained control, optimizing each code path for its specific requirements.
•Multi-datacenter deployments introduce LOCAL_QUORUM vs EACH_QUORUM tradeoffs between latency and cross-DC consistency.
•N=3, W=2, R=2 is the industry-standard default—start here and deviate only with clear justification.
•Odd values for N are preferred because they provide cleaner majority quorum calculations.

Page Complete

You now have deep mastery of N, W, R parameters and how to configure them for various workloads. In the next page, we'll explore 'Tunable Consistency'—the ability to dynamically adjust these parameters to achieve the optimal balance between consistency, availability, and performance for your specific use case.

2 / 5

Loading learning content...

System Design (HLD)Quorum-Based Replication

Quorum-Based Replication

LevelAdvanced

Duration75 mins

TopicQuorum-Based Replication

2 / 5

N, W, R Parameters

The Three Knobs That Control Everything

What You Will Learn

Defining the Parameters Formally

Before diving into configurations and tradeoffs, let's establish rigorous definitions for each parameter. These definitions may seem simple, but precision here prevents confusion later.

N — Replication Factor

N represents the total number of replicas (copies) of each piece of data in the system. When you write a key-value pair, the system stores N copies across N different nodes.

Scope: N is typically set at the keyspace or table level, not per-operation
Implications: Higher N means more storage overhead but greater fault tolerance and read parallelism
Common values: 3 is the most common production value; 5 for critical data; 1 for ephemeral data

W — Write Quorum

Scope: Can be set globally, per-keyspace, or per-operation
Implications: Higher W means more durable writes but slower and less available
Common values: Majority (⌊N/2⌋ + 1), N (for critical writes), 1 (for eventual consistency)

R — Read Quorum

R is the number of replicas that must respond to a read before the system returns a value. The read queries multiple replicas and returns the most recent version.

Scope: Can be set globally, per-keyspace, or per-operation
Implications: Higher R means fresher reads but slower and less available
Common values: Majority, 1 (for speed), N (for absolute consistency)

Parameter Scope and Typical Configuration Points
Parameter	What It Controls	Where Typically Set	Changeable Per-Request?
N (Replication Factor)	Number of data copies	Keyspace/Table creation	No (requires data migration)
W (Write Quorum)	Writes needed to confirm	Global default + per-operation	Yes
R (Read Quorum)	Reads needed to respond	Global default + per-operation	Yes

cassandra-quorum-config.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- N: Replication Factor set at keyspace creation
CREATE KEYSPACE financial_data
WITH REPLICATION = {
  'class': 'NetworkTopologyStrategy',
  'datacenter1': 3,   -- N=3 in datacenter1
  'datacenter2': 3    -- N=3 in datacenter2 (total 6 replicas)
};
 
-- W: Write consistency level (can be set per query)
-- Writing with QUORUM (majority): W = floor(N/2) + 1 = 2
INSERT INTO accounts (id, balance) VALUES ('user-123', 10000)
USING CONSISTENCY QUORUM;
 
-- W: Writing to ALL replicas (W = N = 3)
INSERT INTO critical_config (key, value) VALUES ('rate_limit', '100')
USING CONSISTENCY ALL;
 
-- W: Writing with acknowledgment from any single node (W = 1)
INSERT INTO access_logs (user_id, timestamp, action) VALUES ('user-123', now(), 'login')
USING CONSISTENCY ONE;
 
-- R: Read consistency level (can be set per query)
-- Reading with QUORUM: R = floor(N/2) + 1 = 2
SELECT balance FROM accounts WHERE id = 'user-123'
USING CONSISTENCY QUORUM;
 
-- R: Reading from ONE replica (fast but potentially stale)
SELECT timestamp FROM access_logs WHERE user_id = 'user-123'
USING CONSISTENCY ONE;
 
-- Combined: Strong consistency requires W + R > N
-- With N=3, W=QUORUM(2), R=QUORUM(2): 2+2=4 > 3 ✓ Strong consistency
-- With N=3, W=ONE(1), R=ONE(1): 1+1=2 < 3 ✗ Eventual consistency

The Consistency Equation Deep Dive

The relationship W + R > N is the cornerstone of quorum-based consistency, but understanding it deeply requires exploring the mathematics from multiple angles.

Why Greater Than, Not Greater Than or Equal?

With N = 4, W = 2, R = 2:

Write could go to nodes {A, B}
Read could query nodes {C, D}
Intersection = ∅ → Read misses the latest write

With W + R = 5 > 4:

Write goes to ≥ 2 nodes, Read queries ≥ 3 nodes
No matter which 2 nodes receive the write, reading 3 nodes MUST hit at least one

The Overlap Formula Revisited:

The minimum guaranteed overlap is: Overlap = W + R - N

When this value is:

> 0: Guaranteed intersection (strong consistency)
= 0: No guaranteed intersection (but possible)
< 0: Definitely no intersection (mathematically impossible)

quorum-calculator.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
interface QuorumAnalysis {
  // Input parameters
  n: number;
  w: number;
  r: number;
  
  // Consistency analysis
  hasStrongConsistency: boolean;
  minOverlap: number;
  
  // Availability analysis
  writesAvailableWithFailures: number;
  readsAvailableWithFailures: number;
  
  // Recommendations
  consistencyLevel: 'strong' | 'eventual' | 'invalid';
  warnings: string[];
}
 
function analyzeQuorum(n: number, w: number, r: number): QuorumAnalysis {
  const warnings: string[] = [];
  
  // Validate inputs
  if (w > n) warnings.push(`W(${w}) cannot exceed N(${n})`);
  if (r > n) warnings.push(`R(${r}) cannot exceed N(${n})`);
  if (w < 1) warnings.push('W must be at least 1');
  if (r < 1) warnings.push('R must be at least 1');
  
  const hasStrongConsistency = w + r > n;
  const minOverlap = Math.max(0, w + r - n);
  
  // Failure tolerance
  const writesAvailableWithFailures = n - w;
  const readsAvailableWithFailures = n - r;
  
  // Warnings for edge cases
  if (w === n) {
    warnings.push('W=N: Any node failure will block writes');
  }
  if (r === n) {
    warnings.push('R=N: Any node failure will block reads');
  }
  if (w === 1 && hasStrongConsistency) {
    warnings.push('W=1 with strong consistency requires R=N; single node failure blocks reads');
  }
  
  let consistencyLevel: 'strong' | 'eventual' | 'invalid';
  if (w > n || r > n || w < 1 || r < 1) {
    consistencyLevel = 'invalid';
  } else if (hasStrongConsistency) {
    consistencyLevel = 'strong';
  } else {
    consistencyLevel = 'eventual';
  }
  
  return {
    n, w, r,
    hasStrongConsistency,
    minOverlap,
    writesAvailableWithFailures,
    readsAvailableWithFailures,
    consistencyLevel,
    warnings,
  };
}
 
// Example usage:
console.log(analyzeQuorum(5, 3, 3));
/* Output:
{
  n: 5, w: 3, r: 3,
  hasStrongConsistency: true,
  minOverlap: 1,
  writesAvailableWithFailures: 2,
  readsAvailableWithFailures: 2,
  consistencyLevel: 'strong',
  warnings: []
}
*/
 
console.log(analyzeQuorum(5, 2, 2));
/* Output:
{
  n: 5, w: 2, r: 2,
  hasStrongConsistency: false,
  minOverlap: 0,
  writesAvailableWithFailures: 3,
  readsAvailableWithFailures: 3,
  consistencyLevel: 'eventual',
  warnings: []
}
*/

The Goldilocks Principle

Read-Heavy vs Write-Heavy Optimization

The Fundamental Tradeoff:

For strong consistency (W + R > N), as you increase one parameter, you can decrease the other:

W = N, R = 1 → All burden on writes, reads are fast and available
W = 1, R = N → All burden on reads, writes are fast and available
W = R = majority → Balanced load

Read-Heavy Workloads (W High, R Low):

Typical for systems where:

Reads vastly outnumber writes (100:1 or more)
Write latency is less critical than read latency
Write availability during failures is acceptable to sacrifice
Data rarely changes but is frequently queried

Examples: Product catalogs, content delivery, reference data, configuration services

Read-Heavy Config (W High)

•Configuration: N=5, W=4, R=2
•Consistency: 4 + 2 = 6 > 5 ✓ Strong
•Write tolerance: 1 node failure
•Read tolerance: 3 node failures
•Write latency: High (wait for 4 nodes)
•Read latency: Low (wait for 2 nodes)
•Best for: Product catalogs, CMS, configs

Write-Heavy Config (R High)

•Configuration: N=5, W=2, R=4
•Consistency: 2 + 4 = 6 > 5 ✓ Strong
•Write tolerance: 3 node failures
•Read tolerance: 1 node failure
•Write latency: Low (wait for 2 nodes)
•Read latency: High (wait for 4 nodes)
•Best for: Logging, IoT ingestion, events

Write-Heavy Workloads (W Low, R High):

Typical for systems where:

Writes are frequent and must have low latency
Reads can tolerate slightly higher latency
Write availability is critical (can't drop events)
Data is written once and read infrequently

Examples: Event logging, IoT sensor data, time-series ingestion, audit trails

The Mathematics of Latency Optimization:

Latency in quorum operations is typically dominated by the slowest node in the quorum. This is the "tail latency" problem.

For W = 3 with N = 5:

Send write to 5 nodes
Return after 3rd fastest response
Latency ≈ 3rd percentile of node latencies

For W = 2 with N = 5:

Send write to 5 nodes
Return after 2nd fastest response
Latency ≈ 2nd percentile of node latencies

This is why lower quorums have better latency—you're measuring closer to the median node performance rather than the tail.

latency-simulation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// Simulate quorum latency based on node response times
function simulateQuorumLatency(
  nodeLatencies: number[],  // Array of latencies from each node
  quorum: number            // Number of responses needed
): number {
  // Sort latencies ascending
  const sorted = [...nodeLatencies].sort((a, b) => a - b);
  
  // Quorum is achieved when we get the q-th response
  // (0-indexed, so quorum-1)
  return sorted[quorum - 1];
}
 
// Example: 5 nodes with varying latencies
const nodeSpeeds = [10, 25, 50, 100, 500];  // ms (one slow node!)
 
console.log(`W=1: ${simulateQuorumLatency(nodeSpeeds, 1)}ms`);  // 10ms
console.log(`W=2: ${simulateQuorumLatency(nodeSpeeds, 2)}ms`);  // 25ms
console.log(`W=3: ${simulateQuorumLatency(nodeSpeeds, 3)}ms`);  // 50ms
console.log(`W=4: ${simulateQuorumLatency(nodeSpeeds, 4)}ms`);  // 100ms
console.log(`W=5: ${simulateQuorumLatency(nodeSpeeds, 5)}ms`);  // 500ms (!)
 
// Monte Carlo simulation for p99 latency
function simulateP99Latency(
  baseLatency: number,
  variance: number,
  nodeCount: number,
  quorum: number,
  iterations: number = 10000
): number {
  const latencies: number[] = [];
  
  for (let i = 0; i < iterations; i++) {
    // Generate random latencies for each node
    const nodeLatencies = Array(nodeCount).fill(0).map(() =>
      baseLatency + (Math.random() * variance * 2 - variance)
    );
    
    latencies.push(simulateQuorumLatency(nodeLatencies, quorum));
  }
  
  latencies.sort((a, b) => a - b);
  return latencies[Math.floor(iterations * 0.99)];  // p99
}
 
// Compare p99 latencies for different quorum settings
const base = 20;   // ms base latency
const variance = 10;  // ms variance
 
console.log('P99 Latencies:');
console.log(`W=2: ${simulateP99Latency(base, variance, 5, 2).toFixed(1)}ms`);
console.log(`W=3: ${simulateP99Latency(base, variance, 5, 3).toFixed(1)}ms`);
console.log(`W=4: ${simulateP99Latency(base, variance, 5, 4).toFixed(1)}ms`);

Per-Operation Consistency Levels

Why Per-Operation Consistency Matters:

Not all data is equally critical. Consider an e-commerce platform:

Account balance: Needs absolute consistency—never show stale balance
Product description: Eventually consistent is fine—stale by seconds is acceptable
User session data: Can tolerate some staleness, but prefer fresh
Analytics events: Fire-and-forget, eventual consistency is perfect

With per-operation consistency, you can optimize each use case without making global tradeoffs.

Per-Operation Consistency Strategy by Data Type
Data Type	Write Consistency	Read Consistency	W + R vs N	Rationale
Financial transactions	QUORUM (W=2)	QUORUM (R=2)	4 > 3 ✓	Strong consistency required
User profile data	QUORUM (W=2)	ONE (R=1)	3 = 3 ✗	Accept slight staleness for speed
Product catalog	ALL (W=3)	ONE (R=1)	4 > 3 ✓	Heavy reads, rare writes
Session tokens	ONE (W=1)	LOCAL_QUORUM	Varies	Fast writes, local reads
Metrics/Logs	ONE (W=1)	ONE (R=1)	2 < 3 ✗	Performance over consistency
Critical configs	ALL (W=3)	ALL (R=3)	6 > 3 ✓✓	Maximum durability and freshness

per-operation-consistency.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
enum ConsistencyLevel {
  ONE = 'ONE',           // W or R = 1
  QUORUM = 'QUORUM',     // W or R = floor(N/2) + 1
  ALL = 'ALL',           // W or R = N
  LOCAL_ONE = 'LOCAL_ONE',      // 1 in local datacenter
  LOCAL_QUORUM = 'LOCAL_QUORUM', // Quorum in local datacenter
  EACH_QUORUM = 'EACH_QUORUM',   // Quorum in each datacenter
}
 
interface QueryOptions {
  consistency: ConsistencyLevel;
  timeout?: number;
  retryPolicy?: RetryPolicy;
}
 
class FinancialService {
  // Critical: Account balances require strong consistency
  async getBalance(accountId: string): Promise<number> {
    return this.db.read(
      'SELECT balance FROM accounts WHERE id = ?',
      [accountId],
      { consistency: ConsistencyLevel.QUORUM }  // R=2
    );
  }
  
  async updateBalance(accountId: string, newBalance: number): Promise<void> {
    await this.db.write(
      'UPDATE accounts SET balance = ? WHERE id = ?',
      [newBalance, accountId],
      { consistency: ConsistencyLevel.QUORUM }  // W=2
      // Combined: 2+2 = 4 > 3 = N → Strong consistency
    );
  }
  
  // Less critical: Transaction history can be slightly stale
  async getTransactionHistory(accountId: string): Promise<Transaction[]> {
    return this.db.read(
      'SELECT * FROM transactions WHERE account_id = ? ORDER BY timestamp DESC',
      [accountId],
      { consistency: ConsistencyLevel.ONE }  // R=1 for speed
    );
  }
}
 
class LoggingService {
  // High volume, low criticality: Fire and forget
  async logEvent(event: LogEvent): Promise<void> {
    await this.db.write(
      'INSERT INTO events (id, type, data, timestamp) VALUES (?, ?, ?, ?)',
      [event.id, event.type, event.data, event.timestamp],
      { 
        consistency: ConsistencyLevel.ONE,  // W=1
        timeout: 50  // Fail fast
      }
    );
  }
}
 
class ConfigService {
  // Critical infrastructure: Maximum durability and freshness
  async getConfig(key: string): Promise<string> {
    return this.db.read(
      'SELECT value FROM configs WHERE key = ?',
      [key],
      { consistency: ConsistencyLevel.ALL }  // R=N
    );
  }
  
  async setConfig(key: string, value: string): Promise<void> {
    await this.db.write(
      'UPDATE configs SET value = ? WHERE key = ?',
      [value, key],
      { consistency: ConsistencyLevel.ALL }  // W=N
    );
  }
}

Mixing Consistency Levels Carefully

Multi-Datacenter Quorum Considerations

The Challenge:

Consider N = 6 with 3 replicas in each of 2 datacenters:

DC-East: Nodes {A, B, C}
DC-West: Nodes {D, E, F}

With W = 4 (majority), a write might need to wait for nodes in both datacenters:

If DC-East nodes respond quickly but DC-West is slow (cross-continent latency), write latency is dominated by the slowest required response
This can add 50-200ms to every write operation

Local vs Global Quorums:

To address this, systems like Cassandra introduce LOCAL_QUORUM:

LOCAL_QUORUM: Achieve quorum within the local datacenter only
EACH_QUORUM: Achieve quorum in every datacenter
QUORUM: Global quorum across all nodes regardless of location

multi-dc-quorum.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
interface DatacenterTopology {
  name: string;
  replicationFactor: number;
  nodes: Node[];
}
 
interface MultiDCQuorumConfig {
  datacenters: DatacenterTopology[];
  globalReplicationFactor: number;  // Sum of all DC replication factors
}
 
function calculateLocalQuorum(dcReplicationFactor: number): number {
  return Math.floor(dcReplicationFactor / 2) + 1;
}
 
function calculateGlobalQuorum(globalN: number): number {
  return Math.floor(globalN / 2) + 1;
}
 
// Example: 2 datacenters with RF=3 each
const topology: MultiDCQuorumConfig = {
  datacenters: [
    { name: 'us-east', replicationFactor: 3, nodes: ['A', 'B', 'C'] },
    { name: 'us-west', replicationFactor: 3, nodes: ['D', 'E', 'F'] },
  ],
  globalReplicationFactor: 6,
};
 
// Consistency levels and their behavior
const consistencyLevels = {
  ONE: {
    description: 'Single node acknowledgment',
    nodes: 1,
    latency: 'Lowest',
    consistency: 'Weakest'
  },
  LOCAL_ONE: {
    description: 'Single node in local DC',
    nodes: 1,
    latency: 'Very low (no cross-DC)',
    consistency: 'Weak but fast'
  },
  LOCAL_QUORUM: {
    description: 'Quorum in local DC only',
    nodesNeeded: calculateLocalQuorum(3),  // = 2
    latency: 'Low (no cross-DC)',
    consistency: 'Strong within DC'
  },
  QUORUM: {
    description: 'Global quorum across all DCs',
    nodesNeeded: calculateGlobalQuorum(6),  // = 4
    latency: 'High (must cross DC)',
    consistency: 'Strong globally'
  },
  EACH_QUORUM: {
    description: 'Quorum in EVERY datacenter',
    nodesNeeded: 2 + 2,  // 2 in each DC
    latency: 'Highest (all DCs required)',
    consistency: 'Strongest (survives DC failure)'
  },
  ALL: {
    description: 'All nodes must acknowledge',
    nodesNeeded: 6,
    latency: 'Highest possible',
    consistency: 'Maximum durability'
  },
};
 
// Latency analysis
console.log('Latency comparison:');
console.log('LOCAL_QUORUM (2 local nodes): ~5-10ms');
console.log('QUORUM (4 nodes, likely cross-DC): ~50-150ms');
console.log('EACH_QUORUM (2 per DC): ~50-150ms');
console.log('ALL (6 nodes): ~75-200ms');

Multi-DC Consistency Level Behavior (N=3 per DC, Total N=6)
Consistency Level	Nodes Needed	Cross-DC Required?	Survives DC Failure?	Typical Latency
ONE	1	Optional	No	< 5ms
LOCAL_ONE	1	No	No	< 5ms
LOCAL_QUORUM	2	No	No	5-15ms
QUORUM	4	Usually	Usually	50-150ms
EACH_QUORUM	4 (2+2)	Yes	Yes	50-150ms
ALL	6	Yes	Yes	75-200ms

The LOCAL_QUORUM Trade-off

Choosing Optimal N Value

While W and R can be tuned per-operation, N (replication factor) is typically fixed at keyspace creation and is hard to change. Choosing the right N requires balancing multiple factors.

Factors Favoring Higher N:

Fault Tolerance: With N=5 and W=3, you can lose 2 nodes and still accept writes. With N=3 and W=2, you can only lose 1 node.
Read Scalability: More replicas = more nodes to distribute read load. With N=5 and R=1, reads can be load-balanced across 5 nodes.
Geographic Distribution: Higher N allows placing replicas in more regions for lower user latency.
Data Criticality: Mission-critical data warrants more redundancy.

Factors Favoring Lower N:

Storage Cost: N=5 means 5x storage overhead. For petabyte-scale data, this is significant.
Write Amplification: Every write must be sent to N nodes, consuming N× network bandwidth.
Consistency Coordination: Larger N means more nodes to coordinate, potentially increasing latency.
Operational Complexity: More replicas = more nodes to manage, monitor, and repair.

N Selection Guidelines
Scenario	Recommended N	Rationale
Development/Testing	1	No redundancy needed; save resources
Non-critical data (logs, metrics)	2	Some redundancy without storage overhead
Standard production data	3	Industry standard; survives 1 node failure with majority quorum
Critical production data	5	Survives 2 node failures with majority quorum
Multi-datacenter (2 DCs)	3 per DC (6 total)	Survives single DC failure
Global distribution (3+ regions)	2-3 per region	Balance between coverage and cost

n-selection.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
interface DataCharacteristics {
  isCritical: boolean;          // Can the business survive data loss?
  dataSize: 'small' | 'medium' | 'large' | 'massive';
  writeVolume: 'low' | 'medium' | 'high';
  readPattern: 'balanced' | 'read-heavy' | 'write-heavy';
  geographicRequirements: 'single-region' | 'multi-region' | 'global';
  budgetConstraint: 'low' | 'medium' | 'high';
}
 
function recommendReplicationFactor(data: DataCharacteristics): {
  recommendedN: number;
  reasoning: string[];
} {
  const reasoning: string[] = [];
  let n = 3; // Default starting point
  
  // Criticality adjustment
  if (data.isCritical) {
    n = Math.max(n, 5);
    reasoning.push('Critical data: Starting at N=5 for maximum protection');
  }
  
  // Size/cost adjustment
  if (data.dataSize === 'massive' && data.budgetConstraint === 'low') {
    n = Math.min(n, 3);
    reasoning.push('Large data + budget constraint: Capping at N=3');
  }
  
  // Geographic requirements
  if (data.geographicRequirements === 'multi-region') {
    n = Math.max(n, 4);  // At least 2 per region for 2 regions
    reasoning.push('Multi-region: Need at least N=4 (2 per region)');
  } else if (data.geographicRequirements === 'global') {
    n = Math.max(n, 6);  // At least 2 per region for 3+ regions
    reasoning.push('Global: Need at least N=6 (2 per region for 3 regions)');
  }
  
  // Read scalability
  if (data.readPattern === 'read-heavy' && data.writeVolume === 'low') {
    n = Math.max(n, 5);
    reasoning.push('Read-heavy: Higher N provides more read replicas');
  }
  
  // Write volume consideration
  if (data.writeVolume === 'high' && data.dataSize === 'large') {
    n = Math.min(n, 3);
    reasoning.push('High write volume + large data: Limiting N to control write amplification');
  }
  
  // Ensure odd number for clean majority quorums
  if (n % 2 === 0 && n > 2) {
    n += 1;
    reasoning.push(`Adjusted to odd number N=${n} for clean majority quorum`);
  }
  
  return { recommendedN: n, reasoning };
}
 
// Example usage
const recommendation = recommendReplicationFactor({
  isCritical: true,
  dataSize: 'medium',
  writeVolume: 'medium',
  readPattern: 'read-heavy',
  geographicRequirements: 'multi-region',
  budgetConstraint: 'medium',
});
 
console.log(`Recommended N: ${recommendation.recommendedN}`);
recommendation.reasoning.forEach(r => console.log(`  - ${r}`));

Why Odd Numbers for N?

Common Production Configurations

Let's examine how major production systems configure N, W, R, and the reasoning behind their choices. These real-world examples provide battle-tested templates for your own systems.

Production System Configurations

•Amazon DynamoDB (Managed): N is abstracted but typically 3 replicas across AZs. W and R are exposed as 'Eventually Consistent' (R=1) or 'Strongly Consistent' (R≥2) reads. Writes are always durable to majority.
•Apache Cassandra: Default N=3, W=QUORUM (2), R=QUORUM (2). Provides strong consistency with 4 > 3. This is the most common production configuration.
•Riak: Default N=3, W=QUORUM (2), R=QUORUM (2). Also supports 'sloppy quorum' for higher availability during failures.
•CockroachDB: Uses Raft consensus with N=3 replicas per range. Writes require majority (2), reads go to leaseholder for linearizability.
•MongoDB: N is configurable per replica set (typically 3). Writes use 'w: majority' for durability. Reads from primary for consistency or secondaries for scale.
•Consul (Service Discovery): N=3 or 5 for Consul servers. Uses Raft, so W = majority is required. Reads can be stale (R=1) or consistent (R=leader).

cassandra-production.cql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- PATTERN 1: Standard Strong Consistency
-- Most common production configuration
CREATE KEYSPACE transactional_data
WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 3 };
 
-- Write: QUORUM (2), Read: QUORUM (2)
-- 2 + 2 = 4 > 3 → Strong consistency
-- Tolerates 1 node failure for both reads and writes
 
-- PATTERN 2: Read-Heavy Workload
-- News sites, content platforms, product catalogs
CREATE KEYSPACE content_data
WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 3 };
 
-- Write: ALL (3), Read: ONE (1)  
-- 3 + 1 = 4 > 3 → Strong consistency
-- Writes are rare but durable; reads are fast
-- Warning: Any node failure blocks writes
 
-- PATTERN 3: Write-Heavy Workload
-- Logging, metrics, event streaming
CREATE KEYSPACE event_data
WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 3 };
 
-- Write: ONE (1), Read: ONE (1)
-- 1 + 1 = 2 < 3 → Eventual consistency (acceptable for logs)
-- Maximum write throughput and availability
 
-- PATTERN 4: Multi-Datacenter (Local + Remote)
CREATE KEYSPACE global_data
WITH REPLICATION = {
  'class': 'NetworkTopologyStrategy',
  'us-east': 3,
  'eu-west': 3
};
 
-- Write: LOCAL_QUORUM, Read: LOCAL_QUORUM
-- Fast local operations, async cross-DC replication
-- During DC failure: may see stale data from surviving DC
 
-- PATTERN 5: Multi-Datacenter (Strict Consistency)
CREATE KEYSPACE critical_financial_data
WITH REPLICATION = {
  'class': 'NetworkTopologyStrategy',
  'us-east': 3,
  'eu-west': 3
};
 
-- Write: EACH_QUORUM, Read: QUORUM
-- Every write confirmed in both DCs
-- Higher latency but survives full DC failure with consistency

The Industry Standard: N=3, W=2, R=2

Summary: Mastering N, W, R Parameters

We've comprehensively explored the three parameters that govern quorum-based replication. Let's consolidate the essential knowledge:

Key Takeaways

•N (Replication Factor) determines how many copies of data exist. Higher N = more fault tolerance and read scalability, but more storage and write amplification.
•W (Write Quorum) controls write durability and latency. Higher W = more durable writes but slower and less available during failures.
•R (Read Quorum) controls read freshness and latency. Higher R = fresher reads but slower and less available during failures.
•W + R > N is the mathematical guarantee for strong consistency—ensuring every read sees the latest completed write.
•Per-operation consistency allows fine-grained control, optimizing each code path for its specific requirements.
•Multi-datacenter deployments introduce LOCAL_QUORUM vs EACH_QUORUM tradeoffs between latency and cross-DC consistency.
•N=3, W=2, R=2 is the industry-standard default—start here and deviate only with clear justification.
•Odd values for N are preferred because they provide cleaner majority quorum calculations.

Page Complete

2 / 5