System Design (HLD)Leader-Follower Replication

Leader-Follower Replication

LevelIntermediate

Duration75 mins

TopicLeader-Follower Replication

3 / 5

Synchronous vs Asynchronous Replication

The Durability-Latency Trade-off

Imagine a financial application processes a wire transfer of $1,000,000. The transaction is written to the leader database and the client receives a success response. One second later, the leader's server experiences a total hardware failure.

Question: Is that million-dollar transfer safe?

The answer depends entirely on one configuration choice: synchronous versus asynchronous replication.

If replication was synchronous, at least one follower received and acknowledged the write before the client was told 'success.' The transfer is safe.
If replication was asynchronous, the write may have existed only on the failed leader. The transfer may be lost.

This single configuration option—when the leader waits for followers during a commit—is one of the most consequential decisions in database architecture. It directly trades off between durability guarantees and write performance.

What You Will Learn

By the end of this page, you will deeply understand synchronous replication (strong durability, higher latency), asynchronous replication (lower latency, potential data loss), semi-synchronous modes, the mathematics of the tradeoff, and how to choose the right approach for different scenarios.

Understanding the Commit Timeline

To understand synchronous vs. asynchronous replication, we must first understand the timeline of a write commit. The critical question is: at what point do we tell the client their write succeeded?

Write Commit Timeline
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
TIME ─────────────────────────────────────────────────────────────────────────▶
 
CLIENT              LEADER                    FOLLOWER             FOLLOWER
                                              (sync)               (async)
  │                   │                         │                    │
  │──(1) WRITE───────▶│                         │                    │
  │                   │                         │                    │
  │                   │──(2) Write to WAL──────▶│ (durable)          │
  │                   │                         │                    │
  │                   │──────────────(3) Stream to followers─────────│
  │                   │                         │                    │
  │                   │                   ┌─────│                    │
  │                   │◀──(4) ACK─────────┘     │                    │
  │                   │                         │                    │
  │                   │══════════════════════════════════════════════│
  │                   │  SYNCHRONOUS MODE:                           │
  │                   │  Wait for follower ACK before (5)            │
  │                   │══════════════════════════════════════════════│
  │                   │                                              │
  │◀─(5) SUCCESS─────│                                              │
  │                   │                                              │
  │                   │                                        ┌─────│
  │                   │                         (async follower│     │
  │                   │                           catches up   ▼     │
  │                   │                           eventually)        │
 
 
SYNCHRONOUS: Client sees success AFTER follower acknowledged
ASYNCHRONOUS: Client sees success BEFORE follower acknowledged
 
The window between (2) WAL write and (4) follower ACK is the "replication window"
If the leader fails during this window, async replication may lose data

The Five Stages of a Write:

Client Request — Application sends write to the leader.
Leader WAL Write — Leader durably logs the change to its own disk.
Replication Streaming — Leader sends the log entry to followers.
Follower Acknowledgment — Followers confirm receipt (or apply, depending on mode).
Client Response — Leader tells the client the write succeeded.

The key decision: Does stage 5 wait for stage 4?

Synchronous: Yes. The leader waits for at least one follower's acknowledgment.
Asynchronous: No. The leader responds after stage 2 (its own WAL write).

The Danger Window

In asynchronous mode, there's a window between 'leader wrote to WAL' and 'follower received the data.' If the leader fails during this window, the write is lost—even though the client was told it succeeded. This is not a theoretical concern; it happens in production.

Synchronous Replication Deep Dive

Synchronous replication ensures that every committed transaction exists on at least two nodes (the leader and one or more followers) before the client is told the transaction succeeded.

This provides a powerful guarantee: no committed data can be lost due to a single node failure.

How Synchronous Replication Works

•Leader writes to local WAL — The change is durably stored on the leader's disk.
•Leader streams to designated synchronous follower(s) — The log entry is transmitted over the network.
•Follower receives, writes to its WAL, and acknowledges — Different modes exist: 'received,' 'written to disk,' or 'applied.'
•Leader waits for acknowledgment — The commit blocks until the required number of followers respond.
•Leader responds to client — Only now is the client told the transaction succeeded.
•If follower doesn't respond (timeout) — Options: wait forever, cancel transaction, or demote to async.

Synchronous Acknowledgment Levels
Level	Follower Action Before ACK	Durability Guarantee	Latency Impact
remote_write (PostgreSQL)	Written to OS buffer	Survives follower crash (if fsync soon)	Low (~1-2ms)
remote_flush	fsync'd to disk	Survives follower crash immediately	Medium (~5-10ms)
remote_apply	Applied to data files	Immediately readable on follower	High (~10-50ms)
on (MySQL semi-sync)	Received into relay log	Survives follower crash after apply	Low-Medium

Synchronous Replication Configurations:

Single Synchronous Follower: The leader designates one follower as synchronous. All commits wait for this one follower. If the synchronous follower fails, the leader either waits forever (availability loss) or falls back to async (durability risk).

Multiple Synchronous Followers: The leader can wait for N out of M followers. PostgreSQL's synchronous_standby_names supports FIRST N (wait for first N to respond) or ANY N (require N responses from the list). This provides fault tolerance: one sync follower can fail without blocking commits.

Quorum-Based Sync: Advanced systems use quorum writes: commit when a majority (e.g., 2 of 3) acknowledge. This balances durability with availability—no single follower is critical.

Synchronous Adds Latency

Every synchronous commit adds at least one round-trip time to a follower. For same-datacenter replication, this might be 1-5ms. For cross-region replication, it could be 50-200ms. This latency adds to every single write transaction.

Asynchronous Replication Deep Dive

Asynchronous replication decouples the client response from follower replication. The leader commits as soon as its own WAL write completes, without waiting for followers.

This provides the lowest possible write latency—but with a trade-off in durability guarantees.

How Asynchronous Replication Works

•Leader writes to local WAL — The change is durably stored on the leader's disk.
•Leader immediately responds to client — Transaction is 'committed' from the client's perspective.
•Asynchronously, leader streams to followers — This happens in the background, in parallel with new transactions.
•Followers apply changes at their own pace — There's always some lag, typically milliseconds to seconds.
•If leader fails before replication — The uncommitted (from follower perspective) writes are lost.

Asynchronous Data Loss Scenario
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
TIME ─────────────────────────────────────────────────────────────────────────▶
 
  t=0ms    t=1ms     t=2ms     t=3ms     t=4ms     t=5ms
    │        │         │         │         │         │
    │        │         │         │         │         │
  LEADER   LEADER    CLIENT    LEADER    ████████   FOLLOWER
  receives writes    sees      streams   ████████   has NOT
  write    to WAL    SUCCESS   to        CRASHES    received
                               follower  ████████   the write
                               (in
                               progress)
 
  RESULT: Client thinks transaction succeeded, but it's LOST
  
  The follower will be promoted to leader, missing this transaction.
  Any external side effects (emails sent, payments initiated) cannot be undone.

The Risk is Real:

Asynchronous data loss isn't just theoretical. Consider these real-world scenarios:

Hardware failure — The leader's server has a total power loss or storage controller failure.
Process crash without fsync — The database process crashes and the OS hasn't flushed buffers to disk.
Network partition during failover — The leader is unreachable, a follower is promoted, and the old leader comes back with 'orphan' transactions.

Quantifying the Risk:

The data loss window equals the replication lag. With typical async setups:

Same-datacenter: 10-100ms of potential data loss
Cross-region: 100ms-1000ms of potential data loss

For a system doing 1000 writes/second, 100ms of lag means up to 100 potentially lost transactions per failure.

Async is the Default

Most database installations default to asynchronous replication because it's simpler and faster. This is often fine for development or non-critical data, but production systems with durability requirements should explicitly configure synchronous replication.

Semi-Synchronous and Hybrid Modes

Pure synchronous and pure asynchronous represent two extremes. Most production systems operate somewhere in between, using semi-synchronous or hybrid modes that balance durability against availability and performance.

Semi-Synchronous Approaches

•MySQL Semi-Synchronous Replication — Leader waits for at least one follower to acknowledge receipt (not necessarily apply). Falls back to async after timeout.
•PostgreSQL synchronous_commit Levels — Per-transaction control: on (wait for local fsync), remote_write, remote_apply, local, off. Different transactions can have different guarantees.
•Raft-Based Systems — Commit requires majority acknowledgment (e.g., 2 of 3, 3 of 5). Provides both durability and availability tolerance.
•Degraded Mode Fallback — System operates synchronously when followers are healthy, degrades to async (with alerts) when followers are lagging or offline.

Replication Mode Comparison
Mode	Durability	Latency	Availability	Complexity
Fully Async	Lost data on leader failure	Lowest (leader-only)	Highest (no dependencies)	Simple
Semi-Sync (1 ACK)	Safe with 1 follower	Medium (+network RTT)	Blocked if sync follower down	Medium
Fully Sync (N ACKs)	Safe with N followers	Highest (+slowest follower)	Blocked if any required follower down	Complex
Quorum (Majority)	Safe with majority failure	Medium (+majority RTT)	Tolerates minority failures	Complex

Per-Transaction Control:

Advanced databases allow durability level to be set per-transaction, not just globally. This enables intelligent trade-offs:

-- Critical financial transaction: wait for two replicas
SET synchronous_commit = 'remote_apply';
BEGIN;
INSERT INTO transfers (amount, from_account, to_account) VALUES (1000000, 'A', 'B');
COMMIT;

-- Low-priority logging: async is fine
SET synchronous_commit = 'off';
BEGIN;
INSERT INTO audit_log (event, timestamp) VALUES ('user_login', NOW());
COMMIT;

This approach provides the best of both worlds: critical data gets strong guarantees, bulk/logging operations remain fast.

Timeout Behavior Matters

When a synchronous follower times out, the system must choose: wait forever (availability loss) or fall back to async (potential durability loss). Most production systems choose fallback with alerting. Understand your system's behavior before it matters.

The Mathematics of the Trade-off

Choosing between sync and async isn't just intuition—we can quantify the trade-offs mathematically. Understanding the numbers helps make informed decisions.

Latency Impact:

Synchronous replication adds at least one round-trip time (RTT) to each transaction. Let's calculate the impact:

Scenario	Network RTT	Baseline Commit	Sync Commit	Overhead
Same rack	0.1ms	1ms	1.1ms	+10%
Same datacenter	1ms	1ms	2ms	+100%
Cross-region (US)	40ms	1ms	41ms	+4000%
Cross-continent	150ms	1ms	151ms	+15000%

For write-heavy workloads, synchronous cross-region replication can reduce throughput by 10-100x.

Data Loss Quantification:

Asynchronous replication risks losing data in the 'replication window.' We can estimate expected data loss:

Expected Data Loss Per Failure = Replication Lag × Write Rate

Replication Lag	Write Rate	Transactions at Risk
10ms	100 writes/sec	~1 transaction
100ms	100 writes/sec	~10 transactions
1 second	1000 writes/sec	~1000 transactions
10 seconds	1000 writes/sec	~10000 transactions

Annualized Risk:

If you have one leader failure per year with 100ms lag and 1000 writes/sec:

Expected lost transactions per year: ~100

For financial systems, losing 100 transactions might mean regulatory violations. For analytics, it might be irrelevant.

Decision Framework

•Estimate failure frequency — How often do you expect unplanned leader failures? (Industry average: 1-4 per year per cluster)
•Measure replication lag — What's your typical and peak async replication lag?
•Calculate transactions at risk — Lag × write rate = transactions potentially lost per failure
•Assign value to lost transactions — Is each transaction worth $0.01 or $10,000?
•Calculate sync latency cost — How much does sync overhead reduce throughput / increase latency?
•Compare costs — Expected data loss cost vs. performance/infrastructure cost of sync

The Calculation Rarely Matters

In practice, most systems fall into clear categories. Financial/payment systems: synchronous is non-negotiable. Analytics/logging: async is fine. The edge cases where math matters are rare, but understanding the framework helps justify decisions.

Choosing the Right Approach

With a deep understanding of sync and async trade-offs, let's provide concrete guidance for different scenarios.

Use Synchronous When

•Financial transactions (payments, transfers)
•Regulatory compliance requirements
•External side effects depend on commit (emails, webhooks)
•Data is irreplaceable
•Followers are in the same datacenter (low latency)
•Write volume is moderate
•You can tolerate brief unavailability on follower failure

Use Asynchronous When

•High write volume requires maximum throughput
•Cross-region replication where latency is prohibitive
•Data can be reconstructed or is non-critical (logs, caches)
•Eventual consistency is acceptable
•Availability is more important than durability
•The application can handle occasional data loss
•Development and testing environments

Replication Mode by Use Case
Use Case	Recommended Mode	Rationale
Banking/Payments	Synchronous (quorum)	Zero data loss tolerance; regulatory requirements
E-commerce Orders	Semi-sync (at least 1)	Orders are valuable; some latency acceptable
Social Media Posts	Async with monitoring	High volume; eventual consistency acceptable
Analytics/Logging	Fully Async	Reconstructable; volume too high for sync overhead
Cross-Region DR	Async (different from HA)	Latency prohibitive; DR is last resort anyway
Same-DC HA Replica	Synchronous or Semi-sync	Low latency; HA requires durability

Hybrid Strategies:

Same-DC Sync + Cross-Region Async: Maintain a synchronous replica in the same data center for HA and a second replica asynchronously in a remote region for DR. This provides fast failover with strong durability locally, plus protection against datacenter-level disasters.

Per-Table Configuration: Some databases allow replication configuration per table. Critical tables (accounts, orders) replicate synchronously; high-volume tables (events, logs) replicate asynchronously.

Application-Level Routing: The application knows which writes are critical. It can specify durability requirements per transaction, routing critical writes through a synchronous path.

When in Doubt, Start Synchronous

It's easier to relax durability guarantees (switch from sync to async) than to explain lost data to stakeholders. Start with synchronous for critical data paths; optimize to async only when you've proven the need and accepted the risk.

Implementation Across Databases

Different databases implement synchronous/asynchronous replication with varying configurations and semantics. Let's examine the major implementations.

PostgreSQL

•synchronous_commit — Controls durability level: off, local, remote_write, remote_apply, on
•synchronous_standby_names — Specifies which standbys are synchronous and how many must acknowledge
•Syntax examples: FIRST 1 (standby1, standby2), ANY 2 (standby1, standby2, standby3)
•Per-transaction control — SET LOCAL synchronous_commit = 'remote_apply'; within a transaction
•Quorum support — ANY N syntax provides quorum-based acknowledgment

PostgreSQL Synchronous Configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- In postgresql.conf (leader)
synchronous_commit = on                    -- require sync for commits
synchronous_standby_names = 'FIRST 1 (replica1, replica2)'  -- first responding wins
 
-- Or for quorum mode (2 of 3 must acknowledge)
synchronous_standby_names = 'ANY 2 (replica1, replica2, replica3)'
 
-- Check synchronous status
SELECT application_name, sync_state, sync_priority
FROM pg_stat_replication;
 
-- Per-transaction override (for less critical writes)
BEGIN;
SET LOCAL synchronous_commit = 'local';  -- Don't wait for followers
INSERT INTO logs (event) VALUES ('user_clicked');
COMMIT;

MySQL

•rpl_semi_sync_source_enabled — Enable semi-synchronous on the source (leader)
•rpl_semi_sync_source_wait_for_replica_count — Number of replicas that must acknowledge
•rpl_semi_sync_source_timeout — Milliseconds to wait before falling back to async
•Group Replication — Full Paxos-based consensus for strong consistency
•Fallback behavior — After timeout, reverts to async (configurable)

MongoDB

•Write Concern — Specified per-operation: {w: 1}, {w: 'majority'}, {w: 3}, {w: 'tag'}
•w: 1 — Acknowledge after primary writes (async to secondaries)
•w: 'majority' — Acknowledge after majority of replica set writes (quorum sync)
•w: N — Acknowledge after N nodes write (specific count)
•j: true — Require journal (fsync) before acknowledgment
•wtimeout — Timeout in milliseconds; error if not achieved

Terminology Varies

PostgreSQL calls it 'synchronous_commit,' MySQL calls it 'semi-sync,' MongoDB uses 'write concern.' The concepts are the same: how many replicas must acknowledge before the write is considered durable. Learn the terminology for your specific database.

Summary: Synchronous vs Asynchronous

We've explored one of the most fundamental trade-offs in database replication: when to wait for followers during commit. Let's consolidate the essential insights:

Key Takeaways

•Synchronous replication waits for followers before responding to clients — Provides strong durability but adds latency equal to at least one RTT to a follower.
•Asynchronous replication responds immediately after leader WAL write — Provides lowest latency but risks data loss if the leader fails before replication completes.
•Semi-synchronous modes balance the trade-off — Wait for one follower with timeout fallback, per-transaction control, or quorum-based acknowledgment.
•The math is straightforward: lost data = lag × write rate — Quantify the risk to make informed decisions rather than defaulting to one extreme.
•Per-transaction control enables surgical decisions — Critical writes go synchronous; bulk operations go async. Best of both worlds.
•Same-DC sync + cross-region async is a common pattern — Fast local failover with strong durability, plus geographic disaster recovery.

What's Next:

With writes flowing through the leader and replicating (synchronously or asynchronously) to followers, we must address a critical scenario: what happens when the leader fails? The next page explores failover handling—how systems detect leader failure, elect a new leader, and transition without data loss or prolonged downtime.

Page Complete

You now understand the fundamental durability-latency trade-off in database replication. You can analyze when synchronous replication is worth the latency cost and when asynchronous replication's risks are acceptable. Next, we'll explore how systems handle the inevitable: leader failure and failover.

3 / 5

Loading learning content...

System Design (HLD)Leader-Follower Replication

Leader-Follower Replication

LevelIntermediate

Duration75 mins

TopicLeader-Follower Replication

3 / 5

Synchronous vs Asynchronous Replication

The Durability-Latency Trade-off

Question: Is that million-dollar transfer safe?

The answer depends entirely on one configuration choice: synchronous versus asynchronous replication.

If replication was synchronous, at least one follower received and acknowledged the write before the client was told 'success.' The transfer is safe.
If replication was asynchronous, the write may have existed only on the failed leader. The transfer may be lost.

What You Will Learn

Understanding the Commit Timeline

Write Commit Timeline
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
TIME ─────────────────────────────────────────────────────────────────────────▶
 
CLIENT              LEADER                    FOLLOWER             FOLLOWER
                                              (sync)               (async)
  │                   │                         │                    │
  │──(1) WRITE───────▶│                         │                    │
  │                   │                         │                    │
  │                   │──(2) Write to WAL──────▶│ (durable)          │
  │                   │                         │                    │
  │                   │──────────────(3) Stream to followers─────────│
  │                   │                         │                    │
  │                   │                   ┌─────│                    │
  │                   │◀──(4) ACK─────────┘     │                    │
  │                   │                         │                    │
  │                   │══════════════════════════════════════════════│
  │                   │  SYNCHRONOUS MODE:                           │
  │                   │  Wait for follower ACK before (5)            │
  │                   │══════════════════════════════════════════════│
  │                   │                                              │
  │◀─(5) SUCCESS─────│                                              │
  │                   │                                              │
  │                   │                                        ┌─────│
  │                   │                         (async follower│     │
  │                   │                           catches up   ▼     │
  │                   │                           eventually)        │
 
 
SYNCHRONOUS: Client sees success AFTER follower acknowledged
ASYNCHRONOUS: Client sees success BEFORE follower acknowledged
 
The window between (2) WAL write and (4) follower ACK is the "replication window"
If the leader fails during this window, async replication may lose data

The Five Stages of a Write:

Client Request — Application sends write to the leader.
Leader WAL Write — Leader durably logs the change to its own disk.
Replication Streaming — Leader sends the log entry to followers.
Follower Acknowledgment — Followers confirm receipt (or apply, depending on mode).
Client Response — Leader tells the client the write succeeded.

The key decision: Does stage 5 wait for stage 4?

Synchronous: Yes. The leader waits for at least one follower's acknowledgment.
Asynchronous: No. The leader responds after stage 2 (its own WAL write).

The Danger Window

Synchronous Replication Deep Dive

Synchronous replication ensures that every committed transaction exists on at least two nodes (the leader and one or more followers) before the client is told the transaction succeeded.

This provides a powerful guarantee: no committed data can be lost due to a single node failure.

How Synchronous Replication Works

•Leader writes to local WAL — The change is durably stored on the leader's disk.
•Leader streams to designated synchronous follower(s) — The log entry is transmitted over the network.
•Follower receives, writes to its WAL, and acknowledges — Different modes exist: 'received,' 'written to disk,' or 'applied.'
•Leader waits for acknowledgment — The commit blocks until the required number of followers respond.
•Leader responds to client — Only now is the client told the transaction succeeded.
•If follower doesn't respond (timeout) — Options: wait forever, cancel transaction, or demote to async.

Synchronous Acknowledgment Levels
Level	Follower Action Before ACK	Durability Guarantee	Latency Impact
remote_write (PostgreSQL)	Written to OS buffer	Survives follower crash (if fsync soon)	Low (~1-2ms)
remote_flush	fsync'd to disk	Survives follower crash immediately	Medium (~5-10ms)
remote_apply	Applied to data files	Immediately readable on follower	High (~10-50ms)
on (MySQL semi-sync)	Received into relay log	Survives follower crash after apply	Low-Medium

Synchronous Replication Configurations:

Quorum-Based Sync: Advanced systems use quorum writes: commit when a majority (e.g., 2 of 3) acknowledge. This balances durability with availability—no single follower is critical.

Synchronous Adds Latency

Asynchronous Replication Deep Dive

Asynchronous replication decouples the client response from follower replication. The leader commits as soon as its own WAL write completes, without waiting for followers.

This provides the lowest possible write latency—but with a trade-off in durability guarantees.

How Asynchronous Replication Works

•Leader writes to local WAL — The change is durably stored on the leader's disk.
•Leader immediately responds to client — Transaction is 'committed' from the client's perspective.
•Asynchronously, leader streams to followers — This happens in the background, in parallel with new transactions.
•Followers apply changes at their own pace — There's always some lag, typically milliseconds to seconds.
•If leader fails before replication — The uncommitted (from follower perspective) writes are lost.

Asynchronous Data Loss Scenario
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
TIME ─────────────────────────────────────────────────────────────────────────▶
 
  t=0ms    t=1ms     t=2ms     t=3ms     t=4ms     t=5ms
    │        │         │         │         │         │
    │        │         │         │         │         │
  LEADER   LEADER    CLIENT    LEADER    ████████   FOLLOWER
  receives writes    sees      streams   ████████   has NOT
  write    to WAL    SUCCESS   to        CRASHES    received
                               follower  ████████   the write
                               (in
                               progress)
 
  RESULT: Client thinks transaction succeeded, but it's LOST
  
  The follower will be promoted to leader, missing this transaction.
  Any external side effects (emails sent, payments initiated) cannot be undone.

The Risk is Real:

Asynchronous data loss isn't just theoretical. Consider these real-world scenarios:

Hardware failure — The leader's server has a total power loss or storage controller failure.
Process crash without fsync — The database process crashes and the OS hasn't flushed buffers to disk.
Network partition during failover — The leader is unreachable, a follower is promoted, and the old leader comes back with 'orphan' transactions.

Quantifying the Risk:

The data loss window equals the replication lag. With typical async setups:

Same-datacenter: 10-100ms of potential data loss
Cross-region: 100ms-1000ms of potential data loss

For a system doing 1000 writes/second, 100ms of lag means up to 100 potentially lost transactions per failure.

Async is the Default

Semi-Synchronous and Hybrid Modes

Semi-Synchronous Approaches

•MySQL Semi-Synchronous Replication — Leader waits for at least one follower to acknowledge receipt (not necessarily apply). Falls back to async after timeout.
•PostgreSQL synchronous_commit Levels — Per-transaction control: on (wait for local fsync), remote_write, remote_apply, local, off. Different transactions can have different guarantees.
•Raft-Based Systems — Commit requires majority acknowledgment (e.g., 2 of 3, 3 of 5). Provides both durability and availability tolerance.
•Degraded Mode Fallback — System operates synchronously when followers are healthy, degrades to async (with alerts) when followers are lagging or offline.

Replication Mode Comparison
Mode	Durability	Latency	Availability	Complexity
Fully Async	Lost data on leader failure	Lowest (leader-only)	Highest (no dependencies)	Simple
Semi-Sync (1 ACK)	Safe with 1 follower	Medium (+network RTT)	Blocked if sync follower down	Medium
Fully Sync (N ACKs)	Safe with N followers	Highest (+slowest follower)	Blocked if any required follower down	Complex
Quorum (Majority)	Safe with majority failure	Medium (+majority RTT)	Tolerates minority failures	Complex

Per-Transaction Control:

Advanced databases allow durability level to be set per-transaction, not just globally. This enables intelligent trade-offs:

-- Critical financial transaction: wait for two replicas
SET synchronous_commit = 'remote_apply';
BEGIN;
INSERT INTO transfers (amount, from_account, to_account) VALUES (1000000, 'A', 'B');
COMMIT;

-- Low-priority logging: async is fine
SET synchronous_commit = 'off';
BEGIN;
INSERT INTO audit_log (event, timestamp) VALUES ('user_login', NOW());
COMMIT;

This approach provides the best of both worlds: critical data gets strong guarantees, bulk/logging operations remain fast.

Timeout Behavior Matters

The Mathematics of the Trade-off

Choosing between sync and async isn't just intuition—we can quantify the trade-offs mathematically. Understanding the numbers helps make informed decisions.

Latency Impact:

Synchronous replication adds at least one round-trip time (RTT) to each transaction. Let's calculate the impact:

Scenario	Network RTT	Baseline Commit	Sync Commit	Overhead
Same rack	0.1ms	1ms	1.1ms	+10%
Same datacenter	1ms	1ms	2ms	+100%
Cross-region (US)	40ms	1ms	41ms	+4000%
Cross-continent	150ms	1ms	151ms	+15000%

For write-heavy workloads, synchronous cross-region replication can reduce throughput by 10-100x.

Data Loss Quantification:

Asynchronous replication risks losing data in the 'replication window.' We can estimate expected data loss:

Expected Data Loss Per Failure = Replication Lag × Write Rate

Replication Lag	Write Rate	Transactions at Risk
10ms	100 writes/sec	~1 transaction
100ms	100 writes/sec	~10 transactions
1 second	1000 writes/sec	~1000 transactions
10 seconds	1000 writes/sec	~10000 transactions

Annualized Risk:

If you have one leader failure per year with 100ms lag and 1000 writes/sec:

Expected lost transactions per year: ~100

For financial systems, losing 100 transactions might mean regulatory violations. For analytics, it might be irrelevant.

Decision Framework

•Estimate failure frequency — How often do you expect unplanned leader failures? (Industry average: 1-4 per year per cluster)
•Measure replication lag — What's your typical and peak async replication lag?
•Calculate transactions at risk — Lag × write rate = transactions potentially lost per failure
•Assign value to lost transactions — Is each transaction worth $0.01 or $10,000?
•Calculate sync latency cost — How much does sync overhead reduce throughput / increase latency?
•Compare costs — Expected data loss cost vs. performance/infrastructure cost of sync

The Calculation Rarely Matters

Choosing the Right Approach

With a deep understanding of sync and async trade-offs, let's provide concrete guidance for different scenarios.

Use Synchronous When

•Financial transactions (payments, transfers)
•Regulatory compliance requirements
•External side effects depend on commit (emails, webhooks)
•Data is irreplaceable
•Followers are in the same datacenter (low latency)
•Write volume is moderate
•You can tolerate brief unavailability on follower failure

Use Asynchronous When

•High write volume requires maximum throughput
•Cross-region replication where latency is prohibitive
•Data can be reconstructed or is non-critical (logs, caches)
•Eventual consistency is acceptable
•Availability is more important than durability
•The application can handle occasional data loss
•Development and testing environments

Replication Mode by Use Case
Use Case	Recommended Mode	Rationale
Banking/Payments	Synchronous (quorum)	Zero data loss tolerance; regulatory requirements
E-commerce Orders	Semi-sync (at least 1)	Orders are valuable; some latency acceptable
Social Media Posts	Async with monitoring	High volume; eventual consistency acceptable
Analytics/Logging	Fully Async	Reconstructable; volume too high for sync overhead
Cross-Region DR	Async (different from HA)	Latency prohibitive; DR is last resort anyway
Same-DC HA Replica	Synchronous or Semi-sync	Low latency; HA requires durability

Hybrid Strategies:

Application-Level Routing: The application knows which writes are critical. It can specify durability requirements per transaction, routing critical writes through a synchronous path.

When in Doubt, Start Synchronous

Implementation Across Databases

Different databases implement synchronous/asynchronous replication with varying configurations and semantics. Let's examine the major implementations.

PostgreSQL

•synchronous_commit — Controls durability level: off, local, remote_write, remote_apply, on
•synchronous_standby_names — Specifies which standbys are synchronous and how many must acknowledge
•Syntax examples: FIRST 1 (standby1, standby2), ANY 2 (standby1, standby2, standby3)
•Per-transaction control — SET LOCAL synchronous_commit = 'remote_apply'; within a transaction
•Quorum support — ANY N syntax provides quorum-based acknowledgment

PostgreSQL Synchronous Configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- In postgresql.conf (leader)
synchronous_commit = on                    -- require sync for commits
synchronous_standby_names = 'FIRST 1 (replica1, replica2)'  -- first responding wins
 
-- Or for quorum mode (2 of 3 must acknowledge)
synchronous_standby_names = 'ANY 2 (replica1, replica2, replica3)'
 
-- Check synchronous status
SELECT application_name, sync_state, sync_priority
FROM pg_stat_replication;
 
-- Per-transaction override (for less critical writes)
BEGIN;
SET LOCAL synchronous_commit = 'local';  -- Don't wait for followers
INSERT INTO logs (event) VALUES ('user_clicked');
COMMIT;

MySQL

•rpl_semi_sync_source_enabled — Enable semi-synchronous on the source (leader)
•rpl_semi_sync_source_wait_for_replica_count — Number of replicas that must acknowledge
•rpl_semi_sync_source_timeout — Milliseconds to wait before falling back to async
•Group Replication — Full Paxos-based consensus for strong consistency
•Fallback behavior — After timeout, reverts to async (configurable)

MongoDB

•Write Concern — Specified per-operation: {w: 1}, {w: 'majority'}, {w: 3}, {w: 'tag'}
•w: 1 — Acknowledge after primary writes (async to secondaries)
•w: 'majority' — Acknowledge after majority of replica set writes (quorum sync)
•w: N — Acknowledge after N nodes write (specific count)
•j: true — Require journal (fsync) before acknowledgment
•wtimeout — Timeout in milliseconds; error if not achieved

Terminology Varies

Summary: Synchronous vs Asynchronous

We've explored one of the most fundamental trade-offs in database replication: when to wait for followers during commit. Let's consolidate the essential insights:

Key Takeaways

•Synchronous replication waits for followers before responding to clients — Provides strong durability but adds latency equal to at least one RTT to a follower.
•Asynchronous replication responds immediately after leader WAL write — Provides lowest latency but risks data loss if the leader fails before replication completes.
•Semi-synchronous modes balance the trade-off — Wait for one follower with timeout fallback, per-transaction control, or quorum-based acknowledgment.
•The math is straightforward: lost data = lag × write rate — Quantify the risk to make informed decisions rather than defaulting to one extreme.
•Per-transaction control enables surgical decisions — Critical writes go synchronous; bulk operations go async. Best of both worlds.
•Same-DC sync + cross-region async is a common pattern — Fast local failover with strong durability, plus geographic disaster recovery.

What's Next:

Page Complete

3 / 5