Loading learning content...
In the previous page, we established that all writes flow through a single leader. But having one authoritative copy of the data isn't enough—we need those changes to propagate to followers reliably, efficiently, and with minimal delay.
Consider what followers must accomplish:
This page explores the mechanisms that make follower replication work—the protocols, data formats, and operational patterns that keep copies synchronized across the cluster.
By the end of this page, you will understand how replication logs are structured and transmitted, the differences between physical and logical replication, log streaming versus log shipping, how followers recover from outages, and best practices for maintaining healthy replication.
At the heart of leader-to-follower replication is the replication log—a sequential, append-only record of all changes made to the database. This log is the single source of truth for what the follower must apply.
Different databases call this log different names:
Despite the naming variations, the concept is identical: an ordered sequence of records, each describing a change to the database.
123456789101112131415161718192021
┌─────────────────────────────────────────────────────────────────────────────┐│ REPLICATION LOG ││ ││ ┌──────────────────────────────────────────────────────────────────────┐ ││ │ LSN: 00000001 │ Transaction: BEGIN │ Timestamp: 2024-01-15 10:00 │ ││ │ LSN: 00000002 │ INSERT users (id=1, name='Alice') │ ││ │ LSN: 00000003 │ INSERT orders (user_id=1, total=99.99) │ ││ │ LSN: 00000004 │ Transaction: COMMIT │ ││ ├──────────────────────────────────────────────────────────────────────┤ ││ │ LSN: 00000005 │ Transaction: BEGIN │ ││ │ LSN: 00000006 │ UPDATE users SET balance=500 WHERE id=1 │ ││ │ LSN: 00000007 │ Transaction: COMMIT │ ││ ├──────────────────────────────────────────────────────────────────────┤ ││ │ LSN: 00000008 │ Transaction: BEGIN │ ││ │ LSN: 00000009 │ DELETE FROM orders WHERE id=1 │ ││ │ LSN: 00000010 │ Transaction: COMMIT │ ││ └──────────────────────────────────────────────────────────────────────┘ ││ ││ LSN = Log Sequence Number (monotonically increasing) ││ The log is append-only: new entries are always added at the end │└─────────────────────────────────────────────────────────────────────────────┘LSNs are the heartbeat of replication. Every follower tracks its current LSN—the last log entry it has applied. The difference between the leader's current LSN and the follower's LSN is the 'replication lag' in log-position terms. Recovery after failure means replaying log entries from the last applied LSN.
Replication logs can encode changes at different levels of abstraction. The two primary approaches are physical replication and logical replication, each with distinct trade-offs.
Physical Replication (Byte-Level)
Physical replication transmits the exact byte-level changes to the database's storage files. The log contains information like "write these 8KB of data to page 42 of file users.dat."
Logical Replication (Operation-Level)
Logical replication transmits the logical operations: "INSERT row (id=1, name='Alice') into table users" or "DELETE rows WHERE age > 100 FROM users." The follower interprets these operations and applies them to its own storage.
| Aspect | Physical Replication | Logical Replication |
|---|---|---|
| Data Format | Byte-level storage changes | Logical SQL-like operations |
| Version Compatibility | Requires identical database versions | Supports different versions |
| Cross-Platform | Same OS, same architecture | Can replicate to different platforms |
| Selective Replication | All or nothing (full database) | Can filter tables/rows/columns |
| Performance | Lower CPU overhead | Higher CPU (must parse/apply) |
| Conflict Detection | Not possible (byte-level) | Possible (understands operations) |
| Use Case | High-availability replicas | Data integration, ETL, CDC |
For high-availability failover replicas (same data center, same purpose as leader), physical replication is typically preferred—it's faster and simpler. For cross-version upgrades, data integration pipelines, or selective replication scenarios, logical replication provides the necessary flexibility.
Once the replication log exists, how does it get from the leader to followers? Two primary approaches have evolved: log shipping and log streaming.
1234567891011121314151617181920212223242526272829303132333435363738394041424344
LOG SHIPPING (File-Based)═══════════════════════════════════════════════════════════════════════════ LEADER FOLLOWER ────── ──────── ┌─────────┐ ┌─────────┐ │ WAL │ (1) Archive │ Waiting │ │ Segment │ ────┐ │ ... │ │ 001 │ │ └─────────┘ └─────────┘ │ │ ▼ │ ┌──────────┐ │ │ Shared │ (2) Copy │ │ Storage │ ─────────────┘ │ (NFS/S3) │ │ └──────────┘ ▼ ┌─────────────┐ │ (3) Apply │ │ WAL Segment │ │ 001 │ └─────────────┘ ✓ Simple: just file copies ✗ Lag = segment size (16MB default) ✓ Works over unreliable networks ✗ Minimum latency is segment duration ✓ Easy DR to remote locations ✗ Not suitable for hot standby LOG STREAMING (Connection-Based)═══════════════════════════════════════════════════════════════════════════ LEADER FOLLOWER ────── ──────── ┌─────────────┐ ┌─────────────┐ │ WAL Buffer │ Continuous │ WAL Receive │ │ │ TCP Stream │ Process │ │ [New Entry] │─────────────────────│ [Apply] │ │ [New Entry] │ (real-time) │ [Apply] │ │ [New Entry] │─────────────────────│ [Apply] │ └─────────────┘ └─────────────┘ ✓ Real-time: entries sent immediately ✓ Sub-second lag possible ✓ Follower always near-current ✓ Suitable for hot standby ✓ Efficient: no file overhead ✗ Requires stable network ✗ More complex failure handlingLog Shipping (File-Based)
In log shipping, the leader writes complete log segment files (typically 16MB or 64MB) and archives them to shared storage (NFS, S3, or direct file copy). Followers periodically check for new segments, download them, and apply them.
Use case: Disaster recovery to a remote location over unreliable networks. The follower can be hours or days behind, but it will eventually catch up.
Log Streaming (Connection-Based)
In log streaming, the follower maintains a persistent TCP connection to the leader. As the leader writes new log entries, it immediately streams them to connected followers. The follower applies entries as they arrive.
Use case: Hot standby replicas that can serve reads and fail over quickly. Lag is typically sub-second under normal conditions.
If the leader purges log entries before a follower has consumed them, the follower cannot catch up—it must be rebuilt from scratch. Use replication slots (PostgreSQL) or similar mechanisms to prevent premature log cleanup. But remember: holding too much log consumes disk space.
Receiving log entries is only half the battle—the follower must also apply them to its local data files. This process involves several stages and has its own complexities.
1234567891011121314151617181920212223242526272829303132333435363738394041
┌─────────────────────────────────────────────────────────────────────────────┐│ FOLLOWER APPLY PIPELINE │└─────────────────────────────────────────────────────────────────────────────┘ FROM LEADER │ ▼┌───────────────────┐│ (1) RECEIVE │ Network buffer receives log entries│ Buffer │ Entries queued for processing└───────────────────┘ │ ▼┌───────────────────┐│ (2) WRITE TO │ Entries written to local WAL (durable)│ Local WAL │ This is the follower's recovery point└───────────────────┘ │ ▼┌───────────────────┐│ (3) PARSE & │ Decode binary log format│ Decode │ Validate entry integrity (checksums)└───────────────────┘ │ ▼┌───────────────────┐│ (4) APPLY TO │ Execute the change against local storage│ Data Files │ Update tables, indexes, etc.└───────────────────┘ │ ▼┌───────────────────┐│ (5) UPDATE │ Record the new applied LSN│ Apply Position│ This position is reported to the leader└───────────────────┘ │ ▼┌───────────────────┐│ (6) ACKNOWLEDGE │ Send acknowledgment to leader (if sync)│ to Leader │ Report apply position for monitoring└───────────────────┘Key Apply Process Considerations:
Crash Recovery: If the follower crashes, it restarts from its last applied position (stored durably). Because entries are written to the local WAL before application, recovery replays any entries that were received but not yet fully applied.
Apply Parallelism: Some databases parallelize the apply process by applying non-conflicting transactions concurrently. MySQL Group Replication and PostgreSQL's parallel apply feature both support this. The key challenge is maintaining transaction ordering for conflicting operations.
Apply Lag: The time between the leader writing an entry and the follower applying it is apply lag. This is distinct from network lag (time to transmit). A follower might receive entries quickly but apply them slowly if its disk or CPU is saturated.
| Cause | Symptom | Resolution |
|---|---|---|
| Slow Disk I/O | Apply position falls behind receive position | Faster storage (SSD), tune I/O scheduler |
| CPU Saturation | High CPU on follower, low on leader | Scale follower resources, reduce read load |
| Large Transactions | Periodic spikes in lag | Break up large transactions, batch smaller |
| Schema Changes (DDL) | Lag spikes during DDL | DDL locks entire table; run during low traffic |
| Network Saturation | Receive buffer grows | Increase network capacity, enable compression |
Many databases support 'hot standby' mode where followers serve read queries while applying changes. This introduces a potential conflict: long-running read queries might conflict with pending apply operations. PostgreSQL resolves this with configurable behavior: cancel the query, delay the apply, or allow stale reads.
Followers go offline. Networks partition. Hardware fails. When a follower comes back online after an outage, it must catch up to the leader's current state. The strategy depends on how far behind the follower is.
12345678910111213141516171819202122232425262728293031
FOLLOWER COMES ONLINE │ ▼ ┌────────────────────┐ │ Compare follower's │ │ last LSN to leader │ └────────────────────┘ │ ┌───────────────┼───────────────┐ │ │ │ ▼ ▼ ▼ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ Log entry │ │ Entry in │ │ Entry not │ │ available │ │ archive │ │ available │ │ on leader │ │ storage │ │ anywhere │ └───────────┘ └───────────┘ └───────────┘ │ │ │ ▼ ▼ ▼ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ Stream │ │ Fetch │ │ Rebuild │ │ from │ │ from │ │ from │ │ leader │ │ archive │ │ backup │ └───────────┘ └───────────┘ └───────────┘ │ │ │ └───────────────┼───────────────┘ │ ▼ ┌────────────────────┐ │ Resume streaming │ │ replication │ └────────────────────┘Rebuild Process (pg_basebackup, xtrabackup, mongodump):
When a follower needs complete reconstruction:
Take Base Backup — Create a consistent snapshot of the leader's data files, either by stopping the database momentarily (cold backup) or using consistent snapshot mechanisms (hot backup).
Transfer to Follower — Copy all data files to the new follower. This can be network-intensive for large databases (terabytes = hours or days of transfer).
Record Start Position — Note the log position at the time of the backup. The follower will resume replication from this point.
Start Follower — Initialize the follower with the backup, configure it to stream from the leader starting at the noted position.
Catch Up — The follower applies all log entries from the backup point to the leader's current position.
While a follower rebuilds, you have one fewer replica for redundancy. For large databases, rebuilds can take hours or days. Plan ahead: maintain enough replicas that losing one doesn't threaten availability, and consider techniques like incremental backup or storage-level snapshots to speed up rebuilds.
Healthy replication is invisible—problems are not. Monitoring replication health is critical for catching issues before they become outages. Here's what to watch and how to verify correctness.
12345678910111213141516171819202122
-- View replication status on the leaderSELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, write_lag, flush_lag, replay_lagFROM pg_stat_replication; -- Check replication slots and their lagSELECT slot_name, active, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS lag_sizeFROM pg_replication_slots; -- On the follower, check if recovery is completeSELECT pg_is_in_recovery(), pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn();Data Verification:
Beyond monitoring metrics, you should periodically verify that follower data actually matches the leader. Techniques include:
Set lag alerts well below your tolerance threshold. If your application can tolerate 30 seconds of lag, alert at 10 seconds. This gives you time to investigate and remediate before users experience stale data.
Years of operational experience have distilled specific practices for maintaining healthy, reliable follower replication. These recommendations apply across database systems.
| Setting | Recommendation | Rationale |
|---|---|---|
| Follower Count | Minimum 2 (prefer 3+) | Survive one failure while maintaining read capacity |
| Synchronous Replicas | At most 1 (for critical data) | Balance durability against write latency |
| Log Retention | Slightly more than max expected lag | Prevent rebuild after transient outages |
| Read Routing | Load balance across followers | Distribute read load, reduce leader burden |
| Backup Source | Prefer followers over leader | Reduce leader load, followers are already consistent |
Replication protects against hardware failure but not against data corruption or accidental deletion. If you DELETE all rows from a table on the leader, followers replicate the DELETE. Maintain independent backups (periodic snapshots + log archives) for point-in-time recovery.
We've explored the follower side of leader-follower replication—how followers receive, store, and apply changes from the leader to maintain synchronized copies. Let's consolidate the essential knowledge:
What's Next:
We've covered how writes flow through the leader and how followers replicate. But we've glossed over a critical detail: when does replication happen relative to commit? The next page dives deep into synchronous versus asynchronous replication—the trade-off between durability guarantees and write performance that shapes every production deployment.
You now understand how followers maintain synchronized copies of the leader's data through log-based replication. Next, we'll explore the critical choice between synchronous and asynchronous replication—a decision that directly impacts durability, latency, and availability.