Loading learning content...
Your application becomes successful. Traffic grows. Your single database server, once handling 100 queries per second comfortably, now struggles with 10,000 queries per second. The CPU is pegged at 100%, queries are timing out, and users are complaining.
You could scale vertically—buy a bigger server with more CPU, RAM, and faster SSDs. But vertical scaling has limits and costs grow exponentially. A server with 2x the performance often costs 4x as much, and there's an upper ceiling to what hardware you can buy.
Read replicas offer a different path: horizontal scaling for read operations. Instead of one database handling all queries, you deploy multiple copies—replicas—that stay synchronized with the primary. Read traffic is distributed across replicas, multiplying your read throughput while the primary focuses on writes.
This pattern is foundational to scaling databases. Every major web application—from Twitter to Stripe to Netflix—relies on read replicas to handle their read-heavy workloads.
By the end of this page, you will understand: how database replication works at a low level; the spectrum from synchronous to asynchronous replication; replication lag and its implications; patterns for routing reads to replicas safely; replica failover and promotion; and production configurations for PostgreSQL and MySQL.
Before diving into implementation, let's understand the fundamental problem read replicas solve and why they're so effective.
The Read/Write Ratio:
Most applications are heavily read-oriented:
| Application Type | Typical Read/Write Ratio |
|---|---|
| Social media feed | 100:1 to 1000:1 |
| E-commerce catalog | 50:1 to 200:1 |
| News/content sites | 500:1 to 5000:1 |
| Analytics dashboards | 100:1 |
| Banking/financial | 10:1 to 50:1 |
If 95% of your database operations are reads, scaling reads provides 20x the impact of scaling writes.
What replicas provide:
Read replicas only scale read operations. All writes still go to the primary and are replicated to followers. For write-heavy workloads, you need different strategies: sharding (partitioning data across multiple primaries), or using databases designed for write scaling (like Cassandra or CockroachDB).
Database replication works by shipping changes from the primary to replicas. The mechanism varies by database, but the core concept is consistent: capture changes on primary, transmit to replicas, apply on replicas.
PostgreSQL Streaming Replication:
PostgreSQL uses Write-Ahead Log (WAL) shipping:
PostgreSQL Streaming Replication Flow Application │ │ INSERT INTO orders VALUES (...) ▼┌─────────────────────────────────────────────────────┐│ PRIMARY ││ ┌─────────┐ ┌────────────┐ ┌─────────┐ ││ │ Query │──────│ WAL Buffer │──────│ WAL │ ││ │ Parser │ │ (memory) │ │ Writer │ ││ └─────────┘ └────────────┘ └────┬────┘ ││ │ ││ ┌──────────────────┘ ││ ▼ ││ ┌──────────┐ ││ │ WAL Files │ (pg_wal directory) ││ └─────┬────┘ ││ │ ││ ┌─────┴─────┐ ││ │ WAL Sender │←── Replica conn ││ └─────┬─────┘ │└──────────────────────────┼──────────────────────────┘ │ TCP Connection │ WAL Stream (replication) │ ▼┌─────────────────────────────────────────────────────┐│ REPLICA ││ ┌───────────┐ ││ │WAL Receiver│ ││ └─────┬─────┘ ││ │ ││ ┌─────▼─────┐ ││ │ WAL Files │ (local copy) ││ └─────┬─────┘ ││ │ ││ ┌─────▼─────┐ ││ │ Startup │ (replay process) ││ │ Process │ ││ └─────┬─────┘ ││ ▼ ││ Applied to tables ││ (replica now has the INSERT) │└─────────────────────────────────────────────────────┘MySQL Replication:
MySQL uses binary log (binlog) replication:
MySQL 8.0+ Group Replication:
Modern MySQL supports group replication where multiple nodes use Paxos-like consensus for synchronous, multi-primary replication—a significant advancement over traditional async replication.
| Database | Mechanism | Log Type | Notes |
|---|---|---|---|
| PostgreSQL | Streaming replication | WAL (Write-Ahead Log) | Physical replication; byte-identical replicas |
| PostgreSQL | Logical replication | Logical decoding | Table-level; cross-version compatible; allows transforms |
| MySQL | Async binlog | Binary Log | Statement or row-based; traditional approach |
| MySQL | Semi-sync | Binary Log | At least one replica acknowledges before commit returns |
| MySQL | Group Replication | Binary Log + Paxos | Multi-primary; automatic failover; MySQL 8.0+ |
| SQL Server | Always On AG | Transaction Log | Synchronous or async; automatic failover |
| MongoDB | Replica Set | Oplog | Automatic failover; majority write concern available |
The spectrum between synchronous and asynchronous replication represents a fundamental trade-off between consistency and performance/availability.
1234567891011121314151617181920212223242526272829303132333435
-- PostgreSQL synchronous replication configuration -- On PRIMARY: postgresql.conf-- Require sync commit from at least one replicasynchronous_commit = on -- 'on' = wait for sync replica, 'off' = asyncsynchronous_standby_names = 'FIRST 1 (replica1, replica2)' -- Sync modes available:-- synchronous_commit = off -- Full async (fastest, least safe)-- synchronous_commit = local -- Wait for local WAL write only-- synchronous_commit = remote_write -- Wait for replica to receive-- synchronous_commit = on -- Wait for replica to write WAL-- synchronous_commit = remote_apply -- Wait for replica to apply (strongest) -- Check replication statusSELECT client_addr, state, sync_state, -- 'sync', 'async', 'potential', 'quorum' sent_lsn, write_lsn, flush_lsn, replay_lsn, pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_lag_bytesFROM pg_stat_replication; -- Per-transaction sync level (override global setting)SET synchronous_commit = 'remote_apply'; -- For critical transactionBEGIN; UPDATE accounts SET balance = balance - 1000 WHERE id = 123; INSERT INTO audit_log VALUES (...);COMMIT; -- Waits for replica to apply SET synchronous_commit = 'local'; -- For less critical transactionINSERT INTO page_views VALUES (...); -- Returns immediatelyMany production systems use semi-synchronous replication: wait for at least one replica (out of several) to acknowledge. This provides data safety without letting a single slow replica block all writes. Configure synchronous_standby_names = 'FIRST 1 (replica1, replica2, replica3)' — commit proceeds as soon as ANY one replica confirms.
Replication lag is the delay between when a change is committed on the primary and when it becomes visible on a replica. For asynchronous replication, lag is inevitable—the question is how much and how to handle it.
What causes lag:
| Cause | Description | Mitigation |
|---|---|---|
| Network latency | Time to transmit WAL over network | Co-locate replicas; use faster network |
| Replica CPU | Replay is single-threaded (PostgreSQL) | Faster replica CPU; logical replication |
| Heavy write load | More WAL to ship and replay | More replicas; faster disks |
| Long-running queries | VACUUM, migrations block replay | Use hot_standby_feedback; schedule off-peak |
| Replica maintenance | ANALYZE, index builds | Perform during low-traffic periods |
Measuring lag:
123456789101112131415161718192021222324252627282930313233
-- On PRIMARY: check replication lag for all replicasSELECT client_addr AS replica, state, pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS lag_bytes, pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) / 1024 / 1024 AS lag_mb, pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_pending_bytes, write_lag, -- Time lag for write flush_lag, -- Time lag for fsync replay_lag -- Time lag for apply (PostgreSQL 10+)FROM pg_stat_replication; -- On REPLICA: check how far behindSELECT pg_last_wal_receive_lsn() AS received_lsn, pg_last_wal_replay_lsn() AS replayed_lsn, pg_wal_lsn_diff( pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn() ) AS replay_pending_bytes, pg_last_xact_replay_timestamp() AS last_replayed_at, NOW() - pg_last_xact_replay_timestamp() AS time_since_replay; -- Alert if lag exceeds threshold-- (typically run as monitoring check)SELECT CASE WHEN pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) > 1024*1024*100 THEN 'CRITICAL: Replica lag > 100MB' WHEN pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) > 1024*1024*10 THEN 'WARNING: Replica lag > 10MB' ELSE 'OK'END as statusFROM pg_stat_replication;Strategies for handling lag:
A user creates a post, then immediately views their profile. If the write goes to primary but the read goes to a lagged replica, the post is missing! This breaks user trust. Always ensure read-your-writes consistency for user-facing operations—route the user's subsequent reads to primary for a few seconds after writes, or use causal consistency tokens.
Effectively routing reads to replicas requires thoughtful design. Here are production-proven patterns.
The application explicitly decides write vs read destinations. Most flexible but requires careful implementation.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
// Application-level read/write splitimport { Pool } from 'pg'; // Separate connection poolsconst primaryPool = new Pool({ host: 'db-primary.internal', database: 'myapp', max: 10,}); const replicaPool = new Pool({ host: 'db-replica.internal', database: 'myapp', max: 30, // More replica connections for read load}); type Intent = 'read' | 'write' | 'read-after-write'; // Track recent writes per user for read-your-writes consistencyconst recentWrites = new Map<string, number>();const WRITE_STICK_DURATION = 5000; // 5 seconds function getPool(intent: Intent, userId?: string): Pool { if (intent === 'write') { if (userId) { recentWrites.set(userId, Date.now()); } return primaryPool; } // Check if user recently wrote - ensure read-your-writes if (userId) { const lastWrite = recentWrites.get(userId); if (lastWrite && Date.now() - lastWrite < WRITE_STICK_DURATION) { console.log(`Routing ${userId} to primary (recent write)`); return primaryPool; } } return replicaPool;} // Usage in repositoryclass UserRepository { async findById(userId: string, requestingUserId?: string) { const pool = getPool('read', requestingUserId); const result = await pool.query( 'SELECT * FROM users WHERE id = $1', [userId] ); return result.rows[0]; } async updateProfile(userId: string, data: Partial<User>) { const pool = getPool('write', userId); await pool.query( 'UPDATE users SET name = $1, bio = $2 WHERE id = $3', [data.name, data.bio, userId] ); // Next reads by this user go to primary for 5 seconds }}When the primary fails, a replica must be promoted to become the new primary. This is one of the most critical operations in database administration.
Failover approaches:
| Approach | Description | Recovery Time | Data Loss Risk |
|---|---|---|---|
| Manual failover | DBA manually promotes replica | Minutes to hours | Depends on lag at failure |
| Automated failover | Tools detect failure, auto-promote | Seconds to minutes | Potential for split-brain |
| Managed services | AWS RDS, Cloud SQL handle it | 1-2 minutes typical | Provider's SLA |
PostgreSQL manual promotion:
123456789101112131415161718192021222324252627282930313233343536373839
# PostgreSQL manual failover procedure # 1. Verify primary is truly down (avoid split-brain!)pg_isready -h db-primary.internal -p 5432# Should show: "could not connect to server" # 2. Ensure replica has all available data# On replica: check if caught uppsql -h db-replica1.internal -c " SELECT pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp();" # 3. Stop accepting connections to old primary (if accessible)# This prevents split-brain if primary comes back# On old primary (if accessible):pg_ctl stop -D /var/lib/postgresql/data -m fast # 4. Promote replica to primary# On replica:pg_ctl promote -D /var/lib/postgresql/data# Or: touch /var/lib/postgresql/data/standby.signal && remove it# Or: SELECT pg_promote(); (PostgreSQL 12+) # 5. Verify promotion succeededpsql -h db-replica1.internal -c "SELECT pg_is_in_recovery();"# Should return 'f' (false = not in recovery = is primary) # 6. Update application configuration to point to new primary# (Or update DNS/load balancer to point to new primary) # 7. Reconfigure other replicas to follow new primary# On other replicas, update primary_conninfo in postgresql.conf:# primary_conninfo = 'host=db-replica1.internal port=5432 ...'# Then restart: pg_ctl restart # 8. (Later) Add old primary back as replica# After old primary is recovered, reinitialize it as replica:pg_basebackup -h new-primary.internal -D /var/lib/postgresql/data -U replicator -PAutomated failover tools:
| Tool | Database | Features |
|---|---|---|
| Patroni | PostgreSQL | Distributed consensus (etcd/Consul), auto-failover, REST API |
| repmgr | PostgreSQL | Simpler setup, manual or automatic failover |
| Orchestrator | MySQL | Topology discovery, automated failover, web UI |
| MHA | MySQL | Master High Availability, mature but dated |
| ProxySQL | MySQL | Failover detection with health checks |
Split-brain occurs when two nodes both believe they are the primary and accept writes. This causes data divergence that's extremely difficult to reconcile. Prevention: use STONITH (Shoot The Other Node In The Head) via fencing to ensure old primary can't accept writes; require quorum before promotion; use managed services that handle this correctly.
What's next:
While read replicas scale read throughput, query caching eliminates query execution entirely for repeated requests. The next page explores query caching—from application-level caching with Redis to database query caches and result caching patterns.
You now understand how read replicas work, the spectrum of replication consistency, how to handle replication lag, patterns for routing reads, and the critical process of failover. Next, we'll cover query caching to further reduce database load.