Read Replicas - Learning Module

Loading content...

0/273

Replica Promotion

When the Primary Falls

Your primary database has failed. Whether from hardware failure, cascading software bugs, network partition, or human error, the single source of truth for your data is suddenly unavailable. Writes are failing. Depending on your architecture, reads may be failing too. Every second of downtime costs money, reputation, and user trust.

This is the moment replica promotion is designed for. Among your read replicas—previously read-only copies—one must be selected, prepared, and promoted to become the new primary. This promoted replica will accept writes, become the new source of truth, and the remaining replicas will reconfigure to follow it.

Replica promotion is perhaps the most critical operation in database administration. Done correctly, it restores service quickly with minimal data loss. Done incorrectly, it can cause data corruption, split-brain scenarios, or extended outages. This page provides the deep understanding needed to design, implement, and operate failover systems with confidence.

What You Will Learn

By the end of this page, you will understand failover detection mechanisms, the mechanics of replica promotion across different databases, how to prevent data loss during promotion, strategies for automated versus manual failover, and patterns for application-level failover handling.

Failover Detection

Before promotion can occur, the system must detect that failover is necessary. This detection must be accurate (avoid false positives that cause unnecessary failovers) and fast (minimize downtime). Balancing these requirements is challenging.

Types of failures to detect:

Failure Types and Detection Methods
Failure Type	Symptoms	Detection Method	Detection Challenge
Hardware crash	Complete unresponsiveness, connection failures	Connection timeout, ping failure	Fast and reliable to detect
Process crash	Connection refused, database not running	Connection failure, process monitoring	Quick detection via health checks
Disk failure	I/O errors, corrupted responses	Health check queries failing	May partially respond initially
Network partition	Some nodes reachable, others not	Asymmetric failure detection	False positives if detector is isolated
Performance degradation	Responses slow, queries timeout	Latency thresholds, timeout rates	Distinguishing transient from permanent
Replication break	Replicas falling behind, replication errors	Replication monitoring	May not require promotion

Detection mechanisms:

Health checks send periodic probes to the primary:

Connection establishment tests
Simple query execution (SELECT 1)
Write capability verification (INSERT into heartbeat table)
Replication status verification (are replicas receiving updates?)

Consensus-based detection uses multiple monitors to agree on failure:

Prevents single-monitor false positives during network partitions
Requires majority or quorum agreement before triggering failover
Adds latency but improves accuracy

Replicas as witnesses enables replicas to monitor primary health:

Each replica tracks replication stream health
If configured, replicas can initiate or participate in failover decisions

failover-detector.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
// Failover detection with consensus
 
interface HealthCheckResult {
    monitor: string;
    primaryReachable: boolean;
    querySucceeded: boolean;
    latencyMs: number;
    timestamp: Date;
}
 
interface FailoverDecision {
    shouldFailover: boolean;
    reason: string;
    votes: { monitor: string; vote: boolean }[];
}
 
class FailoverDetector {
    private monitors: string[];
    private healthCheckIntervalMs: number;
    private failureThreshold: number; // Consecutive failures before vote
    private quorumRequirement: number; // Minimum monitors that must agree
    
    private failureCounts: Map<string, number> = new Map();
    
    constructor(config: {
        monitors: string[];
        healthCheckIntervalMs: number;
        failureThreshold: number;
    }) {
        this.monitors = config.monitors;
        this.healthCheckIntervalMs = config.healthCheckIntervalMs;
        this.failureThreshold = config.failureThreshold;
        this.quorumRequirement = Math.floor(this.monitors.length / 2) + 1;
    }
    
    async checkHealth(monitorId: string, primary: DatabaseConnection): Promise<HealthCheckResult> {
        const startTime = Date.now();
        
        try {
            // Test 1: Can we connect?
            await primary.connect();
            
            // Test 2: Can we run a query?
            await primary.query('SELECT 1');
            
            // Test 3: Can we write? (use heartbeat table)
            await primary.query(`
                INSERT INTO _healthcheck (monitor_id, timestamp) 
                VALUES ($1, NOW())
                ON CONFLICT (monitor_id) DO UPDATE SET timestamp = NOW()
            `, [monitorId]);
            
            this.failureCounts.set(monitorId, 0);
            
            return {
                monitor: monitorId,
                primaryReachable: true,
                querySucceeded: true,
                latencyMs: Date.now() - startTime,
                timestamp: new Date(),
            };
            
        } catch (error) {
            const failures = (this.failureCounts.get(monitorId) ?? 0) + 1;
            this.failureCounts.set(monitorId, failures);
            
            return {
                monitor: monitorId,
                primaryReachable: false,
                querySucceeded: false,
                latencyMs: Date.now() - startTime,
                timestamp: new Date(),
            };
        }
    }
    
    collectFailoverVotes(): FailoverDecision {
        const votes: { monitor: string; vote: boolean }[] = [];
        
        for (const monitor of this.monitors) {
            const failures = this.failureCounts.get(monitor) ?? 0;
            const vote = failures >= this.failureThreshold;
            votes.push({ monitor, vote });
        }
        
        const yesVotes = votes.filter(v => v.vote).length;
        const shouldFailover = yesVotes >= this.quorumRequirement;
        
        return {
            shouldFailover,
            reason: shouldFailover 
                ? `Quorum reached: ${yesVotes}/${this.monitors.length} monitors detect failure`
                : `Quorum not reached: ${yesVotes}/${this.monitors.length} (need ${this.quorumRequirement})`,
            votes,
        };
    }
}

The Split-Brain Problem

A major risk is 'split-brain': network partition causes monitors to incorrectly believe the primary is down and promote a replica, while the original primary is still running and accepting writes. Now two nodes believe they are primary, leading to divergent data. Prevention requires: 1) Fencing the old primary (STONITH—Shoot The Other Node In The Head), 2) Quorum-based decisions, 3) VIP/DNS failover that prevents writes to old primary.

Promotion Mechanics

Once failover is decided, the selected replica must be promoted. The promotion process differs by database but follows a general pattern:

Stop replication — Disconnect from the failed primary
Apply remaining logs — Process any buffered replication data
Promote to primary — Switch from read-only standby to read-write primary
Reconfigure other replicas — Point remaining replicas to new primary
Update routing — Switch application traffic to new primary

PostgreSQL promotion process:

PostgreSQL provides several methods for promoting a standby to primary:

postgresql-promotion.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/bin/bash
# PostgreSQL replica promotion
 
# Method 1: pg_ctl promote (requires shell access)
# This is the traditional method
pg_ctl promote -D /var/lib/postgresql/data
 
# Method 2: promote trigger file (configured in recovery.conf/postgresql.conf)
# Configured via: promote_trigger_file = '/tmp/promote_trigger'
touch /tmp/promote_trigger
 
# Method 3: pg_promote() function (PostgreSQL 12+, requires superuser)
# Can be called via SQL
psql -c "SELECT pg_promote();"
 
# Verification: Check if no longer in recovery
psql -c "SELECT pg_is_in_recovery();"
# Should return 'f' (false) after promotion
 
# After promotion, update pg_hba.conf to allow replication connections
# and configure remaining replicas to follow new primary
 
# On other replicas, update primary_conninfo:
# primary_conninfo = 'host=new-primary.internal user=replicator ...'
# Then restart PostgreSQL or signal with pg_ctl reload

Timeline Considerations

PostgreSQL uses 'timelines' to distinguish between different histories after promotion. When a standby is promoted, it starts a new timeline. Other standbys must be reconfigured to follow this new timeline. recovery_target_timeline = 'latest' in standby configuration helps automate this.

Selecting the Promotion Candidate

When multiple replicas exist, choosing which to promote is critical. The wrong choice can result in unnecessary data loss or a new primary that's undersized for the role.

Selection criteria:

Replica Selection Factors

•Replication position (most important) — The replica closest to the failed primary's last known position has the least data loss. Always prefer the replica with the highest WAL position / GTID executed.
•Data completeness — If synchronous replication was configured, prefer replicas that were synchronous (guaranteed to have all committed transactions).
•Hardware capacity — The promoted replica becomes primary, handling all writes. Ensure it has sufficient CPU, memory, and storage I/O for write workload.
•Network location — Prefer replicas in the same region/zone as the original primary to minimize latency for applications configured to use that location.
•Operational readiness — Is the replica healthy? Recent restarts, ongoing maintenance, or known issues should deprioritize a candidate.

candidate-selector.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
// Replica selection logic for promotion
 
interface ReplicaCandidate {
    id: string;
    host: string;
    walPosition: bigint;       // PostgreSQL LSN or MySQL GTID position
    isSynchronous: boolean;    // Was it synchronously replicated?
    lagMs: number;             // Replication lag before failure
    cpuCores: number;
    memoryGb: number;
    region: string;
    lastHealthCheck: Date;
    healthScore: number;       // 0-100
}
 
interface SelectionResult {
    selectedReplica: ReplicaCandidate;
    reason: string;
    alternativeCandidates: ReplicaCandidate[];
    estimatedDataLoss: string;
}
 
class PromotionCandidateSelector {
    private preferredRegion: string;
    private minimumHealthScore: number = 80;
    
    constructor(preferredRegion: string) {
        this.preferredRegion = preferredRegion;
    }
    
    select(candidates: ReplicaCandidate[], primaryLastPosition: bigint): SelectionResult {
        if (candidates.length === 0) {
            throw new Error('No replica candidates available for promotion');
        }
        
        // Filter unhealthy candidates
        const healthy = candidates.filter(c => c.healthScore >= this.minimumHealthScore);
        if (healthy.length === 0) {
            console.warn('No healthy candidates; using best available');
            return this.selectBest(candidates, primaryLastPosition);
        }
        
        return this.selectBest(healthy, primaryLastPosition);
    }
    
    private selectBest(candidates: ReplicaCandidate[], primaryLastPosition: bigint): SelectionResult {
        // Score each candidate
        const scored = candidates.map(c => ({
            candidate: c,
            score: this.computeScore(c, primaryLastPosition),
        }));
        
        // Sort by score descending
        scored.sort((a, b) => b.score - a.score);
        
        const selected = scored[0].candidate;
        const dataLoss = primaryLastPosition - selected.walPosition;
        
        return {
            selectedReplica: selected,
            reason: this.explainSelection(selected, scored[0].score),
            alternativeCandidates: scored.slice(1).map(s => s.candidate),
            estimatedDataLoss: this.formatDataLoss(dataLoss),
        };
    }
    
    private computeScore(candidate: ReplicaCandidate, primaryLastPosition: bigint): number {
        let score = 0;
        
        // Replication position (most important) - up to 50 points
        const positionRatio = Number(candidate.walPosition) / Number(primaryLastPosition);
        score += Math.min(50, positionRatio * 50);
        
        // Synchronous bonus - 20 points
        if (candidate.isSynchronous) {
            score += 20;
        }
        
        // Health score - up to 15 points
        score += (candidate.healthScore / 100) * 15;
        
        // Capacity (normalized) - up to 10 points
        const capacityScore = Math.min(10, (candidate.cpuCores * candidate.memoryGb) / 100);
        score += capacityScore;
        
        // Region preference - 5 points
        if (candidate.region === this.preferredRegion) {
            score += 5;
        }
        
        return score;
    }
    
    private explainSelection(candidate: ReplicaCandidate, score: number): string {
        const reasons: string[] = [];
        
        if (candidate.isSynchronous) {
            reasons.push('synchronously replicated (no data loss)');
        }
        
        reasons.push(`replication position ${candidate.walPosition}`);
        reasons.push(`health score ${candidate.healthScore}`);
        
        if (candidate.region === this.preferredRegion) {
            reasons.push('preferred region');
        }
        
        return `Selected ${candidate.id}: ${reasons.join(', ')}. Total score: ${score.toFixed(1)}`;
    }
    
    private formatDataLoss(bytesLoss: bigint): string {
        if (bytesLoss <= 0n) return 'None (replica fully synchronized)';
        if (bytesLoss < 1024n) return `~${bytesLoss} bytes`;
        if (bytesLoss < 1024n * 1024n) return `~${Number(bytesLoss / 1024n)} KB`;
        return `~${Number(bytesLoss / 1024n / 1024n)} MB`;
    }
}

Data Loss Is Sometimes Unavoidable

With asynchronous replication, some committed transactions on the primary may not have reached any replica before failure. This data is lost. The only prevention is synchronous replication (which impacts write latency) or accepting the business risk of potential loss.

Automated vs. Manual Failover

Should failover happen automatically or require human decision? Both approaches have merits, and many organizations use a hybrid model.

Automated Failover

•Faster recovery — Sub-minute failover possible; no waiting for humans
•24/7 coverage — Works even when operators are unavailable
•Consistent execution — Same procedure every time; no human error
•Risks: False positives, split-brain, data loss without verification
•Best for: Well-understood failure modes, high-availability requirements, mature monitoring

Manual Failover

•Human judgment — Operator assesses situation before acting
•Verification — Can confirm primary is truly failed before promoting
•Context awareness — Understands broader system state, dependencies
•Risks: Delayed response, unavailable operators, human error under pressure
•Best for: Complex environments, data sensitivity, unclear failure modes

Hybrid approaches:

Automatic detection, manual confirmation:

System detects failure and prepares for failover
Pages on-call engineer with one-click approval
Failover executes only after human approval
Timeout option: auto-promote if no response within X minutes

Automatic for known patterns, manual for unknowns:

Common failure patterns (hardware crash, process death) trigger automatic failover
Unusual patterns (partial failures, performance degradation) alert humans

Automatic with escape hatch:

Automatic failover proceeds unless explicitly blocked
Operators can 'pause' automatic failover during maintenance or investigations

hybrid-failover.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
// Hybrid failover controller with approval workflow
 
type FailoverApprovalStatus = 'pending' | 'approved' | 'rejected' | 'timeout';
 
interface FailoverRequest {
    id: string;
    detectedAt: Date;
    reason: string;
    candidateReplica: string;
    status: FailoverApprovalStatus;
    approvedBy?: string;
    executedAt?: Date;
}
 
class HybridFailoverController {
    private readonly autoApprovalTimeoutMs: number;
    private readonly requiresApproval: boolean;
    private pendingFailover: FailoverRequest | null = null;
    
    constructor(config: {
        autoApprovalTimeoutMs: number; // 0 = never auto-approve
        requiresApproval: boolean;
    }) {
        this.autoApprovalTimeoutMs = config.autoApprovalTimeoutMs;
        this.requiresApproval = config.requiresApproval;
    }
    
    async initiateFailover(reason: string, candidate: string): Promise<FailoverRequest> {
        const request: FailoverRequest = {
            id: crypto.randomUUID(),
            detectedAt: new Date(),
            reason,
            candidateReplica: candidate,
            status: 'pending',
        };
        
        this.pendingFailover = request;
        
        // Send alerts
        await this.sendAlerts(request);
        
        if (!this.requiresApproval) {
            // Immediate automatic failover
            return this.executeFailover(request);
        }
        
        if (this.autoApprovalTimeoutMs > 0) {
            // Start timeout for auto-approval
            this.startAutoApprovalTimer(request);
        }
        
        return request;
    }
    
    async approveFailover(requestId: string, approver: string): Promise<FailoverRequest> {
        if (!this.pendingFailover || this.pendingFailover.id !== requestId) {
            throw new Error('No matching pending failover request');
        }
        
        this.pendingFailover.status = 'approved';
        this.pendingFailover.approvedBy = approver;
        
        return this.executeFailover(this.pendingFailover);
    }
    
    async rejectFailover(requestId: string, rejector: string): Promise<void> {
        if (!this.pendingFailover || this.pendingFailover.id !== requestId) {
            throw new Error('No matching pending failover request');
        }
        
        this.pendingFailover.status = 'rejected';
        console.log(`Failover rejected by ${rejector}`);
        
        await this.sendNotification(`Failover REJECTED by ${rejector}. Manual intervention required.`);
        this.pendingFailover = null;
    }
    
    private startAutoApprovalTimer(request: FailoverRequest): void {
        setTimeout(async () => {
            if (this.pendingFailover?.id === request.id && request.status === 'pending') {
                console.warn(`Auto-approving failover after ${this.autoApprovalTimeoutMs}ms timeout`);
                request.status = 'timeout';
                await this.executeFailover(request);
            }
        }, this.autoApprovalTimeoutMs);
    }
    
    private async executeFailover(request: FailoverRequest): Promise<FailoverRequest> {
        console.log(`Executing failover to ${request.candidateReplica}`);
        
        try {
            // 1. Fence old primary (prevent split-brain)
            await this.fenceOldPrimary();
            
            // 2. Promote candidate
            await this.promoteReplica(request.candidateReplica);
            
            // 3. Reconfigure other replicas
            await this.reconfigureReplicas(request.candidateReplica);
            
            // 4. Update routing
            await this.updateRouting(request.candidateReplica);
            
            request.executedAt = new Date();
            request.status = request.status === 'pending' ? 'approved' : request.status;
            
            await this.sendNotification(`Failover COMPLETE to ${request.candidateReplica}`);
            
        } catch (error) {
            await this.sendNotification(`Failover FAILED: ${error}`);
            throw error;
        }
        
        this.pendingFailover = null;
        return request;
    }
    
    private async fenceOldPrimary(): Promise<void> {
        // Implementations vary: revoke network access, kill VM, etc.
    }
    
    private async promoteReplica(replicaId: string): Promise<void> {
        // Database-specific promotion
    }
    
    private async reconfigureReplicas(newPrimaryId: string): Promise<void> {
        // Point remaining replicas to new primary
    }
    
    private async updateRouting(newPrimaryId: string): Promise<void> {
        // Update DNS, VIP, proxy config, etc.
    }
    
    private async sendAlerts(request: FailoverRequest): Promise<void> {
        // PagerDuty, Slack, email, etc.
    }
    
    private async sendNotification(message: string): Promise<void> {
        // Notification channels
    }
}

Application-Level Failover Handling

Database failover doesn't happen in isolation—applications must handle the transition gracefully. Connection pools hold stale connections. In-flight queries need retrying. Routing must update to the new primary.

Application Failover Patterns

•Connection pool refresh — Detect stale connections and replace them. Most pools do this automatically on connection errors, but explicit refresh may speed recovery.
•Retry with backoff — Failed queries during failover should retry with exponential backoff. Brief outage becomes transparent to users if retries succeed.
•DNS-based routing — Use DNS names rather than IPs for database endpoints. DNS TTLs control how quickly applications see routing changes.
•Health-aware routing — Application periodically checks database topology and routes to current primary. Faster than DNS but more complex.
•Graceful degradation — If database is unavailable, serve stale cached data, queue writes for later, or show 'service degraded' message rather than errors.

app-failover-handling.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
// Application-level database failover handling
 
interface DatabaseConfig {
    primaryEndpoint: string;
    replicaEndpoints: string[];
    dnsRefreshMs: number;
}
 
class FailoverAwareConnection {
    private primaryPool: ConnectionPool;
    private replicaPools: ConnectionPool[];
    private config: DatabaseConfig;
    private lastKnownPrimary: string;
    
    constructor(config: DatabaseConfig) {
        this.config = config;
        this.startDnsRefresh();
    }
    
    // Execute with automatic retry on failover
    async executeWithRetry<T>(
        query: string, 
        params: unknown[], 
        options: { 
            isWrite: boolean;
            maxRetries: number;
            baseDelayMs: number;
        }
    ): Promise<T> {
        let lastError: Error | null = null;
        
        for (let attempt = 0; attempt < options.maxRetries; attempt++) {
            try {
                const pool = options.isWrite ? this.primaryPool : this.selectReplicaPool();
                return await pool.query(query, params);
                
            } catch (error) {
                lastError = error as Error;
                
                if (this.isRetryableError(error)) {
                    const delay = options.baseDelayMs * Math.pow(2, attempt);
                    console.warn(`Query failed (attempt ${attempt + 1}), retrying in ${delay}ms`);
                    
                    await this.sleep(delay);
                    
                    // Force connection pool refresh
                    await this.refreshConnectionPools();
                    
                    continue;
                }
                
                throw error; // Non-retryable error
            }
        }
        
        throw new Error(`Query failed after ${options.maxRetries} retries: ${lastError?.message}`);
    }
    
    private isRetryableError(error: unknown): boolean {
        const message = (error as Error).message?.toLowerCase() ?? '';
        
        // Connection-related errors are retryable during failover
        const retryablePatterns = [
            'connection refused',
            'connection reset',
            'connection terminated',
            'cannot connect',
            'server closed',
            'read only',  // Trying to write to replica
            'not primary',
            'timeout',
        ];
        
        return retryablePatterns.some(p => message.includes(p));
    }
    
    private startDnsRefresh(): void {
        setInterval(async () => {
            try {
                const endpoints = await this.resolveDnsEndpoints();
                
                if (endpoints.primary !== this.lastKnownPrimary) {
                    console.log(`Primary endpoint changed: ${this.lastKnownPrimary} -> ${endpoints.primary}`);
                    await this.refreshConnectionPools();
                    this.lastKnownPrimary = endpoints.primary;
                }
            } catch (error) {
                console.error('DNS refresh failed', error);
            }
        }, this.config.dnsRefreshMs);
    }
    
    private async resolveDnsEndpoints(): Promise<{ primary: string; replicas: string[] }> {
        // Resolve DNS to get current endpoints
        // Implementation depends on DNS setup
        return { primary: '', replicas: [] };
    }
    
    private async refreshConnectionPools(): Promise<void> {
        // Close existing connections and create new pools
        await this.primaryPool?.end();
        for (const pool of this.replicaPools) {
            await pool?.end();
        }
        
        // Recreate with current endpoints
        this.primaryPool = new ConnectionPool(await this.getCurrentPrimaryEndpoint());
        this.replicaPools = await this.createReplicaPools();
    }
    
    private selectReplicaPool(): ConnectionPool {
        // Round-robin or other selection
        return this.replicaPools[Math.floor(Math.random() * this.replicaPools.length)];
    }
    
    private sleep(ms: number): Promise<void> {
        return new Promise(resolve => setTimeout(resolve, ms));
    }
}

DNS TTL Considerations

Low DNS TTLs (30-60 seconds) enable faster failover but increase DNS query load. High TTLs (minutes to hours) reduce DNS load but delay failover visibility. Many organizations use low TTLs for database endpoints specifically. Also ensure application DNS caching respects TTLs—some runtimes cache indefinitely by default.

Post-Promotion Operations

Promotion is not the end—it's the beginning of recovery. Several operations must follow to restore full redundancy and prepare for future failures.

Post-Promotion Checklist

•Verify new primary is functioning — Run write tests, check replication to remaining replicas, verify application connectivity.
•Update monitoring and alerting — Ensure monitors point to new primary; update dashboards; verify alerts are routing correctly.
•Investigate failed primary — Determine root cause before attempting to bring it back. Hardware failure? Software bug? Configuration issue?
•Rebuild failed primary as replica — Once root cause is addressed, recreate the old primary as a new replica following the new primary.
•Restore redundancy — If you lost a replica, provision replacement(s). Don't operate at reduced capacity longer than necessary.
•Conduct post-incident review — Document what happened, how long recovery took, what data (if any) was lost, and what improvements could prevent or accelerate future recoveries.
•Update runbooks — Incorporate lessons learned into operational documentation.

Rebuilding the old primary:

The failed primary should not be simply restarted and rejoined. Its data may be divergent (transactions committed after the promotion candidate's last position). Common approaches:

Complete rebuild: Provision a new replica using pg_basebackup (PostgreSQL) or a fresh snapshot from the new primary. Safest approach.

Rewind (PostgreSQL): pg_rewind can rewind a divergent primary to match the current timeline, then replay from the new primary. Faster but requires specific conditions.

GTID-based rejoin (MySQL): With GTID, the old primary can potentially resync by discarding local-only transactions and replaying from the new primary. Requires careful verification.

Never Blindly Rejoin the Old Primary

If the old primary accepted writes after the promotion (split-brain scenario), blindly rejoining could introduce duplicate or conflicting data. Always verify data consistency and use rebuild/rewind techniques rather than simple restart.

Summary: Replica Promotion

Replica promotion is the critical operation that converts a read replica into a primary, restoring write capability after primary failure. Done correctly, it minimizes downtime and data loss.

Key Takeaways

•Detection must balance speed and accuracy — False positives cause unnecessary failovers; slow detection extends outages. Quorum-based detection helps prevent split-brain.
•Select the most advanced replica — The replica with the highest replication position minimizes data loss. Synchronous replicas are preferred when available.
•Promotion mechanics vary by database — PostgreSQL, MySQL, and managed services each have specific procedures. Understand your database's process thoroughly.
•Automated vs. manual is a spectrum — Hybrid approaches combine fast automated detection with human approval, balancing speed and safety.
•Applications must handle failover gracefully — Retry logic, connection pool refresh, and DNS-aware routing prevent brief outages from becoming user-visible errors.
•Post-promotion operations restore redundancy — Verify the new primary, investigate root cause, rebuild the failed node as a replica, and document lessons learned.

Module complete:

This completes the Read Replicas module. You've learned how to offload read traffic, handle replication lag, balance loads across replicas, maintain consistency, and orchestrate failover through replica promotion. These patterns are foundational for building scalable, highly-available SQL database architectures.

Module Complete

You now have comprehensive knowledge of read replica architectures—from basic traffic offloading through sophisticated failover automation. This knowledge enables you to design, implement, and operate database systems that scale to handle substantial read loads while maintaining high availability through effective redundancy and failover strategies.