Data Sync Patterns - Learning Module

Loading content...

0/273

Async Replication Patterns

When Consistency Can Wait

Synchronous protocols like 2PC and consensus algorithms provide strong consistency, but at a cost: latency increases with network distance, and availability is bounded by the slowest participant. For many applications—social media feeds, shopping carts, user preferences—eventual consistency with high availability is the better trade-off.

Asynchronous replication decouples the write acknowledgment from replication. The primary accepts writes immediately and replicates to followers in the background. This enables:

Sub-millisecond write latency (no waiting for replicas)
Continued operation during network partitions
Geographic distribution without geographic latency

But it introduces replication lag and the possibility of data loss if the primary fails before replication completes. Understanding these trade-offs is essential for building systems that balance performance, availability, and durability.

What You Will Learn

By the end of this page, you will understand primary-replica replication, semi-synchronous replication, multi-primary replication, conflict resolution strategies, and how to choose the right replication strategy for your system's requirements.

Primary-Replica (Single-Leader) Replication

The most common replication pattern: one node is the primary (leader), accepting all writes. Changes flow asynchronously to replicas (followers), which serve read requests.

┌─────────────────────────────────────────────────────────────┐
│                   PRIMARY-REPLICA REPLICATION               │
│                                                              │
│   ┌─────────────┐                                           │
│   │   PRIMARY   │◄───── All writes go here                  │
│   │   (Leader)  │                                           │
│   └──────┬──────┘                                           │
│          │                                                   │
│          │ Async replication (binlog/WAL streaming)         │
│          │                                                   │
│   ┌──────┴──────┬──────────────┬──────────────┐            │
│   ▼             ▼              ▼              ▼            │
│ ┌───────┐  ┌───────┐     ┌───────┐     ┌───────┐          │
│ │Replica│  │Replica│     │Replica│     │Replica│          │
│ │   1   │  │   2   │     │   3   │     │   4   │          │
│ └───────┘  └───────┘     └───────┘     └───────┘          │
│                                                              │
│ ◄────────────── Read requests distributed here ────────────►│
└─────────────────────────────────────────────────────────────┘

primary-replica.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
// Primary-Replica Async Replication Implementation
 
interface ReplicationEvent {
    sequenceNumber: number;
    timestamp: Date;
    operation: 'INSERT' | 'UPDATE' | 'DELETE';
    table: string;
    data: Record<string, any>;
}
 
class PrimaryNode {
    private sequenceNumber: number = 0;
    private writeAheadLog: ReplicationEvent[] = [];
    private replicas: ReplicaNode[] = [];
    private replicationQueue: ReplicationEvent[] = [];
 
    async write(operation: WriteOperation): Promise<WriteResult> {
        // 1. Execute write locally
        const result = await this.executeLocally(operation);
        
        // 2. Append to WAL (durably before acknowledging)
        const event = this.createReplicationEvent(operation);
        await this.appendToWAL(event);
        
        // 3. Acknowledge to client IMMEDIATELY (async replication)
        // Replication happens in background
        
        // 4. Queue for async replication
        this.replicationQueue.push(event);
        this.triggerAsyncReplication();
        
        return result; // Client doesn't wait for replication
    }
 
    private async triggerAsyncReplication(): Promise<void> {
        // Non-blocking: replicate in background
        setImmediate(async () => {
            while (this.replicationQueue.length > 0) {
                const event = this.replicationQueue.shift()!;
                
                // Fan out to all replicas (fire and forget)
                await Promise.allSettled(
                    this.replicas.map(replica => 
                        replica.receiveReplicationEvent(event)
                    )
                );
            }
        });
    }
 
    getReplicationLag(): Map<string, number> {
        const lags = new Map<string, number>();
        
        for (const replica of this.replicas) {
            const lastApplied = replica.getLastAppliedSequence();
            const lag = this.sequenceNumber - lastApplied;
            lags.set(replica.id, lag);
        }
        
        return lags;
    }
}
 
class ReplicaNode {
    id: string;
    private lastAppliedSequence: number = 0;
    private replicationBuffer: ReplicationEvent[] = [];
 
    async receiveReplicationEvent(event: ReplicationEvent): Promise<void> {
        // Buffer events that arrive out of order
        this.replicationBuffer.push(event);
        this.replicationBuffer.sort((a, b) => a.sequenceNumber - b.sequenceNumber);
        
        // Apply events in order
        while (this.replicationBuffer.length > 0) {
            const next = this.replicationBuffer[0];
            
            if (next.sequenceNumber === this.lastAppliedSequence + 1) {
                await this.applyEvent(next);
                this.lastAppliedSequence = next.sequenceNumber;
                this.replicationBuffer.shift();
            } else if (next.sequenceNumber <= this.lastAppliedSequence) {
                // Duplicate - discard
                this.replicationBuffer.shift();
            } else {
                // Gap - wait for missing events
                break;
            }
        }
    }
 
    getLastAppliedSequence(): number {
        return this.lastAppliedSequence;
    }
}

Async Replication Trade-offs

•Durability Risk — If primary fails before replication, writes are lost. Durability = number of replicas that have acknowledged.
•Replication Lag — Replicas may serve stale data. Read-your-writes requires routing reads to primary or tracking lag.
•Failover Complexity — New primary election may have different data than old primary—causing conflicts or data loss.
•Performance Benefit — Write latency is local-only; replica count doesn't affect write performance.

Semi-Synchronous Replication

Pure async replication risks data loss; fully synchronous replication sacrifices availability. Semi-synchronous replication is a middle ground: wait for at least one replica to acknowledge before confirming to the client.

Configurations:

Mode	Wait For	Durability	Latency	Availability
Async	None	1 copy	Lowest	Highest
Semi-sync (1 replica)	1 replica	2 copies	Medium	High
Semi-sync (majority)	N/2 replicas	N/2+1 copies	Higher	Medium
Fully sync	All replicas	All copies	Highest	Lowest

semi-sync.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
// Semi-Synchronous Replication Implementation
 
type SyncMode = 'ASYNC' | 'SEMI_SYNC_ONE' | 'SEMI_SYNC_MAJORITY' | 'FULLY_SYNC';
 
class SemiSyncPrimary {
    private replicas: ReplicaNode[] = [];
    private syncMode: SyncMode = 'SEMI_SYNC_ONE';
    private syncTimeout: number = 500; // ms
 
    async write(operation: WriteOperation): Promise<WriteResult> {
        // Execute locally and log
        const result = await this.executeAndLog(operation);
        const event = this.getLastEvent();
 
        // Determine how many acks we need
        const requiredAcks = this.calculateRequiredAcks();
 
        if (requiredAcks === 0) {
            // Async mode - fire and forget
            this.replicateAsync(event);
            return result;
        }
 
        // Wait for required acknowledgments (with timeout)
        try {
            await this.waitForAcks(event, requiredAcks, this.syncTimeout);
        } catch (error) {
            // Timeout - decide what to do
            if (this.shouldFallbackToAsync()) {
                console.warn('Semi-sync timeout - falling back to async');
                // Write is durable on primary, proceed anyway
            } else {
                throw new ReplicationTimeoutError('Failed to replicate');
            }
        }
 
        return result;
    }
 
    private async waitForAcks(
        event: ReplicationEvent,
        required: number,
        timeoutMs: number
    ): Promise<void> {
        return new Promise((resolve, reject) => {
            let ackCount = 0;
            const timeout = setTimeout(() => {
                reject(new Error('Replication timeout'));
            }, timeoutMs);
 
            // Send to all replicas in parallel
            this.replicas.forEach(async (replica) => {
                try {
                    await replica.receiveAndAck(event);
                    ackCount++;
                    
                    if (ackCount >= required) {
                        clearTimeout(timeout);
                        resolve();
                    }
                } catch (error) {
                    // Replica failed - others might succeed
                }
            });
        });
    }
 
    private calculateRequiredAcks(): number {
        switch (this.syncMode) {
            case 'ASYNC': return 0;
            case 'SEMI_SYNC_ONE': return 1;
            case 'SEMI_SYNC_MAJORITY': return Math.floor(this.replicas.length / 2) + 1;
            case 'FULLY_SYNC': return this.replicas.length;
        }
    }
}

MySQL & PostgreSQL Semi-Sync

MySQL's semi-synchronous replication waits for one replica ACK. PostgreSQL's synchronous_commit can be configured per-transaction, allowing critical transactions to wait for replicas while routine operations proceed asynchronously.

Multi-Primary (Multi-Leader) Replication

In multi-primary replication, multiple nodes accept writes, each replicating to the others. This enables:

Geographic write locality — Users write to nearby datacenters
Write availability — System continues if any primary is up
Offline operation — Clients can write locally, sync later

But it introduces the fundamental challenge: write conflicts. Two primaries might concurrently modify the same record.

┌─────────────────────────────────────────────────────────────────┐
│                   MULTI-PRIMARY REPLICATION                      │
│                                                                   │
│   US-EAST                              EU-WEST                   │
│  ┌─────────────┐                     ┌─────────────┐             │
│  │  Primary 1  │◄───────────────────►│  Primary 2  │             │
│  │  (Writes)   │   Bidirectional     │  (Writes)   │             │
│  └──────┬──────┘   Replication       └──────┬──────┘             │
│         │                                    │                    │
│  ┌──────┴──────┐                     ┌──────┴──────┐             │
│  │  Replicas   │                     │  Replicas   │             │
│  └─────────────┘                     └─────────────┘             │
│                                                                   │
│   Users write to         ◄──CONFLICT──►      Users write to     │
│   nearest primary                            nearest primary     │
└─────────────────────────────────────────────────────────────────┘

multi-primary.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
// Multi-Primary Replication with Conflict Detection
 
interface VectorClock {
    [nodeId: string]: number;
}
 
interface VersionedRecord {
    data: Record<string, any>;
    vectorClock: VectorClock;
    lastModifiedBy: string;
}
 
class MultiPrimaryNode {
    id: string;
    private vectorClock: VectorClock = {};
    private store: Map<string, VersionedRecord> = new Map();
    private pendingConflicts: Map<string, VersionedRecord[]> = new Map();
 
    async write(key: string, data: Record<string, any>): Promise<WriteResult> {
        // Increment our position in the vector clock
        this.vectorClock[this.id] = (this.vectorClock[this.id] || 0) + 1;
 
        const record: VersionedRecord = {
            data,
            vectorClock: { ...this.vectorClock },
            lastModifiedBy: this.id
        };
 
        this.store.set(key, record);
        
        // Replicate to other primaries
        await this.replicateToPeers(key, record);
        
        return { success: true, version: record.vectorClock };
    }
 
    async receiveReplication(key: string, incomingRecord: VersionedRecord): Promise<void> {
        const existing = this.store.get(key);
 
        if (!existing) {
            // No conflict - just store
            this.store.set(key, incomingRecord);
            this.mergeVectorClock(incomingRecord.vectorClock);
            return;
        }
 
        const relationship = this.compareVectorClocks(
            existing.vectorClock,
            incomingRecord.vectorClock
        );
 
        switch (relationship) {
            case 'BEFORE':
                // Incoming is newer - replace
                this.store.set(key, incomingRecord);
                this.mergeVectorClock(incomingRecord.vectorClock);
                break;
 
            case 'AFTER':
                // Existing is newer - ignore incoming
                break;
 
            case 'CONCURRENT':
                // CONFLICT! Both modified concurrently
                await this.handleConflict(key, existing, incomingRecord);
                break;
        }
    }
 
    private compareVectorClocks(
        a: VectorClock, 
        b: VectorClock
    ): 'BEFORE' | 'AFTER' | 'CONCURRENT' {
        let aBeforeB = false;
        let bBeforeA = false;
 
        const allNodes = new Set([...Object.keys(a), ...Object.keys(b)]);
 
        for (const node of allNodes) {
            const aVal = a[node] || 0;
            const bVal = b[node] || 0;
 
            if (aVal < bVal) aBeforeB = true;
            if (bVal < aVal) bBeforeA = true;
        }
 
        if (aBeforeB && !bBeforeA) return 'BEFORE';
        if (bBeforeA && !aBeforeB) return 'AFTER';
        return 'CONCURRENT'; // Both have changes the other doesn't
    }
 
    private async handleConflict(
        key: string,
        local: VersionedRecord,
        incoming: VersionedRecord
    ): Promise<void> {
        // Strategy 1: Last-Write-Wins (simple but loses data)
        // Strategy 2: Store both versions, let application resolve
        // Strategy 3: Automatic merge (if data structure supports it)
 
        // Example: Store both for later resolution
        this.pendingConflicts.set(key, [local, incoming]);
        
        console.warn(`Conflict detected for key ${key}`);
    }
}

Common Conflict Resolution Strategies

•Last-Write-Wins (LWW) — Highest timestamp wins. Simple but loses data. Requires synchronized clocks.
•First-Write-Wins — First version persists. Useful for immutable-style data.
•Merge Function — Application-specific logic combines conflicting values (e.g., CRDT merge).
•Application Resolution — Store all versions; let the application or user decide.
•Operational Transformation — Transform concurrent operations to both apply (used in collaborative editing).

Replication Topologies

How replication flows between nodes matters for latency, fault tolerance, and conflict detection.

Multi-Primary Replication Topologies
Topology	Description	Pros	Cons
Circular	A→B→C→A	Low overhead, simple	Single failure breaks chain
Star	Hub fans out to all	Simple routing	Hub is bottleneck/SPOF
All-to-All	Every node to every node	Most fault tolerant	O(n²) connections, conflict complexity

Topology in Practice

Most multi-datacenter deployments use all-to-all for fault tolerance, combined with conflict resolution strategies (usually LWW for simplicity). MySQL Group Replication and CockroachDB use variations of this approach.

Summary and Looking Ahead

Key Takeaways

•Async replication decouples write acknowledgment from replication for lower latency
•Primary-replica is simplest but has single point of write failure
•Semi-synchronous balances durability and latency by waiting for some replicas
•Multi-primary enables geographic write locality but introduces conflicts
•Vector clocks detect concurrent writes; resolution strategies handle conflicts

What's Next:

The final page brings everything together with a decision framework for choosing synchronization approaches. We'll explore how to evaluate your system's requirements and select the right replication strategy.

Page Complete

You now understand async replication patterns—from simple primary-replica to complex multi-primary with conflict resolution. These patterns form the backbone of highly available, globally distributed systems.