Loading learning content...
When a user in Tokyo places an order on a global e-commerce platform, that order is processed in Tokyo. When a user in Frankfurt browses the same catalog, they're served from Frankfurt. When both users' orders affect the same inventory item, somehow the system maintains a coherent view of available stock—despite the 9,000 kilometers separating these data centers.
This is active-active multi-region architecture: a topology where all regions serve production traffic simultaneously, each capable of handling any operation for any user. Unlike active-passive, where a single region handles all traffic and others wait in standby, active-active distributes load globally while maintaining data consistency across the planet.
The benefits are compelling: users everywhere experience low latency, capacity scales beyond single-region limits, and the failure of any region is absorbed seamlessly by survivors. But these benefits come at a steep price in architectural and operational complexity. Active-active is not a pattern to adopt lightly—it's the apex of distributed systems design, demanding sophisticated solutions to problems that don't exist in simpler topologies.
By the end of this page, you will understand the two primary active-active patterns (sharded and replicated), design consistency models appropriate for global systems, implement conflict resolution strategies, and navigate the operational challenges inherent in active-active deployments.
Active-active architectures come in two fundamentally different patterns, each with distinct characteristics, tradeoffs, and appropriate use cases.
Pattern 1: Geographically Sharded Active-Active
In this pattern, each region "owns" a subset of data. Ownership is determined by a partitioning scheme—often by user geography, account ID range, or tenant assignment:
This pattern avoids the hardest problems of active-active (multi-writer conflicts) by ensuring that each data item has a single writer. It's simpler to implement and reason about.
Pattern 2: Fully Replicated Active-Active (Multi-Master)
In this pattern, all data is replicated to all regions, and any region can accept writes for any data:
This pattern provides the ultimate in flexibility and failover (any region can handle any request), but introduces the full complexity of distributed consensus and conflict resolution.
Choosing Between Patterns
The choice between sharded and replicated active-active depends on your data access patterns:
Choose Sharded When:
Choose Fully Replicated When:
Most organizations should start with geographically sharded active-active, which delivers the majority of latency and availability benefits without multi-master complexity. Only move to fully replicated when you've proven the need through user behavior data showing significant cross-region access patterns.
In active-active systems, consistency guarantees become a crucial design decision. The CAP theorem imposes real constraints: during network partitions between regions (which happen regularly), you cannot have both perfect consistency and continuous availability. You must choose.
Eventual Consistency
The most common model for active-active systems, eventual consistency guarantees that if no new updates are made to a piece of data, all reads will eventually return the same value. The "eventually" may be milliseconds or seconds, depending on replication lag.
Session Consistency (Read-Your-Writes)
A stronger guarantee where a user is guaranteed to see their own writes, even if other users see older data:
This model handles the most jarring user experience issue—"I just submitted that, where is it?"—without requiring global synchronization.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145
/** * Session Consistency Implementation * * Ensures users see their own writes even in an eventually consistent * multi-region system. Uses write tokens to route reads appropriately. */ interface WriteToken { timestamp: number; region: string; logSequence: string; // Database position marker} interface ReadOptions { userId: string; writeToken?: WriteToken;} class SessionConsistentReader { private readonly localRegion: string; private readonly replicationLagMs: number; constructor(localRegion: string, estimatedReplicationLagMs: number = 500) { this.localRegion = localRegion; this.replicationLagMs = estimatedReplicationLagMs; } /** * Reads data with session consistency. * If the user has a recent write token, ensures the read * sees at least that write. */ async read<T>( table: string, key: string, options: ReadOptions ): Promise<T> { const { writeToken } = options; // No write token: local read is fine (no consistency requirement) if (!writeToken) { return this.localRead(table, key); } // Write token exists: determine if local replica is caught up const tokenAge = Date.now() - writeToken.timestamp; // If write was in local region, local read is always consistent if (writeToken.region === this.localRegion) { return this.localRead(table, key); } // If write was recent (likely not yet replicated), read from write region if (tokenAge < this.replicationLagMs * 2) { return this.crossRegionRead(table, key, writeToken.region); } // Write is old enough that replication likely completed // Verify by checking local replica position against token const localPosition = await this.getLocalReplicaPosition(); if (this.positionIsAfter(localPosition, writeToken.logSequence)) { // Local replica has caught up return this.localRead(table, key); } else { // Local replica is behind: read from write region return this.crossRegionRead(table, key, writeToken.region); } } /** * Writes data and returns a token for session consistency. */ async write<T>( table: string, key: string, value: T, userId: string ): Promise<{ result: T; writeToken: WriteToken }> { // Write to local region (which is primary for this user) const result = await this.localWrite(table, key, value); // Generate write token for session consistency const writeToken: WriteToken = { timestamp: Date.now(), region: this.localRegion, logSequence: await this.getCurrentLogSequence() }; // Store token for user's session await this.storeUserWriteToken(userId, writeToken); return { result, writeToken }; } private async localRead<T>(table: string, key: string): Promise<T> { // Read from local region database return db.region(this.localRegion).read(table, key); } private async localWrite<T>(table: string, key: string, value: T): Promise<T> { // Write to local region database return db.region(this.localRegion).write(table, key, value); } private async crossRegionRead<T>( table: string, key: string, region: string ): Promise<T> { // Read from specified region (higher latency) return db.region(region).read(table, key); } private async getLocalReplicaPosition(): Promise<string> { // Get current replication position from local database const result = await db.region(this.localRegion) .query('SELECT pg_last_wal_replay_lsn()'); return result.rows[0].pg_last_wal_replay_lsn; } private async getCurrentLogSequence(): Promise<string> { // Get current write position const result = await db.region(this.localRegion) .query('SELECT pg_current_wal_lsn()'); return result.rows[0].pg_current_wal_lsn; } private positionIsAfter(current: string, required: string): boolean { // Compare log sequence numbers return current >= required; // Simplified; real impl uses LSN parsing } private async storeUserWriteToken( userId: string, token: WriteToken ): Promise<void> { // Store in distributed cache with TTL await cache.set( `write-token:${userId}`, token, { ttlSeconds: 60 } // Token expires after replication guaranteed complete ); }}Causal Consistency
A stronger model that preserves causality: if operation A happened before operation B (and B could have depended on A), all observers see A before B. This prevents anomalies like seeing a reply before the original post.
Strong Consistency (Global Serialization)
All operations appear to execute in a single global order, and reads always return the most recent write:
Strong consistency in active-active is possible (Google's Spanner proves this with TrueTime), but the latency cost makes it appropriate only for critical operations like financial transactions.
Hybrid Approaches
Real systems often combine consistency models for different data types:
| Data Type | Recommended Model | Rationale |
|---|---|---|
| Account balances | Strong/CP | Financial correctness required |
| Inventory counts | Strong with fallback | Prevent overselling, degrade gracefully |
| User sessions | Session consistency | Users must see own data |
| User profiles | Eventual | Low conflict rate, not business-critical |
| Social feeds | Causal | Conversation order matters |
| Analytics events | Eventual | Aggregated anyway, order not critical |
| Collaborative docs | Causal + CRDT | Real-time sync with order preservation |
In fully replicated active-active systems, conflicts are inevitable. When two regions accept writes to the same data at approximately the same time, the system must detect this conflict and resolve it deterministically.
Understanding Conflicts
A conflict occurs when two operations modify the same data without awareness of each other. Consider:
Without conflict resolution, one update would randomly overwrite the other, potentially losing data.
Conflict Detection Mechanisms
Last-Write-Wins (LWW)
The simplest approach: timestamp each write, accept the one with the latest timestamp.
Version Vectors / Vector Clocks
Track a logical clock per region, detecting concurrent modifications:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260
"""Conflict Resolution Strategies for Multi-Region Active-Active This module implements several conflict resolution approacheswith their tradeoffs and appropriate use cases."""from dataclasses import dataclass, fieldfrom datetime import datetimefrom typing import Optional, Dict, Any, Callablefrom enum import Enumimport json class ConflictStrategy(Enum): LAST_WRITE_WINS = "lww" FIRST_WRITE_WINS = "fww" MERGE = "merge" CUSTOM = "custom" @dataclassclass VersionVector: """ Vector clock for tracking causality across regions. Each region maintains its own logical timestamp. """ clocks: Dict[str, int] = field(default_factory=dict) def increment(self, region: str) -> 'VersionVector': """Increment the clock for a region (on write).""" new_clocks = self.clocks.copy() new_clocks[region] = new_clocks.get(region, 0) + 1 return VersionVector(new_clocks) def merge(self, other: 'VersionVector') -> 'VersionVector': """Merge two vectors (take max of each component).""" regions = set(self.clocks.keys()) | set(other.clocks.keys()) merged = { r: max(self.clocks.get(r, 0), other.clocks.get(r, 0)) for r in regions } return VersionVector(merged) def is_concurrent_with(self, other: 'VersionVector') -> bool: """ Check if two version vectors are concurrent (neither dominates). This indicates a conflict. """ self_newer = False other_newer = False regions = set(self.clocks.keys()) | set(other.clocks.keys()) for region in regions: self_val = self.clocks.get(region, 0) other_val = other.clocks.get(region, 0) if self_val > other_val: self_newer = True if other_val > self_val: other_newer = True return self_newer and other_newer def dominates(self, other: 'VersionVector') -> bool: """Check if self happened after other (self dominates).""" if self.is_concurrent_with(other): return False regions = set(self.clocks.keys()) | set(other.clocks.keys()) return all( self.clocks.get(r, 0) >= other.clocks.get(r, 0) for r in regions ) @dataclassclass VersionedValue: """A value with version tracking for conflict detection.""" value: Any version: VersionVector timestamp: datetime origin_region: str class ConflictResolver: """ Handles conflict resolution for multi-region replication. """ def __init__(self, local_region: str): self.local_region = local_region self.custom_resolvers: Dict[str, Callable] = {} def register_custom_resolver( self, entity_type: str, resolver: Callable[[VersionedValue, VersionedValue], VersionedValue] ): """Register a custom conflict resolver for an entity type.""" self.custom_resolvers[entity_type] = resolver def resolve( self, entity_type: str, local: VersionedValue, remote: VersionedValue, strategy: ConflictStrategy = ConflictStrategy.LAST_WRITE_WINS ) -> VersionedValue: """ Resolve conflict between local and remote versions. Returns the winning value. """ # Check if there's actually a conflict if local.version.dominates(remote.version): return local # Local is newer, no conflict if remote.version.dominates(local.version): return remote # Remote is newer, no conflict # Concurrent versions: need resolution if strategy == ConflictStrategy.LAST_WRITE_WINS: return self._resolve_lww(local, remote) elif strategy == ConflictStrategy.FIRST_WRITE_WINS: return self._resolve_fww(local, remote) elif strategy == ConflictStrategy.MERGE: return self._resolve_merge(local, remote) elif strategy == ConflictStrategy.CUSTOM: if entity_type in self.custom_resolvers: return self.custom_resolvers[entity_type](local, remote) raise ValueError(f"No custom resolver for {entity_type}") def _resolve_lww( self, local: VersionedValue, remote: VersionedValue ) -> VersionedValue: """Last-Write-Wins: use timestamp, break ties with region name.""" if local.timestamp > remote.timestamp: winner = local elif remote.timestamp > local.timestamp: winner = remote else: # Tie-breaker: deterministic region ordering winner = local if local.origin_region < remote.origin_region else remote # Merge version vectors for accurate causality tracking return VersionedValue( value=winner.value, version=local.version.merge(remote.version), timestamp=max(local.timestamp, remote.timestamp), origin_region=winner.origin_region ) def _resolve_fww( self, local: VersionedValue, remote: VersionedValue ) -> VersionedValue: """First-Write-Wins: preserve the original value.""" if local.timestamp < remote.timestamp: winner = local elif remote.timestamp < local.timestamp: winner = remote else: winner = local if local.origin_region < remote.origin_region else remote return VersionedValue( value=winner.value, version=local.version.merge(remote.version), timestamp=winner.timestamp, origin_region=winner.origin_region ) def _resolve_merge( self, local: VersionedValue, remote: VersionedValue ) -> VersionedValue: """ Merge strategy: attempt to combine values. Works for additive operations (sets, counters). """ # Handle different merge scenarios if isinstance(local.value, set) and isinstance(remote.value, set): merged_value = local.value | remote.value elif isinstance(local.value, dict) and isinstance(remote.value, dict): # Deep merge for dictionaries merged_value = self._deep_merge(local.value, remote.value) elif isinstance(local.value, (int, float)) and isinstance(remote.value, (int, float)): # For counters, this is tricky - need delta-based approach merged_value = max(local.value, remote.value) else: # Fall back to LWW for non-mergeable types return self._resolve_lww(local, remote) return VersionedValue( value=merged_value, version=local.version.merge(remote.version), timestamp=max(local.timestamp, remote.timestamp), origin_region=self.local_region ) def _deep_merge(self, dict1: dict, dict2: dict) -> dict: """Deep merge two dictionaries, handling nested conflicts.""" result = dict1.copy() for key, value in dict2.items(): if key in result: if isinstance(result[key], dict) and isinstance(value, dict): result[key] = self._deep_merge(result[key], value) else: # Conflict at leaf: take later value (could be configurable) result[key] = value else: result[key] = value return result # CRDT Example: G-Counter (Grow-only counter)@dataclassclass GCounter: """ A grow-only counter CRDT. Can be incremented in any region without coordination. Merge always converges to the correct total. """ counts: Dict[str, int] = field(default_factory=dict) def increment(self, region: str, amount: int = 1) -> 'GCounter': """Increment the counter in a specific region.""" new_counts = self.counts.copy() new_counts[region] = new_counts.get(region, 0) + amount return GCounter(new_counts) def value(self) -> int: """Get the total count across all regions.""" return sum(self.counts.values()) def merge(self, other: 'GCounter') -> 'GCounter': """Merge two counters (take max from each region).""" regions = set(self.counts.keys()) | set(other.counts.keys()) merged = { r: max(self.counts.get(r, 0), other.counts.get(r, 0)) for r in regions } return GCounter(merged) # Usage exampleif __name__ == "__main__": # Simulate concurrent counter updates in two regions counter_us = GCounter() counter_eu = GCounter() # US increments 3 times counter_us = counter_us.increment("us-east", 3) # EU increments 5 times (concurrently) counter_eu = counter_eu.increment("eu-west", 5) # After replication, both regions merge counter_us = counter_us.merge(counter_eu) counter_eu = counter_eu.merge(counter_us) print(f"US counter: {counter_us.value()}") # 8 print(f"EU counter: {counter_eu.value()}") # 8 # Both converge to 8 without coordination!Conflict-Free Replicated Data Types (CRDTs)
CRDTs are data structures mathematically designed to merge without conflicts. They guarantee that all replicas converge to the same state, regardless of the order operations are applied.
Common CRDT types:
CRDTs are ideal for data that can be modeled as counters, sets, or maps. They enable strong eventual consistency: replicas may temporarily diverge, but they always converge to the same state.
Application-Level Resolution
For complex domain objects, automatic resolution may be insufficient. Application-level resolution surfaces conflicts to users or applies domain-specific logic:
Every resolved conflict represents potential data loss—one write was preferred over another. Audit conflict resolution: log all resolutions, monitor conflict rates, and alert when they spike. High conflict rates often indicate user experience issues or design problems.
Active-active architectures introduce challenges that rarely appear in single-region or active-passive systems. These must be addressed during design, not discovered in production.
ID Generation
In single-region systems, auto-incrementing database IDs work perfectly. In active-active, concurrent inserts in different regions would produce collisions. Solutions:
UUIDs (v4): Random 128-bit identifiers with negligible collision probability
ULIDs / UUIDs (v7): Time-sorted universally unique identifiers
Snowflake IDs: Twitter's approach with timestamp + machine ID + sequence
Range-Prefixed IDs: Each region gets a prefix range (e.g., US: 1-1B, EU: 1B-2B)
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108
/** * Distributed ID Generation Strategies for Active-Active */ // Strategy 1: ULID - Universally Unique Lexicographically Sortable Identifier// Format: 01ARZ3NDEKTSV4RRFFQ69G5FAV (26 characters, Crockford Base32)// First 10 chars = timestamp, Last 16 chars = randomness import { ulid, decodeTime } from 'ulid'; class ULIDGenerator { generate(): string { return ulid(); // e.g., "01ARZ3NDEKTSV4RRFFQ69G5FAV" } getTimestamp(id: string): Date { const timestamp = decodeTime(id); return new Date(timestamp); }} // Strategy 2: Snowflake-inspired ID// 64-bit: [1 bit unused][41 bits timestamp][10 bits region+machine][12 bits sequence] class SnowflakeGenerator { private readonly epoch = 1609459200000n; // Jan 1, 2021 private readonly regionId: bigint; private readonly machineId: bigint; private sequence: bigint = 0n; private lastTimestamp: bigint = 0n; constructor(regionId: number, machineId: number) { // 5 bits for region (32 regions), 5 bits for machine (32 per region) if (regionId < 0 || regionId > 31) { throw new Error('Region ID must be 0-31'); } if (machineId < 0 || machineId > 31) { throw new Error('Machine ID must be 0-31'); } this.regionId = BigInt(regionId); this.machineId = BigInt(machineId); } generate(): bigint { let timestamp = BigInt(Date.now()) - this.epoch; if (timestamp === this.lastTimestamp) { // Same millisecond: increment sequence this.sequence = (this.sequence + 1n) & 0xFFFn; // 12 bits if (this.sequence === 0n) { // Sequence exhausted, wait for next millisecond while (timestamp <= this.lastTimestamp) { timestamp = BigInt(Date.now()) - this.epoch; } } } else { this.sequence = 0n; } this.lastTimestamp = timestamp; // Compose ID: // timestamp (41 bits) | region (5 bits) | machine (5 bits) | sequence (12 bits) return (timestamp << 22n) | (this.regionId << 17n) | (this.machineId << 12n) | this.sequence; } parse(id: bigint): { timestamp: Date; regionId: number; machineId: number; sequence: number } { return { timestamp: new Date(Number((id >> 22n) + this.epoch)), regionId: Number((id >> 17n) & 0x1Fn), machineId: Number((id >> 12n) & 0x1Fn), sequence: Number(id & 0xFFFn) }; }} // Strategy 3: Region-Prefixed with Local Sequenceclass RegionPrefixedGenerator { private readonly regionPrefix: string; private sequence: number = 0; // Region prefixes provide namespace isolation private static readonly REGION_PREFIXES: Record<string, string> = { 'us-east': 'USE', 'us-west': 'USW', 'eu-west': 'EUW', 'ap-northeast': 'APN' }; constructor(region: string) { const prefix = RegionPrefixedGenerator.REGION_PREFIXES[region]; if (!prefix) { throw new Error(`Unknown region: ${region}`); } this.regionPrefix = prefix; } generate(): string { const timestamp = Date.now().toString(36); // Base36 timestamp const seq = (this.sequence++).toString(36).padStart(4, '0'); const random = Math.random().toString(36).substring(2, 6); return `${this.regionPrefix}-${timestamp}-${seq}-${random}`; // e.g., "USE-lpq5k8z-0001-a7b2" }}Time Synchronization
Many active-active patterns rely on timestamps: last-write-wins, conflict detection, ordering. But clock drift between regions can cause incorrect ordering. Solutions:
Cross-Region Transactions
Some operations must be atomic across regions—transferring money between accounts in different regions, for example. Options:
Avoid cross-region transactions whenever possible. When unavoidable, accept the latency cost and design for partial failures.
Hit Rate and Cache Warming
In active-active, cache effectiveness becomes region-specific:
Strategies:
Active-active bugs are often timing-dependent and difficult to reproduce. Two users updating a document in different regions: 99% of the time it works because latency separates them. But occasionally, their updates collide, and your conflict resolution has a subtle bug. These issues lurk for months before manifesting during a critical moment.
Running active-active in production requires operational practices far beyond single-region or active-passive deployments. The combination of continuous traffic in all regions with cross-region data dependencies creates a uniquely challenging operational environment.
Deployment Strategies
Deploying to active-active systems requires careful orchestration:
Sequential Rolling Deployment
Parallel Deployment with Canary per Region
Feature Flags for Regional Rollout
The Schema Change Problem
Database schema changes in active-active are particularly challenging:
All schema changes must be backward-compatible, as regions run different code versions during deployments.
Cross-Region Observability
Monitoring active-active requires correlating data across regions:
Incident Response Complexity
Incidents in active-active systems are harder to diagnose and resolve:
On-Call Requirements
Active-active typically requires:
Active-active demands more communication: cross-region standups, global deployment coordination, shared incident review. Budget 20-30% additional engineering overhead for coordination activities alone. This isn't waste—it's necessary investment in system coherence.
We've explored the apex of multi-region architecture: active-active systems that serve traffic from all regions simultaneously. Let's consolidate the key principles:
What's Next
We've examined the architectural patterns. The next two pages dive into the critical enabling technologies: data replication across regions (the mechanisms that synchronize data) and traffic routing (how requests reach the right region). These technologies make multi-region possible.
You now understand active-active multi-region architecture—its patterns, consistency models, conflict resolution strategies, implementation challenges, and operational requirements. This is the most complex multi-region topology, and mastering it positions you to design systems at global scale.