System Design (HLD)Data Ownership

Data Ownership in Microservices

LevelAdvanced

Duration90 mins

TopicData Ownership

3 / 5

Event-Driven Data Sync

The Backbone of Distributed Data

In the previous page, we established that data duplication is often necessary in microservices. The question becomes: how do we keep those copies synchronized? The answer, in most modern systems, is event-driven data synchronization.

The fundamental idea is simple: when the owner of data changes it, the owner publishes an event. Services that maintain copies subscribe to these events and update their local views accordingly. There's no direct coupling between producer and consumer—only a shared understanding of event structure.

This pattern enables the loose coupling that microservices promise. The Customer Service doesn't know which services consume customer events or what they do with them. The Order Service doesn't care when Customer Service publishes—it just updates when events arrive. Each service evolves independently.

What You Will Learn

By the end of this page, you will understand the mechanics of event-driven synchronization, event design patterns for data sync, handling event ordering and idempotency, building reliable event consumers, and the infrastructure considerations for production event systems.

Event-Driven Synchronization Fundamentals

Event-driven data synchronization follows a publish-subscribe pattern. The data owner (publisher) emits events when data changes. Services that need copies (subscribers) consume events and update their local state.

The flow:

1. Customer Service updates customer email
2. Customer Service writes to its database
3. Customer Service publishes CustomerEmailChanged event
4. Event broker (Kafka, RabbitMQ, etc.) stores and delivers event
5. Order Service receives event
6. Order Service updates its local customer view
7. Notification Service receives event
8. Notification Service updates its contact preferences

Each step is asynchronous. The Customer Service doesn't wait for consumers to process. Consumers process at their own pace. If a consumer is down, events queue until it recovers.

Components of Event-Driven Sync
Component	Role	Responsibility
Publisher	Data owner service	Emit events after state changes; ensure at-least-once delivery to broker
Event Broker	Message infrastructure	Store events durably; deliver to all subscribers; maintain ordering
Subscriber	Consumer service	Process events idempotently; update local state; handle failures
Event	Data change description	Carry enough information for consumers to update; include metadata
Consumer Group	Subscriber instances	Allow scaling consumers; balance load; track processing position

Why events, not direct updates?

You might wonder: why doesn't Customer Service just call Order Service directly when customer data changes? Several reasons:

Coupling — Direct calls create runtime dependencies; if Order Service is down, Customer update fails
Scaling — Adding a new consumer requires updating the publisher; events allow unlimited consumers
Resilience — Events are durable; consumers can catch up after downtime; direct calls must retry
Decoupling — Publisher doesn't need to know what consumers exist or what they do
Replay — Event history allows reconstructing state; direct calls are fire-and-forget

The Eventual in Eventual Consistency

Event-driven sync is inherently eventually consistent. There's a window between the publisher writing and the subscriber updating. This window is typically milliseconds to seconds in healthy systems, but can extend during backpressure or failures. Your design must tolerate this latency.

Designing Events for Data Synchronization

Event design significantly impacts how well your sync works. Poor event design leads to lost updates, duplicate processing, and debugging nightmares. Good design makes sync reliable and understandable.

Event Content: Thin vs. Fat Events

Thin Events contain minimal information—typically just an identifier and the type of change.

{
  "type": "CustomerUpdated",
  "customerId": "cust_123",
  "timestamp": "2024-01-15T10:30:00Z"
}

Consumers must call the source service to get current data.

Fat Events contain the full entity state or all changed fields.

{
  "type": "CustomerUpdated",
  "customerId": "cust_123",
  "timestamp": "2024-01-15T10:30:00Z",
  "data": {
    "name": "Jane Doe",
    "email": "jane.new@example.com",
    "status": "active"
  }
}

Consumers have all needed information without additional calls.

Thin Events

•✓ Smaller event payload
•✓ No coupling to data schema
•✓ Consumers always get current data
•✗ Requires API call per event
•✗ Higher load on source service
•✗ Source must be available during processing

Fat Events

•✓ No additional API calls needed
•✓ Full sync without source availability
•✓ Self-contained for processing
•✗ Larger event payloads
•✗ Schema coupling via events
•✗ May include data consumers don't need

Recommendation for data sync: Use fat events. For synchronization purposes, fat events are generally preferred because:

Consumer doesn't need source to be available
Processing is faster (no network call)
Replay fully reconstructs state without source
Event is a complete record of what changed

Essential Event Metadata

Every sync event should include:

Field	Purpose	Example
`eventId`	Unique identifier for deduplication	`evt_a1b2c3d4`
`eventType`	Distinguishes event types	`CustomerCreated`
`aggregateId`	The entity this event concerns	`cust_123`
`timestamp`	When the change occurred	ISO 8601 date
`version`	Entity version after change	`15`
`correlationId`	Trace across services	Request ID
`causationId`	What triggered this event	Previous event ID

event-structure-example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
// ===================================================
// COMPREHENSIVE EVENT STRUCTURE FOR DATA SYNC
// ===================================================
// A well-designed event includes everything consumers
// need for reliable, idempotent processing.
// ===================================================
 
interface BaseEvent {
  // Unique event identifier - for deduplication
  eventId: string;
  
  // Event type discriminator
  eventType: string;
  
  // The entity this event describes
  aggregateType: string;
  aggregateId: string;
  
  // Ordering within the entity - critical for sync
  version: number;
  
  // Timing information
  timestamp: Date;
  
  // Tracing
  correlationId: string;  // Original request ID
  causationId: string;    // Event that caused this event
  
  // Source information
  sourceService: string;
  sourceInstance: string;
}
 
interface CustomerCreatedEvent extends BaseEvent {
  eventType: 'CustomerCreated';
  aggregateType: 'Customer';
  
  data: {
    email: string;
    name: string;
    status: 'active' | 'pending';
    createdAt: Date;
  };
}
 
interface CustomerUpdatedEvent extends BaseEvent {
  eventType: 'CustomerUpdated';
  aggregateType: 'Customer';
  
  // Include both old and new values for debugging
  // (optional but helpful)
  changes: Array<{
    field: string;
    oldValue: unknown;
    newValue: unknown;
  }>;
  
  // Full current state after update
  data: {
    email: string;
    name: string;
    status: 'active' | 'suspended' | 'deleted';
    updatedAt: Date;
  };
}
 
interface CustomerDeletedEvent extends BaseEvent {
  eventType: 'CustomerDeleted';
  aggregateType: 'Customer';
  
  // Deletion reason for audit
  reason: string;
  deletedBy: string;
}
 
// Type union for type-safe handling
type CustomerEvent = 
  | CustomerCreatedEvent 
  | CustomerUpdatedEvent 
  | CustomerDeletedEvent;
 
// ===================================================
// EVENT PUBLISHING EXAMPLE
// ===================================================
 
class CustomerService {
  private eventPublisher: EventPublisher;
 
  async updateCustomer(
    customerId: string,
    updates: CustomerUpdates
  ): Promise<Customer> {
    const customer = await this.repository.findById(customerId);
    
    // Track changes for the event
    const changes = this.calculateChanges(customer, updates);
    
    // Apply updates
    const oldVersion = customer.version;
    Object.assign(customer, updates);
    customer.version += 1;
    customer.updatedAt = new Date();
    
    // Persist
    await this.repository.save(customer);
    
    // Publish comprehensive event
    await this.eventPublisher.publish<CustomerUpdatedEvent>({
      eventId: generateEventId(),
      eventType: 'CustomerUpdated',
      aggregateType: 'Customer',
      aggregateId: customerId,
      version: customer.version,  // NEW version
      timestamp: customer.updatedAt,
      correlationId: getCurrentCorrelationId(),
      causationId: getCausationId(),
      sourceService: 'customer-service',
      sourceInstance: getInstanceId(),
      changes: changes,
      data: {
        email: customer.email,
        name: customer.name,
        status: customer.status,
        updatedAt: customer.updatedAt,
      },
    });
    
    return customer;
  }
  
  private calculateChanges(
    before: Customer,
    updates: CustomerUpdates
  ): Array<{ field: string; oldValue: unknown; newValue: unknown }> {
    const changes = [];
    for (const [key, value] of Object.entries(updates)) {
      if (before[key] !== value) {
        changes.push({
          field: key,
          oldValue: before[key],
          newValue: value,
        });
      }
    }
    return changes;
  }
}

Event Ordering and Idempotent Processing

Two challenges haunt event-driven synchronization: ordering and duplicate processing. Understanding and handling these is essential for reliable sync.

The Ordering Problem

Events for the same entity can arrive out of order due to:

Network latency variations
Consumer failures and retries
Parallel publishing
Partition rebalancing

Example:

Customer created (version 1)
Customer email updated (version 2)
Customer name updated (version 3)

If consumer receives in order 1 → 3 → 2, applying version 2 after version 3 would revert the name change!

Solution: Version-Based Updates

When processing an event:

Check local entity version
Only apply if event version > local version
Discard events with lower versions

version-based-update
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
// ===================================================
// VERSION-BASED ORDERING PROTECTION
// ===================================================
// Only apply events that move the version forward.
// Stale events are safely ignored.
// ===================================================
 
interface LocalCustomerView {
  customerId: string;
  name: string;
  email: string;
  version: number;          // Track which version we have
  lastEventId: string;      // Track last processed event
  lastUpdated: Date;
}
 
class CustomerViewUpdater {
  private repository: LocalCustomerViewRepository;
 
  async handleCustomerEvent(event: CustomerEvent): Promise<void> {
    const local = await this.repository.findById(event.aggregateId);
    
    if (local) {
      // ORDERING CHECK: Only apply if event advances version
      if (event.version <= local.version) {
        console.log(
          `Skipping stale event. Local version: ${local.version}, ` +
                                        `Event version: ${event.version}`
        );
        return;  // Safely ignore out-of-order event
      }
      
      // IDEMPOTENCY CHECK: Have we processed this exact event?
      if (event.eventId === local.lastEventId) {
        console.log(`Duplicate event ${event.eventId}, skipping`);
        return;  // Safely ignore duplicate
      }
    }
    
    // Process based on event type
    switch (event.eventType) {
      case 'CustomerCreated':
        await this.handleCreated(event);
        break;
      case 'CustomerUpdated':
        await this.handleUpdated(event, local);
        break;
      case 'CustomerDeleted':
        await this.handleDeleted(event);
        break;
    }
  }
 
  private async handleUpdated(
    event: CustomerUpdatedEvent,
    local: LocalCustomerView | null
  ): Promise<void> {
    if (!local) {
      // We're missing the Created event - fetch full state
      console.warn(
        `Received update for unknown customer ${event.aggregateId}. ` +
                                    `Fetching from source.`
      );
      await this.fetchAndStoreCustomer(event.aggregateId);
      return;
    }
    
    // Apply update
    await this.repository.update(event.aggregateId, {
      name: event.data.name,
      email: event.data.email,
      version: event.version,
      lastEventId: event.eventId,
      lastUpdated: new Date(),
    });
  }
 
  private async fetchAndStoreCustomer(customerId: string): Promise<void> {
    // Fallback: call source service to get current state
    // This handles missed events, startup, etc.
    const current = await this.customerServiceClient.getCustomer(customerId);
    await this.repository.upsert({
      customerId: current.id,
      name: current.name,
      email: current.email,
      version: current.version,
      lastEventId: 'fetched-from-source',
      lastUpdated: new Date(),
    });
  }
}

The Idempotency Requirement

At-least-once delivery is the only practical guarantee for distributed events. Networks fail, consumers crash, acknowledgments get lost. Events will be delivered multiple times.

Your consumer must be idempotent: processing the same event twice produces the same result as processing once.

Techniques for idempotency:

Idempotency keys — Store the eventId of processed events; reject duplicates
Version checks — Only apply if version advances (handles duplicates automatically)
Upsert operations — Replace rather than increment; setting X=5 twice is safe
Conditional writes — Database checks (e.g., UPDATE WHERE version = ?)

Never Rely on Exactly-Once Delivery

Some systems claim 'exactly-once' semantics. Don't trust it for business logic. Always design consumers to be idempotent. Even 'exactly-once' systems have edge cases where duplicates occur.

Building Reliable Event Consumers

A reliable event consumer must handle failures gracefully, process efficiently, and never lose events. This section covers the patterns that production consumers use.

Pattern 1: Transactional Outbox for Publishers

The publisher must ensure events are published if and only if the database write succeeds. The Transactional Outbox pattern achieves this:

Write entity update AND event to outbox table in same transaction
Background process reads outbox and publishes to broker
After confirmed publish, mark outbox entry as sent

This ensures no event is lost if publish fails, and no event is published for rolled-back transactions.

Pattern 2: Consumer Acknowledgment

Consumers should only acknowledge (ack) events after successful processing:

1. Receive event from broker
2. Start database transaction
3. Update local view
4. Commit transaction
5. Acknowledge event to broker

If step 4 fails, the event remains unacknowledged and will be redelivered. This is the at-least-once guarantee in action.

Critical: Your processing must be idempotent because step 3-5 might succeed/fail independently.

Pattern 3: Dead Letter Queues

Some events consistently fail processing—malformed data, unexpected types, bugs. After N retries, move these to a Dead Letter Queue (DLQ):

Prevents one bad event from blocking all processing
Allows investigation and manual intervention
Preserves the event for debugging
Lets healthy events continue flowing

reliable-consumer-example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
// ===================================================
// RELIABLE EVENT CONSUMER IMPLEMENTATION
// ===================================================
// Demonstrates: idempotency, retry logic, DLQ handling,
// and proper acknowledgment sequencing.
// ===================================================
 
interface ConsumerConfig {
  maxRetries: number;
  retryDelayMs: number;
  deadLetterTopic: string;
}
 
class ReliableEventConsumer {
  private config: ConsumerConfig;
  private processor: EventProcessor;
  private publisher: EventPublisher;
  private processedEvents: ProcessedEventStore;
 
  async consumeEvent(event: CustomerEvent): Promise<void> {
    const consumeId = `${event.eventId}-${event.aggregateId}`;
    
    // ==========================================
    // STEP 1: Check idempotency
    // ==========================================
    if (await this.processedEvents.exists(consumeId)) {
      console.log(`Already processed ${event.eventId}, skipping`);
      return;  // Already processed, acknowledge and move on
    }
 
    // ==========================================
    // STEP 2: Process with retry logic
    // ==========================================
    let lastError: Error | null = null;
    
    for (let attempt = 1; attempt <= this.config.maxRetries; attempt++) {
      try {
        await this.processWithTransaction(event);
        
        // ==========================================
        // STEP 3: Record successful processing
        // ==========================================
        await this.processedEvents.markProcessed(consumeId, {
          processedAt: new Date(),
          eventType: event.eventType,
        });
        
        // Success! Return to acknowledge
        return;
        
      } catch (error) {
        lastError = error as Error;
        console.error(
          `Attempt ${attempt} failed for ${event.eventId}: ${lastError.message}`
        );
        
        if (attempt < this.config.maxRetries) {
          // Exponential backoff
          const delay = this.config.retryDelayMs * Math.pow(2, attempt - 1);
          await sleep(delay);
        }
      }
    }
 
    // ==========================================
    // STEP 4: Send to Dead Letter Queue
    // ==========================================
    console.error(
      `All retries exhausted for ${event.eventId}. Sending to DLQ.`
    );
    
    await this.publisher.publish(this.config.deadLetterTopic, {
      originalEvent: event,
      error: lastError?.message,
      failedAt: new Date(),
      attempts: this.config.maxRetries,
    });
    
    // Acknowledge original event to unblock queue
    // The DLQ event now tracks the failure
  }
 
  private async processWithTransaction(event: CustomerEvent): Promise<void> {
    // Use database transaction for atomic local update
    await this.db.transaction(async (tx) => {
      switch (event.eventType) {
        case 'CustomerCreated':
          await this.processor.handleCreated(tx, event);
          break;
        case 'CustomerUpdated':
          await this.processor.handleUpdated(tx, event);
          break;
        case 'CustomerDeleted':
          await this.processor.handleDeleted(tx, event);
          break;
        default:
          // Unknown event type - log and skip
          console.warn(`Unknown event type: ${(event as any).eventType}`);
      }
    });
  }
}
 
// ===================================================
// DLQ PROCESSOR - For manual investigation
// ===================================================
 
class DeadLetterProcessor {
  async reviewAndRetry(dlqEvent: DeadLetterEvent): Promise<void> {
    // Manual review determined event is now processable
    // (e.g., bug was fixed, data was corrected)
    
    // Re-publish to original topic for reprocessing
    await this.publisher.publish(
      'customer-events',
      dlqEvent.originalEvent
    );
    
    // Mark DLQ entry as resolved
    await this.dlqStore.markResolved(dlqEvent.id, {
      resolution: 'retried',
      resolvedAt: new Date(),
      resolvedBy: getCurrentUser(),
    });
  }
}

Event Broker Considerations

The event broker is the critical infrastructure for event-driven sync. Choosing and configuring it correctly determines system reliability.

Comparing Event Broker Options
Broker	Strengths	Considerations
Apache Kafka	Massive scale; log-based; replay support; strong ordering per partition	Operational complexity; requires ZooKeeper or KRaft
RabbitMQ	Feature-rich; mature; flexible routing; good for smaller scale	No built-in replay; messages deleted after consumption
Amazon SQS + SNS	Fully managed; scales automatically; pay-per-use	Limited replay; eventual consistency between SNS→SQS
Google Pub/Sub	Fully managed; global; message retention with replay	Ordering requires configuration; GCP-specific
Azure Event Hubs	Fully managed; Kafka-compatible; strong Azure integration	Azure-specific ecosystem

Key Configuration Decisions

Partitioning for Ordering

Most brokers only guarantee ordering within a partition. For entity-based sync, partition by entity ID:

Partition key = customerId
→ All events for customer X go to same partition
→ Events for customer X processed in order

Be careful: different entities in the same partition are ordered relative to each other, which may cause head-of-line blocking if one entity has many events.

Retention for Replay

Log-based brokers (Kafka) retain events for a configurable period. This enables:

Rebuilding consumer state from scratch
Debugging historical issues
Onboarding new consumers without backfill APIs

Set retention based on:

How long might a consumer be down?
How long do you need for debugging?
Storage costs vs. replay value

Consumer Groups for Scaling

Consumer groups allow horizontal scaling:

Multiple instances share the load
Each partition assigned to one instance
Rebalancing happens automatically when instances join/leave

Important: During rebalancing, some events may be processed twice. Idempotency handles this.

Start with Managed Services

Unless you have specific reasons, start with managed brokers (AWS MSK, Confluent Cloud, GCP Pub/Sub). Operating Kafka or RabbitMQ clusters is non-trivial. Focus your engineering on business logic, not broker maintenance.

Handling Event Gaps and State Catch-Up

Event-driven sync has a fundamental challenge: what if you miss events? Retention expires, bugs cause events to be dropped, or a new consumer comes online without historical data. You need strategies for catching up.

Strategy 1: Snapshot Events

Periodically, the publisher emits a "snapshot" event containing the full current state. Consumers can use these to reset/verify their view.

CustomerSnapshot event:
- Contains complete customer data
- Emitted on schedule (daily) and/or on demand
- Version allows ordering with regular events
- Consumer can fully reset local view from snapshot

Trade-off: Larger events; only periodic consistency check.

Strategy 2: Catch-Up API

The publisher provides an API for bulk fetching:

GET /customers?updatedSince=2024-01-01T00:00:00Z
→ Returns all customers modified after that time

GET /customers?ids=cust_123,cust_456
→ Returns specific customers on demand

Consumers use this for:

Initial bootstrap (before subscribing to events)
Recovering from long outages
Fetching entities referenced in events but not in local store

Strategy 3: Event Replay from Log

With a log-based broker (Kafka), consumers can replay from any offset:

1. Consumer starts fresh
2. Set consumer offset to earliest retained event
3. Process all historical events
4. Reach end of log → now tracking real-time

Critical: Ensure your consumer is idempotent and handles version ordering. Replaying years of events will include duplicates and out-of-order segments.

catchup-strategy-example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
// ===================================================
// HYBRID CATCH-UP STRATEGY
// ===================================================
// Combines initial API fetch with ongoing event processing
// for reliable state synchronization.
// ===================================================
 
class CustomerViewSynchronizer {
  private customerClient: CustomerServiceClient;
  private eventConsumer: EventConsumer;
  private localView: LocalCustomerViewRepository;
  
  async initialize(): Promise<void> {
    console.log('Starting customer view synchronization...');
    
    // ==========================================
    // PHASE 1: Initial bulk load from API
    // ==========================================
    console.log('Phase 1: Fetching current state from API...');
    
    let cursor: string | undefined;
    let totalLoaded = 0;
    
    do {
      const batch = await this.customerClient.listCustomers({
        cursor,
        limit: 1000,
      });
      
      for (const customer of batch.customers) {
        await this.localView.upsert({
          customerId: customer.id,
          name: customer.name,
          email: customer.email,
          version: customer.version,
          lastEventId: 'initial-load',
          lastUpdated: new Date(),
        });
      }
      
      totalLoaded += batch.customers.length;
      cursor = batch.nextCursor;
      
      console.log(`Loaded ${totalLoaded} customers...`);
    } while (cursor);
    
    console.log(`Phase 1 complete: ${totalLoaded} customers loaded`);
    
    // ==========================================
    // PHASE 2: Subscribe to real-time events
    // ==========================================
    console.log('Phase 2: Starting real-time event subscription...');
    
    // Note: We might process some events for data we just loaded,
    // but idempotency and version checks handle this safely.
    await this.eventConsumer.subscribe('customer-events', {
      // Start from recent events to catch anything that happened
      // during API fetch
      startFrom: 'earliest-retained',
      handler: (event) => this.handleEvent(event),
    });
    
    console.log('Synchronization initialized successfully');
  }
 
  async handleMissingEntity(customerId: string): Promise<void> {
    // Called when we receive an event for an unknown entity
    // Fetch it from the source API
    console.warn(
      `Fetching missing customer ${customerId} from API`
    );
    
    try {
      const customer = await this.customerClient.getCustomer(customerId);
      await this.localView.upsert({
        customerId: customer.id,
        name: customer.name,
        email: customer.email,
        version: customer.version,
        lastEventId: 'api-fetch',
        lastUpdated: new Date(),
      });
    } catch (error) {
      if (error.code === 'NOT_FOUND') {
        console.warn(
          `Customer ${customerId} not found in source - may be deleted`
        );
      } else {
        throw error;
      }
    }
  }
 
  async reconcile(): Promise<ReconciliationResult> {
    // Periodic reconciliation to catch any drift
    console.log('Running periodic reconciliation...');
    
    // Get checksums from source
    const sourceChecksums = await this.customerClient.getChecksums();
    
    // Compare with local
    const localChecksums = await this.localView.getChecksums();
    
    const discrepancies: string[] = [];
    for (const [customerId, sourceChecksum] of Object.entries(sourceChecksums)) {
      if (localChecksums[customerId] !== sourceChecksum) {
        discrepancies.push(customerId);
      }
    }
    
    // Fetch and fix discrepancies
    for (const customerId of discrepancies) {
      await this.handleMissingEntity(customerId);
    }
    
    return {
      totalChecked: Object.keys(sourceChecksums).length,
      discrepanciesFound: discrepancies.length,
      discrepanciesFixed: discrepancies.length,
    };
  }
}

Reconciliation is Your Safety Net

No matter how reliable your event pipeline, run periodic reconciliation. It catches bugs, edge cases, and the unexpected. Many production systems reconcile daily or weekly as a background job.

Performance Considerations

At scale, event-driven sync faces performance challenges. Understanding these helps you design for high throughput and low latency.

Performance Optimization Strategies

•Batch Processing — Process events in batches rather than one-by-one. Amortize database transaction overhead across multiple updates.
•Parallel Consumption — Use multiple consumer instances with partitioned topics. Each instance handles a subset of entities.
•Async Processing Pipelines — Don't block event consumption on slow operations. Queue processing internally and acknowledge events quickly.
•Compression — Compress event payloads, especially fat events with large data. Most brokers support compression natively.
•Connection Pooling — Reuse database and broker connections. Connection establishment is expensive at high throughput.
•Selective Subscription — Subscribe only to events you need. Filtering at the broker level is more efficient than consuming and discarding.

Measuring Sync Latency:

Track the time between event publication and consumer processing complete:

Sync Latency = ConsumerProcessedAt - EventGeneratedAt

Set alerts for:

P50 latency > target (typical target: <1 second)
P99 latency > acceptable max (typical: <10 seconds)
Consumer lag growing (falling behind publisher)

Consumer Lag is especially critical: it indicates your consumer can't keep up with event rate, and latency will continuously increase.

Backpressure Can Cascade

If consumers fall behind, event brokers queue more data, using more memory and disk. Eventually, oldest events may be dropped (retention exceeded) or the broker slows/crashes. Monitor consumer lag aggressively and scale consumers before lag becomes critical.

Summary: The Event-Driven Sync Pattern

Event-driven data synchronization is the standard pattern for maintaining data consistency across microservices. It enables loose coupling, high availability, and independent service evolution while providing eventual consistency.

Key Takeaways

•Publishers emit events on state changes — Data owners publish events to a durable broker; consumers subscribe and update local views asynchronously.
•Use fat events for sync — Include full entity state so consumers don't need to call back to the source; enables self-contained processing.
•Handle ordering with versions — Include version numbers in events; only apply updates that advance the version; safely ignore stale events.
•Design for at-least-once delivery — Events will be duplicated; make consumers idempotent through version checks, idempotency keys, or upsert operations.
•Build reliable consumers — Use transactional outbox for publishing; acknowledge after processing; use dead letter queues for poison events.
•Plan for catch-up — Provide snapshot events, catch-up APIs, and support event replay; run periodic reconciliation as a safety net.

What's next:

Event-driven sync works well for keeping local views updated. But sometimes you need to query data across multiple services to fulfill a request. The next page explores patterns for cross-service queries—joining distributed data without sacrificing service autonomy.

Page Complete

You now understand event-driven data synchronization—the backbone of distributed data consistency. By publishing rich events, handling ordering and idempotency, building reliable consumers, and planning for catch-up scenarios, you can maintain eventually consistent views across your microservices architecture. Next, we'll tackle the challenge of querying across service boundaries.

3 / 5

Loading learning content...

System Design (HLD)Data Ownership

Data Ownership in Microservices

LevelAdvanced

Duration90 mins

TopicData Ownership

3 / 5

Event-Driven Data Sync

The Backbone of Distributed Data

What You Will Learn

Event-Driven Synchronization Fundamentals

The flow:

1. Customer Service updates customer email
2. Customer Service writes to its database
3. Customer Service publishes CustomerEmailChanged event
4. Event broker (Kafka, RabbitMQ, etc.) stores and delivers event
5. Order Service receives event
6. Order Service updates its local customer view
7. Notification Service receives event
8. Notification Service updates its contact preferences

Each step is asynchronous. The Customer Service doesn't wait for consumers to process. Consumers process at their own pace. If a consumer is down, events queue until it recovers.

Components of Event-Driven Sync
Component	Role	Responsibility
Publisher	Data owner service	Emit events after state changes; ensure at-least-once delivery to broker
Event Broker	Message infrastructure	Store events durably; deliver to all subscribers; maintain ordering
Subscriber	Consumer service	Process events idempotently; update local state; handle failures
Event	Data change description	Carry enough information for consumers to update; include metadata
Consumer Group	Subscriber instances	Allow scaling consumers; balance load; track processing position

Why events, not direct updates?

You might wonder: why doesn't Customer Service just call Order Service directly when customer data changes? Several reasons:

Coupling — Direct calls create runtime dependencies; if Order Service is down, Customer update fails
Scaling — Adding a new consumer requires updating the publisher; events allow unlimited consumers
Resilience — Events are durable; consumers can catch up after downtime; direct calls must retry
Decoupling — Publisher doesn't need to know what consumers exist or what they do
Replay — Event history allows reconstructing state; direct calls are fire-and-forget

The Eventual in Eventual Consistency

Designing Events for Data Synchronization

Event Content: Thin vs. Fat Events

Thin Events contain minimal information—typically just an identifier and the type of change.

{
  "type": "CustomerUpdated",
  "customerId": "cust_123",
  "timestamp": "2024-01-15T10:30:00Z"
}

Consumers must call the source service to get current data.

Fat Events contain the full entity state or all changed fields.

{
  "type": "CustomerUpdated",
  "customerId": "cust_123",
  "timestamp": "2024-01-15T10:30:00Z",
  "data": {
    "name": "Jane Doe",
    "email": "jane.new@example.com",
    "status": "active"
  }
}

Consumers have all needed information without additional calls.

Thin Events

•✓ Smaller event payload
•✓ No coupling to data schema
•✓ Consumers always get current data
•✗ Requires API call per event
•✗ Higher load on source service
•✗ Source must be available during processing

Fat Events

•✓ No additional API calls needed
•✓ Full sync without source availability
•✓ Self-contained for processing
•✗ Larger event payloads
•✗ Schema coupling via events
•✗ May include data consumers don't need

Recommendation for data sync: Use fat events. For synchronization purposes, fat events are generally preferred because:

Consumer doesn't need source to be available
Processing is faster (no network call)
Replay fully reconstructs state without source
Event is a complete record of what changed

Essential Event Metadata

Every sync event should include:

Field	Purpose	Example
`eventId`	Unique identifier for deduplication	`evt_a1b2c3d4`
`eventType`	Distinguishes event types	`CustomerCreated`
`aggregateId`	The entity this event concerns	`cust_123`
`timestamp`	When the change occurred	ISO 8601 date
`version`	Entity version after change	`15`
`correlationId`	Trace across services	Request ID
`causationId`	What triggered this event	Previous event ID

event-structure-example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
// ===================================================
// COMPREHENSIVE EVENT STRUCTURE FOR DATA SYNC
// ===================================================
// A well-designed event includes everything consumers
// need for reliable, idempotent processing.
// ===================================================
 
interface BaseEvent {
  // Unique event identifier - for deduplication
  eventId: string;
  
  // Event type discriminator
  eventType: string;
  
  // The entity this event describes
  aggregateType: string;
  aggregateId: string;
  
  // Ordering within the entity - critical for sync
  version: number;
  
  // Timing information
  timestamp: Date;
  
  // Tracing
  correlationId: string;  // Original request ID
  causationId: string;    // Event that caused this event
  
  // Source information
  sourceService: string;
  sourceInstance: string;
}
 
interface CustomerCreatedEvent extends BaseEvent {
  eventType: 'CustomerCreated';
  aggregateType: 'Customer';
  
  data: {
    email: string;
    name: string;
    status: 'active' | 'pending';
    createdAt: Date;
  };
}
 
interface CustomerUpdatedEvent extends BaseEvent {
  eventType: 'CustomerUpdated';
  aggregateType: 'Customer';
  
  // Include both old and new values for debugging
  // (optional but helpful)
  changes: Array<{
    field: string;
    oldValue: unknown;
    newValue: unknown;
  }>;
  
  // Full current state after update
  data: {
    email: string;
    name: string;
    status: 'active' | 'suspended' | 'deleted';
    updatedAt: Date;
  };
}
 
interface CustomerDeletedEvent extends BaseEvent {
  eventType: 'CustomerDeleted';
  aggregateType: 'Customer';
  
  // Deletion reason for audit
  reason: string;
  deletedBy: string;
}
 
// Type union for type-safe handling
type CustomerEvent = 
  | CustomerCreatedEvent 
  | CustomerUpdatedEvent 
  | CustomerDeletedEvent;
 
// ===================================================
// EVENT PUBLISHING EXAMPLE
// ===================================================
 
class CustomerService {
  private eventPublisher: EventPublisher;
 
  async updateCustomer(
    customerId: string,
    updates: CustomerUpdates
  ): Promise<Customer> {
    const customer = await this.repository.findById(customerId);
    
    // Track changes for the event
    const changes = this.calculateChanges(customer, updates);
    
    // Apply updates
    const oldVersion = customer.version;
    Object.assign(customer, updates);
    customer.version += 1;
    customer.updatedAt = new Date();
    
    // Persist
    await this.repository.save(customer);
    
    // Publish comprehensive event
    await this.eventPublisher.publish<CustomerUpdatedEvent>({
      eventId: generateEventId(),
      eventType: 'CustomerUpdated',
      aggregateType: 'Customer',
      aggregateId: customerId,
      version: customer.version,  // NEW version
      timestamp: customer.updatedAt,
      correlationId: getCurrentCorrelationId(),
      causationId: getCausationId(),
      sourceService: 'customer-service',
      sourceInstance: getInstanceId(),
      changes: changes,
      data: {
        email: customer.email,
        name: customer.name,
        status: customer.status,
        updatedAt: customer.updatedAt,
      },
    });
    
    return customer;
  }
  
  private calculateChanges(
    before: Customer,
    updates: CustomerUpdates
  ): Array<{ field: string; oldValue: unknown; newValue: unknown }> {
    const changes = [];
    for (const [key, value] of Object.entries(updates)) {
      if (before[key] !== value) {
        changes.push({
          field: key,
          oldValue: before[key],
          newValue: value,
        });
      }
    }
    return changes;
  }
}

Event Ordering and Idempotent Processing

Two challenges haunt event-driven synchronization: ordering and duplicate processing. Understanding and handling these is essential for reliable sync.

The Ordering Problem

Events for the same entity can arrive out of order due to:

Network latency variations
Consumer failures and retries
Parallel publishing
Partition rebalancing

Example:

Customer created (version 1)
Customer email updated (version 2)
Customer name updated (version 3)

If consumer receives in order 1 → 3 → 2, applying version 2 after version 3 would revert the name change!

Solution: Version-Based Updates

When processing an event:

Check local entity version
Only apply if event version > local version
Discard events with lower versions

version-based-update
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
// ===================================================
// VERSION-BASED ORDERING PROTECTION
// ===================================================
// Only apply events that move the version forward.
// Stale events are safely ignored.
// ===================================================
 
interface LocalCustomerView {
  customerId: string;
  name: string;
  email: string;
  version: number;          // Track which version we have
  lastEventId: string;      // Track last processed event
  lastUpdated: Date;
}
 
class CustomerViewUpdater {
  private repository: LocalCustomerViewRepository;
 
  async handleCustomerEvent(event: CustomerEvent): Promise<void> {
    const local = await this.repository.findById(event.aggregateId);
    
    if (local) {
      // ORDERING CHECK: Only apply if event advances version
      if (event.version <= local.version) {
        console.log(
          `Skipping stale event. Local version: ${local.version}, ` +
                                        `Event version: ${event.version}`
        );
        return;  // Safely ignore out-of-order event
      }
      
      // IDEMPOTENCY CHECK: Have we processed this exact event?
      if (event.eventId === local.lastEventId) {
        console.log(`Duplicate event ${event.eventId}, skipping`);
        return;  // Safely ignore duplicate
      }
    }
    
    // Process based on event type
    switch (event.eventType) {
      case 'CustomerCreated':
        await this.handleCreated(event);
        break;
      case 'CustomerUpdated':
        await this.handleUpdated(event, local);
        break;
      case 'CustomerDeleted':
        await this.handleDeleted(event);
        break;
    }
  }
 
  private async handleUpdated(
    event: CustomerUpdatedEvent,
    local: LocalCustomerView | null
  ): Promise<void> {
    if (!local) {
      // We're missing the Created event - fetch full state
      console.warn(
        `Received update for unknown customer ${event.aggregateId}. ` +
                                    `Fetching from source.`
      );
      await this.fetchAndStoreCustomer(event.aggregateId);
      return;
    }
    
    // Apply update
    await this.repository.update(event.aggregateId, {
      name: event.data.name,
      email: event.data.email,
      version: event.version,
      lastEventId: event.eventId,
      lastUpdated: new Date(),
    });
  }
 
  private async fetchAndStoreCustomer(customerId: string): Promise<void> {
    // Fallback: call source service to get current state
    // This handles missed events, startup, etc.
    const current = await this.customerServiceClient.getCustomer(customerId);
    await this.repository.upsert({
      customerId: current.id,
      name: current.name,
      email: current.email,
      version: current.version,
      lastEventId: 'fetched-from-source',
      lastUpdated: new Date(),
    });
  }
}

The Idempotency Requirement

At-least-once delivery is the only practical guarantee for distributed events. Networks fail, consumers crash, acknowledgments get lost. Events will be delivered multiple times.

Your consumer must be idempotent: processing the same event twice produces the same result as processing once.

Techniques for idempotency:

Idempotency keys — Store the eventId of processed events; reject duplicates
Version checks — Only apply if version advances (handles duplicates automatically)
Upsert operations — Replace rather than increment; setting X=5 twice is safe
Conditional writes — Database checks (e.g., UPDATE WHERE version = ?)

Never Rely on Exactly-Once Delivery

Some systems claim 'exactly-once' semantics. Don't trust it for business logic. Always design consumers to be idempotent. Even 'exactly-once' systems have edge cases where duplicates occur.

Building Reliable Event Consumers

A reliable event consumer must handle failures gracefully, process efficiently, and never lose events. This section covers the patterns that production consumers use.

Pattern 1: Transactional Outbox for Publishers

The publisher must ensure events are published if and only if the database write succeeds. The Transactional Outbox pattern achieves this:

Write entity update AND event to outbox table in same transaction
Background process reads outbox and publishes to broker
After confirmed publish, mark outbox entry as sent

This ensures no event is lost if publish fails, and no event is published for rolled-back transactions.

Pattern 2: Consumer Acknowledgment

Consumers should only acknowledge (ack) events after successful processing:

1. Receive event from broker
2. Start database transaction
3. Update local view
4. Commit transaction
5. Acknowledge event to broker

If step 4 fails, the event remains unacknowledged and will be redelivered. This is the at-least-once guarantee in action.

Critical: Your processing must be idempotent because step 3-5 might succeed/fail independently.

Pattern 3: Dead Letter Queues

Some events consistently fail processing—malformed data, unexpected types, bugs. After N retries, move these to a Dead Letter Queue (DLQ):

Prevents one bad event from blocking all processing
Allows investigation and manual intervention
Preserves the event for debugging
Lets healthy events continue flowing

reliable-consumer-example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
// ===================================================
// RELIABLE EVENT CONSUMER IMPLEMENTATION
// ===================================================
// Demonstrates: idempotency, retry logic, DLQ handling,
// and proper acknowledgment sequencing.
// ===================================================
 
interface ConsumerConfig {
  maxRetries: number;
  retryDelayMs: number;
  deadLetterTopic: string;
}
 
class ReliableEventConsumer {
  private config: ConsumerConfig;
  private processor: EventProcessor;
  private publisher: EventPublisher;
  private processedEvents: ProcessedEventStore;
 
  async consumeEvent(event: CustomerEvent): Promise<void> {
    const consumeId = `${event.eventId}-${event.aggregateId}`;
    
    // ==========================================
    // STEP 1: Check idempotency
    // ==========================================
    if (await this.processedEvents.exists(consumeId)) {
      console.log(`Already processed ${event.eventId}, skipping`);
      return;  // Already processed, acknowledge and move on
    }
 
    // ==========================================
    // STEP 2: Process with retry logic
    // ==========================================
    let lastError: Error | null = null;
    
    for (let attempt = 1; attempt <= this.config.maxRetries; attempt++) {
      try {
        await this.processWithTransaction(event);
        
        // ==========================================
        // STEP 3: Record successful processing
        // ==========================================
        await this.processedEvents.markProcessed(consumeId, {
          processedAt: new Date(),
          eventType: event.eventType,
        });
        
        // Success! Return to acknowledge
        return;
        
      } catch (error) {
        lastError = error as Error;
        console.error(
          `Attempt ${attempt} failed for ${event.eventId}: ${lastError.message}`
        );
        
        if (attempt < this.config.maxRetries) {
          // Exponential backoff
          const delay = this.config.retryDelayMs * Math.pow(2, attempt - 1);
          await sleep(delay);
        }
      }
    }
 
    // ==========================================
    // STEP 4: Send to Dead Letter Queue
    // ==========================================
    console.error(
      `All retries exhausted for ${event.eventId}. Sending to DLQ.`
    );
    
    await this.publisher.publish(this.config.deadLetterTopic, {
      originalEvent: event,
      error: lastError?.message,
      failedAt: new Date(),
      attempts: this.config.maxRetries,
    });
    
    // Acknowledge original event to unblock queue
    // The DLQ event now tracks the failure
  }
 
  private async processWithTransaction(event: CustomerEvent): Promise<void> {
    // Use database transaction for atomic local update
    await this.db.transaction(async (tx) => {
      switch (event.eventType) {
        case 'CustomerCreated':
          await this.processor.handleCreated(tx, event);
          break;
        case 'CustomerUpdated':
          await this.processor.handleUpdated(tx, event);
          break;
        case 'CustomerDeleted':
          await this.processor.handleDeleted(tx, event);
          break;
        default:
          // Unknown event type - log and skip
          console.warn(`Unknown event type: ${(event as any).eventType}`);
      }
    });
  }
}
 
// ===================================================
// DLQ PROCESSOR - For manual investigation
// ===================================================
 
class DeadLetterProcessor {
  async reviewAndRetry(dlqEvent: DeadLetterEvent): Promise<void> {
    // Manual review determined event is now processable
    // (e.g., bug was fixed, data was corrected)
    
    // Re-publish to original topic for reprocessing
    await this.publisher.publish(
      'customer-events',
      dlqEvent.originalEvent
    );
    
    // Mark DLQ entry as resolved
    await this.dlqStore.markResolved(dlqEvent.id, {
      resolution: 'retried',
      resolvedAt: new Date(),
      resolvedBy: getCurrentUser(),
    });
  }
}

Event Broker Considerations

The event broker is the critical infrastructure for event-driven sync. Choosing and configuring it correctly determines system reliability.

Comparing Event Broker Options
Broker	Strengths	Considerations
Apache Kafka	Massive scale; log-based; replay support; strong ordering per partition	Operational complexity; requires ZooKeeper or KRaft
RabbitMQ	Feature-rich; mature; flexible routing; good for smaller scale	No built-in replay; messages deleted after consumption
Amazon SQS + SNS	Fully managed; scales automatically; pay-per-use	Limited replay; eventual consistency between SNS→SQS
Google Pub/Sub	Fully managed; global; message retention with replay	Ordering requires configuration; GCP-specific
Azure Event Hubs	Fully managed; Kafka-compatible; strong Azure integration	Azure-specific ecosystem

Key Configuration Decisions

Partitioning for Ordering

Most brokers only guarantee ordering within a partition. For entity-based sync, partition by entity ID:

Partition key = customerId
→ All events for customer X go to same partition
→ Events for customer X processed in order

Be careful: different entities in the same partition are ordered relative to each other, which may cause head-of-line blocking if one entity has many events.

Retention for Replay

Log-based brokers (Kafka) retain events for a configurable period. This enables:

Rebuilding consumer state from scratch
Debugging historical issues
Onboarding new consumers without backfill APIs

Set retention based on:

How long might a consumer be down?
How long do you need for debugging?
Storage costs vs. replay value

Consumer Groups for Scaling

Consumer groups allow horizontal scaling:

Multiple instances share the load
Each partition assigned to one instance
Rebalancing happens automatically when instances join/leave

Important: During rebalancing, some events may be processed twice. Idempotency handles this.

Start with Managed Services

Handling Event Gaps and State Catch-Up

Strategy 1: Snapshot Events

Periodically, the publisher emits a "snapshot" event containing the full current state. Consumers can use these to reset/verify their view.

CustomerSnapshot event:
- Contains complete customer data
- Emitted on schedule (daily) and/or on demand
- Version allows ordering with regular events
- Consumer can fully reset local view from snapshot

Trade-off: Larger events; only periodic consistency check.

Strategy 2: Catch-Up API

The publisher provides an API for bulk fetching:

GET /customers?updatedSince=2024-01-01T00:00:00Z
→ Returns all customers modified after that time

GET /customers?ids=cust_123,cust_456
→ Returns specific customers on demand

Consumers use this for:

Initial bootstrap (before subscribing to events)
Recovering from long outages
Fetching entities referenced in events but not in local store

Strategy 3: Event Replay from Log

With a log-based broker (Kafka), consumers can replay from any offset:

1. Consumer starts fresh
2. Set consumer offset to earliest retained event
3. Process all historical events
4. Reach end of log → now tracking real-time

Critical: Ensure your consumer is idempotent and handles version ordering. Replaying years of events will include duplicates and out-of-order segments.

catchup-strategy-example
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
// ===================================================
// HYBRID CATCH-UP STRATEGY
// ===================================================
// Combines initial API fetch with ongoing event processing
// for reliable state synchronization.
// ===================================================
 
class CustomerViewSynchronizer {
  private customerClient: CustomerServiceClient;
  private eventConsumer: EventConsumer;
  private localView: LocalCustomerViewRepository;
  
  async initialize(): Promise<void> {
    console.log('Starting customer view synchronization...');
    
    // ==========================================
    // PHASE 1: Initial bulk load from API
    // ==========================================
    console.log('Phase 1: Fetching current state from API...');
    
    let cursor: string | undefined;
    let totalLoaded = 0;
    
    do {
      const batch = await this.customerClient.listCustomers({
        cursor,
        limit: 1000,
      });
      
      for (const customer of batch.customers) {
        await this.localView.upsert({
          customerId: customer.id,
          name: customer.name,
          email: customer.email,
          version: customer.version,
          lastEventId: 'initial-load',
          lastUpdated: new Date(),
        });
      }
      
      totalLoaded += batch.customers.length;
      cursor = batch.nextCursor;
      
      console.log(`Loaded ${totalLoaded} customers...`);
    } while (cursor);
    
    console.log(`Phase 1 complete: ${totalLoaded} customers loaded`);
    
    // ==========================================
    // PHASE 2: Subscribe to real-time events
    // ==========================================
    console.log('Phase 2: Starting real-time event subscription...');
    
    // Note: We might process some events for data we just loaded,
    // but idempotency and version checks handle this safely.
    await this.eventConsumer.subscribe('customer-events', {
      // Start from recent events to catch anything that happened
      // during API fetch
      startFrom: 'earliest-retained',
      handler: (event) => this.handleEvent(event),
    });
    
    console.log('Synchronization initialized successfully');
  }
 
  async handleMissingEntity(customerId: string): Promise<void> {
    // Called when we receive an event for an unknown entity
    // Fetch it from the source API
    console.warn(
      `Fetching missing customer ${customerId} from API`
    );
    
    try {
      const customer = await this.customerClient.getCustomer(customerId);
      await this.localView.upsert({
        customerId: customer.id,
        name: customer.name,
        email: customer.email,
        version: customer.version,
        lastEventId: 'api-fetch',
        lastUpdated: new Date(),
      });
    } catch (error) {
      if (error.code === 'NOT_FOUND') {
        console.warn(
          `Customer ${customerId} not found in source - may be deleted`
        );
      } else {
        throw error;
      }
    }
  }
 
  async reconcile(): Promise<ReconciliationResult> {
    // Periodic reconciliation to catch any drift
    console.log('Running periodic reconciliation...');
    
    // Get checksums from source
    const sourceChecksums = await this.customerClient.getChecksums();
    
    // Compare with local
    const localChecksums = await this.localView.getChecksums();
    
    const discrepancies: string[] = [];
    for (const [customerId, sourceChecksum] of Object.entries(sourceChecksums)) {
      if (localChecksums[customerId] !== sourceChecksum) {
        discrepancies.push(customerId);
      }
    }
    
    // Fetch and fix discrepancies
    for (const customerId of discrepancies) {
      await this.handleMissingEntity(customerId);
    }
    
    return {
      totalChecked: Object.keys(sourceChecksums).length,
      discrepanciesFound: discrepancies.length,
      discrepanciesFixed: discrepancies.length,
    };
  }
}

Reconciliation is Your Safety Net

No matter how reliable your event pipeline, run periodic reconciliation. It catches bugs, edge cases, and the unexpected. Many production systems reconcile daily or weekly as a background job.

Performance Considerations

At scale, event-driven sync faces performance challenges. Understanding these helps you design for high throughput and low latency.

Performance Optimization Strategies

•Batch Processing — Process events in batches rather than one-by-one. Amortize database transaction overhead across multiple updates.
•Parallel Consumption — Use multiple consumer instances with partitioned topics. Each instance handles a subset of entities.
•Async Processing Pipelines — Don't block event consumption on slow operations. Queue processing internally and acknowledge events quickly.
•Compression — Compress event payloads, especially fat events with large data. Most brokers support compression natively.
•Connection Pooling — Reuse database and broker connections. Connection establishment is expensive at high throughput.
•Selective Subscription — Subscribe only to events you need. Filtering at the broker level is more efficient than consuming and discarding.

Measuring Sync Latency:

Track the time between event publication and consumer processing complete:

Sync Latency = ConsumerProcessedAt - EventGeneratedAt

Set alerts for:

P50 latency > target (typical target: <1 second)
P99 latency > acceptable max (typical: <10 seconds)
Consumer lag growing (falling behind publisher)

Consumer Lag is especially critical: it indicates your consumer can't keep up with event rate, and latency will continuously increase.

Backpressure Can Cascade

Summary: The Event-Driven Sync Pattern

Key Takeaways

•Publishers emit events on state changes — Data owners publish events to a durable broker; consumers subscribe and update local views asynchronously.
•Use fat events for sync — Include full entity state so consumers don't need to call back to the source; enables self-contained processing.
•Handle ordering with versions — Include version numbers in events; only apply updates that advance the version; safely ignore stale events.
•Design for at-least-once delivery — Events will be duplicated; make consumers idempotent through version checks, idempotency keys, or upsert operations.
•Build reliable consumers — Use transactional outbox for publishing; acknowledge after processing; use dead letter queues for poison events.
•Plan for catch-up — Provide snapshot events, catch-up APIs, and support event replay; run periodic reconciliation as a safety net.

What's next:

Page Complete

3 / 5