System Design (HLD)Choreography vs Orchestration

Choreography vs Orchestration: Coordination Patterns in Event-Driven Systems

LevelAdvanced

Duration90 mins

TopicChoreography vs Orchestration

1 / 5

Choreography — Decentralized Coordination

The Dance of Autonomous Services

Imagine a jazz ensemble where musicians improvise together without a conductor. Each player responds to what they hear, contributing their part based on established patterns and mutual understanding. No one gives explicit instructions—yet the music comes together coherently because each musician knows their role and responds appropriately to the others.

This is choreography in distributed systems. Services react to events independently, each knowing its responsibilities without any central authority dictating the flow. The result, when done well, is a system that's resilient, scalable, and naturally decoupled.

But just as a jazz ensemble requires skilled musicians who understand the genre deeply, choreography requires services that are thoughtfully designed to participate in a larger dance without stepping on each other's toes. Get it right, and you have a beautifully autonomous system. Get it wrong, and you have chaos masquerading as architecture.

What You Will Learn

By the end of this page, you will understand choreography as a coordination pattern: its philosophical foundations, implementation mechanics, event design requirements, testing strategies, and the specific scenarios where it excels. You'll be able to design choreographed workflows that maintain consistency without central control.

Understanding Choreography

Choreography is a coordination pattern where the workflow emerges from independent services reacting to events, rather than being directed by a central controller. Each service subscribes to relevant events, performs its work, and emits new events that trigger subsequent services. There is no single entity that knows or controls the entire workflow.

The fundamental principle: In choreography, services are reactive rather than commanded. They observe the world, respond when they see something relevant, and announce what they've done. The overall business process emerges from these individual reactions, like a complex pattern emerging from simple rules in cellular automata.

Choreography vs Traditional Request-Response
Characteristic	Request-Response	Choreography
Control Flow	Caller knows and controls the sequence	No single entity controls the sequence
Coupling	Caller coupled to callees	Services coupled only to events
Synchrony	Typically synchronous	Inherently asynchronous
Failure Handling	Caller handles callee failures	Each service handles its own failures
Scalability	Limited by calling service capacity	Each service scales independently
Visibility	Caller sees the entire flow	No single point sees the entire flow

An illustrative example:

Consider an e-commerce order process. In a choreographed system:

Order Service receives a purchase request and emits OrderPlaced
Payment Service hears OrderPlaced, processes payment, emits PaymentCompleted
Inventory Service hears PaymentCompleted, reserves items, emits InventoryReserved
Shipping Service hears InventoryReserved, schedules shipment, emits ShipmentScheduled
Notification Service hears ShipmentScheduled, sends confirmation email

No service knows the complete workflow. Order Service doesn't know that its event will eventually trigger shipping. Payment Service doesn't know whether it's the first or fifth step. Each service simply reacts to what it observes and announces what it does.

The Essence of Choreography

In choreography, the workflow is implicit in the event subscriptions, not explicit in any single service's code. Understanding the complete flow requires examining multiple services and their event relationships—a fundamental shift from traditional architectures where control flow is visible in calling code.

Philosophical Foundations

Choreography isn't just a technical pattern—it embodies specific philosophical commitments about how distributed systems should work. Understanding these foundations helps you decide when choreography aligns with your goals.

Tell, Don't Ask: In traditional systems, services often ask others for information: "What's the inventory level? Okay, now reserve it." In choreography, services tell the world what happened: "Payment succeeded." Other services react based on their own logic. This inversion reduces temporal coupling—the teller doesn't wait for the reactor.

Loose Coupling, High Cohesion: Choreographed services are loosely coupled because they communicate only through events. A service doesn't import another's code or even know another exists. It knows only about event types. Simultaneously, each service remains highly cohesive—focused on one responsibility and containing all logic to fulfill it.

Autonomous Operation: Each service in a choreographed system is operationally autonomous. It can deploy, scale, and fail independently. This autonomy is profound: you can replace, upgrade, or remove a service without changing other services (though you must consider event contract compatibility).

Core Principles of Choreography

•Event Sovereignty — Events are the single source of truth for what happened. Services trust events over queries.
•Domain Ownership — Each service owns its domain completely, making all decisions within that domain autonomously.
•Temporal Decoupling — Producer and consumer don't need to be available simultaneously; events are durable.
•Evolutionary Architecture — New services can join the dance by subscribing to events; existing services need not change.
•Eventual Consistency — The system converges to consistency over time rather than maintaining it synchronously.

The Conway's Law Alignment

Choreography aligns naturally with autonomous teams. When each team owns a service completely, choreography lets them move independently. The event contracts become the interfaces between teams, much like APIs but with less temporal coupling. This makes choreography particularly attractive in organizations embracing microservices and team autonomy.

Event Design for Choreography

Events in a choreographed system carry an outsized importance. They aren't just notifications—they're the contracts that allow autonomous services to coordinate. Poor event design undermines the entire pattern.

Domain Events vs Integration Events:

Domain events capture something that happened within a bounded context: OrderPlaced, PaymentFailed, InventoryReserved. They express business facts, not technical operations.

Integration events are designed explicitly for cross-service communication. They may be derived from domain events but are crafted for external consumption—versioned, documented, and stable.

In choreography, both types appear, but integration events require particular care. They become the interface between autonomous services.

Well-Designed Choreography Events
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// Events should be self-describing, immutable records of facts
interface OrderPlacedEvent {
  // Metadata
  readonly eventId: string;          // Unique identifier for idempotency
  readonly eventType: 'OrderPlaced';
  readonly timestamp: string;         // ISO 8601
  readonly version: '1.0';           // Schema version
  readonly correlationId: string;    // For tracing across services
  readonly causationId?: string;     // What event caused this one
  
  // Business payload
  readonly orderId: string;
  readonly customerId: string;
  readonly items: readonly OrderItem[];
  readonly totalAmount: Money;
  readonly currency: string;
  readonly shippingAddress: Address;
  
  // Context for consumers
  readonly customerTier: 'standard' | 'premium' | 'vip';
  readonly isFirstOrder: boolean;
}
 
interface PaymentCompletedEvent {
  readonly eventId: string;
  readonly eventType: 'PaymentCompleted';
  readonly timestamp: string;
  readonly version: '1.0';
  readonly correlationId: string;
  readonly causationId: string;       // Links to OrderPlaced
  
  readonly orderId: string;
  readonly paymentId: string;
  readonly amount: Money;
  readonly paymentMethod: 'card' | 'bank_transfer' | 'wallet';
  readonly transactionReference: string;
  
  // Information for downstream services
  readonly fraudScore: number;        // 0-100, helps shipping decide
}

Critical event design principles for choreography:

1. Include Enough Context: Consumers shouldn't need to call back to producers for more information. The event should contain everything needed for a consumer to make decisions. This reduces coupling and enables true autonomy.

2. Use Past Tense: Events represent facts that have already happened: OrderPlaced, not PlaceOrder; PaymentCompleted, not ProcessPayment. This semantic clarity prevents confusion between commands and events.

3. Design for Unknown Consumers: You don't know who will consume your events. Include information broadly useful, but never include sensitive data that shouldn't propagate (like raw credit card numbers).

4. Support Correlation: Include correlation IDs so consumers can trace events across the entire workflow. Include causation IDs to establish event lineage—which event caused this one?

5. Version Explicitly: Events are contracts. Include version numbers and design for backward compatibility. Consumers of v1.0 should still work when v1.1 is published.

The Fat Event Trap

Including enough context doesn't mean including everything. Events that carry entire aggregates create coupling through shared data models and bloat message sizes. Include what consumers need, not what producers have. If you find events growing excessively large, you may be mixing concerns or designing too coarsely.

Implementation Patterns

Implementing choreography requires patterns that ensure reliable event delivery, proper event handling, and consistent state management. Let's examine the essential patterns that make choreography work in production.

Pattern 1: The Transactional Outbox

A critical challenge in choreography is ensuring that state changes and event publication happen atomically. If a service updates its database but fails before publishing the event, the system becomes inconsistent.

The Outbox Pattern solves this by writing events to an "outbox" table in the same database transaction as the state change. A separate process then publishes events from the outbox to the message broker.

Transactional Outbox Pattern
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
// Outbox table schema (PostgreSQL)
// CREATE TABLE outbox (
//   id UUID PRIMARY KEY,
//   aggregate_type VARCHAR(255) NOT NULL,
//   aggregate_id VARCHAR(255) NOT NULL,
//   event_type VARCHAR(255) NOT NULL,
//   payload JSONB NOT NULL,
//   created_at TIMESTAMP DEFAULT NOW(),
//   published_at TIMESTAMP NULL
// );
 
class OrderService {
  constructor(
    private readonly db: Database,
    private readonly eventPublisher: EventPublisher // For polling, not direct use
  ) {}
 
  async placeOrder(command: PlaceOrderCommand): Promise<Order> {
    // Single transaction ensures atomicity
    return this.db.transaction(async (tx) => {
      // 1. Create the order
      const order = Order.create(command);
      await tx.orders.insert(order);
      
      // 2. Write event to outbox in SAME transaction
      const event: OrderPlacedEvent = {
        eventId: uuid(),
        eventType: 'OrderPlaced',
        timestamp: new Date().toISOString(),
        version: '1.0',
        correlationId: command.correlationId,
        orderId: order.id,
        customerId: command.customerId,
        items: order.items,
        totalAmount: order.totalAmount,
        currency: order.currency,
        shippingAddress: command.shippingAddress,
        customerTier: command.customerTier,
        isFirstOrder: await this.isFirstOrder(tx, command.customerId),
      };
      
      await tx.outbox.insert({
        id: event.eventId,
        aggregateType: 'Order',
        aggregateId: order.id,
        eventType: event.eventType,
        payload: event,
        createdAt: new Date(),
        publishedAt: null,
      });
      
      return order;
    });
  }
}
 
// Separate process: Outbox Publisher
class OutboxPublisher {
  async poll(): Promise<void> {
    const events = await this.db.outbox.findMany({
      where: { publishedAt: null },
      orderBy: { createdAt: 'asc' },
      limit: 100,
    });
    
    for (const event of events) {
      try {
        await this.messageBroker.publish(
          event.eventType,
          event.payload
        );
        await this.db.outbox.update({
          where: { id: event.id },
          data: { publishedAt: new Date() },
        });
      } catch (error) {
        // Log and continue; will retry on next poll
        logger.error('Failed to publish event', { eventId: event.id, error });
      }
    }
  }
}

Pattern 2: Idempotent Consumers

In distributed systems, events may be delivered more than once. Network issues, retries, and at-least-once delivery semantics all contribute. Consumers must be idempotent—processing the same event multiple times produces the same result as processing it once.

Strategies for idempotency:

Event ID Tracking: Store processed event IDs; check before processing
Natural Idempotency: Design operations that are naturally idempotent (setting a value, not incrementing)
Conditional Updates: Use database conditions that prevent duplicate effects

Idempotent Event Consumer
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
class PaymentConsumer {
  constructor(
    private readonly db: Database,
    private readonly paymentGateway: PaymentGateway,
  ) {}
 
  async handleOrderPlaced(event: OrderPlacedEvent): Promise<void> {
    // Idempotency check: Have we processed this event?
    const processed = await this.db.processedEvents.findUnique({
      where: { eventId: event.eventId }
    });
    
    if (processed) {
      logger.info('Event already processed, skipping', { 
        eventId: event.eventId 
      });
      return;
    }
    
    // Use database transaction for idempotency
    await this.db.transaction(async (tx) => {
      // Double-check inside transaction (optimistic locking alternative)
      const exists = await tx.processedEvents.findUnique({
        where: { eventId: event.eventId }
      });
      if (exists) return;
      
      // Process the payment
      const payment = await this.paymentGateway.charge({
        orderId: event.orderId,
        amount: event.totalAmount,
        customerId: event.customerId,
        idempotencyKey: event.eventId, // Gateway-level idempotency
      });
      
      // Store payment record
      await tx.payments.create({
        orderId: event.orderId,
        paymentId: payment.id,
        amount: event.totalAmount,
        status: payment.status,
      });
      
      // Mark event as processed
      await tx.processedEvents.create({
        eventId: event.eventId,
        processedAt: new Date(),
        processor: 'PaymentConsumer',
      });
      
      // Write outgoing event to outbox
      if (payment.status === 'completed') {
        await tx.outbox.insert({
          id: uuid(),
          aggregateType: 'Payment',
          aggregateId: payment.id,
          eventType: 'PaymentCompleted',
          payload: this.buildPaymentCompletedEvent(event, payment),
        });
      }
    });
  }
}

Idempotency Key Propagation

Notice how the event ID becomes the idempotency key for the payment gateway. This propagation of idempotency through the entire chain—from initial event through external API calls—is crucial for reliable choreography. Always pass correlation and idempotency information through the entire workflow.

Error Handling Without Central Control

In choreography, there's no central controller to catch exceptions and coordinate recovery. Each service must handle its own failures and communicate them through events. This distributed error handling is both a challenge and a strength.

The Principle of Compensating Events:

When something goes wrong, services emit failure events that trigger compensating actions in other services. Instead of rolling back a distributed transaction (which choreography doesn't support), you emit events that undo or compensate for previous work.

Example: Payment Failure in Order Processing:

Compensating Events Flow
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
// Happy path:
// OrderPlaced → PaymentCompleted → InventoryReserved → ShipmentScheduled
 
// Payment fails:
// OrderPlaced → PaymentFailed
 
// Inventory reservation fails (after payment succeeded):
// OrderPlaced → PaymentCompleted → InventoryReservationFailed
//                                        ↓
// PaymentService reacts to InventoryReservationFailed:
//                               → PaymentRefunded
//                                        ↓
// OrderService reacts to PaymentRefunded:
//                               → OrderCancelled
 
class PaymentService {
  async handleInventoryReservationFailed(
    event: InventoryReservationFailedEvent
  ): Promise<void> {
    // Find the payment for this order
    const payment = await this.db.payments.findUnique({
      where: { orderId: event.orderId }
    });
    
    if (!payment || payment.status !== 'completed') {
      return; // Nothing to refund
    }
    
    await this.db.transaction(async (tx) => {
      // Process refund
      const refund = await this.paymentGateway.refund({
        paymentId: payment.paymentId,
        amount: payment.amount,
        reason: 'inventory_unavailable',
      });
      
      // Update payment status
      await tx.payments.update({
        where: { paymentId: payment.paymentId },
        data: { status: 'refunded', refundId: refund.id },
      });
      
      // Emit compensating event
      await tx.outbox.insert({
        id: uuid(),
        aggregateType: 'Payment',
        aggregateId: payment.paymentId,
        eventType: 'PaymentRefunded',
        payload: {
          eventId: uuid(),
          eventType: 'PaymentRefunded',
          timestamp: new Date().toISOString(),
          version: '1.0',
          correlationId: event.correlationId,
          causationId: event.eventId,
          orderId: event.orderId,
          paymentId: payment.paymentId,
          refundAmount: payment.amount,
          reason: 'inventory_unavailable',
        },
      });
    });
  }
}
 
class OrderService {
  async handlePaymentRefunded(event: PaymentRefundedEvent): Promise<void> {
    const order = await this.db.orders.findUnique({
      where: { orderId: event.orderId }
    });
    
    if (!order || order.status === 'cancelled') {
      return; // Already cancelled or doesn't exist
    }
    
    await this.db.transaction(async (tx) => {
      await tx.orders.update({
        where: { orderId: event.orderId },
        data: { 
          status: 'cancelled',
          cancelledAt: new Date(),
          cancelReason: event.reason,
        },
      });
      
      await tx.outbox.insert({
        id: uuid(),
        aggregateType: 'Order',
        aggregateId: event.orderId,
        eventType: 'OrderCancelled',
        payload: {
          eventId: uuid(),
          eventType: 'OrderCancelled',
          timestamp: new Date().toISOString(),
          version: '1.0',
          correlationId: event.correlationId,
          causationId: event.eventId,
          orderId: event.orderId,
          reason: event.reason,
        },
      });
    });
  }
}

Error Handling Best Practices

•Emit Failure Events Explicitly — Don't just log failures; emit events so other services can react. PaymentFailed, InventoryReservationFailed, ShipmentDelayed are all valid business events.
•Design Compensation Logic Carefully — Compensating actions may not perfectly mirror the original. A refund might not equal the charge (fees, partial fulfillment). Model this explicitly.
•Handle Event Ordering Carefully — Failure events might arrive before or after related success events due to network conditions. Use event timestamps and logical ordering.
•Consider Dead Letter Queues — Events that fail processing repeatedly should move to a DLQ for manual intervention, not block the queue forever.
•Monitor Compensation Chains — Track when compensations occur. Frequent compensations might indicate design problems, not just business exceptions.

The Compensation Cascade Risk

Be careful with compensation chains that create their own failure events. If refunding payment fails, you now have a PaymentRefundFailed event that might trigger more compensations. Design circuit breakers and terminal states to prevent infinite compensation loops.

Visibility and Debugging

The greatest challenge with choreography is visibility. When no single service knows the complete workflow, how do you understand what's happening? How do you debug a failed order when the relevant events span five services?

Distributed Tracing is Essential:

Correlation IDs aren't just nice-to-have—they're mandatory. Every event must carry a correlation ID that traces back to the initiating action. Tools like Jaeger, Zipkin, or cloud-native equivalents (AWS X-Ray, Google Cloud Trace) can visualize the entire flow.

The Correlation Graph:

Build the ability to reconstruct the complete event chain for any workflow instance. Given an order ID or correlation ID, you should be able to see every event in the chain, in order, with timing information.

Workflow Visualization Query
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- Query to reconstruct event chain for an order
-- This assumes events are stored in an event store or log
 
WITH RECURSIVE event_chain AS (
    -- Start with the initial event
    SELECT 
        e.event_id,
        e.event_type,
        e.timestamp,
        e.correlation_id,
        e.causation_id,
        e.service_name,
        e.payload,
        1 as depth
    FROM events e
    WHERE e.correlation_id = :orderId
      AND e.causation_id IS NULL
    
    UNION ALL
    
    -- Recursively find caused events
    SELECT 
        e.event_id,
        e.event_type,
        e.timestamp,
        e.correlation_id,
        e.causation_id,
        e.service_name,
        e.payload,
        ec.depth + 1
    FROM events e
    JOIN event_chain ec ON e.causation_id = ec.event_id
)
SELECT 
    event_type,
    service_name,
    timestamp,
    depth,
    EXTRACT(MILLISECONDS FROM 
        timestamp - LAG(timestamp) OVER (ORDER BY depth, timestamp)
    ) as ms_since_previous
FROM event_chain
ORDER BY depth, timestamp;
 
-- Example output:
-- | event_type          | service_name | timestamp           | depth | ms_since_previous |
-- |---------------------|--------------|---------------------|-------|-------------------|
-- | OrderPlaced         | orders       | 2024-01-15 10:00:00 | 1     | NULL              |
-- | PaymentCompleted    | payments     | 2024-01-15 10:00:02 | 2     | 2000              |
-- | InventoryReserved   | inventory    | 2024-01-15 10:00:03 | 3     | 1000              |
-- | ShipmentScheduled   | shipping     | 2024-01-15 10:00:05 | 4     | 2000              |

Process Mining and Visualization:

For complex choreographies, consider process mining tools that can reconstruct workflow patterns from event logs. These tools visualize the actual flows in your system, which may differ from your intended design.

Monitoring Points:

Key Monitoring Metrics for Choreography

•Event Throughput: Events produced/consumed per service per second
•Event Latency: Time between correlated events (e.g., OrderPlaced to ShipmentScheduled)
•Consumer Lag: How far behind consumers are from producers
•Error Rates: Failed event processing, DLQ depth
•Completion Rates: What percentage of initiated workflows reach expected end states?
•Compensation Frequency: How often are compensating events triggered?

The Observability Investment

Choreography requires significant investment in observability tooling. Without it, you're flying blind. Budget time and resources for distributed tracing, event stores for replay and debugging, and dashboards that show workflow health. This isn't optional—it's the cost of doing choreography correctly.

Testing Choreographed Workflows

Testing choreographed systems requires strategies that span multiple services without introducing tight coupling. The testing pyramid remains relevant, but each level takes on new characteristics.

Unit Testing: Focus on Event Handling Logic

Unit tests verify that a service correctly handles incoming events and produces correct outgoing events. Mock the message infrastructure; test the business logic.

Unit Testing Event Handlers
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
describe('PaymentService', () => {
  let service: PaymentService;
  let mockDb: MockDatabase;
  let mockPaymentGateway: MockPaymentGateway;
  let capturedOutboxEvents: OutboxEntry[];
 
  beforeEach(() => {
    mockDb = createMockDatabase();
    mockPaymentGateway = createMockPaymentGateway();
    capturedOutboxEvents = [];
    
    // Capture events written to outbox
    mockDb.outbox.insert = jest.fn((entry) => {
      capturedOutboxEvents.push(entry);
      return Promise.resolve(entry);
    });
    
    service = new PaymentService(mockDb, mockPaymentGateway);
  });
 
  describe('handleOrderPlaced', () => {
    it('should process payment and emit PaymentCompleted', async () => {
      const orderPlacedEvent = createOrderPlacedEvent({
        orderId: 'order-123',
        totalAmount: { amount: 9999, currency: 'USD' },
      });
      
      mockPaymentGateway.charge.mockResolvedValue({
        id: 'payment-456',
        status: 'completed',
        transactionRef: 'tx-789',
      });
      
      await service.handleOrderPlaced(orderPlacedEvent);
      
      // Verify payment was processed
      expect(mockPaymentGateway.charge).toHaveBeenCalledWith(
        expect.objectContaining({
          orderId: 'order-123',
          amount: { amount: 9999, currency: 'USD' },
        })
      );
      
      // Verify correct event was emitted
      expect(capturedOutboxEvents).toHaveLength(1);
      expect(capturedOutboxEvents[0].eventType).toBe('PaymentCompleted');
      expect(capturedOutboxEvents[0].payload).toMatchObject({
        orderId: 'order-123',
        paymentId: 'payment-456',
        correlationId: orderPlacedEvent.correlationId,
        causationId: orderPlacedEvent.eventId,
      });
    });
    
    it('should emit PaymentFailed when gateway fails', async () => {
      const orderPlacedEvent = createOrderPlacedEvent({
        orderId: 'order-123',
      });
      
      mockPaymentGateway.charge.mockRejectedValue(
        new PaymentDeclinedError('Insufficient funds')
      );
      
      await service.handleOrderPlaced(orderPlacedEvent);
      
      expect(capturedOutboxEvents[0].eventType).toBe('PaymentFailed');
      expect(capturedOutboxEvents[0].payload.reason).toBe('insufficient_funds');
    });
    
    it('should be idempotent for duplicate events', async () => {
      const orderPlacedEvent = createOrderPlacedEvent({
        orderId: 'order-123',
        eventId: 'event-duplicate',
      });
      
      // First processing
      await service.handleOrderPlaced(orderPlacedEvent);
      const firstCallCount = mockPaymentGateway.charge.mock.calls.length;
      
      // Mark as processed in mock
      mockDb.processedEvents.findUnique.mockResolvedValue({
        eventId: 'event-duplicate',
      });
      
      // Second processing of same event
      await service.handleOrderPlaced(orderPlacedEvent);
      
      // Should not process again
      expect(mockPaymentGateway.charge).toHaveBeenCalledTimes(firstCallCount);
    });
  });
});

Integration Testing: Consumer Contract Testing

Contract testing verifies that services communicate correctly through events. Each consumer maintains contracts describing what events it expects; producers verify they satisfy those contracts.

Consumer Contract Testing with Pact
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
// Consumer defines expected event format
// payment-service/pacts/order-service-messages.ts
import { MessageConsumerPact, synchronousBodyHandler } from '@pact-foundation/pact';
 
describe('Payment Service - Order Events Contract', () => {
  const messagePact = new MessageConsumerPact({
    consumer: 'PaymentService',
    provider: 'OrderService',
    dir: './pacts',
  });
 
  it('consumes OrderPlaced events', () => {
    return messagePact
      .expectsToReceive('an OrderPlaced event')
      .withContent({
        eventType: 'OrderPlaced',
        eventId: like('uuid-string'),
        timestamp: like('2024-01-15T10:00:00Z'),
        version: '1.0',
        correlationId: like('correlation-id'),
        orderId: like('order-id'),
        customerId: like('customer-id'),
        totalAmount: {
          amount: like(9999),
          currency: like('USD'),
        },
        items: eachLike({
          productId: like('product-id'),
          quantity: like(1),
          unitPrice: like(1999),
        }),
      })
      .withMetadata({ contentType: 'application/json' })
      .verify(synchronousBodyHandler(async (event: OrderPlacedEvent) => {
        // Verify our consumer can handle this event shape
        const handler = new PaymentEventHandler();
        await expect(handler.validateEvent(event)).resolves.toBe(true);
      }));
  });
});
 
// Producer verifies it satisfies consumer contracts
// order-service/tests/pact-verification.test.ts
describe('Order Service - Pact Verification', () => {
  it('satisfies PaymentService expectations for OrderPlaced', async () => {
    const verifier = new MessageProviderPact({
      provider: 'OrderService',
      pactUrls: ['./pacts/order-service-payment-service.json'],
      messageProviders: {
        'an OrderPlaced event': async () => {
          const order = createTestOrder();
          return OrderPlacedEvent.create(order, 'test-correlation-id');
        },
      },
    });
    
    await verifier.verify();
  });
});

End-to-End Testing Strategy

For end-to-end tests, deploy all services and verify complete workflows: place an order and verify that a shipment is eventually scheduled. Use the event store to validate intermediate states. These tests are expensive but essential for verifying the choreography works as a whole.

Summary: When Choreography Shines

We've explored choreography as a coordination pattern for event-driven architectures. Let's consolidate the key insights:

Key Takeaways

•Choreography is decentralized coordination — No single service controls the workflow; it emerges from services reacting to events.
•Event design is critical — Events are contracts between autonomous services; include context, version explicitly, and design for unknown consumers.
•Transactional outbox ensures reliability — Write events to a database outbox in the same transaction as state changes to guarantee delivery.
•Idempotency is mandatory — Every event handler must safely handle duplicate events; use event ID tracking and idempotent operations.
•Error handling requires compensation — Without central control, services emit failure events that trigger compensating actions in other services.
•Observability is non-negotiable — Correlation IDs, distributed tracing, and event chain visualization are essential for debugging choreographed systems.

Choreography excels when:

Services are owned by different teams who want autonomy
Workflows may evolve (add new services) without changing existing ones
Loose coupling is prioritized over visibility
The workflow is relatively well-understood and stable
Scale and resilience are more important than strict consistency

In the next page, we'll explore the alternative: Orchestration, where a central service explicitly controls the workflow. You'll see how the same order processing workflow looks with centralized control, understand the tradeoffs, and learn when each approach is appropriate.

Page Complete

You now understand choreography as a coordination pattern: its philosophy, implementation requirements, error handling approach, and testing strategies. You can design choreographed workflows where services react autonomously to events, creating loosely coupled systems that scale independently. Next, we'll examine orchestration—the centralized alternative.

1 / 5

Loading learning content...

System Design (HLD)Choreography vs Orchestration

Choreography vs Orchestration: Coordination Patterns in Event-Driven Systems

LevelAdvanced

Duration90 mins

TopicChoreography vs Orchestration

1 / 5

Choreography — Decentralized Coordination

The Dance of Autonomous Services

What You Will Learn

Understanding Choreography

Choreography vs Traditional Request-Response
Characteristic	Request-Response	Choreography
Control Flow	Caller knows and controls the sequence	No single entity controls the sequence
Coupling	Caller coupled to callees	Services coupled only to events
Synchrony	Typically synchronous	Inherently asynchronous
Failure Handling	Caller handles callee failures	Each service handles its own failures
Scalability	Limited by calling service capacity	Each service scales independently
Visibility	Caller sees the entire flow	No single point sees the entire flow

An illustrative example:

Consider an e-commerce order process. In a choreographed system:

Order Service receives a purchase request and emits OrderPlaced
Payment Service hears OrderPlaced, processes payment, emits PaymentCompleted
Inventory Service hears PaymentCompleted, reserves items, emits InventoryReserved
Shipping Service hears InventoryReserved, schedules shipment, emits ShipmentScheduled
Notification Service hears ShipmentScheduled, sends confirmation email

The Essence of Choreography

Philosophical Foundations

Core Principles of Choreography

•Event Sovereignty — Events are the single source of truth for what happened. Services trust events over queries.
•Domain Ownership — Each service owns its domain completely, making all decisions within that domain autonomously.
•Temporal Decoupling — Producer and consumer don't need to be available simultaneously; events are durable.
•Evolutionary Architecture — New services can join the dance by subscribing to events; existing services need not change.
•Eventual Consistency — The system converges to consistency over time rather than maintaining it synchronously.

The Conway's Law Alignment

Event Design for Choreography

Domain Events vs Integration Events:

Domain events capture something that happened within a bounded context: OrderPlaced, PaymentFailed, InventoryReserved. They express business facts, not technical operations.

Integration events are designed explicitly for cross-service communication. They may be derived from domain events but are crafted for external consumption—versioned, documented, and stable.

In choreography, both types appear, but integration events require particular care. They become the interface between autonomous services.

Well-Designed Choreography Events
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// Events should be self-describing, immutable records of facts
interface OrderPlacedEvent {
  // Metadata
  readonly eventId: string;          // Unique identifier for idempotency
  readonly eventType: 'OrderPlaced';
  readonly timestamp: string;         // ISO 8601
  readonly version: '1.0';           // Schema version
  readonly correlationId: string;    // For tracing across services
  readonly causationId?: string;     // What event caused this one
  
  // Business payload
  readonly orderId: string;
  readonly customerId: string;
  readonly items: readonly OrderItem[];
  readonly totalAmount: Money;
  readonly currency: string;
  readonly shippingAddress: Address;
  
  // Context for consumers
  readonly customerTier: 'standard' | 'premium' | 'vip';
  readonly isFirstOrder: boolean;
}
 
interface PaymentCompletedEvent {
  readonly eventId: string;
  readonly eventType: 'PaymentCompleted';
  readonly timestamp: string;
  readonly version: '1.0';
  readonly correlationId: string;
  readonly causationId: string;       // Links to OrderPlaced
  
  readonly orderId: string;
  readonly paymentId: string;
  readonly amount: Money;
  readonly paymentMethod: 'card' | 'bank_transfer' | 'wallet';
  readonly transactionReference: string;
  
  // Information for downstream services
  readonly fraudScore: number;        // 0-100, helps shipping decide
}

Critical event design principles for choreography:

4. Support Correlation: Include correlation IDs so consumers can trace events across the entire workflow. Include causation IDs to establish event lineage—which event caused this one?

5. Version Explicitly: Events are contracts. Include version numbers and design for backward compatibility. Consumers of v1.0 should still work when v1.1 is published.

The Fat Event Trap

Implementation Patterns

Pattern 1: The Transactional Outbox

Transactional Outbox Pattern
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
// Outbox table schema (PostgreSQL)
// CREATE TABLE outbox (
//   id UUID PRIMARY KEY,
//   aggregate_type VARCHAR(255) NOT NULL,
//   aggregate_id VARCHAR(255) NOT NULL,
//   event_type VARCHAR(255) NOT NULL,
//   payload JSONB NOT NULL,
//   created_at TIMESTAMP DEFAULT NOW(),
//   published_at TIMESTAMP NULL
// );
 
class OrderService {
  constructor(
    private readonly db: Database,
    private readonly eventPublisher: EventPublisher // For polling, not direct use
  ) {}
 
  async placeOrder(command: PlaceOrderCommand): Promise<Order> {
    // Single transaction ensures atomicity
    return this.db.transaction(async (tx) => {
      // 1. Create the order
      const order = Order.create(command);
      await tx.orders.insert(order);
      
      // 2. Write event to outbox in SAME transaction
      const event: OrderPlacedEvent = {
        eventId: uuid(),
        eventType: 'OrderPlaced',
        timestamp: new Date().toISOString(),
        version: '1.0',
        correlationId: command.correlationId,
        orderId: order.id,
        customerId: command.customerId,
        items: order.items,
        totalAmount: order.totalAmount,
        currency: order.currency,
        shippingAddress: command.shippingAddress,
        customerTier: command.customerTier,
        isFirstOrder: await this.isFirstOrder(tx, command.customerId),
      };
      
      await tx.outbox.insert({
        id: event.eventId,
        aggregateType: 'Order',
        aggregateId: order.id,
        eventType: event.eventType,
        payload: event,
        createdAt: new Date(),
        publishedAt: null,
      });
      
      return order;
    });
  }
}
 
// Separate process: Outbox Publisher
class OutboxPublisher {
  async poll(): Promise<void> {
    const events = await this.db.outbox.findMany({
      where: { publishedAt: null },
      orderBy: { createdAt: 'asc' },
      limit: 100,
    });
    
    for (const event of events) {
      try {
        await this.messageBroker.publish(
          event.eventType,
          event.payload
        );
        await this.db.outbox.update({
          where: { id: event.id },
          data: { publishedAt: new Date() },
        });
      } catch (error) {
        // Log and continue; will retry on next poll
        logger.error('Failed to publish event', { eventId: event.id, error });
      }
    }
  }
}

Pattern 2: Idempotent Consumers

Strategies for idempotency:

Event ID Tracking: Store processed event IDs; check before processing
Natural Idempotency: Design operations that are naturally idempotent (setting a value, not incrementing)
Conditional Updates: Use database conditions that prevent duplicate effects

Idempotent Event Consumer
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
class PaymentConsumer {
  constructor(
    private readonly db: Database,
    private readonly paymentGateway: PaymentGateway,
  ) {}
 
  async handleOrderPlaced(event: OrderPlacedEvent): Promise<void> {
    // Idempotency check: Have we processed this event?
    const processed = await this.db.processedEvents.findUnique({
      where: { eventId: event.eventId }
    });
    
    if (processed) {
      logger.info('Event already processed, skipping', { 
        eventId: event.eventId 
      });
      return;
    }
    
    // Use database transaction for idempotency
    await this.db.transaction(async (tx) => {
      // Double-check inside transaction (optimistic locking alternative)
      const exists = await tx.processedEvents.findUnique({
        where: { eventId: event.eventId }
      });
      if (exists) return;
      
      // Process the payment
      const payment = await this.paymentGateway.charge({
        orderId: event.orderId,
        amount: event.totalAmount,
        customerId: event.customerId,
        idempotencyKey: event.eventId, // Gateway-level idempotency
      });
      
      // Store payment record
      await tx.payments.create({
        orderId: event.orderId,
        paymentId: payment.id,
        amount: event.totalAmount,
        status: payment.status,
      });
      
      // Mark event as processed
      await tx.processedEvents.create({
        eventId: event.eventId,
        processedAt: new Date(),
        processor: 'PaymentConsumer',
      });
      
      // Write outgoing event to outbox
      if (payment.status === 'completed') {
        await tx.outbox.insert({
          id: uuid(),
          aggregateType: 'Payment',
          aggregateId: payment.id,
          eventType: 'PaymentCompleted',
          payload: this.buildPaymentCompletedEvent(event, payment),
        });
      }
    });
  }
}

Idempotency Key Propagation

Error Handling Without Central Control

The Principle of Compensating Events:

Example: Payment Failure in Order Processing:

Compensating Events Flow
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
// Happy path:
// OrderPlaced → PaymentCompleted → InventoryReserved → ShipmentScheduled
 
// Payment fails:
// OrderPlaced → PaymentFailed
 
// Inventory reservation fails (after payment succeeded):
// OrderPlaced → PaymentCompleted → InventoryReservationFailed
//                                        ↓
// PaymentService reacts to InventoryReservationFailed:
//                               → PaymentRefunded
//                                        ↓
// OrderService reacts to PaymentRefunded:
//                               → OrderCancelled
 
class PaymentService {
  async handleInventoryReservationFailed(
    event: InventoryReservationFailedEvent
  ): Promise<void> {
    // Find the payment for this order
    const payment = await this.db.payments.findUnique({
      where: { orderId: event.orderId }
    });
    
    if (!payment || payment.status !== 'completed') {
      return; // Nothing to refund
    }
    
    await this.db.transaction(async (tx) => {
      // Process refund
      const refund = await this.paymentGateway.refund({
        paymentId: payment.paymentId,
        amount: payment.amount,
        reason: 'inventory_unavailable',
      });
      
      // Update payment status
      await tx.payments.update({
        where: { paymentId: payment.paymentId },
        data: { status: 'refunded', refundId: refund.id },
      });
      
      // Emit compensating event
      await tx.outbox.insert({
        id: uuid(),
        aggregateType: 'Payment',
        aggregateId: payment.paymentId,
        eventType: 'PaymentRefunded',
        payload: {
          eventId: uuid(),
          eventType: 'PaymentRefunded',
          timestamp: new Date().toISOString(),
          version: '1.0',
          correlationId: event.correlationId,
          causationId: event.eventId,
          orderId: event.orderId,
          paymentId: payment.paymentId,
          refundAmount: payment.amount,
          reason: 'inventory_unavailable',
        },
      });
    });
  }
}
 
class OrderService {
  async handlePaymentRefunded(event: PaymentRefundedEvent): Promise<void> {
    const order = await this.db.orders.findUnique({
      where: { orderId: event.orderId }
    });
    
    if (!order || order.status === 'cancelled') {
      return; // Already cancelled or doesn't exist
    }
    
    await this.db.transaction(async (tx) => {
      await tx.orders.update({
        where: { orderId: event.orderId },
        data: { 
          status: 'cancelled',
          cancelledAt: new Date(),
          cancelReason: event.reason,
        },
      });
      
      await tx.outbox.insert({
        id: uuid(),
        aggregateType: 'Order',
        aggregateId: event.orderId,
        eventType: 'OrderCancelled',
        payload: {
          eventId: uuid(),
          eventType: 'OrderCancelled',
          timestamp: new Date().toISOString(),
          version: '1.0',
          correlationId: event.correlationId,
          causationId: event.eventId,
          orderId: event.orderId,
          reason: event.reason,
        },
      });
    });
  }
}

Error Handling Best Practices

•Emit Failure Events Explicitly — Don't just log failures; emit events so other services can react. PaymentFailed, InventoryReservationFailed, ShipmentDelayed are all valid business events.
•Design Compensation Logic Carefully — Compensating actions may not perfectly mirror the original. A refund might not equal the charge (fees, partial fulfillment). Model this explicitly.
•Handle Event Ordering Carefully — Failure events might arrive before or after related success events due to network conditions. Use event timestamps and logical ordering.
•Consider Dead Letter Queues — Events that fail processing repeatedly should move to a DLQ for manual intervention, not block the queue forever.
•Monitor Compensation Chains — Track when compensations occur. Frequent compensations might indicate design problems, not just business exceptions.

The Compensation Cascade Risk

Visibility and Debugging

Distributed Tracing is Essential:

The Correlation Graph:

Workflow Visualization Query
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- Query to reconstruct event chain for an order
-- This assumes events are stored in an event store or log
 
WITH RECURSIVE event_chain AS (
    -- Start with the initial event
    SELECT 
        e.event_id,
        e.event_type,
        e.timestamp,
        e.correlation_id,
        e.causation_id,
        e.service_name,
        e.payload,
        1 as depth
    FROM events e
    WHERE e.correlation_id = :orderId
      AND e.causation_id IS NULL
    
    UNION ALL
    
    -- Recursively find caused events
    SELECT 
        e.event_id,
        e.event_type,
        e.timestamp,
        e.correlation_id,
        e.causation_id,
        e.service_name,
        e.payload,
        ec.depth + 1
    FROM events e
    JOIN event_chain ec ON e.causation_id = ec.event_id
)
SELECT 
    event_type,
    service_name,
    timestamp,
    depth,
    EXTRACT(MILLISECONDS FROM 
        timestamp - LAG(timestamp) OVER (ORDER BY depth, timestamp)
    ) as ms_since_previous
FROM event_chain
ORDER BY depth, timestamp;
 
-- Example output:
-- | event_type          | service_name | timestamp           | depth | ms_since_previous |
-- |---------------------|--------------|---------------------|-------|-------------------|
-- | OrderPlaced         | orders       | 2024-01-15 10:00:00 | 1     | NULL              |
-- | PaymentCompleted    | payments     | 2024-01-15 10:00:02 | 2     | 2000              |
-- | InventoryReserved   | inventory    | 2024-01-15 10:00:03 | 3     | 1000              |
-- | ShipmentScheduled   | shipping     | 2024-01-15 10:00:05 | 4     | 2000              |

Process Mining and Visualization:

Monitoring Points:

Key Monitoring Metrics for Choreography

•Event Throughput: Events produced/consumed per service per second
•Event Latency: Time between correlated events (e.g., OrderPlaced to ShipmentScheduled)
•Consumer Lag: How far behind consumers are from producers
•Error Rates: Failed event processing, DLQ depth
•Completion Rates: What percentage of initiated workflows reach expected end states?
•Compensation Frequency: How often are compensating events triggered?

The Observability Investment

Testing Choreographed Workflows

Testing choreographed systems requires strategies that span multiple services without introducing tight coupling. The testing pyramid remains relevant, but each level takes on new characteristics.

Unit Testing: Focus on Event Handling Logic

Unit tests verify that a service correctly handles incoming events and produces correct outgoing events. Mock the message infrastructure; test the business logic.

Unit Testing Event Handlers
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
describe('PaymentService', () => {
  let service: PaymentService;
  let mockDb: MockDatabase;
  let mockPaymentGateway: MockPaymentGateway;
  let capturedOutboxEvents: OutboxEntry[];
 
  beforeEach(() => {
    mockDb = createMockDatabase();
    mockPaymentGateway = createMockPaymentGateway();
    capturedOutboxEvents = [];
    
    // Capture events written to outbox
    mockDb.outbox.insert = jest.fn((entry) => {
      capturedOutboxEvents.push(entry);
      return Promise.resolve(entry);
    });
    
    service = new PaymentService(mockDb, mockPaymentGateway);
  });
 
  describe('handleOrderPlaced', () => {
    it('should process payment and emit PaymentCompleted', async () => {
      const orderPlacedEvent = createOrderPlacedEvent({
        orderId: 'order-123',
        totalAmount: { amount: 9999, currency: 'USD' },
      });
      
      mockPaymentGateway.charge.mockResolvedValue({
        id: 'payment-456',
        status: 'completed',
        transactionRef: 'tx-789',
      });
      
      await service.handleOrderPlaced(orderPlacedEvent);
      
      // Verify payment was processed
      expect(mockPaymentGateway.charge).toHaveBeenCalledWith(
        expect.objectContaining({
          orderId: 'order-123',
          amount: { amount: 9999, currency: 'USD' },
        })
      );
      
      // Verify correct event was emitted
      expect(capturedOutboxEvents).toHaveLength(1);
      expect(capturedOutboxEvents[0].eventType).toBe('PaymentCompleted');
      expect(capturedOutboxEvents[0].payload).toMatchObject({
        orderId: 'order-123',
        paymentId: 'payment-456',
        correlationId: orderPlacedEvent.correlationId,
        causationId: orderPlacedEvent.eventId,
      });
    });
    
    it('should emit PaymentFailed when gateway fails', async () => {
      const orderPlacedEvent = createOrderPlacedEvent({
        orderId: 'order-123',
      });
      
      mockPaymentGateway.charge.mockRejectedValue(
        new PaymentDeclinedError('Insufficient funds')
      );
      
      await service.handleOrderPlaced(orderPlacedEvent);
      
      expect(capturedOutboxEvents[0].eventType).toBe('PaymentFailed');
      expect(capturedOutboxEvents[0].payload.reason).toBe('insufficient_funds');
    });
    
    it('should be idempotent for duplicate events', async () => {
      const orderPlacedEvent = createOrderPlacedEvent({
        orderId: 'order-123',
        eventId: 'event-duplicate',
      });
      
      // First processing
      await service.handleOrderPlaced(orderPlacedEvent);
      const firstCallCount = mockPaymentGateway.charge.mock.calls.length;
      
      // Mark as processed in mock
      mockDb.processedEvents.findUnique.mockResolvedValue({
        eventId: 'event-duplicate',
      });
      
      // Second processing of same event
      await service.handleOrderPlaced(orderPlacedEvent);
      
      // Should not process again
      expect(mockPaymentGateway.charge).toHaveBeenCalledTimes(firstCallCount);
    });
  });
});

Integration Testing: Consumer Contract Testing

Contract testing verifies that services communicate correctly through events. Each consumer maintains contracts describing what events it expects; producers verify they satisfy those contracts.

Consumer Contract Testing with Pact
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
// Consumer defines expected event format
// payment-service/pacts/order-service-messages.ts
import { MessageConsumerPact, synchronousBodyHandler } from '@pact-foundation/pact';
 
describe('Payment Service - Order Events Contract', () => {
  const messagePact = new MessageConsumerPact({
    consumer: 'PaymentService',
    provider: 'OrderService',
    dir: './pacts',
  });
 
  it('consumes OrderPlaced events', () => {
    return messagePact
      .expectsToReceive('an OrderPlaced event')
      .withContent({
        eventType: 'OrderPlaced',
        eventId: like('uuid-string'),
        timestamp: like('2024-01-15T10:00:00Z'),
        version: '1.0',
        correlationId: like('correlation-id'),
        orderId: like('order-id'),
        customerId: like('customer-id'),
        totalAmount: {
          amount: like(9999),
          currency: like('USD'),
        },
        items: eachLike({
          productId: like('product-id'),
          quantity: like(1),
          unitPrice: like(1999),
        }),
      })
      .withMetadata({ contentType: 'application/json' })
      .verify(synchronousBodyHandler(async (event: OrderPlacedEvent) => {
        // Verify our consumer can handle this event shape
        const handler = new PaymentEventHandler();
        await expect(handler.validateEvent(event)).resolves.toBe(true);
      }));
  });
});
 
// Producer verifies it satisfies consumer contracts
// order-service/tests/pact-verification.test.ts
describe('Order Service - Pact Verification', () => {
  it('satisfies PaymentService expectations for OrderPlaced', async () => {
    const verifier = new MessageProviderPact({
      provider: 'OrderService',
      pactUrls: ['./pacts/order-service-payment-service.json'],
      messageProviders: {
        'an OrderPlaced event': async () => {
          const order = createTestOrder();
          return OrderPlacedEvent.create(order, 'test-correlation-id');
        },
      },
    });
    
    await verifier.verify();
  });
});

End-to-End Testing Strategy

Summary: When Choreography Shines

We've explored choreography as a coordination pattern for event-driven architectures. Let's consolidate the key insights:

Key Takeaways

•Choreography is decentralized coordination — No single service controls the workflow; it emerges from services reacting to events.
•Event design is critical — Events are contracts between autonomous services; include context, version explicitly, and design for unknown consumers.
•Transactional outbox ensures reliability — Write events to a database outbox in the same transaction as state changes to guarantee delivery.
•Idempotency is mandatory — Every event handler must safely handle duplicate events; use event ID tracking and idempotent operations.
•Error handling requires compensation — Without central control, services emit failure events that trigger compensating actions in other services.
•Observability is non-negotiable — Correlation IDs, distributed tracing, and event chain visualization are essential for debugging choreographed systems.

Choreography excels when:

Services are owned by different teams who want autonomy
Workflows may evolve (add new services) without changing existing ones
Loose coupling is prioritized over visibility
The workflow is relatively well-understood and stable
Scale and resilience are more important than strict consistency

Page Complete

1 / 5