Loading learning content...
Understanding what an event handler is provides a foundation, but knowing how to design excellent handlers requires a deeper understanding of principles that make handlers robust, maintainable, and production-ready.
The difference between a handler that works in tests and one that survives years in production lies not in the business logic, but in the application of design principles that anticipate and mitigate real-world challenges: duplicate messages, out-of-order delivery, partial failures, and the inevitable evolution of event schemas.
By the end of this page, you will master the core principles that guide event handler design: idempotency, error handling strategies, resilience patterns, and the architectural principles that keep handlers maintainable as systems grow. These principles apply across languages, frameworks, and domains.
Idempotency is the single most important principle for event handlers. In distributed systems with at-least-once delivery guarantees, handlers will inevitably receive duplicate events. Network retries, message broker redeliveries, and infrastructure failures all conspire to send the same event multiple times.
An idempotent handler produces the same result whether it processes an event once or multiple times. This doesn't mean the handler does nothing on subsequent calls—it means the observable outcome remains consistent.
In distributed systems, you should assume every event will be delivered at least twice. Infrastructure failures, network partitions, and consumer crashes all trigger redelivery. Handlers that aren't idempotent will double-charge customers, send duplicate emails, and corrupt data.
Strategies for achieving idempotency:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100
// Idempotency Pattern 1: Explicit Deduplication Storeclass IdempotentPaymentHandler implements EventHandler<OrderPaidEvent> { readonly eventTypes = ['OrderPaid']; constructor( private readonly paymentService: PaymentService, private readonly idempotencyStore: IdempotencyStore, private readonly logger: Logger ) {} async handle(event: OrderPaidEvent): Promise<void> { // Check for prior processing using event ID if (await this.idempotencyStore.hasProcessed(event.id)) { this.logger.info('Event already processed, skipping', { eventId: event.id }); return; } // Process the payment await this.paymentService.capturePayment(event.payload.paymentId); // Mark as processed AFTER successful completion await this.idempotencyStore.markProcessed(event.id, { processedAt: new Date(), orderId: event.payload.orderId }); }} // Idempotency Pattern 2: Conditional Update with Versionclass IdempotentOrderStatusHandler implements EventHandler<OrderShippedEvent> { readonly eventTypes = ['OrderShipped']; constructor(private readonly orderRepo: OrderRepository) {} async handle(event: OrderShippedEvent): Promise<void> { // Conditional update - only applies if order is in 'processing' state // Running twice is safe: second attempt finds order in 'shipped' state and fails gracefully const updated = await this.orderRepo.updateStatus({ orderId: event.payload.orderId, newStatus: 'shipped', expectedCurrentStatus: 'processing', // Precondition trackingNumber: event.payload.trackingNumber, shippedAt: event.timestamp }); if (!updated) { // Order wasn't in 'processing' state - either already shipped or invalid transition // This is expected on duplicate events - just log and continue console.log('Order status update skipped - precondition not met'); } }} // Idempotency Pattern 3: Upsert with Event IDclass IdempotentProjectionHandler implements EventHandler<UserCreatedEvent> { readonly eventTypes = ['UserCreated']; constructor(private readonly userReadModelRepo: UserReadModelRepository) {} async handle(event: UserCreatedEvent): Promise<void> { // Upsert is naturally idempotent - insert or update if exists await this.userReadModelRepo.upsert({ userId: event.payload.userId, // Natural key email: event.payload.email, name: event.payload.name, createdAt: event.timestamp, lastEventId: event.id, lastUpdated: new Date() }); // Re-processing simply overwrites with the same (or newer) data }} // The IdempotencyStore interfaceinterface IdempotencyStore { hasProcessed(eventId: string): Promise<boolean>; markProcessed(eventId: string, metadata?: Record<string, unknown>): Promise<void>;} // Redis-based implementationclass RedisIdempotencyStore implements IdempotencyStore { constructor( private readonly redis: Redis, private readonly ttlSeconds: number = 7 * 24 * 60 * 60 // 7 days ) {} async hasProcessed(eventId: string): Promise<boolean> { const result = await this.redis.get(`idempotency:${eventId}`); return result !== null; } async markProcessed(eventId: string, metadata?: Record<string, unknown>): Promise<void> { await this.redis.setex( `idempotency:${eventId}`, this.ttlSeconds, JSON.stringify({ processedAt: new Date(), ...metadata }) ); }}Always check for prior processing at the beginning of your handler, not the end. This ensures you skip duplicate work early and avoid partially re-executing logic. The idempotency store should be your first call, before any business logic.
Event handlers must manage their own errors without propagating to a caller. This requires explicit error handling strategies that classify failures and respond appropriately. Not all errors are created equal—some warrant retry, others require human intervention, and some can be safely ignored.
| Error Type | Examples | Strategy | Handler Action |
|---|---|---|---|
| Transient | Network timeout, database lock, rate limit | Retry with backoff | Throw/rethrow to trigger retry |
| Permanent | Invalid data, business rule violation | Dead-letter | Log error, send to DLQ, acknowledge message |
| Poison Message | Unparseable event, schema mismatch | Dead-letter immediately | Don't retry, send to DLQ for investigation |
| Expected/Graceful | Entity not found (deleted), already processed | Skip silently | Log at info/debug level, return successfully |
| Infrastructure | Service unavailable, circuit breaker open | Defer processing | Dead-letter with retry-after timestamp |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123
// Comprehensive error handling in handlersclass RobustOrderHandler implements EventHandler<OrderPlacedEvent> { readonly eventTypes = ['OrderPlaced']; constructor( private readonly orderService: OrderService, private readonly deadLetterQueue: DeadLetterQueue, private readonly logger: Logger, private readonly metrics: Metrics ) {} async handle(event: OrderPlacedEvent): Promise<void> { try { await this.processOrder(event); this.metrics.increment('order.handler.success'); } catch (error) { await this.handleError(event, error); } } private async processOrder(event: OrderPlacedEvent): Promise<void> { // Validate event data this.validateEvent(event); // Process the order await this.orderService.processOrder(event.payload); } private validateEvent(event: OrderPlacedEvent): void { if (!event.payload.orderId) { throw new PermanentError('Missing orderId in event payload'); } if (!event.payload.items?.length) { throw new PermanentError('Order must have at least one item'); } } private async handleError(event: OrderPlacedEvent, error: unknown): Promise<void> { this.metrics.increment('order.handler.error'); // Classify the error if (error instanceof PermanentError) { // Permanent errors go straight to dead-letter this.logger.error('Permanent error processing order', { eventId: event.id, orderId: event.payload.orderId, error: error.message }); await this.deadLetterQueue.send({ event, error: error.message, errorType: 'permanent', timestamp: new Date() }); // Return successfully - don't retry return; } if (error instanceof NotFoundError) { // Entity not found - likely deleted between event creation and handling this.logger.warn('Order not found, may have been deleted', { eventId: event.id, orderId: event.payload.orderId }); // Return successfully - this is expected in eventually consistent systems return; } if (error instanceof CircuitBreakerOpenError) { // Downstream service unavailable this.logger.warn('Downstream service unavailable', { eventId: event.id, service: error.serviceName }); // Dead-letter with retry-after await this.deadLetterQueue.send({ event, error: error.message, errorType: 'infrastructure', retryAfter: new Date(Date.now() + 5 * 60 * 1000), // 5 minutes timestamp: new Date() }); return; } // Transient or unknown error - rethrow to trigger retry this.logger.error('Transient error processing order, will retry', { eventId: event.id, orderId: event.payload.orderId, error: error instanceof Error ? error.message : String(error) }); throw error; // Rethrow for retry }} // Custom error types for classificationclass PermanentError extends Error { constructor(message: string) { super(message); this.name = 'PermanentError'; }} class NotFoundError extends Error { constructor(public readonly entityType: string, public readonly entityId: string) { super(`${entityType} not found: ${entityId}`); this.name = 'NotFoundError'; }} class CircuitBreakerOpenError extends Error { constructor(public readonly serviceName: string) { super(`Circuit breaker open for service: ${serviceName}`); this.name = 'CircuitBreakerOpenError'; }}A dead-letter queue is essential for production event systems. It captures events that cannot be processed, preserving them for investigation and manual replay. Without a DLQ, failed events are lost, and problems go undetected. Always implement DLQ handling for permanent errors.
An autonomous handler is self-contained: it has everything it needs to process an event without relying on prior handlers, shared state, or specific execution order. Autonomy is fundamental to scalability—autonomous handlers can be parallelized, distributed across machines, and scaled independently.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
// Autonomous handler - self-contained and independentclass AutonomousInventoryHandler implements EventHandler<OrderPlacedEvent> { readonly eventTypes = ['OrderPlaced']; constructor( private readonly inventoryRepo: InventoryRepository, private readonly eventBus: EventBus, private readonly logger: Logger ) {} async handle(event: OrderPlacedEvent): Promise<void> { // The event contains all necessary data - no external queries needed const { orderId, items, timestamp } = event.payload; // Process independently - other handlers' success/failure doesn't affect us for (const item of items) { // Use timestamp for handling out-of-order events const reserved = await this.inventoryRepo.reserveStock({ sku: item.sku, quantity: item.quantity, orderId: orderId, reservedAt: timestamp, // Only reserve if this is newer than existing reservation ifNewerThan: await this.getExistingReservationTime(orderId, item.sku) }); if (!reserved) { // Insufficient stock - this handler publishes its own event await this.eventBus.publish({ type: 'InventoryReservationFailed', id: generateId(), timestamp: new Date(), payload: { orderId, sku: item.sku, requestedQuantity: item.quantity } }); return; } } // Success - publish downstream event await this.eventBus.publish({ type: 'InventoryReserved', id: generateId(), timestamp: new Date(), payload: { orderId, items } }); } private async getExistingReservationTime(orderId: string, sku: string): Promise<Date | null> { const existing = await this.inventoryRepo.findReservation(orderId, sku); return existing?.reservedAt ?? null; }}Autonomous handlers depend on events carrying sufficient data. This argues for 'fat events' that include all relevant information rather than just IDs that require additional lookups. The trade-off is larger message sizes, but the autonomy gained is usually worth it.
Handlers interact with external systems—databases, APIs, message queues—that can fail. Building resilience into handlers ensures they degrade gracefully and recover automatically from transient failures.
Retry with exponential backoff handles transient failures by reaxtempting operations with increasing delays. This gives downstream systems time to recover while avoiding overwhelming them with immediate retries.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
class RetryPolicy { constructor( private readonly maxAttempts: number = 3, private readonly baseDelayMs: number = 100, private readonly maxDelayMs: number = 10000 ) {} async execute<T>(operation: () => Promise<T>): Promise<T> { let lastError: Error | undefined; for (let attempt = 1; attempt <= this.maxAttempts; attempt++) { try { return await operation(); } catch (error) { lastError = error as Error; if (attempt === this.maxAttempts) { break; // No more retries } if (!this.isRetryable(error)) { break; // Non-retryable error } // Calculate delay with exponential backoff + jitter const delay = Math.min( this.baseDelayMs * Math.pow(2, attempt - 1) + Math.random() * 100, this.maxDelayMs ); console.log(`Attempt ${attempt} failed, retrying in ${delay}ms`); await this.sleep(delay); } } throw lastError; } private isRetryable(error: unknown): boolean { // Retry network errors, timeouts, rate limits if (error instanceof Error) { const message = error.message.toLowerCase(); return message.includes('timeout') || message.includes('network') || message.includes('rate limit') || message.includes('503') || message.includes('429'); } return false; } private sleep(ms: number): Promise<void> { return new Promise(resolve => setTimeout(resolve, ms)); }} // Usage in handlerclass ResilientHandler implements EventHandler<SomeEvent> { private readonly retryPolicy = new RetryPolicy(3, 100, 5000); async handle(event: SomeEvent): Promise<void> { await this.retryPolicy.execute(async () => { await this.externalService.call(event.payload); }); }}The Single Responsibility Principle (SRP) applies directly to event handlers: each handler should have one reason to change. This means handlers should focus on a single concern, making them easier to understand, test, and maintain.
Signs a handler is doing too much:
The remedy: Split into multiple focused handlers that each subscribe to the same event.
❌ Violating SRP
1234567891011121314151617181920212223242526
// Fat handler doing too many thingsclass OrderPlacedHandler { async handle(event: OrderPlacedEvent) { // Concern 1: Inventory await this.inventoryService .reserve(event.payload.items); // Concern 2: Notifications await this.emailService .sendConfirmation(event.payload); await this.smsService .sendConfirmation(event.payload); // Concern 3: Analytics await this.analyticsService .trackOrder(event.payload); // Concern 4: Fraud detection await this.fraudService .analyze(event.payload); // Concern 5: Loyalty points await this.loyaltyService .awardPoints(event.payload); }}✅ Following SRP
123456789101112131415161718192021222324252627282930313233343536373839404142
// Focused handlers - each with one jobclass InventoryReservationHandler { readonly eventTypes = ['OrderPlaced']; async handle(event: OrderPlacedEvent) { await this.inventoryService .reserve(event.payload.items); }} class OrderConfirmationHandler { readonly eventTypes = ['OrderPlaced']; async handle(event: OrderPlacedEvent) { await this.emailService .sendConfirmation(event.payload); await this.smsService .sendConfirmation(event.payload); }} class OrderAnalyticsHandler { readonly eventTypes = ['OrderPlaced']; async handle(event: OrderPlacedEvent) { await this.analyticsService .trackOrder(event.payload); }} class FraudDetectionHandler { readonly eventTypes = ['OrderPlaced']; async handle(event: OrderPlacedEvent) { await this.fraudService .analyze(event.payload); }} class LoyaltyPointsHandler { readonly eventTypes = ['OrderPlaced']; async handle(event: OrderPlacedEvent) { await this.loyaltyService .awardPoints(event.payload); }}Split handlers provide fault isolation (email failure doesn't block inventory), independent testing (test each concern separately), parallel execution (handlers run concurrently), and independent scaling (high-priority handlers get more resources).
Event handlers should execute quickly—typically completing in milliseconds to a few seconds. Long-running handlers create several problems:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
// Pattern: Delegate long-running work to background jobsclass VideoUploadedHandler implements EventHandler<VideoUploadedEvent> { readonly eventTypes = ['VideoUploaded']; constructor( private readonly jobQueue: JobQueue, private readonly videoRepo: VideoRepository ) {} async handle(event: VideoUploadedEvent): Promise<void> { // Quick database update await this.videoRepo.updateStatus(event.payload.videoId, 'processing'); // Delegate transcoding (takes minutes) to background job await this.jobQueue.enqueue('transcode-video', { videoId: event.payload.videoId, sourceUrl: event.payload.sourceUrl, formats: ['720p', '1080p', '4k'] }); // Delegate thumbnail generation await this.jobQueue.enqueue('generate-thumbnails', { videoId: event.payload.videoId, sourceUrl: event.payload.sourceUrl, timestamps: [0, 10, 30, 60] }); // Handler completes in milliseconds // Background jobs will publish events when they complete }} // Pattern: Publish and proceedclass OrderHandler implements EventHandler<OrderPaidEvent> { readonly eventTypes = ['OrderPaid']; constructor( private readonly eventBus: EventBus, private readonly orderRepo: OrderRepository ) {} async handle(event: OrderPaidEvent): Promise<void> { // Quick: Update order status await this.orderRepo.updateStatus(event.payload.orderId, 'paid'); // Publish events for other handlers instead of doing everything here await this.eventBus.publish({ type: 'OrderReadyForFulfillment', id: generateId(), timestamp: new Date(), payload: { orderId: event.payload.orderId, items: event.payload.items } }); // Handler completes immediately // Fulfillment, notification, etc. happen in separate handlers }}In event-driven systems, requests span multiple handlers across multiple services. Without proper observability, debugging becomes guesswork. Handlers must emit sufficient telemetry to trace event flow, identify failures, and measure performance.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
class ObservableHandler implements EventHandler<OrderEvent> { readonly eventTypes = ['OrderPlaced']; constructor( private readonly orderService: OrderService, private readonly logger: Logger, private readonly metrics: Metrics, private readonly tracer: Tracer ) {} async handle(event: OrderEvent): Promise<void> { const startTime = Date.now(); // Create trace span const span = this.tracer.startSpan('OrderPlacedHandler.handle', { attributes: { 'event.id': event.id, 'event.type': event.type, 'order.id': event.payload.orderId, 'correlation.id': event.metadata?.correlationId } }); // Structured log entry this.logger.info('Handler started', { eventId: event.id, eventType: event.type, orderId: event.payload.orderId, correlationId: event.metadata?.correlationId, handler: 'OrderPlacedHandler' }); try { // Create child span for service call const serviceSpan = this.tracer.startSpan('orderService.process', { parent: span }); await this.orderService.process(event.payload); serviceSpan.end(); // Record success metrics const duration = Date.now() - startTime; this.metrics.increment('handler.success', { handler: 'OrderPlacedHandler', eventType: 'OrderPlaced' }); this.metrics.histogram('handler.duration', duration, { handler: 'OrderPlacedHandler' }); // Success log this.logger.info('Handler completed', { eventId: event.id, orderId: event.payload.orderId, durationMs: duration, outcome: 'success' }); span.setStatus({ code: SpanStatusCode.OK }); } catch (error) { const duration = Date.now() - startTime; // Record failure metrics this.metrics.increment('handler.failure', { handler: 'OrderPlacedHandler', eventType: 'OrderPlaced', errorType: error.constructor.name }); // Error log with full context this.logger.error('Handler failed', { eventId: event.id, orderId: event.payload.orderId, durationMs: duration, outcome: 'failure', error: error.message, stack: error.stack }); span.recordException(error); span.setStatus({ code: SpanStatusCode.ERROR, message: error.message }); throw error; } finally { span.end(); } }}Include correlation IDs in every log and span. When an order fails fulfillment, you need to trace back through handler logs across services. Without correlation IDs linking related events, debugging distributed failures becomes nearly impossible.
We've covered the core principles that distinguish robust event handlers from brittle ones. These principles are battle-tested patterns from production systems handling millions of events daily.
What's next:
With design principles established, we'll explore the architectural decision of single vs multiple handlers for the same event. When should one handler do everything? When should you split into multiple handlers? The next page examines the trade-offs and provides clear guidance for this common design decision.
You now understand the core principles for designing robust event handlers. These principles—idempotency, error handling, autonomy, resilience, and observability—form the foundation for building production-ready event-driven systems.