Event-Driven Pitfalls - Learning Module

Loading content...

0/273

Duplicate Events in Event-Driven Architectures

The Ghost in the Machine: Duplicate Events

Your e-commerce platform processes a PaymentCharged event and deducts $500 from the customer's payment method. Everything works perfectly. Then, due to a network hiccup, the event is redelivered and processed again. Another $500 is deducted. The customer is now charged $1,000 for a $500 order.

This is not a theoretical concern—it happens in production systems every day. Duplicate events are not a bug; they are a fundamental characteristic of distributed messaging systems. The question isn't whether duplicates will occur, but how your system will handle them when they inevitably do.

In this page, we'll explore why duplicates occur, the mathematical principles behind idempotency, and battle-tested patterns for building systems that handle duplicates gracefully.

What You Will Learn

By the end of this page, you will understand why duplicate events are unavoidable in distributed systems, the difference between at-least-once and exactly-once semantics, how to implement idempotent event handlers, deduplication strategies at various layers of the stack, and how to design business operations that are naturally resilient to duplicates.

Why Duplicate Events Are Inevitable

To understand why duplicates are inevitable, we must understand the fundamental delivery semantics available in distributed systems.

The Three Delivery Semantics

At-Most-Once Delivery: Each message is delivered zero or one times. Duplicates are impossible, but message loss is acceptable. This is achieved by not retrying failed deliveries.
At-Least-Once Delivery: Each message is delivered one or more times. Message loss is prevented through retries, but duplicates may occur.
Exactly-Once Delivery: Each message is delivered exactly once. This is the holy grail—no loss, no duplicates.

The critical insight is this: true exactly-once delivery is impossible in distributed systems without coordination overhead that makes it impractical at scale. This is a consequence of the Two Generals Problem and the fundamental impossibility results in distributed computing.

Sources of Duplicate Events in Distributed Systems
Source	Mechanism	Example
Producer Retries	Producer doesn't receive ACK, retries sending	Kafka producer timeout, message sent twice
Consumer Failures	Consumer processes message, crashes before ACK	Consumer restarts, reprocesses unacked messages
Network Partitions	ACK lost in network, producer or broker retries	Split-brain scenarios, partition recovery
Consumer Rebalancing	Consumer group rebalances, uncommitted offsets re-assigned	Kafka rebalance during processing
Broker Replication	Broker fails after write, replica doesn't have ACK	Primary-replica failover with reprocessing
Infrastructure Recovery	Disaster recovery restores from backup including already-processed events	Database restore to point-in-time

The Producer Retry Problem

Consider the sequence of events when a producer publishes a message:

Producer sends message to broker
Broker receives message, writes to log
Broker sends acknowledgment to producer
Producer receives acknowledgment, marks message as sent

Network failures can occur at any step. If the failure happens between steps 3 and 4—the broker sent the ACK, but the producer never received it—the producer will retry. The message is now in the broker's log twice.

producer-duplicate-scenario.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// Timeline of a duplicate event scenario
 
// T=0: Producer sends OrderPlaced event
producer.send({
    topic: 'orders',
    messages: [{ key: 'order-123', value: JSON.stringify(orderPlacedEvent) }],
});
// Message reaches broker (T=50ms), broker writes to partition (T=52ms)
// Broker sends ACK (T=53ms)
 
// T=55ms: Network hiccup - ACK packet is lost
// Producer waits for ACK...
 
// T=1055ms: Producer timeout (default 1 second)
// Producer retries send
producer.send({
    topic: 'orders',  // Same message sent again
    messages: [{ key: 'order-123', value: JSON.stringify(orderPlacedEvent) }],
});
// Broker receives second copy, writes to partition
// Broker sends ACK, this time it arrives
 
// Result: 'orders' topic now contains TWO identical OrderPlaced events
// Consumer will process both unless it implements deduplication
 
// Note: Kafka's idempotent producer (enable.idempotence=true) solves 
// this specific scenario but introduces other complexity and limitations

The Consumer Processing Problem

Even if the broker guarantees no duplicate messages, consumers themselves can create effective duplicates through crash recovery.

consumer-duplicate-scenario.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Timeline of consumer-created duplicate processing
 
// T=0: Consumer polls, receives messages [M1, M2, M3]
// T=10ms: Consumer processes M1 (PaymentCharged - charges customer $500)
// T=50ms: Consumer processes M2 (InventoryReserved)
// T=100ms: Consumer crashes! 🔥
 
// M3 was not processed
// Offsets for M1 and M2 were not committed (commits happen after processing)
 
// T=5000ms: Consumer restarts
// T=5001ms: Consumer polls, receives messages [M1, M2, M3] again!
// T=5011ms: Consumer processes M1 again (PaymentCharged - charges customer AGAIN!)
 
// Customer is now charged $1000 for a $500 order 😱
 
// The fundamental problem: we can't atomically perform
// "process message AND commit offset" in a single operation
// They are separate operations that can fail independently

The Exactly-Once Illusion

Some messaging systems advertise 'exactly-once semantics' (Kafka, for example, has 'exactly-once' mode). Read the fine print carefully. These features typically mean 'exactly-once within the messaging system' through idempotent producers and transactional consumers. They do NOT guarantee exactly-once delivery to your application logic. Your side effects (database writes, API calls, emails) can still be duplicated.

The Mathematics of Idempotency

What Is Idempotency?

An operation is idempotent if performing it multiple times has the same effect as performing it once. Mathematically, a function f is idempotent if:

f(f(x)) = f(x)

For event processing, this means:

process(process(state, event), event) = process(state, event)

Processing an event twice (or any number of times) should leave the system in the same state as processing it once.

Natural Idempotency vs. Artificial Idempotency

Some operations are naturally idempotent; others require engineering to become idempotent.

Examples of Naturally Idempotent vs. Non-Idempotent Operations
Naturally Idempotent	NOT Naturally Idempotent
SET balance = 500	balance = balance + 50
DELETE WHERE id = 'X'	INSERT INTO table (...)
PUT /resource/123 (full replacement)	POST /resources (create)
User email IS 'new@email.com'	Increment counter by 1
Order status = 'shipped'	Send email notification
Cache entry = value	Append to log file

Turning Non-Idempotent Operations into Idempotent Ones

The key insight is that any operation can be made idempotent by adding a unique identifier and tracking which identifiers have been processed.

The general pattern is:

Every event has a unique, immutable identifier (eventId)
Before processing, check if eventId has been seen
If seen, skip processing (return cached result if needed)
If not seen, process and record the eventId
The recording step must be atomic with the processing step

idempotency-transformation.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Transforming a non-idempotent operation into an idempotent one
 
// NON-IDEMPOTENT: Incrementing a counter
class NonIdempotentCounter {
    private count = 0;
    
    increment(): number {
        this.count += 1;
        return this.count;
    }
}
// increment() + increment() = 2
// NOT idempotent
 
// IDEMPOTENT: Using event IDs to track processed increments
class IdempotentCounter {
    private count = 0;
    private processedEvents = new Set<string>();
    
    increment(eventId: string): number {
        if (this.processedEvents.has(eventId)) {
            // Already processed this event - return current count
            return this.count;
        }
        
        this.count += 1;
        this.processedEvents.add(eventId);
        return this.count;
    }
}
// increment('e1') + increment('e1') = 1
// Idempotent!
 
// But there's a problem: what if we crash between incrementing and recording?
// We need ATOMIC recording and processing

The Atomicity Requirement

For idempotency to truly work, the following must happen atomically:

Check if event has been processed
Perform the business logic
Record that the event has been processed

If these steps aren't atomic, there's a window where duplicates can slip through.

atomic-idempotency.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
// Achieving atomicity through database transactions
 
interface IdempotentPaymentHandler {
    handlePaymentCharged(
        eventId: string,
        customerId: string,
        amount: number
    ): Promise<void>;
}
 
class AtomicIdempotentPaymentHandler implements IdempotentPaymentHandler {
    constructor(private readonly db: Database) {}
    
    async handlePaymentCharged(
        eventId: string,
        customerId: string,
        amount: number
    ): Promise<void> {
        // Use database transaction for atomicity
        await this.db.transaction(async (tx) => {
            // Step 1: Check if event has been processed (with lock)
            const existing = await tx.query(
                'SELECT 1 FROM processed_events WHERE event_id = $1 FOR UPDATE',
                [eventId]
            );
            
            if (existing.rows.length > 0) {
                // Already processed - transaction will rollback, no side effects
                console.log(`Skipping duplicate event ${eventId}`);
                return;
            }
            
            // Step 2: Perform business logic
            await tx.query(
                'UPDATE customer_balances SET balance = balance - $1 WHERE customer_id = $2',
                [amount, customerId]
            );
            
            // Step 3: Record event as processed
            await tx.query(
                'INSERT INTO processed_events (event_id, processed_at) VALUES ($1, NOW())',
                [eventId]
            );
            
            // All three steps commit together or roll back together
        });
    }
}
 
// The key insight: both the business logic (balance update) and the 
// idempotency record (processed_events insert) are in the same transaction.
// If ANYTHING fails, both roll back - no partial state.

The Unique Constraint Shortcut

For simple idempotency needs, you can use a unique constraint on event_id in your target table. The INSERT will fail (constraint violation) if the event was already processed. This is simpler than explicit checking but only works when your business logic is a single INSERT.

Idempotency Implementation Patterns

There are several patterns for implementing idempotency, each with different trade-offs. Let's explore the most common approaches.

Pattern 1: Idempotency Key Store

The most straightforward pattern: maintain a store of processed event IDs.

idempotency-key-store.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
interface IdempotencyStore {
    hasProcessed(eventId: string): Promise<boolean>;
    markProcessed(eventId: string, result?: unknown): Promise<void>;
    getResult(eventId: string): Promise<unknown | null>;
}
 
class RedisIdempotencyStore implements IdempotencyStore {
    constructor(
        private readonly redis: Redis,
        private readonly ttlSeconds: number = 7 * 24 * 60 * 60 // 7 days
    ) {}
    
    async hasProcessed(eventId: string): Promise<boolean> {
        const exists = await this.redis.exists(`idempotency:${eventId}`);
        return exists === 1;
    }
    
    async markProcessed(eventId: string, result?: unknown): Promise<void> {
        await this.redis.set(
            `idempotency:${eventId}`,
            JSON.stringify({ processedAt: Date.now(), result }),
            'EX',
            this.ttlSeconds
        );
    }
    
    async getResult(eventId: string): Promise<unknown | null> {
        const data = await this.redis.get(`idempotency:${eventId}`);
        if (!data) return null;
        return JSON.parse(data).result;
    }
}
 
// Usage with result caching (for APIs that need to return consistent responses)
class IdempotentEventProcessor {
    constructor(
        private readonly store: IdempotencyStore,
        private readonly handler: (event: DomainEvent) => Promise<unknown>
    ) {}
    
    async process(event: DomainEvent): Promise<unknown> {
        const eventId = event.metadata.eventId;
        
        // Check if already processed
        const cachedResult = await this.store.getResult(eventId);
        if (cachedResult !== null) {
            console.log(`Returning cached result for ${eventId}`);
            return cachedResult;
        }
        
        // Not processed - execute handler
        const result = await this.handler(event);
        
        // Store result for future duplicate requests
        await this.store.markProcessed(eventId, result);
        
        return result;
    }
}

Pattern 2: Optimistic Locking with Version Numbers

For aggregate-based event handling, optimistic locking ensures that duplicate events can't corrupt state.

optimistic-locking-idempotency.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
interface VersionedAggregate {
    id: string;
    version: number;
    lastEventId: string;
    data: unknown;
}
 
class OptimisticLockingHandler {
    constructor(private readonly db: Database) {}
    
    async handleEvent(event: DomainEvent): Promise<void> {
        const aggregateId = event.data.aggregateId;
        
        // Load current aggregate with version
        const aggregate = await this.db.query<VersionedAggregate>(
            'SELECT * FROM aggregates WHERE id = $1',
            [aggregateId]
        );
        
        if (!aggregate) {
            // First event for this aggregate
            await this.createAggregate(event);
            return;
        }
        
        // Check if this event was already applied
        if (aggregate.lastEventId === event.metadata.eventId) {
            console.log(`Event ${event.metadata.eventId} already applied`);
            return;
        }
        
        // Apply event and update version with optimistic lock
        const newData = this.applyEvent(aggregate.data, event);
        
        const updated = await this.db.query(
            `UPDATE aggregates 
             SET data = $1, version = version + 1, last_event_id = $2 
             WHERE id = $3 AND version = $4`,
            [newData, event.metadata.eventId, aggregateId, aggregate.version]
        );
        
        if (updated.rowCount === 0) {
            // Another process updated concurrently - retry
            console.log(`Optimistic lock conflict for ${aggregateId}, retrying...`);
            await this.handleEvent(event); // Recursive retry
        }
    }
    
    private applyEvent(currentData: unknown, event: DomainEvent): unknown {
        // Apply event to get new state
        // This should be a pure function
        return { ...currentData, ...event.data };
    }
    
    private async createAggregate(event: DomainEvent): Promise<void> {
        try {
            await this.db.query(
                `INSERT INTO aggregates (id, version, last_event_id, data) 
                 VALUES ($1, 1, $2, $3)`,
                [event.data.aggregateId, event.metadata.eventId, event.data]
            );
        } catch (error) {
            // Handle unique constraint violation (race condition on create)
            if ((error as any).code === '23505') {
                // Another process created it - apply as update
                await this.handleEvent(event);
            } else {
                throw error;
            }
        }
    }
}

Pattern 3: Outbox Pattern with Deduplication

When your event handler needs to publish new events, the outbox pattern with deduplication ensures exactly-once publishing semantics.

outbox-deduplication.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
// The problem: how do we ensure that processing an event and 
// publishing resulting events happens exactly once?
 
class OutboxPatternHandler {
    constructor(private readonly db: Database) {}
    
    async handleOrderPlaced(event: OrderPlacedEvent): Promise<void> {
        const eventId = event.metadata.eventId;
        const orderId = event.data.orderId;
        
        await this.db.transaction(async (tx) => {
            // 1. Check idempotency
            const processed = await tx.query(
                'SELECT 1 FROM processed_events WHERE event_id = $1',
                [eventId]
            );
            if (processed.rows.length > 0) return;
            
            // 2. Process business logic
            await tx.query(
                'INSERT INTO orders (id, status, ...) VALUES ($1, $2, ...)',
                [orderId, 'placed']
            );
            
            // 3. Write events to outbox (same transaction as business logic!)
            const inventoryEvent = {
                type: 'InventoryReservationRequested',
                data: { orderId, items: event.data.items },
                metadata: { correlationId: event.metadata.correlationId }
            };
            
            await tx.query(
                `INSERT INTO event_outbox (event_id, topic, payload, created_at) 
                 VALUES ($1, $2, $3, NOW())`,
                [crypto.randomUUID(), 'inventory-events', JSON.stringify(inventoryEvent)]
            );
            
            // 4. Mark original event as processed
            await tx.query(
                'INSERT INTO processed_events (event_id) VALUES ($1)',
                [eventId]
            );
        });
        
        // Outbox table is polled by a separate process that publishes to Kafka
        // If publishing fails, it retries. If message was already published,
        // consumers handle deduplication on their end.
    }
}
 
// Outbox poller (separate process)
class OutboxPoller {
    async pollAndPublish(): Promise<void> {
        const events = await this.db.query(
            `SELECT * FROM event_outbox 
             WHERE published = false 
             ORDER BY created_at 
             LIMIT 100 
             FOR UPDATE SKIP LOCKED`
        );
        
        for (const event of events.rows) {
            try {
                await this.kafka.send({
                    topic: event.topic,
                    messages: [{ value: event.payload }],
                });
                
                await this.db.query(
                    'UPDATE event_outbox SET published = true WHERE id = $1',
                    [event.id]
                );
            } catch (error) {
                // Will retry on next poll
                console.error(`Failed to publish ${event.id}: ${error}`);
            }
        }
    }
}

The Change Data Capture Alternative

Instead of polling the outbox, you can use Change Data Capture (CDC) tools like Debezium to stream database changes to Kafka. This reduces latency and eliminates polling overhead. The trade-off is additional infrastructure complexity.

Deduplication at Different System Layers

Deduplication can be implemented at various layers of your system, each with different characteristics and trade-offs.

Layer 1: Message Broker Level

Some message brokers offer built-in deduplication.

Broker-Level Deduplication Capabilities
Broker	Deduplication Feature	Limitations
Kafka	Idempotent Producer (enable.idempotence=true)	Only dedupes producer retries; consumer duplicates not covered
AWS SQS FIFO	Content-based deduplication or MessageDeduplicationId	5-minute deduplication window; limited throughput
RabbitMQ	No native deduplication	Must implement at consumer level
Pulsar	Message deduplication on broker	Configurable window; memory overhead
Azure Event Hubs	No native deduplication	Must implement at consumer level

Layer 2: Consumer Infrastructure Level

Deduplication can be implemented as a wrapper around your event handlers.

consumer-deduplication-middleware.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
// Deduplication as a middleware/decorator pattern
 
interface EventHandler<T> {
    handle(event: DomainEvent<T>): Promise<void>;
}
 
class DeduplicatingHandler<T> implements EventHandler<T> {
    constructor(
        private readonly inner: EventHandler<T>,
        private readonly deduplicationStore: DeduplicationStore,
        private readonly options: {
            keyExtractor?: (event: DomainEvent<T>) => string;
            ttlMs?: number;
        } = {}
    ) {}
    
    async handle(event: DomainEvent<T>): Promise<void> {
        const dedupeKey = this.options.keyExtractor
            ? this.options.keyExtractor(event)
            : event.metadata.eventId;
        
        // Attempt to acquire deduplication lock
        const acquired = await this.deduplicationStore.tryAcquire(
            dedupeKey,
            this.options.ttlMs ?? 7 * 24 * 60 * 60 * 1000 // 7 days default
        );
        
        if (!acquired) {
            console.log(`Duplicate event detected: ${dedupeKey}`);
            return;
        }
        
        try {
            await this.inner.handle(event);
        } catch (error) {
            // On failure, release the deduplication lock so retry works
            await this.deduplicationStore.release(dedupeKey);
            throw error;
        }
    }
}
 
// Redis-based deduplication store
class RedisDeduplicationStore implements DeduplicationStore {
    constructor(private readonly redis: Redis) {}
    
    async tryAcquire(key: string, ttlMs: number): Promise<boolean> {
        // SET NX (only if not exists) with expiry
        const result = await this.redis.set(
            `dedupe:${key}`,
            Date.now().toString(),
            'PX',
            ttlMs,
            'NX'
        );
        return result === 'OK';
    }
    
    async release(key: string): Promise<void> {
        await this.redis.del(`dedupe:${key}`);
    }
}
 
// Usage
const orderHandler = new DeduplicatingHandler(
    new OrderPlacedHandler(),
    new RedisDeduplicationStore(redis),
    { ttlMs: 24 * 60 * 60 * 1000 } // 24 hour deduplication window
);

Layer 3: Application/Domain Level

The most robust deduplication happens at the domain level, where it's tied to business logic.

domain-level-deduplication.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
// Domain-level deduplication: idempotency is a business concern
 
class PaymentService {
    constructor(private readonly db: Database) {}
    
    async processPayment(
        paymentId: string,  // Client-generated idempotency key
        customerId: string,
        amount: number,
        paymentMethod: PaymentMethod
    ): Promise<PaymentResult> {
        // Check if payment already processed
        const existing = await this.db.query<Payment>(
            'SELECT * FROM payments WHERE idempotency_key = $1',
            [paymentId]
        );
        
        if (existing) {
            // Return the same result - idempotent behavior
            return {
                paymentId: existing.id,
                status: existing.status,
                message: 'Payment already processed (idempotent response)',
            };
        }
        
        // Process new payment
        try {
            const stripeCharge = await stripe.charges.create({
                amount,
                currency: 'usd',
                customer: customerId,
                idempotency_key: paymentId, // Stripe also supports idempotency!
            });
            
            await this.db.query(
                `INSERT INTO payments (idempotency_key, stripe_charge_id, status, amount, customer_id)
                 VALUES ($1, $2, 'completed', $3, $4)`,
                [paymentId, stripeCharge.id, amount, customerId]
            );
            
            return { paymentId, status: 'completed' };
        } catch (error) {
            // Record failure (prevents retrying failed payments)
            await this.db.query(
                `INSERT INTO payments (idempotency_key, status, error, amount, customer_id)
                 VALUES ($1, 'failed', $2, $3, $4)`,
                [paymentId, (error as Error).message, amount, customerId]
            );
            
            return { paymentId, status: 'failed', error: (error as Error).message };
        }
    }
}
 
// Key insight: the idempotency key (paymentId) is part of the domain model,
// not just infrastructure. This makes idempotency a first-class citizen.

Defense in Depth

Best practice is to implement deduplication at multiple layers. Broker-level deduplication catches producer retries. Consumer-level deduplication catches consumer restarts. Domain-level deduplication catches business-level duplicates (e.g., user clicking 'Submit' twice). Each layer catches duplicates the others might miss.

Managing Deduplication State at Scale

As your system grows, managing deduplication state becomes a challenge in itself. You can't store event IDs forever—the storage would grow unbounded.

TTL-Based Expiration

The most common approach is time-based expiration: store event IDs for a fixed window (e.g., 7 days), then delete them.

ttl-deduplication.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// Time-based deduplication with automatic cleanup
 
class TTLDeduplicationStore {
    constructor(
        private readonly redis: Redis,
        private readonly ttlSeconds: number = 7 * 24 * 60 * 60 // 7 days
    ) {}
    
    async markProcessed(eventId: string): Promise<void> {
        // SETEX: set with automatic expiry
        await this.redis.setex(
            `dedupe:${eventId}`,
            this.ttlSeconds,
            Date.now().toString()
        );
    }
    
    async isProcessed(eventId: string): Promise<boolean> {
        return (await this.redis.exists(`dedupe:${eventId}`)) === 1;
    }
}
 
// For PostgreSQL, use partition pruning:
const postgresSetup = `
-- Partitioned table for deduplication with automatic cleanup
CREATE TABLE processed_events (
    event_id VARCHAR(64) NOT NULL,
    processed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (processed_at);
 
-- Create partitions for each day (automate with pg_cron or similar)
CREATE TABLE processed_events_2024_01_01 
    PARTITION OF processed_events 
    FOR VALUES FROM ('2024-01-01') TO ('2024-01-02');
 
-- Drop old partitions (much faster than DELETE)
DROP TABLE processed_events_2024_12_24; -- 7+ days old
 
-- Index for fast lookups
CREATE UNIQUE INDEX idx_processed_events_id ON processed_events (event_id);
`;

Bloom Filters for Memory-Efficient Deduplication

When you need to deduplicate a very high volume of events without storing every ID, Bloom filters provide a probabilistic solution.

bloom-filter-deduplication.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
import { BloomFilter } from 'bloom-filters';
 
// Bloom filter: probabilistic data structure
// - If filter says "not seen": definitely not seen
// - If filter says "seen": probably seen (small false positive rate)
 
class BloomFilterDeduplicator {
    private filter: BloomFilter;
    private exactStore: Set<string>;
    
    constructor(
        expectedItems: number = 10_000_000,
        falsePositiveRate: number = 0.001  // 0.1% false positives
    ) {
        this.filter = BloomFilter.create(expectedItems, falsePositiveRate);
        // Small exact store for recent items (handles false positives)
        this.exactStore = new Set();
    }
    
    async shouldProcess(eventId: string): Promise<boolean> {
        // Fast path: check Bloom filter
        if (!this.filter.has(eventId)) {
            // Definitely not seen before
            this.filter.add(eventId);
            this.addToExactStore(eventId);
            return true;
        }
        
        // Bloom filter says "maybe seen"
        // Check exact store for certainty
        if (this.exactStore.has(eventId)) {
            // Definitely a duplicate
            return false;
        }
        
        // False positive from Bloom filter
        // This is rare (0.1%) - treat as new
        this.addToExactStore(eventId);
        return true;
    }
    
    private addToExactStore(eventId: string): void {
        // Keep exact store bounded (e.g., last 100k events)
        if (this.exactStore.size > 100_000) {
            const firstKey = this.exactStore.values().next().value;
            this.exactStore.delete(firstKey);
        }
        this.exactStore.add(eventId);
    }
}
 
// Trade-offs:
// ✅ Very memory efficient (10M items in ~2MB)
// ✅ O(1) lookups
// ⚠️ Cannot remove items (once added, always "seen")
// ⚠️ False positives possible (will incorrectly skip ~0.1% of unique events)
 
// Best for: high-volume streaming where occasional false positives are acceptable
// Not suitable for: financial transactions, critical business events

Hierarchical Deduplication

For very high scale, use a hierarchical approach: fast in-memory cache for recent events, backed by persistent storage for the full TTL window.

hierarchical-deduplication.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
class HierarchicalDeduplicationStore {
    constructor(
        private readonly l1Cache: LRUCache<string, boolean>,  // In-memory, fast
        private readonly l2Cache: Redis,                        // Redis, medium
        private readonly l3Store: Database,                     // PostgreSQL, persistent
        private readonly ttlSeconds: number = 7 * 24 * 60 * 60
    ) {}
    
    async isProcessed(eventId: string): Promise<boolean> {
        // Level 1: Check in-memory cache (sub-millisecond)
        if (this.l1Cache.has(eventId)) {
            return true;
        }
        
        // Level 2: Check Redis (~1-2ms)
        const inRedis = await this.l2Cache.exists(`dedupe:${eventId}`);
        if (inRedis) {
            this.l1Cache.set(eventId, true);
            return true;
        }
        
        // Level 3: Check database (~5-20ms)
        const inDb = await this.l3Store.query(
            'SELECT 1 FROM processed_events WHERE event_id = $1',
            [eventId]
        );
        if (inDb.rows.length > 0) {
            this.l1Cache.set(eventId, true);
            await this.l2Cache.setex(`dedupe:${eventId}`, this.ttlSeconds, '1');
            return true;
        }
        
        return false;
    }
    
    async markProcessed(eventId: string): Promise<void> {
        // Write to all levels (write-through)
        await Promise.all([
            (() => { this.l1Cache.set(eventId, true); })(),
            this.l2Cache.setex(`dedupe:${eventId}`, this.ttlSeconds, '1'),
            this.l3Store.query(
                'INSERT INTO processed_events (event_id) VALUES ($1) ON CONFLICT DO NOTHING',
                [eventId]
            ),
        ]);
    }
}
 
// Cache hit rates in practice:
// L1 (in-memory): ~70-90% for recent, hot events
// L2 (Redis): ~5-20% for warm events
// L3 (PostgreSQL): ~5-10% for cold events
 
// Result: most deduplication checks are sub-millisecond

The TTL Window Trade-off

Your TTL window must be longer than the maximum time between an event being published and a duplicate arriving. Consider: producer retry timeouts, consumer lag, disaster recovery scenarios, and manual replay operations. A common production value is 7 days, but analyze your specific failure modes.

Handling Duplicates with External Systems

When your event handlers interact with external systems (payment processors, email services, third-party APIs), duplicate prevention becomes more complex. You can't roll back an external API call.

Pattern: Idempotency Keys for External APIs

Many APIs support idempotency keys. Always use them.

external-api-idempotency.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
// Stripe supports idempotency keys natively
const stripePayment = await stripe.paymentIntents.create(
    {
        amount: 5000,
        currency: 'usd',
        customer: customerId,
    },
    {
        idempotencyKey: `payment-${orderId}-${attemptNumber}`,
    }
);
// If you call this again with the same idempotencyKey,
// Stripe returns the original response without re-charging
 
// For APIs that don't support idempotency, wrap with local tracking
class IdempotentAPIClient {
    constructor(
        private readonly api: ExternalAPI,
        private readonly store: IdempotencyStore
    ) {}
    
    async sendRequest<T>(
        idempotencyKey: string,
        request: APIRequest
    ): Promise<T> {
        // Check if already called
        const existing = await this.store.get<T>(idempotencyKey);
        if (existing !== null) {
            return existing;
        }
        
        // Make the API call
        const response = await this.api.call<T>(request);
        
        // Store the response for future duplicate calls
        await this.store.set(idempotencyKey, response);
        
        return response;
    }
}
 
// The risk: if we crash after API call but before storing response,
// the next attempt will make a duplicate API call.
// For critical operations, consider the Outbox pattern instead.

Pattern: Email/Notification Deduplication

Sending duplicate emails is a common and embarrassing bug. Here's how to prevent it.

email-deduplication.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
class IdempotentEmailService {
    constructor(
        private readonly emailProvider: EmailProvider,
        private readonly db: Database
    ) {}
    
    async sendEmail(
        notificationId: string,  // Unique per notification intent
        to: string,
        subject: string,
        body: string
    ): Promise<void> {
        // Atomically check and record in single query
        const result = await this.db.query(`
            INSERT INTO sent_emails (notification_id, sent_at, recipient)
            VALUES ($1, NOW(), $2)
            ON CONFLICT (notification_id) DO NOTHING
            RETURNING notification_id
        `, [notificationId, to]);
        
        if (result.rows.length === 0) {
            // Conflict = already sent
            console.log(`Email ${notificationId} already sent, skipping`);
            return;
        }
        
        // First time seeing this notificationId - send the email
        try {
            await this.emailProvider.send({ to, subject, body });
        } catch (error) {
            // On failure, remove the record so retry works
            await this.db.query(
                'DELETE FROM sent_emails WHERE notification_id = $1',
                [notificationId]
            );
            throw error;
        }
    }
}
 
// Usage: generate deterministic notification IDs
const notificationId = `order-confirmation-${orderId}`;
await emailService.sendEmail(
    notificationId,
    customer.email,
    'Order Confirmed!',
    renderOrderConfirmation(order)
);
 
// Even if OrderPlaced event is processed twice,
// only one email is sent

Deterministic Idempotency Keys

Generate idempotency keys deterministically from the event data, not randomly. If you use a random UUID as an idempotency key, and the same event is processed twice (each time generating a new random UUID), you'll still have duplicates. Instead, derive the key from stable event properties: eventId, or entity + action + version.

Summary: Mastering Duplicate Event Handling

Duplicate events are a fundamental characteristic of distributed messaging systems, not a bug to be fixed. Designing for duplicates is essential for building reliable event-driven systems.

Key Takeaways

•Duplicates are inevitable: Producer retries, consumer crashes, network partitions, and failovers all cause duplicates. Design for them from day one.
•'Exactly-once' has caveats: Even messaging systems with 'exactly-once' semantics don't guarantee exactly-once side effects. Your handlers must still be idempotent.
•Idempotency requires atomicity: Checking 'already processed' and performing the operation must happen atomically, typically within a database transaction.
•Use defense in depth: Implement deduplication at broker, consumer infrastructure, and domain levels. Each layer catches duplicates others might miss.
•Manage deduplication state carefully: Use TTLs to prevent unbounded growth. Consider Bloom filters or hierarchical caching for high-volume scenarios.
•External APIs need special care: Use idempotency keys where supported. For non-idempotent APIs, track calls in your database before making them.

What's Next

Duplicates arrive more than once; lost events don't arrive at all. In the next page, we'll explore lost events—how events can be lost in distributed systems, detection strategies, and patterns for ensuring delivery guarantees.

Page Complete

You now understand why duplicate events are inevitable, the mathematics behind idempotency, and practical patterns for building systems that handle duplicates gracefully. These techniques are essential for operating event-driven systems reliably in production.