Delivery Guarantees - Learning Module

Loading content...

0/273

Exactly-Once Delivery (Challenges)

The Holy Grail of Distributed Messaging

We've explored at-most-once (messages may be lost) and at-least-once (messages may be duplicated). The obvious next question is: Can we achieve exactly-once delivery—where every message is delivered precisely one time, with no losses and no duplicates?

The answer is nuanced—and understanding this nuance is critical for designing distributed systems correctly.

The short answer: True exactly-once delivery is impossible in a distributed system. This isn't a limitation of current technology—it's a fundamental property of distributed computing, proven mathematically. However, we can achieve exactly-once processing semantics through careful design, which for most practical purposes is equivalent.

This page explores why exactly-once delivery is impossible, what we can actually achieve, and the patterns that let us build systems that behave as if exactly-once were possible.

What You Will Learn

By the end of this page, you will understand the fundamental impossibility of exactly-once delivery, the Two Generals Problem that underlies this impossibility, the distinction between delivery and processing semantics, and the practical patterns that achieve effectively exactly-once behavior. You will be able to critically evaluate vendor claims of 'exactly-once delivery' and design systems accordingly.

Why Exactly-Once Delivery Is Impossible

To understand why exactly-once delivery is impossible, we must understand a fundamental problem in distributed systems: The Two Generals Problem.

The Scenario:

Two armies (let's call them Army A and Army B) are positioned on opposite sides of a valley, planning to attack an enemy city in the middle. They can only communicate by sending messengers through the valley—but messengers can be captured (lost) by the enemy. If both armies attack together, they win. If only one attacks, they lose.

For the attack to succeed:

Army A must send a message: "Attack at dawn"
Army B must receive the message and send confirmation: "Confirmed, attacking at dawn"
Army A must receive the confirmation

Here's the problem: How does Army A know that Army B received the confirmation that Army A received their confirmation?

The Infinite Regress:

Army A sends: "Attack at dawn"
Army B receives and sends: "I'll attack" — But Army B doesn't know if this message arrived!
Army A receives confirmation, sends: "I received your confirmation" — But Army A doesn't know if this message arrived!
Army B receives and sends: "I know you received my confirmation" — But now Army B needs confirmation again!

This continues infinitely. There is always a final message whose receipt cannot be confirmed. No matter how many acknowledgments you add, you can never reach certain agreement.

The Mathematical Proof:

This was formally proven impossible by proving that no deterministic protocol can guarantee consensus in a system with even a single unreliable communication link. The proof uses a simple argument:

Suppose there exists a protocol P that achieves consensus after N messages
Consider the scenario where message N is lost
The sender of message N doesn't know if it was received
Therefore, the sender cannot be certain of consensus
This contradicts the assumption that P achieves consensus

This proof applies directly to exactly-once delivery.

two-generals.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
/**
 * Why exactly-once delivery is impossible:
 * A code demonstration of the Two Generals Problem
 */
 
interface MessageResult {
    delivered: boolean;
    ackReceived: boolean;
}
 
class UnreliableNetwork {
    // Messages have a chance of being lost
    private lossRate: number = 0.1; // 10% loss
 
    send(message: any): boolean {
        return Math.random() > this.lossRate;
    }
}
 
class Producer {
    private network: UnreliableNetwork;
 
    async sendExactlyOnce(message: Message): Promise<boolean> {
        // Attempt 1: Send the message
        const sent = this.network.send(message);
        
        if (!sent) {
            // We don't know if it was delivered
            // Option A: Retry → might cause duplicate
            // Option B: Don't retry → might be lost
            return false;
        }
 
        // Assume broker received and persisted
        // Now wait for acknowledgment
        const ackReceived = this.network.send({ type: 'ack' }); // ACK also uses network!
        
        if (!ackReceived) {
            // Broker received and persisted (we know this)
            // But ACK was lost
            // 
            // From producer's perspective: Did it work?
            // We don't know!
            //
            // If we retry: DUPLICATE
            // If we don't retry: MAYBE LOST (we don't know)
            //
            // THIS is why exactly-once is impossible.
            // No amount of ACKs can resolve this uncertainty.
        }
 
        return ackReceived;
    }
}
 
/**
 * The fundamental insight:
 * 
 * At the last step of any protocol, there is always a final message.
 * The sender of that message can NEVER know if it was received.
 * 
 * Therefore:
 * - If we assume success → might be wrong (message lost)
 * - If we assume failure and retry → might cause duplicate
 * 
 * There is no third option that guarantees exactly-once.
 */

This Is Not a Technology Limitation

The impossibility of exactly-once delivery is not a limitation of current technology—it's a mathematical truth about distributed systems with unreliable communication. No amount of engineering excellence can overcome it. Any system claiming 'exactly-once delivery' is either using a very specific definition, or is misleading you.

Delivery Semantics vs Processing Semantics

Understanding the distinction between delivery semantics and processing semantics is crucial for navigating claims about exactly-once support.

Delivery Semantics: How many times a message is transported from producer to consumer—the network layer.

Processing Semantics: How many times a message's effects are applied to the system state—the application layer.

The key insight: While exactly-once delivery is impossible, exactly-once processing is achievable through idempotency and deduplication.

Delivery vs Processing Semantics
Aspect	Delivery Semantics	Processing Semantics
Definition	How many times message transits the network	How many times message effects are applied
Exactly-once possible?	No (Two Generals Problem)	Yes (with idempotency)
Responsibility	Messaging infrastructure	Application layer
Implementation	Retries, acknowledgments, persistence	Deduplication, idempotent operations
Example	Consumer receives message 3 times	Order created exactly once despite 3 deliveries

exactly-once-processing.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
/**
 * Achieving exactly-once PROCESSING semantics
 * despite at-least-once DELIVERY semantics
 */
 
class OrderProcessor {
    private db: Database;
    private processedMessageIds: Set<string> = new Set();
 
    /**
     * This handler may be called MULTIPLE times for the same message
     * (at-least-once delivery) but the ORDER will only be created ONCE
     * (exactly-once processing)
     */
    async handleOrderMessage(message: OrderMessage): Promise<void> {
        // STEP 1: Check if we've processed this message before
        // This is the DEDUPLICATION step
        if (this.processedMessageIds.has(message.messageId)) {
            console.log(`Message ${message.messageId} already processed, skipping`);
            return; // Idempotent: second processing is no-op
        }
 
        // Check database for more durable deduplication
        const existingOrder = await this.db.orders.findUnique({
            where: { orderId: message.orderId }
        });
 
        if (existingOrder) {
            console.log(`Order ${message.orderId} already exists, skipping`);
            // Mark as processed so we skip faster next time
            this.processedMessageIds.add(message.messageId);
            return;
        }
 
        // STEP 2: Create the order (first time only)
        await this.db.orders.create({
            data: {
                orderId: message.orderId,
                customerId: message.customerId,
                items: message.items,
                processedFromMessageId: message.messageId, // Track for debugging
            }
        });
 
        // STEP 3: Mark as processed
        this.processedMessageIds.add(message.messageId);
 
        console.log(`Order ${message.orderId} created successfully`);
    }
}
 
/**
 * From the MESSAGE perspective (delivery):
 *   - Message might be delivered 1, 2, or 3 times
 *   - This is AT-LEAST-ONCE delivery
 * 
 * From the ORDER perspective (processing):
 *   - Order is created exactly once
 *   - This is EXACTLY-ONCE processing
 * 
 * The combination gives us the behavior we want:
 *   - No order is lost (at-least-once)
 *   - No duplicate orders (deduplication)
 */

What Vendors Mean by 'Exactly-Once'

When Kafka, Pulsar, or other systems advertise 'exactly-once', they mean exactly-once processing semantics within their ecosystem—achieved through idempotent producers, transactional consumers, and deduplication. They do NOT mean exactly-once delivery, which remains impossible. Always verify what a vendor means by the term.

The Effective Exactly-Once Pattern

While true exactly-once delivery is impossible, we can build systems that behave as if they were exactly-once through a combination of three techniques:

At-least-once delivery (ensures no message loss)
Idempotent processing (ensures duplicate processing is harmless)
Deduplication (detects and filters duplicates)

Together, these give us effectively exactly-once (sometimes called effectively-once or exactly-once processing semantics).

effective-exactly-once.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
/**
 * Complete pattern for effective exactly-once semantics
 */
 
interface ProcessedMessage {
    messageId: string;
    processedAt: Date;
    result: 'success' | 'failure';
}
 
class EffectivelyExactlyOnceProcessor {
    private db: Database;
    
    /**
     * Process a message with effectively exactly-once semantics.
     * 
     * This is achieved by:
     * 1. Deduplication check (filter already-processed messages)
     * 2. Idempotent state change (atomic with deduplication record)
     * 3. Record processing (prevent future reprocessing)
     */
    async processExactlyOnce<T>(
        messageId: string,
        processFunc: () => Promise<T>
    ): Promise<{ processed: boolean; result?: T }> {
        
        // STEP 1: Check if already processed (deduplication)
        const existing = await this.db.processedMessages.findUnique({
            where: { messageId }
        });
        
        if (existing) {
            // Duplicate detected - skip processing
            // This is what makes it "exactly-once" from state perspective
            console.log(`Duplicate message ${messageId} detected`);
            return { processed: false };
        }
 
        // STEP 2: Process and record atomically
        // Use a transaction to ensure atomicity
        return await this.db.$transaction(async (tx) => {
            // Double-check within transaction (handle race conditions)
            const doubleCheck = await tx.processedMessages.findUnique({
                where: { messageId }
            });
            
            if (doubleCheck) {
                return { processed: false }; // Race condition, skip
            }
 
            // Execute the actual business logic
            const result = await processFunc();
 
            // Record that we processed this message
            await tx.processedMessages.create({
                data: {
                    messageId,
                    processedAt: new Date(),
                    result: 'success',
                }
            });
 
            return { processed: true, result };
        });
    }
}
 
// Usage in consumer
const processor = new EffectivelyExactlyOnceProcessor(db);
 
consumer.on('message', async (message) => {
    const { processed, result } = await processor.processExactlyOnce(
        message.id,
        async () => {
            // This function runs exactly once per unique message.id
            // Even if the message is delivered multiple times
            
            return await orderService.createOrder({
                orderId: message.payload.orderId,
                customerId: message.payload.customerId,
                items: message.payload.items,
            });
        }
    );
 
    if (processed) {
        console.log('Order created:', result);
    } else {
        console.log('Duplicate message, order already created');
    }
    
    // Acknowledge message (even if duplicate - it's handled)
    await message.ack();
});

The Transaction Is Critical

Note that the state change and the deduplication record must be atomic (in the same transaction). If they're not atomic, there's a window where the state changes but the deduplication record isn't written, potentially allowing a duplicate to slip through if a crash occurs at exactly the wrong time.

Case Study: Kafka's Exactly-Once Semantics

Apache Kafka introduced "exactly-once semantics" (EOS) in version 0.11. Understanding how Kafka achieves this provides valuable insight into what 'exactly-once' really means in practice.

Important Disclaimer: Kafka's EOS applies to Kafka-to-Kafka workflows—consuming from one topic and producing to another. It does NOT provide exactly-once guarantees for external side effects (API calls, database writes, etc.).

Kafka's Three Pillars of EOS:

Idempotent Producer: Prevents duplicate messages due to producer retries
Transactional Producer: Allows atomic multi-partition writes
Transactional Consumer: Reads only committed messages and commits offsets atomically

kafka-idempotent-producer.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import { Kafka } from 'kafkajs';
 
const kafka = new Kafka({
    clientId: 'order-service',
    brokers: ['kafka:9092'],
});
 
/**
 * Idempotent Producer
 * 
 * The broker assigns a Producer ID (PID) and tracks sequence numbers.
 * If the producer retries and sends the same (PID, sequence) pair,
 * the broker detects the duplicate and ignores it.
 * 
 * This solves: Producer retry causing duplicate messages
 */
const idempotentProducer = kafka.producer({
    // Enable idempotency
    idempotent: true,
    
    // Required for idempotency
    maxInFlightRequests: 5, // Max 5 (Kafka requirement)
    
    // Retry settings become safer with idempotency
    retry: {
        retries: Number.MAX_SAFE_INTEGER, // Safe to retry forever
    },
});
 
await idempotentProducer.connect();
 
// If this send is retried due to network issues,
// the broker will detect and deduplicate
await idempotentProducer.send({
    topic: 'orders',
    messages: [{ value: JSON.stringify(order) }],
});
 
/**
 * How it works:
 * 
 * Producer gets PID=5 on startup
 * 
 * Send message with sequence=1 → Broker stores, ACK
 * ACK lost, producer retries
 * Send message with sequence=1 again → Broker detects duplicate
 *                                       Returns success without re-storing
 * 
 * Result: Message appears exactly once in the log
 */

kafka-transactional-producer.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
/**
 * Transactional Producer
 * 
 * Allows producing to multiple partitions/topics atomically.
 * Either ALL messages are committed or NONE are.
 * 
 * This solves: Partial writes (some messages succeed, some fail)
 */
const transactionalProducer = kafka.producer({
    transactionalId: 'order-processor-tx', // Unique ID for this producer
    idempotent: true, // Required for transactions
});
 
await transactionalProducer.connect();
 
async function processOrderBatch(orders: Order[]): Promise<void> {
    // Begin transaction
    const transaction = await transactionalProducer.transaction();
 
    try {
        // All these messages are part of the same transaction
        for (const order of orders) {
            await transaction.send({
                topic: 'processed-orders',
                messages: [{ value: JSON.stringify(order) }],
            });
 
            await transaction.send({
                topic: 'order-events',
                messages: [{ 
                    value: JSON.stringify({ 
                        type: 'ORDER_PROCESSED', 
                        orderId: order.id 
                    }) 
                }],
            });
        }
 
        // Commit: ALL messages become visible atomically
        await transaction.commit();
        
    } catch (error) {
        // Abort: NONE of the messages become visible
        await transaction.abort();
        throw error;
    }
}
 
/**
 * Transactional guarantees:
 * 
 * - All messages in a transaction are committed atomically
 * - Consumers with isolation.level=read_committed see only committed messages
 * - If producer crashes mid-transaction, Kafka auto-aborts
 * - Recovery uses the transactionalId to resume or abort pending transactions
 */

kafka-consume-transform-produce.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
/**
 * The Complete Pattern: Consume-Transform-Produce
 * 
 * This is where Kafka's exactly-once really shines.
 * We can consume, process, and produce in a single atomic transaction.
 */
 
const consumer = kafka.consumer({
    groupId: 'order-processor',
    readUncommitted: false, // Only read committed messages
});
 
const producer = kafka.producer({
    transactionalId: 'order-processor-tx',
    idempotent: true,
});
 
await consumer.connect();
await producer.connect();
 
await consumer.subscribe({ topic: 'raw-orders' });
 
await consumer.run({
    autoCommit: false, // CRITICAL: We handle commits in transactions
    
    eachMessage: async ({ topic, partition, message }) => {
        const transaction = await producer.transaction();
        
        try {
            // 1. Process the message
            const processedOrder = await processOrder(
                JSON.parse(message.value!.toString())
            );
            
            // 2. Produce result to output topic (part of transaction)
            await transaction.send({
                topic: 'processed-orders',
                messages: [{ value: JSON.stringify(processedOrder) }],
            });
            
            // 3. Commit consumer offset AS PART OF the transaction
            await transaction.sendOffsets({
                consumerGroupId: 'order-processor',
                topics: [{
                    topic,
                    partitions: [{
                        partition,
                        offset: (BigInt(message.offset) + 1n).toString(),
                    }],
                }],
            });
            
            // 4. Commit everything atomically
            await transaction.commit();
            // At this point:
            // - Output message is visible
            // - Consumer offset is updated
            // Both or neither happen
            
        } catch (error) {
            await transaction.abort();
            // Nothing happened:
            // - No output message
            // - Consumer offset unchanged
            // - Message will be reprocessed
        }
    },
});
 
/**
 * EXACTLY-ONCE in Kafka-to-Kafka workflows:
 * 
 * 1. Input message consumed
 * 2. Output message produced 
 * 3. Input offset committed
 * 
 * All three happen atomically.
 * 
 * If crash after step 1: Offset not committed, message reprocessed
 * If crash after step 2: Transaction not committed, output not visible,
 *                        message reprocessed
 * If crash after step 3: Transaction committed, no reprocessing
 * 
 * Result: Each input message produces exactly one output message
 */

Kafka EOS Limitations

Kafka's exactly-once applies ONLY within Kafka. If your consumer makes an HTTP call or writes to an external database, you're back to at-least-once. The external system doesn't participate in Kafka's transactions. For external side effects, you still need idempotency.

The External Side Effects Challenge

Most real-world systems don't just consume and produce messages—they also interact with external systems: databases, APIs, payment gateways, email services. These external side effects are where exactly-once becomes truly challenging.

The Core Problem:

External systems don't participate in your message broker's transaction protocol. When you make an API call or database write, you can't atomically commit that change along with your message acknowledgment.

external-side-effect-problem.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class PaymentProcessor {
    /**
     * This is the classic external side effect problem.
     * 
     * We need to:
     * 1. Process a payment (external API call)
     * 2. Acknowledge the message (message broker)
     * 
     * But we can't make these atomic!
     */
    async processPaymentMessage(message: PaymentMessage): Promise<void> {
        // STEP 1: Call external payment API
        const paymentResult = await paymentGateway.charge({
            amount: message.amount,
            customerId: message.customerId,
            orderId: message.orderId,
        });
 
        // GAP: If we crash here, the payment was processed
        //      but the message wasn't acknowledged.
        //      Message will be redelivered.
        //      Customer will be charged TWICE.
 
        // STEP 2: Acknowledge the message
        await message.ack();
    }
}
 
/**
 * The failure modes:
 * 
 * Scenario A: Crash after payment, before ack
 *   - Payment processed
 *   - Message not acknowledged
 *   - Message redelivered
 *   - Payment processed AGAIN → DUPLICATE!
 * 
 * Scenario B: Ack before payment (at-most-once approach)
 *   - Message acknowledged
 *   - Crash before payment
 *   - Payment never processed → LOST!
 * 
 * There's no ordering that makes this atomic.
 * The payment API doesn't know about our message acknowledgment.
 */

Solutions for External Side Effects:

There are several patterns to handle external side effects while maintaining effectively exactly-once semantics:

Patterns for External Side Effects

•Idempotency Keys: Include a unique key with external API calls; the external system deduplicates
•Local State + Outbox: Record intent locally, deliver reliably via outbox pattern
•Saga Pattern: Model as distributed transaction with compensating actions
•Event Sourcing: Record events, derive state; replay handles duplicates naturally
•External Deduplication: Query external system for existing record before action

idempotency-key-pattern.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
class IdempotentPaymentProcessor {
    /**
     * Use idempotency keys to make external API calls safe.
     * 
     * The payment gateway tracks idempotency keys and ensures
     * the same key results in the same response, without duplicate charges.
     */
    async processPaymentMessage(message: PaymentMessage): Promise<void> {
        // Use message ID as idempotency key
        // Same message = same idempotency key = no duplicate charge
        const idempotencyKey = `payment-${message.messageId}`;
 
        // Call payment API with idempotency key
        const paymentResult = await paymentGateway.charge({
            amount: message.amount,
            customerId: message.customerId,
            orderId: message.orderId,
            idempotencyKey,  // ← The crucial field
        });
 
        // Even if this message is redelivered and we call the API again,
        // the payment gateway will recognize the idempotency key and
        // return the original result without charging again.
 
        await message.ack();
    }
}
 
/**
 * How idempotency keys work at the payment gateway:
 * 
 * First call with key "payment-msg-123":
 *   - Process payment
 *   - Store result keyed by "payment-msg-123"
 *   - Return success
 * 
 * Second call with key "payment-msg-123":
 *   - Lookup "payment-msg-123"
 *   - Found! Return cached result
 *   - No duplicate charge
 * 
 * Popular APIs that support idempotency keys:
 *   - Stripe (Idempotency-Key header)
 *   - PayPal
 *   - Square
 *   - Many modern payment processors
 */

check-before-act.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
class SafeOrderProcessor {
    private db: Database;
 
    /**
     * Check external state before acting.
     * 
     * This pattern queries the external system to determine
     * if the action was already performed.
     */
    async processOrderMessage(message: OrderMessage): Promise<void> {
        // Check if order was already processed
        // This could be a local database or the external system
        const existingShipment = await this.shippingService.findByOrderId(
            message.orderId
        );
 
        if (existingShipment) {
            // Already processed - this is a duplicate message
            console.log(`Order ${message.orderId} already shipped`);
            await message.ack();
            return;
        }
 
        // Not yet processed - safe to proceed
        await this.shippingService.createShipment({
            orderId: message.orderId,
            items: message.items,
            address: message.shippingAddress,
        });
 
        await message.ack();
    }
}
 
/**
 * Trade-offs of check-before-act:
 * 
 * Pros:
 *   - Simple to understand
 *   - Works with any external system
 * 
 * Cons:
 *   - Race condition window between check and act
 *   - Extra latency for the check call
 *   - Requires external system to support querying for duplicates
 */

The Best Defense: Multiple Layers

Production systems often combine multiple patterns: idempotency keys for APIs that support them, check-before-act for others, and local deduplication tracking as a first line of defense. Defense in depth reduces the probability of duplicates slipping through any single layer.

Performance and Operational Trade-offs

Achieving effectively exactly-once semantics comes with significant costs. Understanding these trade-offs is essential for making informed architectural decisions.

Exactly-Once Trade-offs
Aspect	At-Least-Once	Effectively Exactly-Once	Impact
Latency	Low (simple ack)	Higher (dedup lookup + transaction)	1.5-3x slower per message
Throughput	High	Lower (transaction overhead)	30-50% reduction typical
Storage	Messages only	Messages + dedup records	Significant storage growth
Complexity	Consumer only	Consumer + dedup infrastructure	More moving parts
Consistency Window	N/A	Dedup retention period	Trade-off between storage and safety
Failure Modes	Duplicates	Rare duplicates possible	Not truly eliminated, just reduced

Kafka EOS Performance Impact:

Kafka's exactly-once semantics have measurable performance costs:

kafka-eos-performance.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
/**
 * Kafka Exactly-Once Semantics Performance Impact
 * 
 * Based on real-world measurements:
 */
 
const performanceMetrics = {
    // Idempotent Producer (vs non-idempotent)
    idempotentProducer: {
        latencyIncrease: '5-10%',   // Near negligible
        throughputImpact: '~0%',    // No significant impact
        reason: 'Sequence number tracking is lightweight'
    },
 
    // Transactional Producer (vs non-transactional)
    transactionalProducer: {
        latencyIncrease: '20-50%',  // Transaction overhead
        throughputImpact: '20-40%', // Commit/abort protocol
        reason: 'Two-phase commit with transaction coordinator'
    },
 
    // Consume-Transform-Produce with EOS
    fullEOS: {
        latencyIncrease: '30-80%',  // Full transaction path
        throughputImpact: '30-50%', // Depends on message size
        additionalConsiderations: [
            'Transaction coordinator becomes bottleneck at scale',
            'Longer processing times = longer transaction windows',
            'Broker resource usage increases',
            'Consumer group rebalancing becomes more complex'
        ]
    }
};
 
/**
 * When EOS overhead is acceptable:
 *   - Correctness is more valuable than raw throughput
 *   - Message value justifies the overhead
 *   - System isn't at extreme scale
 * 
 * When to reconsider EOS:
 *   - Processing millions of messages/second
 *   - Low-latency requirements (< 10ms)
 *   - Messages are low-value (metrics, logs)
 */

Operational Considerations

•Deduplication Store Maintenance: The dedup store grows indefinitely; requires TTL or compaction strategy
•Clock Skew Impact: Time-based deduplication can fail with clock drift between processes
•Transaction Timeout Tuning: Too short causes spurious aborts; too long causes resource contention
•Recovery Complexity: Understanding transaction state during incident response is challenging
•Testing Difficulty: Creating duplicate scenarios for testing requires careful test setup

The Right Level of Guarantee

Not every message needs exactly-once processing. Apply the strongest guarantee where it matters most (financial transactions, user-facing operations) and use simpler at-least-once with tolerance for duplicates where it's acceptable (analytics, logging, internal state updates).

Summary: Exactly-Once Delivery

Exactly-once delivery is a nuanced topic that requires careful understanding. Let's consolidate the key insights:

Key Takeaways

•True exactly-once delivery is impossible — The Two Generals Problem proves this mathematically for any distributed system with unreliable communication
•Delivery ≠ Processing — While exactly-once delivery is impossible, exactly-once processing semantics are achievable through application-level patterns
•At-least-once + Idempotency + Deduplication = Effectively exactly-once — This combination gives us the behavior we want
•Kafka's EOS is exactly-once processing — It applies within Kafka-to-Kafka workflows, not external side effects
•External side effects require special handling — Idempotency keys, check-before-act, and saga patterns address external systems
•Exactly-once has significant costs — Performance, complexity, and operational overhead must be weighed against the value of the guarantee
•Choose guarantees appropriately — Apply exactly-once where it matters; use simpler guarantees where duplicates are tolerable

When to Use Exactly-Once

•Financial transactions
•Order processing
•Inventory management
•User billing events
•Compliance-critical operations

When At-Least-Once Suffices

•Analytics events
•Log aggregation
•Cache invalidation
•Notification triggers
•Search index updates

Page Complete

You now understand the challenges of exactly-once delivery—why true exactly-once is impossible, what systems actually provide, and how to achieve effectively exactly-once semantics. In the next page, we'll dive deep into idempotent consumers—the critical pattern that makes exactly-once processing possible.