Loading content...
We've explored at-most-once (messages may be lost) and at-least-once (messages may be duplicated). The obvious next question is: Can we achieve exactly-once delivery—where every message is delivered precisely one time, with no losses and no duplicates?
The answer is nuanced—and understanding this nuance is critical for designing distributed systems correctly.
The short answer: True exactly-once delivery is impossible in a distributed system. This isn't a limitation of current technology—it's a fundamental property of distributed computing, proven mathematically. However, we can achieve exactly-once processing semantics through careful design, which for most practical purposes is equivalent.
This page explores why exactly-once delivery is impossible, what we can actually achieve, and the patterns that let us build systems that behave as if exactly-once were possible.
By the end of this page, you will understand the fundamental impossibility of exactly-once delivery, the Two Generals Problem that underlies this impossibility, the distinction between delivery and processing semantics, and the practical patterns that achieve effectively exactly-once behavior. You will be able to critically evaluate vendor claims of 'exactly-once delivery' and design systems accordingly.
To understand why exactly-once delivery is impossible, we must understand a fundamental problem in distributed systems: The Two Generals Problem.
The Scenario:
Two armies (let's call them Army A and Army B) are positioned on opposite sides of a valley, planning to attack an enemy city in the middle. They can only communicate by sending messengers through the valley—but messengers can be captured (lost) by the enemy. If both armies attack together, they win. If only one attacks, they lose.
For the attack to succeed:
Here's the problem: How does Army A know that Army B received the confirmation that Army A received their confirmation?
The Infinite Regress:
This continues infinitely. There is always a final message whose receipt cannot be confirmed. No matter how many acknowledgments you add, you can never reach certain agreement.
The Mathematical Proof:
This was formally proven impossible by proving that no deterministic protocol can guarantee consensus in a system with even a single unreliable communication link. The proof uses a simple argument:
This proof applies directly to exactly-once delivery.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
/** * Why exactly-once delivery is impossible: * A code demonstration of the Two Generals Problem */ interface MessageResult { delivered: boolean; ackReceived: boolean;} class UnreliableNetwork { // Messages have a chance of being lost private lossRate: number = 0.1; // 10% loss send(message: any): boolean { return Math.random() > this.lossRate; }} class Producer { private network: UnreliableNetwork; async sendExactlyOnce(message: Message): Promise<boolean> { // Attempt 1: Send the message const sent = this.network.send(message); if (!sent) { // We don't know if it was delivered // Option A: Retry → might cause duplicate // Option B: Don't retry → might be lost return false; } // Assume broker received and persisted // Now wait for acknowledgment const ackReceived = this.network.send({ type: 'ack' }); // ACK also uses network! if (!ackReceived) { // Broker received and persisted (we know this) // But ACK was lost // // From producer's perspective: Did it work? // We don't know! // // If we retry: DUPLICATE // If we don't retry: MAYBE LOST (we don't know) // // THIS is why exactly-once is impossible. // No amount of ACKs can resolve this uncertainty. } return ackReceived; }} /** * The fundamental insight: * * At the last step of any protocol, there is always a final message. * The sender of that message can NEVER know if it was received. * * Therefore: * - If we assume success → might be wrong (message lost) * - If we assume failure and retry → might cause duplicate * * There is no third option that guarantees exactly-once. */The impossibility of exactly-once delivery is not a limitation of current technology—it's a mathematical truth about distributed systems with unreliable communication. No amount of engineering excellence can overcome it. Any system claiming 'exactly-once delivery' is either using a very specific definition, or is misleading you.
Understanding the distinction between delivery semantics and processing semantics is crucial for navigating claims about exactly-once support.
Delivery Semantics: How many times a message is transported from producer to consumer—the network layer.
Processing Semantics: How many times a message's effects are applied to the system state—the application layer.
The key insight: While exactly-once delivery is impossible, exactly-once processing is achievable through idempotency and deduplication.
| Aspect | Delivery Semantics | Processing Semantics |
|---|---|---|
| Definition | How many times message transits the network | How many times message effects are applied |
| Exactly-once possible? | No (Two Generals Problem) | Yes (with idempotency) |
| Responsibility | Messaging infrastructure | Application layer |
| Implementation | Retries, acknowledgments, persistence | Deduplication, idempotent operations |
| Example | Consumer receives message 3 times | Order created exactly once despite 3 deliveries |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364
/** * Achieving exactly-once PROCESSING semantics * despite at-least-once DELIVERY semantics */ class OrderProcessor { private db: Database; private processedMessageIds: Set<string> = new Set(); /** * This handler may be called MULTIPLE times for the same message * (at-least-once delivery) but the ORDER will only be created ONCE * (exactly-once processing) */ async handleOrderMessage(message: OrderMessage): Promise<void> { // STEP 1: Check if we've processed this message before // This is the DEDUPLICATION step if (this.processedMessageIds.has(message.messageId)) { console.log(`Message ${message.messageId} already processed, skipping`); return; // Idempotent: second processing is no-op } // Check database for more durable deduplication const existingOrder = await this.db.orders.findUnique({ where: { orderId: message.orderId } }); if (existingOrder) { console.log(`Order ${message.orderId} already exists, skipping`); // Mark as processed so we skip faster next time this.processedMessageIds.add(message.messageId); return; } // STEP 2: Create the order (first time only) await this.db.orders.create({ data: { orderId: message.orderId, customerId: message.customerId, items: message.items, processedFromMessageId: message.messageId, // Track for debugging } }); // STEP 3: Mark as processed this.processedMessageIds.add(message.messageId); console.log(`Order ${message.orderId} created successfully`); }} /** * From the MESSAGE perspective (delivery): * - Message might be delivered 1, 2, or 3 times * - This is AT-LEAST-ONCE delivery * * From the ORDER perspective (processing): * - Order is created exactly once * - This is EXACTLY-ONCE processing * * The combination gives us the behavior we want: * - No order is lost (at-least-once) * - No duplicate orders (deduplication) */When Kafka, Pulsar, or other systems advertise 'exactly-once', they mean exactly-once processing semantics within their ecosystem—achieved through idempotent producers, transactional consumers, and deduplication. They do NOT mean exactly-once delivery, which remains impossible. Always verify what a vendor means by the term.
While true exactly-once delivery is impossible, we can build systems that behave as if they were exactly-once through a combination of three techniques:
Together, these give us effectively exactly-once (sometimes called effectively-once or exactly-once processing semantics).
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394
/** * Complete pattern for effective exactly-once semantics */ interface ProcessedMessage { messageId: string; processedAt: Date; result: 'success' | 'failure';} class EffectivelyExactlyOnceProcessor { private db: Database; /** * Process a message with effectively exactly-once semantics. * * This is achieved by: * 1. Deduplication check (filter already-processed messages) * 2. Idempotent state change (atomic with deduplication record) * 3. Record processing (prevent future reprocessing) */ async processExactlyOnce<T>( messageId: string, processFunc: () => Promise<T> ): Promise<{ processed: boolean; result?: T }> { // STEP 1: Check if already processed (deduplication) const existing = await this.db.processedMessages.findUnique({ where: { messageId } }); if (existing) { // Duplicate detected - skip processing // This is what makes it "exactly-once" from state perspective console.log(`Duplicate message ${messageId} detected`); return { processed: false }; } // STEP 2: Process and record atomically // Use a transaction to ensure atomicity return await this.db.$transaction(async (tx) => { // Double-check within transaction (handle race conditions) const doubleCheck = await tx.processedMessages.findUnique({ where: { messageId } }); if (doubleCheck) { return { processed: false }; // Race condition, skip } // Execute the actual business logic const result = await processFunc(); // Record that we processed this message await tx.processedMessages.create({ data: { messageId, processedAt: new Date(), result: 'success', } }); return { processed: true, result }; }); }} // Usage in consumerconst processor = new EffectivelyExactlyOnceProcessor(db); consumer.on('message', async (message) => { const { processed, result } = await processor.processExactlyOnce( message.id, async () => { // This function runs exactly once per unique message.id // Even if the message is delivered multiple times return await orderService.createOrder({ orderId: message.payload.orderId, customerId: message.payload.customerId, items: message.payload.items, }); } ); if (processed) { console.log('Order created:', result); } else { console.log('Duplicate message, order already created'); } // Acknowledge message (even if duplicate - it's handled) await message.ack();});Note that the state change and the deduplication record must be atomic (in the same transaction). If they're not atomic, there's a window where the state changes but the deduplication record isn't written, potentially allowing a duplicate to slip through if a crash occurs at exactly the wrong time.
Apache Kafka introduced "exactly-once semantics" (EOS) in version 0.11. Understanding how Kafka achieves this provides valuable insight into what 'exactly-once' really means in practice.
Important Disclaimer: Kafka's EOS applies to Kafka-to-Kafka workflows—consuming from one topic and producing to another. It does NOT provide exactly-once guarantees for external side effects (API calls, database writes, etc.).
Kafka's Three Pillars of EOS:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
import { Kafka } from 'kafkajs'; const kafka = new Kafka({ clientId: 'order-service', brokers: ['kafka:9092'],}); /** * Idempotent Producer * * The broker assigns a Producer ID (PID) and tracks sequence numbers. * If the producer retries and sends the same (PID, sequence) pair, * the broker detects the duplicate and ignores it. * * This solves: Producer retry causing duplicate messages */const idempotentProducer = kafka.producer({ // Enable idempotency idempotent: true, // Required for idempotency maxInFlightRequests: 5, // Max 5 (Kafka requirement) // Retry settings become safer with idempotency retry: { retries: Number.MAX_SAFE_INTEGER, // Safe to retry forever },}); await idempotentProducer.connect(); // If this send is retried due to network issues,// the broker will detect and deduplicateawait idempotentProducer.send({ topic: 'orders', messages: [{ value: JSON.stringify(order) }],}); /** * How it works: * * Producer gets PID=5 on startup * * Send message with sequence=1 → Broker stores, ACK * ACK lost, producer retries * Send message with sequence=1 again → Broker detects duplicate * Returns success without re-storing * * Result: Message appears exactly once in the log */1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
/** * Transactional Producer * * Allows producing to multiple partitions/topics atomically. * Either ALL messages are committed or NONE are. * * This solves: Partial writes (some messages succeed, some fail) */const transactionalProducer = kafka.producer({ transactionalId: 'order-processor-tx', // Unique ID for this producer idempotent: true, // Required for transactions}); await transactionalProducer.connect(); async function processOrderBatch(orders: Order[]): Promise<void> { // Begin transaction const transaction = await transactionalProducer.transaction(); try { // All these messages are part of the same transaction for (const order of orders) { await transaction.send({ topic: 'processed-orders', messages: [{ value: JSON.stringify(order) }], }); await transaction.send({ topic: 'order-events', messages: [{ value: JSON.stringify({ type: 'ORDER_PROCESSED', orderId: order.id }) }], }); } // Commit: ALL messages become visible atomically await transaction.commit(); } catch (error) { // Abort: NONE of the messages become visible await transaction.abort(); throw error; }} /** * Transactional guarantees: * * - All messages in a transaction are committed atomically * - Consumers with isolation.level=read_committed see only committed messages * - If producer crashes mid-transaction, Kafka auto-aborts * - Recovery uses the transactionalId to resume or abort pending transactions */12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485
/** * The Complete Pattern: Consume-Transform-Produce * * This is where Kafka's exactly-once really shines. * We can consume, process, and produce in a single atomic transaction. */ const consumer = kafka.consumer({ groupId: 'order-processor', readUncommitted: false, // Only read committed messages}); const producer = kafka.producer({ transactionalId: 'order-processor-tx', idempotent: true,}); await consumer.connect();await producer.connect(); await consumer.subscribe({ topic: 'raw-orders' }); await consumer.run({ autoCommit: false, // CRITICAL: We handle commits in transactions eachMessage: async ({ topic, partition, message }) => { const transaction = await producer.transaction(); try { // 1. Process the message const processedOrder = await processOrder( JSON.parse(message.value!.toString()) ); // 2. Produce result to output topic (part of transaction) await transaction.send({ topic: 'processed-orders', messages: [{ value: JSON.stringify(processedOrder) }], }); // 3. Commit consumer offset AS PART OF the transaction await transaction.sendOffsets({ consumerGroupId: 'order-processor', topics: [{ topic, partitions: [{ partition, offset: (BigInt(message.offset) + 1n).toString(), }], }], }); // 4. Commit everything atomically await transaction.commit(); // At this point: // - Output message is visible // - Consumer offset is updated // Both or neither happen } catch (error) { await transaction.abort(); // Nothing happened: // - No output message // - Consumer offset unchanged // - Message will be reprocessed } },}); /** * EXACTLY-ONCE in Kafka-to-Kafka workflows: * * 1. Input message consumed * 2. Output message produced * 3. Input offset committed * * All three happen atomically. * * If crash after step 1: Offset not committed, message reprocessed * If crash after step 2: Transaction not committed, output not visible, * message reprocessed * If crash after step 3: Transaction committed, no reprocessing * * Result: Each input message produces exactly one output message */Kafka's exactly-once applies ONLY within Kafka. If your consumer makes an HTTP call or writes to an external database, you're back to at-least-once. The external system doesn't participate in Kafka's transactions. For external side effects, you still need idempotency.
Most real-world systems don't just consume and produce messages—they also interact with external systems: databases, APIs, payment gateways, email services. These external side effects are where exactly-once becomes truly challenging.
The Core Problem:
External systems don't participate in your message broker's transaction protocol. When you make an API call or database write, you can't atomically commit that change along with your message acknowledgment.
123456789101112131415161718192021222324252627282930313233343536373839404142434445
class PaymentProcessor { /** * This is the classic external side effect problem. * * We need to: * 1. Process a payment (external API call) * 2. Acknowledge the message (message broker) * * But we can't make these atomic! */ async processPaymentMessage(message: PaymentMessage): Promise<void> { // STEP 1: Call external payment API const paymentResult = await paymentGateway.charge({ amount: message.amount, customerId: message.customerId, orderId: message.orderId, }); // GAP: If we crash here, the payment was processed // but the message wasn't acknowledged. // Message will be redelivered. // Customer will be charged TWICE. // STEP 2: Acknowledge the message await message.ack(); }} /** * The failure modes: * * Scenario A: Crash after payment, before ack * - Payment processed * - Message not acknowledged * - Message redelivered * - Payment processed AGAIN → DUPLICATE! * * Scenario B: Ack before payment (at-most-once approach) * - Message acknowledged * - Crash before payment * - Payment never processed → LOST! * * There's no ordering that makes this atomic. * The payment API doesn't know about our message acknowledgment. */Solutions for External Side Effects:
There are several patterns to handle external side effects while maintaining effectively exactly-once semantics:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
class IdempotentPaymentProcessor { /** * Use idempotency keys to make external API calls safe. * * The payment gateway tracks idempotency keys and ensures * the same key results in the same response, without duplicate charges. */ async processPaymentMessage(message: PaymentMessage): Promise<void> { // Use message ID as idempotency key // Same message = same idempotency key = no duplicate charge const idempotencyKey = `payment-${message.messageId}`; // Call payment API with idempotency key const paymentResult = await paymentGateway.charge({ amount: message.amount, customerId: message.customerId, orderId: message.orderId, idempotencyKey, // ← The crucial field }); // Even if this message is redelivered and we call the API again, // the payment gateway will recognize the idempotency key and // return the original result without charging again. await message.ack(); }} /** * How idempotency keys work at the payment gateway: * * First call with key "payment-msg-123": * - Process payment * - Store result keyed by "payment-msg-123" * - Return success * * Second call with key "payment-msg-123": * - Lookup "payment-msg-123" * - Found! Return cached result * - No duplicate charge * * Popular APIs that support idempotency keys: * - Stripe (Idempotency-Key header) * - PayPal * - Square * - Many modern payment processors */12345678910111213141516171819202122232425262728293031323334353637383940414243444546
class SafeOrderProcessor { private db: Database; /** * Check external state before acting. * * This pattern queries the external system to determine * if the action was already performed. */ async processOrderMessage(message: OrderMessage): Promise<void> { // Check if order was already processed // This could be a local database or the external system const existingShipment = await this.shippingService.findByOrderId( message.orderId ); if (existingShipment) { // Already processed - this is a duplicate message console.log(`Order ${message.orderId} already shipped`); await message.ack(); return; } // Not yet processed - safe to proceed await this.shippingService.createShipment({ orderId: message.orderId, items: message.items, address: message.shippingAddress, }); await message.ack(); }} /** * Trade-offs of check-before-act: * * Pros: * - Simple to understand * - Works with any external system * * Cons: * - Race condition window between check and act * - Extra latency for the check call * - Requires external system to support querying for duplicates */Production systems often combine multiple patterns: idempotency keys for APIs that support them, check-before-act for others, and local deduplication tracking as a first line of defense. Defense in depth reduces the probability of duplicates slipping through any single layer.
Achieving effectively exactly-once semantics comes with significant costs. Understanding these trade-offs is essential for making informed architectural decisions.
| Aspect | At-Least-Once | Effectively Exactly-Once | Impact |
|---|---|---|---|
| Latency | Low (simple ack) | Higher (dedup lookup + transaction) | 1.5-3x slower per message |
| Throughput | High | Lower (transaction overhead) | 30-50% reduction typical |
| Storage | Messages only | Messages + dedup records | Significant storage growth |
| Complexity | Consumer only | Consumer + dedup infrastructure | More moving parts |
| Consistency Window | N/A | Dedup retention period | Trade-off between storage and safety |
| Failure Modes | Duplicates | Rare duplicates possible | Not truly eliminated, just reduced |
Kafka EOS Performance Impact:
Kafka's exactly-once semantics have measurable performance costs:
123456789101112131415161718192021222324252627282930313233343536373839404142434445
/** * Kafka Exactly-Once Semantics Performance Impact * * Based on real-world measurements: */ const performanceMetrics = { // Idempotent Producer (vs non-idempotent) idempotentProducer: { latencyIncrease: '5-10%', // Near negligible throughputImpact: '~0%', // No significant impact reason: 'Sequence number tracking is lightweight' }, // Transactional Producer (vs non-transactional) transactionalProducer: { latencyIncrease: '20-50%', // Transaction overhead throughputImpact: '20-40%', // Commit/abort protocol reason: 'Two-phase commit with transaction coordinator' }, // Consume-Transform-Produce with EOS fullEOS: { latencyIncrease: '30-80%', // Full transaction path throughputImpact: '30-50%', // Depends on message size additionalConsiderations: [ 'Transaction coordinator becomes bottleneck at scale', 'Longer processing times = longer transaction windows', 'Broker resource usage increases', 'Consumer group rebalancing becomes more complex' ] }}; /** * When EOS overhead is acceptable: * - Correctness is more valuable than raw throughput * - Message value justifies the overhead * - System isn't at extreme scale * * When to reconsider EOS: * - Processing millions of messages/second * - Low-latency requirements (< 10ms) * - Messages are low-value (metrics, logs) */Not every message needs exactly-once processing. Apply the strongest guarantee where it matters most (financial transactions, user-facing operations) and use simpler at-least-once with tolerance for duplicates where it's acceptable (analytics, logging, internal state updates).
Exactly-once delivery is a nuanced topic that requires careful understanding. Let's consolidate the key insights:
You now understand the challenges of exactly-once delivery—why true exactly-once is impossible, what systems actually provide, and how to achieve effectively exactly-once semantics. In the next page, we'll dive deep into idempotent consumers—the critical pattern that makes exactly-once processing possible.