Loading content...
In distributed systems, one of the most fundamental questions we face is deceptively simple: When you send a message from one service to another, how many times will that message be delivered?
This question may seem trivial—after all, you send a message, and the recipient gets it, right? But in the presence of network failures, process crashes, and the inherent unreliability of distributed systems, the answer becomes far more nuanced. The properties you can guarantee around message delivery define the character of your entire system architecture.
Delivery guarantees fall into three fundamental categories, each representing a distinct trade-off between reliability, complexity, and performance:
This page focuses exclusively on at-most-once delivery—the simplest, fastest, but also the most 'lossy' of these guarantees.
By the end of this page, you will deeply understand the at-most-once delivery semantic—its guarantees, its failure modes, why it exists, when to use it, and how it relates to the broader landscape of distributed systems reliability. You will be able to articulate precisely why at-most-once is the right choice in specific scenarios and why it's dangerously inappropriate in others.
At-most-once delivery provides a guarantee that every message will be delivered either zero or one time—never more. This means:
This semantic is sometimes called "fire-and-forget" because the producer sends the message and immediately moves on, without waiting for confirmation of successful processing.
1234567891011121314151617181920212223242526272829303132333435
// At-most-once delivery: Producer sends without confirmationclass AtMostOnceProducer { private broker: MessageBroker; constructor(broker: MessageBroker) { this.broker = broker; } /** * Send a message with at-most-once semantics. * * Key characteristics: * - No acknowledgment awaited * - No retry on failure * - Message may be lost * - Extremely fast */ send(topic: string, message: Message): void { // Fire and forget - no waiting for acknowledgment this.broker.publish(topic, message); // Producer immediately continues // No way to know if message was received // No retry mechanism }} // Usage - the producer doesn't know if this succeededproducer.send("metrics-topic", { type: "cpu_usage", value: 45.2, timestamp: Date.now()}); // Execution continues immediately regardless of delivery successThe formal definition of at-most-once semantics can be expressed as:
For any message M sent by producer P to consumer C, let D(M) represent the number of times M is delivered to C. At-most-once guarantees: 0 ≤ D(M) ≤ 1.
This is the weakest delivery guarantee possible. It makes no promises about successful delivery—only that if delivery happens, it happens at most once.
The naming convention 'at-most-once' emphasizes what the guarantee prevents (multiple deliveries) rather than what it ensures (any delivery at all). This phrasing is deliberate—it signals that the system is optimizing for preventing duplicates, accepting the possibility of message loss as a trade-off.
To understand why at-most-once delivery exists and when it's appropriate, we need to examine the mechanics of message transmission in distributed systems.
The message lifecycle in a typical messaging system involves several steps:
At-most-once delivery typically works by acknowledging the message before (or instead of) processing it, or by not requiring acknowledgment at all.
Implementation Pattern 1: Fire-and-Forget (No ACK)
The simplest at-most-once implementation uses UDP-style semantics—no acknowledgment mechanism exists at all. The producer sends the message and the broker (if present) delivers it without tracking whether the consumer received it.
123456789101112131415161718192021
// Pattern 1: No acknowledgment at allclass FireAndForgetBroker { private consumers: Map<string, Consumer[]> = new Map(); publish(topic: string, message: Message): void { const topicConsumers = this.consumers.get(topic) || []; for (const consumer of topicConsumers) { // Send and immediately forget // No tracking, no retry, no delivery confirmation try { consumer.receive(message); } catch (error) { // Log and continue - message is lost console.warn(`Delivery failed: ${error.message}`); } } // Message is discarded after single delivery attempt }}Implementation Pattern 2: Pre-Processing Acknowledgment
In this pattern, the consumer acknowledges receipt before it begins processing. If the consumer crashes during processing, the message is lost because the broker already considers it delivered.
12345678910111213141516171819202122232425262728
// Pattern 2: Acknowledge before processingclass PreAckConsumer { private broker: MessageBroker; async consumeMessage(): Promise<void> { const message = await this.broker.receive(); // CRITICAL: Acknowledge BEFORE processing // If we crash after this, message is lost await this.broker.acknowledge(message.id); // Now process the message // If this throws, the message is already acknowledged // and will NOT be redelivered try { await this.processMessage(message); } catch (error) { // Message processing failed but it's already ACK'd // Message is effectively lost - no retry will occur console.error(`Processing failed, message lost: ${error.message}`); } } private async processMessage(message: Message): Promise<void> { // Business logic here // If this crashes, message is gone forever }}The critical vulnerability in at-most-once delivery is the crash window—the period between acknowledgment and processing completion. Any failure during this window results in permanent message loss. The larger this window, the higher the probability of data loss.
Understanding where and how messages can be lost in at-most-once systems is crucial for making informed architectural decisions. Let's examine each point of failure in the message pipeline.
| Failure Point | Failure Scenario | Message Outcome | Recovery Possibility |
|---|---|---|---|
| Producer→Broker Network | Network partition, timeout, packet loss | Message never reaches broker | None - producer doesn't know it failed |
| Broker Storage | Broker crash before persisting (if not durable) | Message vanishes | None - no persistence means no recovery |
| Broker→Consumer Network | Network failure after broker sends | Message in flight is lost | None - broker already discarded message |
| Consumer Pre-ACK Crash | Consumer crashes after ACK but before processing | Message marked delivered but unprocessed | None - broker believes delivery succeeded |
| Consumer Processing | Exception during business logic after ACK | Business state inconsistent | Manual intervention required |
Quantifying Loss Probability
In a well-designed system with reasonable network reliability, message loss rates for at-most-once delivery typically range from 0.01% to 1%, depending on:
This may sound acceptable, but at scale, the numbers become significant:
123456789101112131415161718192021222324252627282930313233
/** * Calculate expected message loss for at-most-once delivery * * Assume 0.1% loss rate (optimistic for production systems) */function calculateExpectedLoss(messagesPerDay: number, lossRate: number = 0.001): void { const dailyLoss = messagesPerDay * lossRate; const monthlyLoss = dailyLoss * 30; const yearlyLoss = dailyLoss * 365; console.log(`At ${messagesPerDay.toLocaleString()} messages/day with ${lossRate * 100}% loss rate:`); console.log(` Daily loss: ${dailyLoss.toLocaleString()} messages`); console.log(` Monthly loss: ${monthlyLoss.toLocaleString()} messages`); console.log(` Yearly loss: ${yearlyLoss.toLocaleString()} messages`);} // Small system: 100K messages/daycalculateExpectedLoss(100_000);// Daily loss: 100 messages// Monthly loss: 3,000 messages// Yearly loss: 36,500 messages // Medium system: 10M messages/day calculateExpectedLoss(10_000_000);// Daily loss: 10,000 messages// Monthly loss: 300,000 messages// Yearly loss: 3,650,000 messages // Large system: 1B messages/daycalculateExpectedLoss(1_000_000_000);// Daily loss: 1,000,000 messages// Monthly loss: 30,000,000 messages// Yearly loss: 365,000,000 messagesA 0.1% loss rate means losing 10,000 messages per day at 10 million messages/day throughput. If those messages represent financial transactions, user data, or critical business events, at-most-once delivery is catastrophically inappropriate. The choice of delivery semantic must always be evaluated against the business impact of message loss.
Given the significant risk of data loss, why would anyone use at-most-once delivery? The answer lies in two compelling advantages: performance and simplicity.
Performance Advantages:
At-most-once delivery is dramatically faster than alternatives because it eliminates several expensive operations:
1234567891011121314151617181920212223242526272829303132
// Latency comparison for different delivery semanticsinterface LatencyBreakdown { networkLatency: number; // One-way network time serializationTime: number; // Message encoding brokerPersistence?: number; // Disk write time ackRoundTrip?: number; // Acknowledgment latency retryOverhead?: number; // Average retry cost} // At-Most-Once Deliveryconst atMostOnceLatency: LatencyBreakdown = { networkLatency: 1, // 1ms serializationTime: 0.1, // 0.1ms // No persistence // No acknowledgment // No retry};// Total: ~1.1ms per message // At-Least-Once Deliveryconst atLeastOnceLatency: LatencyBreakdown = { networkLatency: 1, // 1ms serializationTime: 0.1, // 0.1ms brokerPersistence: 5, // 5ms (SSD fsync) ackRoundTrip: 2, // 2ms (network round-trip) retryOverhead: 0.3, // 0.3ms (amortized retry cost)};// Total: ~8.4ms per message // At-most-once is approximately 7.6x faster in this exampleconst speedup = (1 + 0.1 + 5 + 2 + 0.3) / (1 + 0.1);console.log(`At-most-once is ~${speedup.toFixed(1)}x faster`);Simplicity Advantages:
Beyond raw performance, at-most-once systems are fundamentally simpler to implement, operate, and reason about:
At-most-once isn't a 'bad' delivery guarantee—it's the correct choice when message loss is acceptable and performance is critical. The key is understanding when those conditions apply.
At-most-once delivery is the correct choice in scenarios where:
Let's examine each category with concrete examples:
123456789101112131415161718192021222324252627282930313233
// At-most-once is perfect for high-frequency metricsclass MetricsCollector { private broker: AtMostOnceBroker; private sampleRate: number = 0.01; // 1% sampling /** * Collect request latency metric. * * Perfect for at-most-once because: * 1. We sample anyway (don't need 100% of data) * 2. Aggregate statistics tolerant to missing points * 3. High volume (millions/sec) - can't afford ACK overhead * 4. Next measurement replaces value of lost one */ recordLatency(endpoint: string, latencyMs: number): void { // Sample to reduce volume if (Math.random() > this.sampleRate) { return; } // Fire-and-forget - perfectly appropriate here this.broker.send("metrics.latency", { endpoint, latencyMs, timestamp: Date.now(), }); // Don't wait, don't retry, continue serving requests }} // This is called on every request - performance is critical// Losing 0.1% of already-sampled metrics is completely acceptableAll appropriate at-most-once use cases share a common property: the business value of complete delivery is less than the cost of ensuring it. When you can ask 'Would losing 1 in 1,000 of these messages cause business harm?' and the answer is 'no', at-most-once is likely appropriate.
Equally important is understanding when at-most-once delivery is dangerous and inappropriate. Using at-most-once for critical messages is a common source of production incidents and data loss.
1234567891011121314151617181920212223242526272829303132333435363738394041
// ❌ ANTI-PATTERN: At-most-once for critical business eventsclass PaymentProcessor { private broker: AtMostOnceBroker; // THIS IS DANGEROUS! async processPayment(order: Order, payment: Payment): Promise<void> { // Charge the customer await this.paymentGateway.charge(payment); // Fire and forget the fulfillment event // ❌ If this message is lost: // - Customer is charged // - Order is never fulfilled // - No automatic recovery // - Manual investigation required this.broker.send("fulfillment.orders", { orderId: order.id, customerId: order.customerId, items: order.items, }); // Customer is charged but may never receive their order! }} // ✅ CORRECT: Use at-least-once with idempotent consumerclass ReliablePaymentProcessor { private broker: AtLeastOnceBroker; async processPayment(order: Order, payment: Payment): Promise<void> { // Use transactional outbox pattern (covered in Module 4) await this.transactionalOutbox.executeWithEvent( () => this.paymentGateway.charge(payment), { topic: "fulfillment.orders", message: { orderId: order.id, customerId: order.customerId, items: order.items } } ); // Now either both succeed or both fail - no lost orders }}Real-world incident: A major e-commerce platform used at-most-once delivery for order fulfillment events. During a 3-hour network partition, approximately 15,000 orders were paid but never fulfilled. Manual reconciliation took 2 weeks, cost $500K in support labor, and resulted in significant customer churn. The 'savings' from simpler infrastructure were obliterated by a single incident.
Let's examine how at-most-once semantics are implemented in real-world messaging systems and protocols:
UDP (User Datagram Protocol)
UDP is the canonical at-most-once protocol. It provides no delivery confirmation, no ordering guarantees, and no retransmission mechanism. Packets are sent and forgotten.
12345678910111213141516171819202122232425262728293031323334353637
import * as dgram from 'dgram'; // UDP provides pure at-most-once semantics at the transport layerclass UDPMetricsSender { private socket: dgram.Socket; private metricsHost: string; private metricsPort: number; constructor(host: string, port: number) { this.socket = dgram.createSocket('udp4'); this.metricsHost = host; this.metricsPort = port; } /** * Send a metric via UDP. * * - No connection establishment (faster) * - No delivery confirmation (lighter) * - Packet may be lost, duplicated, or reordered by network * - Perfect for high-frequency, loss-tolerant telemetry */ sendMetric(name: string, value: number): void { const message = Buffer.from(`${name}:${value}|${Date.now()}`); // Send and forget - no callback for delivery confirmation this.socket.send(message, this.metricsPort, this.metricsHost); // Execution continues immediately // We have no idea if the packet arrived }} // StatsD, Prometheus's pushgateway, and many metrics systems use UDPconst metrics = new UDPMetricsSender('statsd.internal', 8125);metrics.sendMetric('api.requests.count', 1);metrics.sendMetric('api.latency.p99', 145);Apache Kafka with acks=0
Kafka supports at-most-once delivery when producers are configured with acks=0. This means the producer doesn't wait for any acknowledgment from brokers.
123456789101112131415161718192021222324252627282930313233343536
import { Kafka } from 'kafkajs'; const kafka = new Kafka({ clientId: 'metrics-producer', brokers: ['kafka-1:9092', 'kafka-2:9092'],}); // At-most-once producer configurationconst producer = kafka.producer({ // Don't wait for any acknowledgments // Message may be lost if broker hasn't received it acks: 0, // ← KEY SETTING for at-most-once // No point in retries without acks retry: { retries: 0, // No retries },}); // Consumer with auto-commit (at-most-once on consumer side)const consumer = kafka.consumer({ groupId: 'metrics-consumer' }); await consumer.subscribe({ topic: 'metrics' }); await consumer.run({ // Commit offsets as soon as messages are received // If processing fails after commit, messages are lost autoCommit: true, autoCommitInterval: 100, // Aggressive auto-commit eachMessage: async ({ message }) => { // Message offset already committed // If we crash here, this message is "lost" (won't be reprocessed) await processMetric(message.value); },});MQTT QoS 0
MQTT (Message Queuing Telemetry Transport) explicitly defines three QoS (Quality of Service) levels. QoS 0 is 'at-most-once':
12345678910111213141516171819
import * as mqtt from 'mqtt'; const client = mqtt.connect('mqtt://broker.local'); // QoS 0: At-most-once delivery// - Fastest delivery option// - No delivery confirmation// - Message may be lost// - Perfect for sensor data with high frequencyclient.publish( 'sensors/temperature/room1', JSON.stringify({ value: 22.5, unit: 'celsius' }), { qos: 0 } // ← QoS level 0 = at-most-once); // MQTT QoS levels summary:// QoS 0: At-most-once - Fire and forget// QoS 1: At-least-once - Delivered at least once, may duplicate// QoS 2: Exactly-once - Delivered exactly once (4-way handshake)Notice that all these protocols/systems make at-most-once an explicit, opt-in choice. This is intentional—the designers understood that at-most-once is valuable for specific use cases but dangerous as a default. Always be deliberate when choosing this semantic.
At-most-once delivery represents the simplest and fastest delivery semantic, trading reliability for performance. Let's consolidate the key insights:
The Decision Framework:
Before choosing at-most-once delivery, answer these questions:
You now understand the at-most-once delivery semantic—its guarantees, implementation mechanics, appropriate use cases, and critical limitations. In the next page, we'll explore at-least-once delivery, which eliminates message loss at the cost of potential duplicates, fundamentally changing the design constraints for consumers.