Computer NetworksUDP Overview

Understanding the User Datagram Protocol

LevelBeginner

Duration50 mins

TopicUDP Overview

3 / 5

Unreliable Communication

Embracing the Unreliable

In everyday language, unreliable is an insult. We avoid unreliable cars, distrust unreliable people, and complain about unreliable weather forecasts. Reliability is a virtue—something to maximize, always.

But in the engineering of transport protocols, unreliable is not a flaw to be eliminated. It's a deliberate design choice—a conscious decision to provide less in order to enable more.

When we say UDP is an unreliable protocol, we don't mean UDP is broken, unstable, or poorly designed. We mean that UDP makes no promises about delivery. A UDP datagram might arrive perfectly. It might arrive corrupted. It might arrive twice. It might never arrive at all. It might arrive out of order relative to other datagrams. UDP, as a protocol, is agnostic to all of these outcomes.

This isn't negligence—it's profound engineering pragmatism.

What You Will Learn

By the end of this page, you will understand exactly what 'unreliable' means technically, the specific ways UDP datagrams can fail, why UDP deliberately avoids reliability mechanisms, and how applications successfully build robust systems atop unreliable foundations.

What 'Unreliable' Means Technically

In networking terminology, reliability has a specific technical meaning. A reliable protocol guarantees certain properties about data delivery:

Properties of Reliable Delivery:

Delivery Guarantee: If the sender sends data, it will eventually arrive at the receiver (or the sender will be informed of failure)
Ordering Guarantee: Data arrives in the same order it was sent
Integrity Guarantee: Data arrives uncorrupted (identical to what was sent)
No Duplication: Each piece of data is delivered exactly once

TCP provides all four guarantees. SCTP provides all four. QUIC provides all four.

UDP provides none of them.

Reliability Properties: TCP vs UDP
Property	TCP	UDP	UDP Reality
Delivery	✓ Guaranteed (or error)	✗ Not guaranteed	Datagrams may be silently lost
Ordering	✓ Strict FIFO order	✗ Not guaranteed	Datagrams may arrive out of order
Integrity	✓ Mandatory checksum	⚠ Optional checksum*	Corruption possible but detectable
No duplication	✓ Sequence numbers detect	✗ Not guaranteed	Same datagram may arrive twice

*In IPv4, the UDP checksum is technically optional (can be set to zero to indicate 'no checksum'). In IPv6, the checksum is mandatory. In practice, all modern UDP implementations compute checksums.

What unreliability means in practice:

When an application calls sendto() with a UDP datagram:

The OS kernel queues the datagram for transmission
The network interface sends the datagram
Routers along the path forward the datagram (or drop it)
Eventually, the datagram reaches the destination (or it doesn't)

At no point does any component report back to the sender whether the datagram arrived. The sender receives no acknowledgment, no confirmation, no error if the datagram vanishes into the network void.

From the sender's perspective, every sendto() succeeds—regardless of what actually happens to the data.

Local Errors Are Still Reported

While network delivery failures are silent, local errors are reported. If the socket isn't bound, if the destination is unreachable according to the local routing table, or if the local network interface is down, sendto() will return an error. The unreliability is specifically about what happens after the datagram leaves the local system.

How UDP Datagrams Can Fail

Understanding unreliability requires examining the specific ways datagrams can fail. Each failure mode has different causes, frequencies, and implications.

1. Packet Loss (Datagram Never Arrives)

This is the most common failure mode. A datagram is sent but never reaches its destination. Causes include:

Causes of Packet Loss

•Router buffer overflow — Routers have finite buffer space. During congestion, incoming packets are dropped when buffers fill. This is the most common cause of loss on the internet.
•Interface queue overflow — Similarly, network interfaces have send/receive queues that can overflow under heavy load.
•TTL expiration — Each IP packet has a Time To Live that decrements at each hop. If it reaches zero, the packet is dropped (prevents infinite routing loops).
•Link errors — Physical transmission errors can corrupt packets beyond repair, causing them to be discarded.
•Firewall filtering — Stateful firewalls may drop UDP packets that don't match expected 'session' state.
•NAT mapping expiration — If a NAT mapping expires, reply packets have no way to reach the internal host.
•Routing failures — Network partitions, routing loops, or black holes can prevent delivery.
•Receiver buffer overflow — If the receiving application doesn't read fast enough, the socket receive buffer fills and new datagrams are dropped.

2. Corruption (Datagram Arrives Altered)

Data in the datagram changes between sending and receiving. Causes include:

Bit flips from electromagnetic interference
Memory errors in routers or network interfaces
Buggy network equipment
Cosmic ray-induced single-event upsets (rare but real)

UDP's checksum can detect most corruption, but:

In IPv4, the checksum is optional (though rarely omitted)
The checksum is only 16 bits—there's a 1 in 65,536 chance of undetected corruption
UDP checksum covers only UDP header and data, not IP header

3. Duplication (Same Datagram Arrives Multiple Times)

The same datagram is delivered more than once. This can happen due to:

Network equipment retransmitting at the link layer
Routing anomalies that cause packets to traverse the same path twice
Buggy middleboxes that duplicate packets

Without sequence numbers, the receiver has no way to detect duplicates—both copies appear to be valid, independent datagrams.

4. Reordering (Datagrams Arrive Out of Order)

Datagrams sent as A, B, C might arrive as B, A, C or A, C, B. Causes include:

Load-balanced paths with different latencies
Route changes mid-stream
Parallelism in network equipment
Retransmissions at lower layers arriving alongside originals

Reordering is especially common on the internet where packets routinely take different paths.

5. Delay Variation (Jitter)

While not strictly a reliability issue, delay variation affects applications expecting consistent timing:

Datagram 1: 50ms latency
Datagram 2: 120ms latency
Datagram 3: 45ms latency

For real-time applications, this variation can be more disruptive than occasional loss.

All Failures Are Silent

The critical aspect is not that these failures occur—they occur for TCP too. The difference is that TCP detects and recovers from failures automatically and silently. With UDP, failures are not detected at the transport layer. Applications must detect and handle them if needed.

Why UDP Deliberately Avoids Reliability

Given that reliability seems universally desirable, why would a protocol deliberately omit it? The answer lies in understanding the costs of reliability mechanisms and recognizing that these costs are not appropriate for all applications.

The Costs of Reliability:

What Reliability Mechanisms Cost

•Latency — Acknowledgments require round trips. Retransmissions add delay. Waiting for ordered delivery delays later data while earlier data is recovered. For a video call, waiting 300ms to retransmit a lost packet means 300ms of silence or frozen video—unacceptable.
•Bandwidth overhead — ACKs consume bandwidth. Sequence numbers add bytes to every packet. Retransmissions duplicate data. For high-volume telemetry, this overhead can be substantial.
•Memory — Reliable delivery requires buffering sent data until acknowledged. Receive-side ordering requires buffering out-of-order data. For embedded systems or servers with millions of flows, this memory cost matters.
•CPU cycles — Reliability mechanisms require computation: sequence tracking, ACK processing, timer management, congestion control algorithms. Per-packet processing overhead limits maximum throughput.
•Complexity — Reliability adds state machines, edge cases, timeout tuning, and failure modes. Simple datagram delivery becomes a complex protocol.

When These Costs Are Unacceptable:

Real-time media: A video streaming application would rather skip a lost frame than pause playback waiting for retransmission. By the time a retransmitted frame arrives, the moment it should display has passed. TCP's reliability actively harms the user experience.

High-frequency updates: A game sending 60 position updates per second doesn't need every update to arrive. If update 42 is lost, update 43 supersedes it. Retransmitting 42 wastes bandwidth and might cause the stale position to be processed.

Resource-constrained systems: An IoT sensor with 32KB of RAM cannot afford TCP's buffer requirements. A fire-and-forget UDP message to a logging server works perfectly.

Idempotent operations: DNS resolution is idempotent—asking the same question twice yields the same answer. If a query is lost, requery. No state needed.

Application-specific reliability: Perhaps you need messages to be reliable, but not ordered. Or ordered, but not reliable. Or selectively reliable based on message importance. TCP's one-size-fits-all reliability prevents this customization.

The End-to-End Argument

This design follows the end-to-end principle: implement functionality at endpoints only if needed, not in the network. Reliability is needed by some applications, not all. By omitting reliability from UDP, applications have the freedom to implement exactly the reliability semantics they need—or none at all.

The Checksum: UDP's Single Concession to Integrity

While UDP provides no delivery, ordering, or duplication guarantees, it does offer one reliability-adjacent feature: the checksum. This is UDP's single concession to data integrity.

What the UDP checksum covers:

Source port, destination port, length
UDP data payload
A 'pseudo-header' with IP source/destination addresses and protocol number

The pseudo-header inclusion is clever—it allows the checksum to verify that the datagram reached the correct host (not just the correct port), even though this information is in the IP header, not the UDP header.

How the checksum works:

The UDP checksum uses the same one's complement sum algorithm as TCP and IP:

Treat the data as a sequence of 16-bit words
Sum all words using one's complement addition (carries wrap around)
Take the one's complement of the result
Store in the checksum field

At the receiver, summing all words (including the checksum) should produce 0xFFFF. Any other result indicates corruption.

The checksum's limitations:

Only 16 bits: With only 65,536 possible values, there's a ~1 in 65,536 chance of undetected corruption
Only detects, doesn't correct: A failed checksum means the datagram is discarded—no error correction is attempted
Optional in IPv4: The checksum field can be zero to indicate 'no checksum computed' (rarely used in practice)
No protection against reordering or loss: Checksum validates individual datagram integrity, nothing more

UDP Checksum Behavior
Scenario	IPv4 Behavior	IPv6 Behavior
Checksum computation	Optional (but recommended)	Mandatory
Zero checksum value	Means 'not computed'	Reserved for future use
Failed checksum	Datagram silently discarded	Datagram silently discarded
Pseudo-header included	Yes (12 bytes)	Yes (40 bytes)

Why Even Bother with a Checksum?

If UDP is unreliable, why include a checksum at all? Because corruption and loss have different implications. A lost packet is simply gone—the application never sees it. A corrupted packet that's delivered could cause incorrect application behavior, security vulnerabilities, or data corruption. The checksum prevents silently delivering garbage.

Real-world corruption rates:

Studies have shown that end-to-end data corruption, even after link-layer CRCs, occurs at measurable rates:

Memory errors in network equipment
Software bugs in router implementations
Cosmic ray-induced bit flips (especially significant in data centers with massive data volumes)

Google's research found that approximately 1 in 10 billion packets experiences undetected corruption through their data centers. At Google scale, this means millions of corrupted packets per day. End-to-end checksums like UDP's are essential defense-in-depth.

Building Reliability When Needed

When applications using UDP do need reliability, they must implement it themselves. This isn't as daunting as it sounds—the mechanisms are well-understood, and implementing only what's needed can be simpler than using TCP.

Basic Reliability Mechanisms:

1. Acknowledgments and Retransmission

The sender waits for an acknowledgment. If none arrives within a timeout, it retransmits:

Simple Reliable Send
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import socket
import time
 
def reliable_send(sock, data, dest, max_retries=5, timeout=1.0):
    """Send data reliably with acknowledgment and retry."""
    seq_num = 0  # Simplified; real implementation needs proper sequence tracking
    
    for attempt in range(max_retries):
        # Send with sequence number prefix
        packet = seq_num.to_bytes(4, 'big') + data
        sock.sendto(packet, dest)
        
        sock.settimeout(timeout)
        try:
            ack, addr = sock.recvfrom(4)
            ack_num = int.from_bytes(ack, 'big')
            if ack_num == seq_num:
                return True  # Success
        except socket.timeout:
            timeout *= 2  # Exponential backoff
            continue
    
    return False  # Failed after all retries
 
def reliable_receive(sock):
    """Receive data and send acknowledgment."""
    data, addr = sock.recvfrom(65535)
    seq_num = int.from_bytes(data[:4], 'big')
    payload = data[4:]
    
    # Send ACK
    ack = seq_num.to_bytes(4, 'big')
    sock.sendto(ack, addr)
    
    return payload, addr

2. Sequence Numbers for Ordering

Attach sequence numbers to datagrams. The receiver reorders based on sequence:

Simple case: Deliver when next expected sequence arrives
Complex case: Buffer out-of-order, deliver when gaps filled

3. Sequence Numbers for Deduplication

Same sequence numbers, different use: track which sequences have been received and ignore duplicates.

4. Checksums for Integrity

UDP provides one, but applications can add stronger checksums (CRC-32, SHA-256) for critical data.

5. Forward Error Correction (FEC)

Send redundant data that allows recovery without retransmission:

Send packets A, B, C, and parity P = A XOR B XOR C
If any single packet is lost, it can be reconstructed from the others
Critical for real-time streams where retransmission latency is unacceptable

QUIC: Reliability Done Right on UDP

QUIC demonstrates sophisticated reliability over UDP: per-stream ordering, selective acknowledgments, adaptive congestion control, and 0-RTT session resumption. It proves that UDP's unreliability isn't a limitation—it's a foundation for building exactly the reliability semantics your application needs.

Unreliability in Practice: Real-World Loss Rates

Understanding theoretical unreliability is useful, but what does it mean in practice? What loss rates do applications actually experience?

Loss Rates by Network Type:

Typical Packet Loss Rates
Network Type	Typical Loss Rate	Notes
Data center (same rack)	< 0.0001%	Almost negligible; rare equipment failure
Data center (cross-rack)	0.001% - 0.01%	Occasional congestion at aggregation switches
Corporate LAN	0.01% - 0.1%	Depends on network quality and utilization
Home broadband (wired)	0.1% - 1%	ISP congestion, last-mile issues
Home broadband (WiFi)	1% - 5%	Interference, contention, range issues
Mobile (4G/5G)	1% - 10%	Highly variable; cell handoff, congestion
Satellite/intercontinental	2% - 15%	Long paths, multiple hops, congestion

What These Rates Mean:

At 0.1% loss (decent broadband):

1 in 1,000 packets lost
A VoIP call at 50 packets/second loses ~3 packets/minute
DNS query has 99.9% first-try success rate

At 1% loss (WiFi, mobile):

1 in 100 packets lost
VoIP call loses ~30 packets/minute (noticeable glitches)
Game state updates need prediction/interpolation to mask loss

At 5% loss (congested mobile):

1 in 20 packets lost
Streaming video needs significant FEC or adaptive bitrate
Reliable protocols experience heavy retransmission overhead

Burst loss is worse than random loss:

Loss rarely distributes uniformly. Network congestion causes burst losses—many packets lost in quick succession. This is worse for applications because:

FEC that handles one lost packet can't handle three consecutive losses
Audio/video experiences multi-frame gaps instead of scattered noise
Retransmission strategies must handle correlated losses

Plan for the Worst, Optimize for the Common

Design applications to function at 5-10% loss (mobile users exist), optimize for the common case of <1% loss. Applications that only work on perfect networks will fail for real users.

Success Stories: Thriving on Unreliability

Despite—or because of—its unreliability, UDP powers some of the internet's most demanding applications. Let's examine how they succeed:

DNS: Embracing Simplicity

DNS queries are:

Small (typically fits in one packet)
Idempotent (same question, same answer)
Latency-sensitive (every page load waits for DNS)

The reliability strategy:

Send query via UDP
Wait 1-2 seconds for response
If no response, retry (maybe to different server)
If still failing, report error to application

99%+ of DNS queries succeed without any retry. The rare failures are handled gracefully by retrying. TCP's overhead would double resolution time for no benefit.

VoIP/Video Calling: Preferring Fresh Over Perfect

Voice and video streams are:

Continuous flows at fixed rate
Time-sensitive (old data is useless)
Loss-tolerant (small gaps can be masked)

The reliability strategy:

Send packets with RTP (sequence numbers, timestamps)
Monitor for loss via sequence gaps
Use FEC or redundancy for loss recovery
If packet definitely lost, skip it—don't wait

A 20ms audio packet lost 200ms ago should NOT be retransmitted. By the time it arrives, 220ms of audio would have been buffered waiting for it. Better to interpolate 20ms of audio than pause for 220ms.

Online Gaming: Prediction Over Perfection

Game state updates are:

High-frequency (60+ updates/second)
Superseding (next state replaces previous)
Latency-critical (ms matter for competitive play)

The reliability strategy:

Send all updates unreliably
Include enough state for each update to be self-sufficient
Client predicts movement during network gaps
Server reconciles when updates arrive
Occasional full-state sync catches any drift

If position update 42 is lost, update 43 supersedes it anyway. No retransmission needed.

The Pattern

Notice that successful UDP applications share traits: tolerance for occasional loss, time-sensitivity that makes old data worthless, or idempotency that makes retry cheap. Applications lacking these traits typically use TCP or build reliability over UDP.

Summary: Unreliability as a Feature

We've thoroughly explored what unreliability means for UDP. Let's consolidate the essential insights:

Key Takeaways

•'Unreliable' is a technical term — It means no guarantees of delivery, ordering, or deduplication—not that the protocol is broken or unstable.
•Datagrams can fail in multiple ways — Loss, corruption, duplication, reordering, and delay variation all occur and are all silent at the UDP level.
•Unreliability is deliberate — Reliability mechanisms have real costs (latency, bandwidth, memory, CPU). By omitting them, UDP enables applications where those costs are unacceptable.
•The checksum is the exception — UDP includes integrity checking because corruption is worse than loss (garbage data vs. no data).
•Applications can build reliability — ACKs, sequence numbers, retransmission, and FEC are well-understood and can be implemented when needed.
•Real loss rates vary dramatically — From 0.0001% in data centers to 10%+ on mobile networks. Applications must handle the range users actually experience.
•Successful UDP applications share traits — They tolerate loss, have time-sensitive data, or are idempotent enough that retry is cheap.

The deeper principle:

UDP's unreliability is a feature, not a bug. It provides a foundation that applications can build upon according to their specific needs. Some applications need no reliability. Some need partial reliability. Some need full reliability but with different semantics than TCP provides. UDP accommodates them all by providing none—and letting applications add exactly what they need.

What's next:

Unreliability and connectionlessness combine to create UDP's best-effort delivery model. In the next page, we'll examine what 'best-effort' means in depth, how it compares to guaranteed delivery, and why best-effort is often the right choice.

Page Complete

You now understand UDP's unreliability—what it means technically, why it's a deliberate design choice, and how applications successfully operate on unreliable foundations. Next, we'll explore the best-effort delivery model that ties these concepts together.

3 / 5

Loading learning content...

Computer NetworksUDP Overview

Understanding the User Datagram Protocol

LevelBeginner

Duration50 mins

TopicUDP Overview

3 / 5

Unreliable Communication

Embracing the Unreliable

But in the engineering of transport protocols, unreliable is not a flaw to be eliminated. It's a deliberate design choice—a conscious decision to provide less in order to enable more.

This isn't negligence—it's profound engineering pragmatism.

What You Will Learn

What 'Unreliable' Means Technically

In networking terminology, reliability has a specific technical meaning. A reliable protocol guarantees certain properties about data delivery:

Properties of Reliable Delivery:

Delivery Guarantee: If the sender sends data, it will eventually arrive at the receiver (or the sender will be informed of failure)
Ordering Guarantee: Data arrives in the same order it was sent
Integrity Guarantee: Data arrives uncorrupted (identical to what was sent)
No Duplication: Each piece of data is delivered exactly once

TCP provides all four guarantees. SCTP provides all four. QUIC provides all four.

UDP provides none of them.

Reliability Properties: TCP vs UDP
Property	TCP	UDP	UDP Reality
Delivery	✓ Guaranteed (or error)	✗ Not guaranteed	Datagrams may be silently lost
Ordering	✓ Strict FIFO order	✗ Not guaranteed	Datagrams may arrive out of order
Integrity	✓ Mandatory checksum	⚠ Optional checksum*	Corruption possible but detectable
No duplication	✓ Sequence numbers detect	✗ Not guaranteed	Same datagram may arrive twice

*In IPv4, the UDP checksum is technically optional (can be set to zero to indicate 'no checksum'). In IPv6, the checksum is mandatory. In practice, all modern UDP implementations compute checksums.

What unreliability means in practice:

When an application calls sendto() with a UDP datagram:

The OS kernel queues the datagram for transmission
The network interface sends the datagram
Routers along the path forward the datagram (or drop it)
Eventually, the datagram reaches the destination (or it doesn't)

At no point does any component report back to the sender whether the datagram arrived. The sender receives no acknowledgment, no confirmation, no error if the datagram vanishes into the network void.

From the sender's perspective, every sendto() succeeds—regardless of what actually happens to the data.

Local Errors Are Still Reported

How UDP Datagrams Can Fail

Understanding unreliability requires examining the specific ways datagrams can fail. Each failure mode has different causes, frequencies, and implications.

1. Packet Loss (Datagram Never Arrives)

This is the most common failure mode. A datagram is sent but never reaches its destination. Causes include:

Causes of Packet Loss

•Router buffer overflow — Routers have finite buffer space. During congestion, incoming packets are dropped when buffers fill. This is the most common cause of loss on the internet.
•Interface queue overflow — Similarly, network interfaces have send/receive queues that can overflow under heavy load.
•TTL expiration — Each IP packet has a Time To Live that decrements at each hop. If it reaches zero, the packet is dropped (prevents infinite routing loops).
•Link errors — Physical transmission errors can corrupt packets beyond repair, causing them to be discarded.
•Firewall filtering — Stateful firewalls may drop UDP packets that don't match expected 'session' state.
•NAT mapping expiration — If a NAT mapping expires, reply packets have no way to reach the internal host.
•Routing failures — Network partitions, routing loops, or black holes can prevent delivery.
•Receiver buffer overflow — If the receiving application doesn't read fast enough, the socket receive buffer fills and new datagrams are dropped.

2. Corruption (Datagram Arrives Altered)

Data in the datagram changes between sending and receiving. Causes include:

Bit flips from electromagnetic interference
Memory errors in routers or network interfaces
Buggy network equipment
Cosmic ray-induced single-event upsets (rare but real)

UDP's checksum can detect most corruption, but:

In IPv4, the checksum is optional (though rarely omitted)
The checksum is only 16 bits—there's a 1 in 65,536 chance of undetected corruption
UDP checksum covers only UDP header and data, not IP header

3. Duplication (Same Datagram Arrives Multiple Times)

The same datagram is delivered more than once. This can happen due to:

Network equipment retransmitting at the link layer
Routing anomalies that cause packets to traverse the same path twice
Buggy middleboxes that duplicate packets

Without sequence numbers, the receiver has no way to detect duplicates—both copies appear to be valid, independent datagrams.

4. Reordering (Datagrams Arrive Out of Order)

Datagrams sent as A, B, C might arrive as B, A, C or A, C, B. Causes include:

Load-balanced paths with different latencies
Route changes mid-stream
Parallelism in network equipment
Retransmissions at lower layers arriving alongside originals

Reordering is especially common on the internet where packets routinely take different paths.

5. Delay Variation (Jitter)

While not strictly a reliability issue, delay variation affects applications expecting consistent timing:

Datagram 1: 50ms latency
Datagram 2: 120ms latency
Datagram 3: 45ms latency

For real-time applications, this variation can be more disruptive than occasional loss.

All Failures Are Silent

Why UDP Deliberately Avoids Reliability

The Costs of Reliability:

What Reliability Mechanisms Cost

•Latency — Acknowledgments require round trips. Retransmissions add delay. Waiting for ordered delivery delays later data while earlier data is recovered. For a video call, waiting 300ms to retransmit a lost packet means 300ms of silence or frozen video—unacceptable.
•Bandwidth overhead — ACKs consume bandwidth. Sequence numbers add bytes to every packet. Retransmissions duplicate data. For high-volume telemetry, this overhead can be substantial.
•Memory — Reliable delivery requires buffering sent data until acknowledged. Receive-side ordering requires buffering out-of-order data. For embedded systems or servers with millions of flows, this memory cost matters.
•CPU cycles — Reliability mechanisms require computation: sequence tracking, ACK processing, timer management, congestion control algorithms. Per-packet processing overhead limits maximum throughput.
•Complexity — Reliability adds state machines, edge cases, timeout tuning, and failure modes. Simple datagram delivery becomes a complex protocol.

When These Costs Are Unacceptable:

Resource-constrained systems: An IoT sensor with 32KB of RAM cannot afford TCP's buffer requirements. A fire-and-forget UDP message to a logging server works perfectly.

Idempotent operations: DNS resolution is idempotent—asking the same question twice yields the same answer. If a query is lost, requery. No state needed.

The End-to-End Argument

The Checksum: UDP's Single Concession to Integrity

While UDP provides no delivery, ordering, or duplication guarantees, it does offer one reliability-adjacent feature: the checksum. This is UDP's single concession to data integrity.

What the UDP checksum covers:

Source port, destination port, length
UDP data payload
A 'pseudo-header' with IP source/destination addresses and protocol number

How the checksum works:

The UDP checksum uses the same one's complement sum algorithm as TCP and IP:

Treat the data as a sequence of 16-bit words
Sum all words using one's complement addition (carries wrap around)
Take the one's complement of the result
Store in the checksum field

At the receiver, summing all words (including the checksum) should produce 0xFFFF. Any other result indicates corruption.

The checksum's limitations:

Only 16 bits: With only 65,536 possible values, there's a ~1 in 65,536 chance of undetected corruption
Only detects, doesn't correct: A failed checksum means the datagram is discarded—no error correction is attempted
Optional in IPv4: The checksum field can be zero to indicate 'no checksum computed' (rarely used in practice)
No protection against reordering or loss: Checksum validates individual datagram integrity, nothing more

UDP Checksum Behavior
Scenario	IPv4 Behavior	IPv6 Behavior
Checksum computation	Optional (but recommended)	Mandatory
Zero checksum value	Means 'not computed'	Reserved for future use
Failed checksum	Datagram silently discarded	Datagram silently discarded
Pseudo-header included	Yes (12 bytes)	Yes (40 bytes)

Why Even Bother with a Checksum?

Real-world corruption rates:

Studies have shown that end-to-end data corruption, even after link-layer CRCs, occurs at measurable rates:

Memory errors in network equipment
Software bugs in router implementations
Cosmic ray-induced bit flips (especially significant in data centers with massive data volumes)

Building Reliability When Needed

Basic Reliability Mechanisms:

1. Acknowledgments and Retransmission

The sender waits for an acknowledgment. If none arrives within a timeout, it retransmits:

Simple Reliable Send
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import socket
import time
 
def reliable_send(sock, data, dest, max_retries=5, timeout=1.0):
    """Send data reliably with acknowledgment and retry."""
    seq_num = 0  # Simplified; real implementation needs proper sequence tracking
    
    for attempt in range(max_retries):
        # Send with sequence number prefix
        packet = seq_num.to_bytes(4, 'big') + data
        sock.sendto(packet, dest)
        
        sock.settimeout(timeout)
        try:
            ack, addr = sock.recvfrom(4)
            ack_num = int.from_bytes(ack, 'big')
            if ack_num == seq_num:
                return True  # Success
        except socket.timeout:
            timeout *= 2  # Exponential backoff
            continue
    
    return False  # Failed after all retries
 
def reliable_receive(sock):
    """Receive data and send acknowledgment."""
    data, addr = sock.recvfrom(65535)
    seq_num = int.from_bytes(data[:4], 'big')
    payload = data[4:]
    
    # Send ACK
    ack = seq_num.to_bytes(4, 'big')
    sock.sendto(ack, addr)
    
    return payload, addr

2. Sequence Numbers for Ordering

Attach sequence numbers to datagrams. The receiver reorders based on sequence:

Simple case: Deliver when next expected sequence arrives
Complex case: Buffer out-of-order, deliver when gaps filled

3. Sequence Numbers for Deduplication

Same sequence numbers, different use: track which sequences have been received and ignore duplicates.

4. Checksums for Integrity

UDP provides one, but applications can add stronger checksums (CRC-32, SHA-256) for critical data.

5. Forward Error Correction (FEC)

Send redundant data that allows recovery without retransmission:

Send packets A, B, C, and parity P = A XOR B XOR C
If any single packet is lost, it can be reconstructed from the others
Critical for real-time streams where retransmission latency is unacceptable

QUIC: Reliability Done Right on UDP

Unreliability in Practice: Real-World Loss Rates

Understanding theoretical unreliability is useful, but what does it mean in practice? What loss rates do applications actually experience?

Loss Rates by Network Type:

Typical Packet Loss Rates
Network Type	Typical Loss Rate	Notes
Data center (same rack)	< 0.0001%	Almost negligible; rare equipment failure
Data center (cross-rack)	0.001% - 0.01%	Occasional congestion at aggregation switches
Corporate LAN	0.01% - 0.1%	Depends on network quality and utilization
Home broadband (wired)	0.1% - 1%	ISP congestion, last-mile issues
Home broadband (WiFi)	1% - 5%	Interference, contention, range issues
Mobile (4G/5G)	1% - 10%	Highly variable; cell handoff, congestion
Satellite/intercontinental	2% - 15%	Long paths, multiple hops, congestion

What These Rates Mean:

At 0.1% loss (decent broadband):

1 in 1,000 packets lost
A VoIP call at 50 packets/second loses ~3 packets/minute
DNS query has 99.9% first-try success rate

At 1% loss (WiFi, mobile):

1 in 100 packets lost
VoIP call loses ~30 packets/minute (noticeable glitches)
Game state updates need prediction/interpolation to mask loss

At 5% loss (congested mobile):

1 in 20 packets lost
Streaming video needs significant FEC or adaptive bitrate
Reliable protocols experience heavy retransmission overhead

Burst loss is worse than random loss:

Loss rarely distributes uniformly. Network congestion causes burst losses—many packets lost in quick succession. This is worse for applications because:

FEC that handles one lost packet can't handle three consecutive losses
Audio/video experiences multi-frame gaps instead of scattered noise
Retransmission strategies must handle correlated losses

Plan for the Worst, Optimize for the Common

Design applications to function at 5-10% loss (mobile users exist), optimize for the common case of <1% loss. Applications that only work on perfect networks will fail for real users.

Success Stories: Thriving on Unreliability

Despite—or because of—its unreliability, UDP powers some of the internet's most demanding applications. Let's examine how they succeed:

DNS: Embracing Simplicity

DNS queries are:

Small (typically fits in one packet)
Idempotent (same question, same answer)
Latency-sensitive (every page load waits for DNS)

The reliability strategy:

Send query via UDP
Wait 1-2 seconds for response
If no response, retry (maybe to different server)
If still failing, report error to application

99%+ of DNS queries succeed without any retry. The rare failures are handled gracefully by retrying. TCP's overhead would double resolution time for no benefit.

VoIP/Video Calling: Preferring Fresh Over Perfect

Voice and video streams are:

Continuous flows at fixed rate
Time-sensitive (old data is useless)
Loss-tolerant (small gaps can be masked)

The reliability strategy:

Send packets with RTP (sequence numbers, timestamps)
Monitor for loss via sequence gaps
Use FEC or redundancy for loss recovery
If packet definitely lost, skip it—don't wait

Online Gaming: Prediction Over Perfection

Game state updates are:

High-frequency (60+ updates/second)
Superseding (next state replaces previous)
Latency-critical (ms matter for competitive play)

The reliability strategy:

Send all updates unreliably
Include enough state for each update to be self-sufficient
Client predicts movement during network gaps
Server reconciles when updates arrive
Occasional full-state sync catches any drift

If position update 42 is lost, update 43 supersedes it anyway. No retransmission needed.

The Pattern

Summary: Unreliability as a Feature

We've thoroughly explored what unreliability means for UDP. Let's consolidate the essential insights:

Key Takeaways

•'Unreliable' is a technical term — It means no guarantees of delivery, ordering, or deduplication—not that the protocol is broken or unstable.
•Datagrams can fail in multiple ways — Loss, corruption, duplication, reordering, and delay variation all occur and are all silent at the UDP level.
•Unreliability is deliberate — Reliability mechanisms have real costs (latency, bandwidth, memory, CPU). By omitting them, UDP enables applications where those costs are unacceptable.
•The checksum is the exception — UDP includes integrity checking because corruption is worse than loss (garbage data vs. no data).
•Applications can build reliability — ACKs, sequence numbers, retransmission, and FEC are well-understood and can be implemented when needed.
•Real loss rates vary dramatically — From 0.0001% in data centers to 10%+ on mobile networks. Applications must handle the range users actually experience.
•Successful UDP applications share traits — They tolerate loss, have time-sensitive data, or are idempotent enough that retry is cheap.

The deeper principle:

What's next:

Page Complete

3 / 5