Udp Header Format - Learning Module

Loading content...

0/240

Checksum: Detecting Transmission Errors

Ensuring Data Integrity

Data traveling across networks faces constant threats to its integrity. Electromagnetic interference, cosmic rays, hardware failures, and software bugs can flip bits, corrupt bytes, or mangle entire packets. Even on modern, reliable networks, these errors occur more often than you might expect.

The checksum field in the UDP header provides a mechanism to detect such corruption. It's a 16-bit value computed over the UDP header, data, and a "pseudo-header" derived from the IP layer. When a datagram arrives, the receiver performs the same calculation and compares results. Mismatches indicate corruption, and the datagram is silently discarded.

This page provides a comprehensive exploration of the UDP checksum—its calculation algorithm, the pseudo-header concept, the differences between IPv4 (optional) and IPv6 (mandatory), and the implications for application reliability.

What You Will Learn

By the end of this page, you will understand: (1) The precise structure and position of the checksum field, (2) The one's complement sum algorithm used for calculation, (3) Why and how the pseudo-header is incorporated, (4) Differences between IPv4 and IPv6 checksum requirements, (5) Performance considerations and hardware offloading, (6) The limitations of checksums as a security mechanism.

Anatomy of the Checksum Field

The checksum field occupies the final 16 bits (bytes 6-7) of the UDP header. This position is significant—it comes after all other header fields and immediately precedes the user data.

Field Structure:

udp_checksum_position.txt

Diagram

UDP Header (8 bytes) with Checksum Field Highlighted
═══════════════════════════════════════════════════════════════════
 
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
├─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┤
│         Source Port           │       Destination Port        │
├───────────────────────────────┼───────────────────────────────┤
│            Length             │███████ CHECKSUM █████████████│
│          (16 bits)            │       (16 bits)               │
│          Bytes 4-5            │  Bytes 6-7, Bits 48-63        │
└───────────────────────────────┴───────────────────────────────┘
                                            ▲
                                            │
                                    Final header field
                                    
Checksum Field Details:
───────────────────────────────────────────────────────────────────
• Offset: 48 bits (6 bytes) from start of UDP header
• Size: 16 bits (2 bytes)
• Byte order: Network byte order (big-endian)
• Special value: 0x0000 = Checksum not computed (IPv4 only)
• Calculation: One's complement of one's complement sum

The Special Value: Zero

In IPv4, a checksum value of zero (0x0000) has special meaning: it indicates that the sender chose not to compute a checksum. This is legally permitted by RFC 768 for IPv4 UDP.

However, there's a subtlety: what if the actual computed checksum happens to be zero? This is possible—the algorithm can produce 0x0000 as a valid result. To distinguish from "not computed," senders transmit 0xFFFF instead (the one's complement of zero, which is mathematically equivalent to zero in one's complement arithmetic but indicates the checksum was actually calculated).

IPv6 Requirement:

In IPv6, the checksum is mandatory. The value zero is not permitted—if a sender transmits zero, standard-compliant receivers must drop the datagram. This is specified in RFC 8200.

IPv4 Checksum Disabling Is Dangerous

While IPv4 technically allows disabling the UDP checksum, this is strongly discouraged in modern networks. The IP header checksum only covers the IP header, not the payload. Without UDP checksum, corrupted data reaches applications silently. Only historically justified for applications with their own integrity checking (like NFS over UDP in trusted LANs).

The One's Complement Sum Algorithm

The UDP checksum (like the IP, TCP, and ICMP checksums) uses a venerable technique called the one's complement sum. This algorithm dates back to early internet protocol design and has properties that make it particularly suitable for network checksums.

Algorithm Steps:

Divide the data into 16-bit words
Sum all 16-bit words, treating them as unsigned integers
Carry any overflow bits back into the sum (end-around carry)
Take the one's complement (bitwise NOT) of the final sum
The result is the checksum

Verification:

To verify, the receiver computes the same sum over the received data including the checksum field. If no errors occurred, the result should be all ones (0xFFFF). Any other result indicates corruption.

ones_complement_checksum.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
"""
One's Complement Checksum Implementation
 
This module provides a complete, educational implementation of
the one's complement checksum algorithm used by UDP, TCP, and IP.
"""
 
import struct
from typing import Union
 
def ones_complement_sum(data: bytes) -> int:
    """
    Compute the one's complement sum of data.
    
    This is the core algorithm used by UDP, TCP, and IP checksums.
    
    Args:
        data: Bytes to sum (padded with zero if odd length)
    
    Returns:
        16-bit one's complement sum (before final complement)
    """
    # Pad with zero byte if odd length
    if len(data) % 2:
        data = data + b'\x00'
    
    # Sum all 16-bit words
    total = 0
    for i in range(0, len(data), 2):
        word = (data[i] << 8) + data[i + 1]  # Big-endian
        total += word
    
    # Fold 32-bit sum to 16 bits (handle carries)
    # This is the "end-around carry"
    while total > 0xFFFF:
        total = (total & 0xFFFF) + (total >> 16)
    
    return total
 
def compute_checksum(data: bytes) -> int:
    """
    Compute the one's complement checksum.
    
    Args:
        data: Data to checksum (pseudo-header + UDP header + payload)
    
    Returns:
        16-bit checksum value
    """
    s = ones_complement_sum(data)
    # Take one's complement (bitwise NOT, keeping 16 bits)
    checksum = ~s & 0xFFFF
    
    # Special case: if checksum is 0, use 0xFFFF instead
    # (to distinguish from "checksum not used")
    if checksum == 0:
        checksum = 0xFFFF
    
    return checksum
 
def verify_checksum(data: bytes) -> bool:
    """
    Verify a received datagram's checksum.
    
    Args:
        data: Complete data including existing checksum
    
    Returns:
        True if checksum is valid, False otherwise
    """
    s = ones_complement_sum(data)
    # If valid, result should be all ones (0xFFFF)
    return s == 0xFFFF
 
def demonstrate_algorithm():
    """Step-by-step demonstration of the algorithm."""
    
    print("=== One's Complement Checksum Demonstration ===
")
    
    # Example data
    test_data = bytes([0x00, 0x01, 0xf2, 0x03, 0xf4, 0xf5, 0xf6, 0xf7])
    
    print(f"Input data (hex): {test_data.hex()}")
    print(f"Input length: {len(test_data)} bytes
")
    
    # Step 1: Split into 16-bit words
    words = []
    for i in range(0, len(test_data), 2):
        word = (test_data[i] << 8) + test_data[i + 1]
        words.append(word)
        print(f"Word {i//2}: 0x{word:04x} ({word})")
    
    # Step 2: Sum the words
    print(f"
Summing words:")
    total = 0
    for i, word in enumerate(words):
        total += word
        print(f"  After word {i}: 0x{total:05x} ({total})")
    
    # Step 3: Fold carries
    print(f"
Folding carries:")
    while total > 0xFFFF:
        carry = total >> 16
        lower = total & 0xFFFF
        total = lower + carry
        print(f"  {lower:#06x} + {carry:#06x} = {total:#06x}")
    
    # Step 4: One's complement
    checksum = ~total & 0xFFFF
    print(f"
One's complement of {total:#06x}: {checksum:#06x}")
    
    # Verification
    print(f"
=== Verification ===")
    data_with_checksum = test_data + struct.pack("!H", checksum)
    print(f"Data with checksum: {data_with_checksum.hex()}")
    
    verification_sum = ones_complement_sum(data_with_checksum)
    print(f"Verification sum: {verification_sum:#06x}")
    print(f"Valid: {verification_sum == 0xFFFF}")
 
if __name__ == "__main__":
    demonstrate_algorithm()

Properties of One's Complement Checksums:

Byte-order independent: Due to the commutative property of addition and the end-around carry, the checksum gives the same result regardless of whether computed on big-endian or little-endian systems (careful implementation required).
Incremental updates: The checksum can be updated when a single field changes without recomputing from scratch—useful for NAT devices changing addresses.
Simple to compute: No multiplication or division—just addition and bit operations. This was important for 1970s hardware and remains efficient today.
Detects many errors: Catches all single-bit errors and most multi-bit errors, though not cryptographically secure.

The Pseudo-Header: Cross-Layer Protection

A unique aspect of UDP (and TCP) checksums is that they include information from the IP layer in their calculation. This information is called the pseudo-header—it's not transmitted on the wire but is constructed from IP header fields for checksum purposes.

Why Include IP Information?

The UDP header alone doesn't contain the source and destination IP addresses. If UDP's checksum only covered the UDP header and data, a corrupted IP address wouldn't be detected. A datagram could be delivered to the wrong host without any error indication.

The pseudo-header ensures that:

Source and destination IP addresses are integrity-protected
The protocol number is verified (preventing TCP/UDP confusion)
The length is double-checked against the IP layer

IPv4 Pseudo-Header Structure (12 bytes)
Field	Size (bytes)	Description
Source Address	4	IPv4 source address
Destination Address	4	IPv4 destination address
Zero	1	Reserved, must be zero
Protocol	1	IP protocol number (17 for UDP)
UDP Length	2	Length of UDP header + data

Converting Mermaid diagram...

pseudo_header_construction.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
"""
Pseudo-Header Construction for UDP Checksum
 
Demonstrates building the pseudo-header for both IPv4 and IPv6.
"""
 
import struct
import socket
from dataclasses import dataclass
 
@dataclass
class IPv4PseudoHeader:
    """
    IPv4 pseudo-header for UDP checksum calculation.
    
    Structure (12 bytes):
    - Source IP (4 bytes)
    - Destination IP (4 bytes)
    - Zero (1 byte)
    - Protocol (1 byte) = 17 for UDP
    - UDP Length (2 bytes)
    """
    source_ip: str
    dest_ip: str
    udp_length: int
    protocol: int = 17  # UDP protocol number
    
    def to_bytes(self) -> bytes:
        """Construct pseudo-header bytes."""
        src = socket.inet_aton(self.source_ip)  # 4 bytes
        dst = socket.inet_aton(self.dest_ip)    # 4 bytes
        
        return struct.pack(
            "!4s4sBBH",
            src,               # Source IP
            dst,               # Destination IP
            0,                 # Zero padding
            self.protocol,     # Protocol (17 = UDP)
            self.udp_length    # UDP length
        )
 
@dataclass
class IPv6PseudoHeader:
    """
    IPv6 pseudo-header for UDP checksum calculation.
    
    Structure (40 bytes):
    - Source IP (16 bytes)
    - Destination IP (16 bytes)
    - UDP Length (4 bytes) - note: 32-bit in IPv6!
    - Zero (3 bytes)
    - Next Header (1 byte) = 17 for UDP
    """
    source_ip: str
    dest_ip: str
    udp_length: int
    next_header: int = 17  # UDP
    
    def to_bytes(self) -> bytes:
        """Construct pseudo-header bytes."""
        src = socket.inet_pton(socket.AF_INET6, self.source_ip)
        dst = socket.inet_pton(socket.AF_INET6, self.dest_ip)
        
        return struct.pack(
            "!16s16sIBBBB",
            src,                # Source IP (16 bytes)
            dst,                # Destination IP (16 bytes)
            self.udp_length,    # UDP length (4 bytes)
            0, 0, 0,            # Zero padding (3 bytes)
            self.next_header    # Next header (1 byte)
        )
 
def build_checksum_input(
    source_ip: str,
    dest_ip: str,
    source_port: int,
    dest_port: int,
    payload: bytes,
    ip_version: int = 4
) -> bytes:
    """
    Build complete data for checksum calculation.
    
    Returns pseudo-header + UDP header + payload (checksum field zeroed)
    """
    udp_length = 8 + len(payload)
    
    # Build pseudo-header
    if ip_version == 4:
        pseudo = IPv4PseudoHeader(source_ip, dest_ip, udp_length)
    else:
        pseudo = IPv6PseudoHeader(source_ip, dest_ip, udp_length)
    
    # Build UDP header (checksum field = 0 for calculation)
    udp_header = struct.pack(
        "!HHHH",
        source_port,
        dest_port,
        udp_length,
        0  # Checksum = 0 during calculation
    )
    
    # Combine and pad if necessary
    data = pseudo.to_bytes() + udp_header + payload
    if len(data) % 2:
        data += b'\x00'  # Pad to even length
    
    return data
 
# Complete checksum calculation example
def calculate_udp_checksum(
    source_ip: str,
    dest_ip: str,
    source_port: int,
    dest_port: int,
    payload: bytes,
    ip_version: int = 4
) -> int:
    """
    Calculate complete UDP checksum.
    
    Returns the 16-bit checksum value.
    """
    data = build_checksum_input(
        source_ip, dest_ip,
        source_port, dest_port,
        payload, ip_version
    )
    
    # Compute one's complement sum
    s = 0
    for i in range(0, len(data), 2):
        word = (data[i] << 8) + data[i + 1]
        s += word
    
    while s > 0xFFFF:
        s = (s & 0xFFFF) + (s >> 16)
    
    checksum = ~s & 0xFFFF
    
    # Convert 0 to 0xFFFF
    return checksum if checksum != 0 else 0xFFFF
 
# Demonstration
if __name__ == "__main__":
    # Example: DNS query
    source_ip = "192.168.1.100"
    dest_ip = "8.8.8.8"
    source_port = 52437
    dest_port = 53
    
    # Sample DNS query payload
    dns_payload = bytes([
        0x12, 0x34,  # Transaction ID
        0x01, 0x00,  # Flags
        0x00, 0x01,  # Questions
        0x00, 0x00,  # Answers
        0x00, 0x00,  # Authority
        0x00, 0x00,  # Additional
    ])
    
    checksum = calculate_udp_checksum(
        source_ip, dest_ip,
        source_port, dest_port,
        dns_payload
    )
    
    print(f"=== UDP Checksum Calculation ===")
    print(f"Source: {source_ip}:{source_port}")
    print(f"Destination: {dest_ip}:{dest_port}")
    print(f"Payload length: {len(dns_payload)} bytes")
    print(f"UDP length: {8 + len(dns_payload)} bytes")
    print(f"Calculated checksum: 0x{checksum:04x}")

IPv6 Pseudo-Header Differences

The IPv6 pseudo-header is 40 bytes (vs 12 for IPv4) due to larger addresses. It uses a 32-bit length field (supporting jumbograms) and places the protocol number (Next Header) at the end. When implementing, carefully match the structure to the IP version in use.

IPv4 Optional vs IPv6 Mandatory

One of the most important differences in UDP behavior between IPv4 and IPv6 is the checksum requirement:

IPv4: Checksum Optional (RFC 768)

The original UDP specification (RFC 768, 1980) made the checksum optional:

"If the computed checksum is zero, it is transmitted as all ones (the equivalent in one's complement arithmetic). An all zero transmitted checksum value means that the transmitter generated no checksum."

This was a pragmatic decision for 1980s hardware where checksum computation was expensive. Applications with their own integrity checking (like those using higher-layer checksums) could save CPU cycles.

IPv6: Checksum Mandatory (RFC 8200)

IPv6 removed the IP header checksum (present in IPv4) because:

Link layers already provide CRC checking
Transport layers (TCP/UDP) provide checksums
Removing redundancy improved router performance

However, this means no IP-layer corruption detection. To compensate, IPv6 requires UDP to compute a checksum. Zero indicates no checksum was computed, and such datagrams must be discarded.

UDP Checksum Requirements by IP Version
Aspect	IPv4	IPv6
Checksum Computation	Optional	Mandatory
Zero Value Meaning	Not computed (allowed)	Not computed (drop packet)
Pseudo-Header Size	12 bytes	40 bytes
Length Field in Pseudo	16 bits	32 bits (jumbogram support)
IP Header Checksum	Present (covers IP header)	Absent
Rationale	Historical (CPU savings)	Required (no IP checksum)

Modern Practice:

Even on IPv4, virtually all modern implementations compute the UDP checksum:

Security: Disabling checksums allows corrupted data to reach applications undetected.
Performance: With hardware offloading, checksum computation is essentially free.
Dual-Stack Compatibility: Code that works on IPv4 should work on IPv6, requiring checksums.
NAT Traversal: NAT devices may modify packets; checksums help detect errors in this processing.

Exception: UDP-Lite (RFC 3828)

UDP-Lite is a variant that allows partial checksum coverage. Instead of all-or-nothing, applications specify how many bytes are checksum-protected. This is useful for multimedia codecs where partial corruption is acceptable—better to play slightly corrupted audio than lose the entire packet.

Don't Disable Checksums

Some socket APIs (particularly on Linux) allow disabling UDP checksums via socket options. This is almost never appropriate. The performance impact is negligible with modern hardware, and the integrity protection is essential. Only consider disabling in highly controlled environments with application-layer integrity verification.

Hardware Checksum Offloading

Modern network interface cards (NICs) can compute checksums in hardware, eliminating CPU overhead. This technology, called checksum offloading, is ubiquitous in current hardware and has important implications for packet capture and analysis.

Types of Offloading:

Checksum Offload Modes
Mode	Direction	Description
TX Checksum Offload	Outgoing	NIC computes checksum before transmitting. OS leaves checksum field zero/placeholder.
RX Checksum Offload	Incoming	NIC verifies checksum, reports pass/fail to OS. OS skips verification.
Generic Receive Offload (GRO)	Incoming	NIC combines packets, recalculates checksums.
Generic Segmentation Offload (GSO)	Outgoing	OS sends large packets, NIC segments and adds checksums.

Implications for Packet Capture:

When capturing packets with tools like Wireshark or tcpdump, you may see zero or incorrect checksums on outgoing packets. This is not corruption—it's normal behavior when TX offloading is enabled:

Application writes data to socket
OS creates UDP datagram with placeholder checksum (often zero)
Packet capture tool sees this pre-checksum data
NIC computes correct checksum and transmits
On the wire, the checksum is correct

This causes confusion for users who see Wireshark reporting "incorrect checksum" on their own outgoing traffic. The solution is to either:

Disable checksum verification in Wireshark settings
Disable TX offload on the interface (for debugging only)
Understand that capture shows pre-NIC-processing state

check_offload_settings.sh
Shell
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/bin/bash
# Check and configure checksum offload settings (Linux)
 
# View current offload settings
ethtool -k eth0 | grep checksum
 
# Typical output:
# rx-checksumming: on
# tx-checksumming: on
#     tx-checksum-ipv4: off [fixed]
#     tx-checksum-ip-generic: on
#     tx-checksum-ipv6: off [fixed]
# tx-checksum-fcoe-crc: off [fixed]
# tx-checksum-sctp: off [fixed]
 
# Disable TX offload (for debugging/packet capture)
# WARNING: Increases CPU usage, use only for testing
# sudo ethtool -K eth0 tx off
 
# Re-enable (recommended for normal operation)
# sudo ethtool -K eth0 tx on
 
# View UDP-specific settings
ethtool -S eth0 | grep -i udp
 
# Check for checksum errors reported by NIC
ethtool -S eth0 | grep -i "csum"

Wireshark Configuration

In Wireshark, go to Edit → Preferences → Protocols → UDP and uncheck 'Validate checksum if possible'. This prevents false-positive error reports when analyzing traffic from interfaces with TX offload enabled.

Limitations of the UDP Checksum

While the UDP checksum provides valuable error detection, it has significant limitations that applications should understand.

Detection Capabilities:

Checksum Error Detection Capabilities
Error Type	Detection	Notes
Single-bit error	100% detected	Any single bit flip changes checksum
Two-bit errors (adjacent)	100% detected	Adjacent bits definitely caught
Two-bit errors (far apart)	~99.998% detected	Extremely unlikely to cancel
Burst errors (< 16 bits)	100% detected	Affects at most two 16-bit words
Burst errors (> 16 bits)	~99.998% detected	Still very likely to detect
Malicious modification	NOT secure	Attacker can recompute valid checksum

Known Weaknesses:

1. Not Cryptographically Secure

The UDP checksum is a simple arithmetic check, not a cryptographic hash. Anyone who can modify a packet can also recompute a valid checksum. It provides no protection against active attacks.

2. Byte Swap Blindness

Due to the commutative nature of addition, swapping two 16-bit words produces the same checksum. While rare in practice, this is a theoretical weakness.

3. All-Zero/All-One Blindness

Certain patterns of bit flips can cancel out in the sum. For example, adding 1 to one word and subtracting 1 from another preserves the sum.

4. Limited to Corruption Detection

The checksum detects random corruption but provides no:

Replay protection (same packet can be sent again)
Ordering guarantees (packets can arrive out of order)
Source authentication (attacker can forge source address)

When You Need More:

Beyond UDP's Checksum

•Integrity + Authentication: Use HMAC or authenticated encryption (AES-GCM). DTLS and QUIC provide this for UDP.
•Replay Protection: Include sequence numbers and track received sequences. IPSec AH and ESP do this.
•Source Verification: Digital signatures or challenge-response protocols. DNSSEC adds this to DNS.
•Ordering: Application-layer sequence numbers with reassembly logic. RTP does this over UDP.
•Strong Hashing: If you need to detect any modification, use SHA-256 or similar in your application protocol.

Defense in Depth

The UDP checksum is one layer of error detection. Link layers provide CRC checks. Applications can add stronger integrity mechanisms. Security-critical applications should always use authenticated encryption (DTLS, IPSec) rather than relying on transport checksums.

Summary: The Checksum Field

The UDP checksum is the final field in the 8-byte header, providing error detection for the entire datagram including IP addressing information. While simple compared to modern cryptographic integrity mechanisms, it effectively catches accidental corruption.

Let's consolidate the key insights:

Key Takeaways

•Position and Structure — The checksum occupies bytes 6-7 (bits 48-63), computed using one's complement sum of pseudo-header, UDP header (with checksum zeroed), and data.
•One's Complement Algorithm — Sum all 16-bit words, fold carries back in, take bitwise NOT. Result of 0xFFFF upon verification indicates validity.
•Pseudo-Header Protection — The checksum covers IP addresses and protocol number, even though these are in the IP header, not UDP. This prevents misdelivery of corrupted packets.
•IPv4 vs IPv6 — Optional in IPv4 (zero means not computed), mandatory in IPv6 (zero means drop). Modern practice: always compute.
•Hardware Offloading — Modern NICs compute checksums in hardware. Packet captures may show incorrect values for outgoing traffic (pre-offload state).
•Security Limitations — Not cryptographically secure. Detects random corruption but not malicious modification. Use DTLS, IPSec, or application-layer authentication for security.
•Performance Impact — Minimal with hardware offload. Almost never a reason to disable, and disabling creates security risks.

What's Next:

With all four UDP header fields now thoroughly understood—source port, destination port, length, and checksum—the next page synthesizes this knowledge into a complete picture of UDP header simplicity. We'll explore why this minimalist design is UDP's greatest strength and how it enables the protocol's unique capabilities.

Page Complete

You now have a comprehensive understanding of the UDP checksum—from its calculation algorithm to its security limitations. This knowledge is essential for implementing custom protocols, debugging network issues, and understanding why UDP-based applications layer additional integrity mechanisms.