Loading learning content...
We've explored the mathematical foundations—the checksum concept, Internet checksum specification, and one's complement arithmetic. Now it's time to put this knowledge into practice.
In this page, we'll walk through checksum calculation step by step, from raw bytes to verified integrity. You'll see exactly how checksums are computed for IPv4 headers, calculate TCP and UDP checksums with pseudo-headers, understand the intricacies of partial computations, and learn to debug checksum failures.
By the end, you'll be able to compute Internet checksums by hand, implement them in any programming language, and diagnose checksum-related issues in network debugging.
By the end of this page, you will master the complete checksum calculation process: dividing data into words, computing one's complement sums, handling odd bytes, applying the final complement, and verifying received data. You'll work through detailed examples for IP, TCP, and UDP.
Before diving into protocol-specific details, let's establish the universal algorithm that applies to all Internet checksums.
Step-by-Step Algorithm:
Verification Algorithm:
Why Verification Works:
When computing the checksum, we calculated: checksum = ~sum(data_with_zero_checksum)
When verifying, we compute: sum(data_with_checksum) = sum(other_words) + checksum
Since checksum = ~sum(other_words), we get:
sum(other_words) + ~sum(other_words) = 0xFFFF (in one's complement)
This beautiful algebraic identity makes verification simple and fast.
Think of the checksum as the 'missing piece' that makes the sum complete. Just like adding a number and its negative gives zero, adding a sum and its complement gives all-ones. The receiver checks if all pieces fit together perfectly.
The IPv4 header checksum is the simplest Internet checksum to compute—it covers only the header with no pseudo-header involved. Let's work through a complete example.
Sample IPv4 Header (20 bytes, no options):
| Byte Offset | Field | Value (Hex) | Value (Binary/Meaning) |
|---|---|---|---|
| 0 | Version + IHL | 0x45 | Version 4, IHL 5 (20 bytes) |
| 1 | DSCP + ECN | 0x00 | Default service |
| 2-3 | Total Length | 0x00 0x3C | 60 bytes total packet |
| 4-5 | Identification | 0x1C 0x46 | ID: 7238 |
| 6-7 | Flags + Fragment | 0x40 0x00 | Don't fragment, offset 0 |
| 8 | TTL | 0x40 | 64 hops |
| 9 | Protocol | 0x06 | TCP (6) |
| 10-11 | Header Checksum | 0x00 0x00 | ← Zero for calculation |
| 12-15 | Source IP | 0xAC 0x10 0x0A 0x63 | 172.16.10.99 |
| 16-19 | Dest IP | 0xAC 0x10 0x0A 0x0C | 172.16.10.12 |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485
"""Step-by-step IPv4 header checksum calculation.""" # The IPv4 header as bytes (with checksum field set to zero)ip_header = bytes([ 0x45, 0x00, # Word 0: Version/IHL + DSCP/ECN 0x00, 0x3C, # Word 1: Total Length (60 bytes) 0x1C, 0x46, # Word 2: Identification 0x40, 0x00, # Word 3: Flags + Fragment Offset 0x40, 0x06, # Word 4: TTL (64) + Protocol (TCP=6) 0x00, 0x00, # Word 5: Header Checksum (zero for calculation) 0xAC, 0x10, # Word 6: Source IP (first half) 172.16.x.x 0x0A, 0x63, # Word 7: Source IP (second half) x.x.10.99 0xAC, 0x10, # Word 8: Dest IP (first half) 172.16.x.x 0x0A, 0x0C # Word 9: Dest IP (second half) x.x.10.12]) print("=== IPv4 Header Checksum Calculation ===")print(f"Header bytes: {ip_header.hex()}")print() # Step 1: Divide into 16-bit wordswords = []for i in range(0, len(ip_header), 2): word = (ip_header[i] << 8) | ip_header[i + 1] words.append(word) print("Step 1: Divide into 16-bit words (big-endian)")for i, word in enumerate(words): print(f" Word {i}: 0x{word:04X} ({word})")print() # Step 2: Sum all wordstotal = 0print("Step 2: Sum all words")for i, word in enumerate(words): old_total = total total += word carry = total >> 16 if carry: print(f" {old_total:05X} + {word:04X} = {total:05X} (carry!)") else: print(f" {old_total:05X} + {word:04X} = {total:05X}") print(f" Raw sum: 0x{total:05X}")print() # Step 3: Fold to 16 bits (end-around carry)print("Step 3: Fold 32-bit sum to 16 bits")while total > 0xFFFF: high = total >> 16 low = total & 0xFFFF print(f" 0x{high:04X} + 0x{low:04X} = 0x{high + low:04X}") total = high + low print(f" Folded sum: 0x{total:04X}")print() # Step 4: Take one's complementchecksum = (~total) & 0xFFFFprint("Step 4: Take one's complement")print(f" ~0x{total:04X} = 0x{checksum:04X}")print() print(f"=== FINAL CHECKSUM: 0x{checksum:04X} ===") # Verificationprint()print("=== Verification ===")# Insert checksum and recomputeheader_with_cs = bytearray(ip_header)header_with_cs[10] = (checksum >> 8) & 0xFFheader_with_cs[11] = checksum & 0xFF verify_sum = 0for i in range(0, len(header_with_cs), 2): word = (header_with_cs[i] << 8) | header_with_cs[i + 1] verify_sum += word while verify_sum > 0xFFFF: verify_sum = (verify_sum >> 16) + (verify_sum & 0xFFFF) print(f"Sum with checksum included: 0x{verify_sum:04X}")print(f"Valid (== 0xFFFF): {verify_sum == 0xFFFF}")The IPv4 header checksum covers ONLY the IP header (20-60 bytes), not the payload. Each router along the path must recompute this checksum because it modifies the TTL field. The payload is protected by transport-layer checksums (TCP/UDP).
TCP checksums are more complex because they include a pseudo-header containing information from the IP layer. This provides end-to-end protection that catches certain routing errors.
The TCP Pseudo-Header (12 bytes for IPv4):
| Bytes | Field | Description |
|---|---|---|
| 0-3 | Source IP Address | From IP header |
| 4-7 | Destination IP Address | From IP header |
| 8 | Zero | Reserved (must be zero) |
| 9 | Protocol | TCP = 6 |
| 10-11 | TCP Length | TCP header + data length in bytes |
The calculation covers:
Pseudo-Header (12 bytes) + TCP Header (20+ bytes) + TCP Payload (variable)
The pseudo-header is never transmitted—it's synthesized from the IP header for checksum purposes only.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111
"""Complete TCP checksum calculation with pseudo-header."""import struct def internet_checksum(data: bytes) -> int: """Compute Internet checksum over byte sequence.""" if len(data) % 2 == 1: data = data + b'\x00' total = 0 for i in range(0, len(data), 2): word = (data[i] << 8) | data[i + 1] total += word while total > 0xFFFF: total = (total >> 16) + (total & 0xFFFF) return (~total) & 0xFFFF # Example: TCP SYN packetprint("=== TCP Checksum Calculation ===")print() # IP addressessrc_ip = "192.168.1.100"dst_ip = "10.0.0.50" # Parse IPs to bytesimport socketsrc_ip_bytes = socket.inet_aton(src_ip)dst_ip_bytes = socket.inet_aton(dst_ip) print(f"Source IP: {src_ip} -> {src_ip_bytes.hex()}")print(f"Destination IP: {dst_ip} -> {dst_ip_bytes.hex()}")print() # TCP Header (20 bytes, no options)tcp_header = bytes([ 0x04, 0xD2, # Source Port: 1234 0x00, 0x50, # Dest Port: 80 0x00, 0x00, 0x01, 0x00, # Sequence Number: 256 0x00, 0x00, 0x00, 0x00, # Ack Number: 0 0x50, 0x02, # Data Offset (5 words=20 bytes), Flags (SYN) 0x72, 0x10, # Window Size: 29200 0x00, 0x00, # Checksum: 0 (to be calculated) 0x00, 0x00 # Urgent Pointer: 0]) # TCP Payloadtcp_payload = b"Hello, World!" print("TCP Header (with checksum zeroed):")print(f" {tcp_header.hex()}")print(f"TCP Payload: {tcp_payload}")print() # Build pseudo-headertcp_length = len(tcp_header) + len(tcp_payload)pseudo_header = struct.pack( "!4s4sBBH", src_ip_bytes, # Source IP (4 bytes) dst_ip_bytes, # Dest IP (4 bytes) 0, # Reserved (1 byte) 6, # Protocol: TCP (1 byte) tcp_length # TCP Length (2 bytes)) print("Pseudo-Header:")print(f" {pseudo_header.hex()}")print(f" - Source IP: {src_ip_bytes.hex()}")print(f" - Dest IP: {dst_ip_bytes.hex()}")print(f" - Zero: 00")print(f" - Protocol: 06 (TCP)")print(f" - TCP Length: {tcp_length} bytes (0x{tcp_length:04X})")print() # Concatenate all: pseudo-header + TCP header + payloadchecksum_data = pseudo_header + tcp_header + tcp_payload print(f"Total checksum data: {len(checksum_data)} bytes")print() # Compute checksumchecksum = internet_checksum(checksum_data)print(f"=== COMPUTED CHECKSUM: 0x{checksum:04X} ===")print() # Verification: insert checksum and verifytcp_header_with_cs = bytearray(tcp_header)tcp_header_with_cs[16] = (checksum >> 8) & 0xFFtcp_header_with_cs[17] = checksum & 0xFF verify_data = pseudo_header + bytes(tcp_header_with_cs) + tcp_payloadverify_sum = internet_checksum(verify_data)# For verification, we check if sum is 0 (complement of 0xFFFF)# Actually, we need to check sum before complement equals 0xFFFF total = 0if len(verify_data) % 2 == 1: verify_data = verify_data + b'\x00'for i in range(0, len(verify_data), 2): word = (verify_data[i] << 8) | verify_data[i + 1] total += wordwhile total > 0xFFFF: total = (total >> 16) + (total & 0xFFFF) print("=== Verification ===")print(f"Sum with checksum included: 0x{total:04X}")print(f"Valid (== 0xFFFF): {total == 0xFFFF}")Forgetting to include the pseudo-header is the most common TCP checksum bug. The checksum MUST cover: pseudo-header + TCP header (with checksum field zeroed) + TCP payload. Missing any component produces invalid packets that receivers will drop.
UDP checksum calculation is nearly identical to TCP, but with important differences:
| Bytes | Field | Description |
|---|---|---|
| 0-3 | Source IP Address | From IP header |
| 4-7 | Destination IP Address | From IP header |
| 8 | Zero | Reserved |
| 9 | Protocol | UDP = 17 (0x11) |
| 10-11 | UDP Length | UDP header (8) + data length |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117
"""UDP checksum calculation with all edge cases handled."""import structimport socket def internet_checksum(data: bytes) -> int: """Compute Internet checksum.""" if len(data) % 2 == 1: data = data + b'\x00' total = 0 for i in range(0, len(data), 2): word = (data[i] << 8) | data[i + 1] total += word while total > 0xFFFF: total = (total >> 16) + (total & 0xFFFF) result = (~total) & 0xFFFF # Special case: if checksum is 0, use 0xFFFF instead # (Both represent zero in one's complement, but 0x0000 # means "no checksum computed" in UDP) if result == 0x0000: result = 0xFFFF return result def compute_udp_checksum( src_ip: str, dst_ip: str, src_port: int, dst_port: int, payload: bytes) -> int: """ Compute complete UDP checksum. Returns the 16-bit checksum value. """ # Build UDP header (8 bytes) udp_length = 8 + len(payload) udp_header = struct.pack( "!HHHH", src_port, # Source port (2 bytes) dst_port, # Dest port (2 bytes) udp_length, # Length (2 bytes) 0 # Checksum = 0 for calculation (2 bytes) ) # Build pseudo-header (12 bytes) src_ip_bytes = socket.inet_aton(src_ip) dst_ip_bytes = socket.inet_aton(dst_ip) pseudo_header = struct.pack( "!4s4sBBH", src_ip_bytes, dst_ip_bytes, 0, # Reserved 17, # Protocol: UDP udp_length # UDP length ) # Concatenate and compute checksum_data = pseudo_header + udp_header + payload return internet_checksum(checksum_data) # Example: DNS query packetprint("=== UDP Checksum Calculation (DNS Query) ===")print() src_ip = "192.168.1.50"dst_ip = "8.8.8.8"src_port = 54321dst_port = 53 # DNS # Simple DNS query payload (asking for google.com A record)# This is a simplified representationdns_payload = bytes([ 0x12, 0x34, # Transaction ID 0x01, 0x00, # Flags: Standard query 0x00, 0x01, # Questions: 1 0x00, 0x00, # Answer RRs: 0 0x00, 0x00, # Authority RRs: 0 0x00, 0x00, # Additional RRs: 0 # Query: google.com 0x06, 0x67, 0x6f, 0x6f, 0x67, 0x6c, 0x65, # "google" 0x03, 0x63, 0x6f, 0x6d, # "com" 0x00, # Root label 0x00, 0x01, # Type: A 0x00, 0x01 # Class: IN]) print(f"Source: {src_ip}:{src_port}")print(f"Destination: {dst_ip}:{dst_port}")print(f"Payload ({len(dns_payload)} bytes): {dns_payload.hex()}")print() checksum = compute_udp_checksum(src_ip, dst_ip, src_port, dst_port, dns_payload)print(f"=== COMPUTED CHECKSUM: 0x{checksum:04X} ===")print() # Build complete UDP packetudp_length = 8 + len(dns_payload)udp_packet = struct.pack( "!HHHH", src_port, dst_port, udp_length, checksum) + dns_payload print("Complete UDP packet:")print(f" {udp_packet.hex()}")In UDP, a checksum field of 0x0000 means 'checksum not computed.' But what if the actual computed checksum equals zero? The solution: transmit 0xFFFF instead. In one's complement, 0x0000 and 0xFFFF both represent zero, so verification still works. This clever trick resolves the ambiguity.
Real-world checksum implementation must handle several edge cases that don't appear in textbook descriptions.
Edge Case 1: Odd-Length Data
When the data length is odd, you can't form a complete final 16-bit word. The standard approach: conceptually pad with a zero byte, but only for calculation—the padding is never transmitted.
1234567891011121314151617181920212223242526272829303132333435
"""Handling odd-length data in checksum calculation.""" def checksum_with_odd_handling(data: bytes) -> int: """ Proper handling of odd-length data. """ total = 0 length = len(data) # Process complete 16-bit words for i in range(0, length - 1, 2): word = (data[i] << 8) | data[i + 1] total += word # Handle trailing odd byte if length % 2 == 1: # Treat as high byte with implicit zero low byte # Network byte order: the byte is MSB total += data[-1] << 8 print(f"Odd byte 0x{data[-1]:02X} treated as 0x{data[-1]:02X}00") # Fold and complement while total > 0xFFFF: total = (total >> 16) + (total & 0xFFFF) return (~total) & 0xFFFF # Test with odd-length dataodd_data = bytes([0x45, 0x00, 0x00, 0x3C, 0x1C]) # 5 bytesprint(f"Data (5 bytes): {odd_data.hex()}")checksum = checksum_with_odd_handling(odd_data)print(f"Checksum: 0x{checksum:04X}")Edge Case 2: Empty Data
What's the checksum of zero bytes? Technically:
So the checksum of empty data is 0xFFFF.
Edge Case 3: All-Zero Data
Similarly, if data consists entirely of zero bytes:
Edge Case 4: All-Ones Data
If every word is 0xFFFF (all ones):
This is why verification checks for 0xFFFF: correct data sums to 0xFFFF before complement.
On some architectures, accessing 16-bit values at odd addresses causes exceptions or performance penalties. High-performance implementations often use byte-by-byte processing to avoid alignment issues, or ensure data is aligned before word-based processing.
In many scenarios, we need to compute checksums over non-contiguous data or combine previously computed partial checksums. The mathematical properties of one's complement addition make this possible.
Combining Partial Sums:
If you have partial sums computed over different data regions, you can combine them:
final_sum = partial_sum_1 ⊕ partial_sum_2 ⊕ ... ⊕ partial_sum_n
where ⊕ is one's complement addition.
Application: Scatter-Gather I/O
Network stacks often use scatter-gather I/O where packet data is in non-contiguous memory buffers. Rather than copying everything together, compute partial checksums per buffer and combine.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100
"""Computing and combining partial checksums. Useful for:- Scatter-gather I/O (non-contiguous buffers)- Incremental checksum updates- Pre-computing pseudo-header contribution""" def partial_sum(data: bytes) -> int: """ Compute the one's complement sum (before final complement). Returns the sum, not the checksum! """ if len(data) % 2 == 1: data = data + b'\x00' total = 0 for i in range(0, len(data), 2): word = (data[i] << 8) | data[i + 1] total += word # Fold but don't complement while total > 0xFFFF: total = (total >> 16) + (total & 0xFFFF) return total def combine_partial_sums(*sums: int) -> int: """ Combine multiple partial sums using one's complement addition. """ total = 0 for s in sums: total += s while total > 0xFFFF: total = (total >> 16) + (total & 0xFFFF) return total def finalize_checksum(sum_value: int) -> int: """ Apply final complement to get checksum from sum. """ return (~sum_value) & 0xFFFF # Example: TCP checksum with partial computationprint("=== Partial Checksum Computation ===") # Pre-compute pseudo-header contribution (can be cached for connection)pseudo_header = bytes([ 0xC0, 0xA8, 0x01, 0x64, # Source: 192.168.1.100 0x0A, 0x00, 0x00, 0x32, # Dest: 10.0.0.50 0x00, 0x06, # Reserved + Protocol (TCP) 0x00, 0x21 # TCP Length: 33 bytes]) pseudo_sum = partial_sum(pseudo_header)print(f"Pseudo-header sum: 0x{pseudo_sum:04X}") # TCP header (can be in one buffer)tcp_header = bytes([ 0x04, 0xD2, # Source Port: 1234 0x00, 0x50, # Dest Port: 80 0x00, 0x00, 0x01, 0x00, # Seq 0x00, 0x00, 0x00, 0x00, # Ack 0x50, 0x02, 0x72, 0x10, # Flags, Window 0x00, 0x00, 0x00, 0x00 # Checksum (0), Urgent]) header_sum = partial_sum(tcp_header)print(f"TCP header sum: 0x{header_sum:04X}") # TCP payload (might be in another buffer - scatter-gather)tcp_payload = b"Hello, World!" # 13 bytes (odd) payload_sum = partial_sum(tcp_payload)print(f"Payload sum: 0x{payload_sum:04X}") # Combine all partial sumscombined = combine_partial_sums(pseudo_sum, header_sum, payload_sum)print(f"Combined sum: 0x{combined:04X}") # Finalizechecksum = finalize_checksum(combined)print(f"Final checksum: 0x{checksum:04X}") # Verify by computing directlydirect_data = pseudo_header + tcp_header + tcp_payloaddirect_sum = partial_sum(direct_data)direct_checksum = finalize_checksum(direct_sum)print(f"Direct checksum: 0x{direct_checksum:04X}")print(f"Match: {checksum == direct_checksum}")For long-lived TCP connections, the pseudo-header contribution can be pre-computed once when the connection is established. Only the TCP header and payload sums need to be computed per-packet, saving significant overhead for small packets.
Checksum failures are frustrating because they indicate something is wrong but not what. Here's a systematic approach to debugging.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687
"""Checksum debugging utilities.""" def debug_checksum_failure( received_data: bytes, checksum_offset: int, pseudo_header: bytes = b'') -> None: """ Analyze a failed checksum to help identify the problem. Args: received_data: The complete received packet/segment checksum_offset: Byte offset where checksum field starts pseudo_header: Pseudo-header bytes (for TCP/UDP) """ print("=== Checksum Failure Analysis ===") # Extract the received checksum received_checksum = (received_data[checksum_offset] << 8) | \ received_data[checksum_offset + 1] print(f"Received checksum: 0x{received_checksum:04X}") # Zero out the checksum field for recomputation data_zeroed = bytearray(received_data) data_zeroed[checksum_offset] = 0 data_zeroed[checksum_offset + 1] = 0 # Recompute expected checksum full_data = pseudo_header + bytes(data_zeroed) if len(full_data) % 2 == 1: full_data = full_data + b'\x00' total = 0 print("Word-by-word breakdown:") for i in range(0, len(full_data), 2): word = (full_data[i] << 8) | full_data[i + 1] print(f" Offset {i:3d}: 0x{word:04X} ({full_data[i]:02X} {full_data[i+1]:02X})") total += word while total > 0xFFFF: total = (total >> 16) + (total & 0xFFFF) expected_checksum = (~total) & 0xFFFF print(f"Expected checksum: 0x{expected_checksum:04X}") print(f"Received checksum: 0x{received_checksum:04X}") print(f"Match: {expected_checksum == received_checksum}") if expected_checksum != received_checksum: # Try to identify the discrepancy diff = received_checksum ^ expected_checksum print(f"XOR difference: 0x{diff:04X}") print(f"Binary diff: {diff:016b}") # Common error patterns if diff == received_checksum: print("Hint: Expected checksum is 0 - did you forget the pseudo-header?") elif diff == (received_checksum & 0xFF) ^ ((received_checksum >> 8) & 0xFF): print("Hint: Possible byte-swap issue") # Example usageprint("Example: Analyzing a TCP packet with checksum failure") # Simulated received TCP segment (with "wrong" checksum)tcp_segment = bytes([ 0x04, 0xD2, 0x00, 0x50, # Ports 0x00, 0x00, 0x01, 0x00, # Seq 0x00, 0x00, 0x00, 0x00, # Ack 0x50, 0x02, 0x72, 0x10, # Flags 0xDE, 0xAD, # Checksum (intentionally wrong) 0x00, 0x00 # Urgent]) pseudo_header = bytes([ 0xC0, 0xA8, 0x01, 0x64, 0x0A, 0x00, 0x00, 0x32, 0x00, 0x06, 0x00, 0x14]) debug_checksum_failure(tcp_segment, 16, pseudo_header)When analyzing captures with Wireshark, go to Edit → Preferences → Protocols → TCP and UNCHECK 'Validate the TCP checksum if possible.' Otherwise, Wireshark may mark packets with offloaded (not-yet-computed) checksums as errors, causing confusion.
We've walked through the complete checksum calculation process from start to finish. Let's consolidate the essential knowledge:
Looking Ahead:
With the calculation process mastered, we're ready to explore the broader applications of checksums across networking and computing. The next page examines where checksums are used, why certain applications choose checksums over CRCs, and the role of checksums in modern high-speed networks.
You now have complete mastery of checksum calculation: the algorithm, protocol-specific details, edge case handling, partial checksums, and debugging techniques. You can implement correct checksums in any language and diagnose failures in real network traffic.