Udp Vs Tcp - Learning Module

Loading content...

0/228

Overhead Comparison — Analyzing Protocol Costs

The Hidden Cost of Every Packet

Every transport protocol imposes overhead—additional bytes, processing cycles, and memory that don't directly serve your application's data. This overhead might seem negligible for a single packet, but in systems processing billions of packets daily, even a few bytes or microseconds per packet translates to significant infrastructure costs.

Understanding overhead isn't just academic curiosity. It's the difference between a system that scales gracefully and one that hits unexpected walls. Between infrastructure that costs $10,000/month and infrastructure that costs $100,000/month. Between latency that users tolerate and latency that drives them away.

In this page, we'll dissect the overhead of UDP and TCP with surgical precision—examining every byte, every CPU cycle, and every memory allocation.

What You Will Master

By the end of this page, you will understand exactly what overhead each protocol imposes, where that overhead comes from, and how to calculate the efficiency impact for any given workload. You'll be able to quantify the cost difference when choosing between UDP and TCP for your applications.

Header Size Analysis

The most visible overhead is the protocol header—bytes prepended to every packet that serve protocol functions rather than carrying application data.

UDP Header: The Minimalist Approach

UDP's header is among the simplest in networking—just 8 bytes containing four 16-bit fields:

UDP Header Structure (8 bytes total)
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          Source Port          |       Destination Port        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            Length             |           Checksum            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 
Field Breakdown:
┌──────────────────┬────────────┬────────────────────────────────────────┐
│ Field            │ Size       │ Purpose                                │
├──────────────────┼────────────┼────────────────────────────────────────┤
│ Source Port      │ 16 bits    │ Sender's port for demultiplexing       │
│ Destination Port │ 16 bits    │ Receiver's port for demultiplexing     │
│ Length           │ 16 bits    │ Total datagram size (header + data)    │
│ Checksum         │ 16 bits    │ Error detection (optional in IPv4)     │
├──────────────────┼────────────┼────────────────────────────────────────┤
│ TOTAL            │ 64 bits    │ = 8 bytes                              │
└──────────────────┴────────────┴────────────────────────────────────────┘

TCP Header: The Feature-Rich Approach

TCP's header is significantly larger—20 bytes minimum, up to 60 bytes with options:

TCP Header Structure (20-60 bytes)
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          Source Port          |       Destination Port        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Sequence Number                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Acknowledgment Number                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Data  |       |C|E|U|A|P|R|S|F|                               |
   | Offset| Rsrvd |W|C|R|C|S|S|Y|I|            Window             |
   |       |       |R|E|G|K|H|T|N|N|                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Checksum            |         Urgent Pointer        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Options (if Data Offset > 5)               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 
Field Breakdown:
┌──────────────────────┬────────────┬───────────────────────────────────────┐
│ Field                │ Size       │ Purpose                               │
├──────────────────────┼────────────┼───────────────────────────────────────┤
│ Source Port          │ 16 bits    │ Sender's port for demultiplexing      │
│ Destination Port     │ 16 bits    │ Receiver's port for demultiplexing    │
│ Sequence Number      │ 32 bits    │ Position of first byte in this segment│
│ Acknowledgment Number│ 32 bits    │ Next expected byte from peer          │
│ Data Offset          │ 4 bits     │ Header length in 32-bit words         │
│ Reserved             │ 4 bits     │ Reserved for future use               │
│ Flags                │ 8 bits     │ Control flags (SYN, ACK, FIN, etc.)   │
│ Window               │ 16 bits    │ Receive window size for flow control  │
│ Checksum             │ 16 bits    │ Error detection (mandatory)           │
│ Urgent Pointer       │ 16 bits    │ Offset to urgent data                 │
│ Options              │ 0-40 bytes │ Variable: timestamps, window scaling  │
├──────────────────────┼────────────┼───────────────────────────────────────┤
│ MINIMUM TOTAL        │ 160 bits   │ = 20 bytes                            │
│ MAXIMUM TOTAL        │ 480 bits   │ = 60 bytes                            │
│ TYPICAL TOTAL        │ 256 bits   │ = 32 bytes (with timestamps)          │
└──────────────────────┴────────────┴───────────────────────────────────────┘

Header Size Comparison Summary
Metric	UDP	TCP (minimum)	TCP (typical)	TCP (maximum)
Header size	8 bytes	20 bytes	32 bytes	60 bytes
vs UDP	—	+12 bytes (150%)	+24 bytes (300%)	+52 bytes (650%)
Fields count	4	10	10 + options	10 + options
Variable size?	No	Yes (options)	Yes	Yes

Why TCP Options Matter

Modern TCP nearly always uses options: Timestamps (10 bytes) for RTT measurement, Window Scaling (3 bytes) for large windows, and SACK (variable) for selective acknowledgment. A realistic TCP header is 32 bytes, not 20. This means TCP's header is typically 4× larger than UDP's.

Bandwidth Efficiency Analysis

Header overhead directly impacts goodput—the amount of useful application data transferred versus total bytes transmitted. The impact varies dramatically based on payload size.

Efficiency Formula:

Efficiency = Payload Size / (Payload Size + Header Size) × 100%

Adding IP and Ethernet overhead for complete picture:

Layer	Size
Ethernet Frame Header	14 bytes
IP Header (IPv4, no options)	20 bytes
Transport Header	8-60 bytes
Ethernet CRC	4 bytes
Total Overhead	46-98 bytes

Bandwidth Efficiency by Payload Size (Including All Headers)
Payload Size	UDP Efficiency	TCP Efficiency	Efficiency Difference
1 byte	2.1%	1.0%	UDP 2.1× better
10 bytes	17.2%	9.6%	UDP 1.8× better
64 bytes	54.2%	39.0%	UDP 1.4× better
128 bytes	70.3%	56.6%	UDP 1.2× better
256 bytes	82.6%	72.7%	UDP 1.1× better
512 bytes	90.4%	84.2%	UDP 1.07× better
1024 bytes	94.9%	91.4%	UDP 1.04× better
1460 bytes (MTU limit)	96.4%	94.0%	UDP 1.03× better

Critical Insight: Small Payloads Magnify Overhead

For small payloads (under 64 bytes), the overhead difference is substantial. This pattern appears in several real-world scenarios:

High-Overhead Scenarios

•TCP Acknowledgments — Pure ACK packets carry 0 bytes of application data but consume 52+ bytes of headers. Every data segment generates at least one ACK.
•Interactive Protocols — SSH keystrokes, Telnet characters, and similar interactive traffic often sends 1-10 bytes of payload.
•Keep-Alive Probes — TCP and application-layer keep-alives send minimal or zero-byte payloads.
•DNS Queries — Many DNS queries are 30-50 bytes of payload, making UDP's efficiency advantage significant.
•IoT Sensor Data — Temperature readings, status updates, and similar telemetry often fit in 8-32 bytes.
•Game State Updates — Position updates, input events, and tick synchronization packets are typically 20-100 bytes.

Batching Mitigates Overhead

When small payloads are unavoidable, batching multiple logical messages into a single packet amortizes header overhead. Nagle's algorithm (TCP) and application-layer coalescing can help, though they add latency. The trade-off continues.

Connection State Overhead

Beyond per-packet header overhead, TCP requires maintaining connection state for every active connection. This state consumes memory and processing resources on both endpoints.

UDP Connection State: Essentially Zero

UDP is stateless from the protocol's perspective. Each datagram is independent. The only 'state' is the socket binding (port allocation), consuming minimal resources:

UDP Socket State (Minimal)
UDP Socket State (per socket, not per 'connection'):
┌────────────────────────────────────────────────────────────────┐
│ Local IP Address          │ 4-16 bytes (IPv4/IPv6)            │
│ Local Port                │ 2 bytes                            │
│ Receive Buffer            │ Configurable (typically 64KB-2MB)  │
│ Send Buffer               │ Configurable (typically 64KB-2MB)  │
└────────────────────────────────────────────────────────────────┘
 
Total State Per Socket: ~6-18 bytes + buffer memory
State Per Datagram: 0 bytes (no tracking)
 
Key Insight: One UDP socket can send to ANY number of destinations
             without additional state per destination.

TCP Connection State: Substantial Per-Connection

Every TCP connection requires tracking extensive state for reliability, flow control, and congestion control:

TCP Connection State (Per Connection)
TCP Connection Block (TCB) - State Per Connection:
┌────────────────────────────────────────────────────────────────────────────┐
│ CONNECTION IDENTIFICATION                                                   │
├────────────────────────────────────────────────────────────────────────────┤
│ Local IP Address           │ 4-16 bytes                                    │
│ Local Port                 │ 2 bytes                                       │
│ Remote IP Address          │ 4-16 bytes                                    │
│ Remote Port                │ 2 bytes                                       │
│ Connection State           │ 1 byte (ESTABLISHED, TIME_WAIT, etc.)        │
├────────────────────────────────────────────────────────────────────────────┤
│ SEQUENCE NUMBER MANAGEMENT                                                  │
├────────────────────────────────────────────────────────────────────────────┤
│ Send Sequence Number (SND.NXT)    │ 4 bytes                                │
│ Send Unacknowledged (SND.UNA)     │ 4 bytes                                │
│ Send Window Size (SND.WND)        │ 4 bytes                                │
│ Send Window Scale                 │ 1 byte                                 │
│ Initial Send Sequence (ISS)       │ 4 bytes                                │
│ Receive Next (RCV.NXT)            │ 4 bytes                                │
│ Receive Window (RCV.WND)          │ 4 bytes                                │
│ Receive Window Scale              │ 1 byte                                 │
│ Initial Receive Sequence (IRS)    │ 4 bytes                                │
├────────────────────────────────────────────────────────────────────────────┤
│ CONGESTION CONTROL                                                          │
├────────────────────────────────────────────────────────────────────────────┤
│ Congestion Window (cwnd)          │ 4 bytes                                │
│ Slow Start Threshold (ssthresh)   │ 4 bytes                                │
│ RTT Estimate                      │ 8 bytes                                │
│ RTT Variance                      │ 8 bytes                                │
│ Retransmission Timeout (RTO)      │ 4 bytes                                │
│ Duplicate ACK Count               │ 2 bytes                                │
├────────────────────────────────────────────────────────────────────────────┤
│ TIMERS AND TIMESTAMPS                                                       │
├────────────────────────────────────────────────────────────────────────────┤
│ Retransmission Timer              │ Timer structure (~16 bytes)            │
│ Delayed ACK Timer                 │ Timer structure (~16 bytes)            │
│ Keepalive Timer                   │ Timer structure (~16 bytes)            │
│ TIME_WAIT Timer                   │ Timer structure (~16 bytes)            │
│ Persist Timer                     │ Timer structure (~16 bytes)            │
│ Last ACK Sent Timestamp           │ 8 bytes                                │
│ Last Data Received Timestamp      │ 8 bytes                                │
├────────────────────────────────────────────────────────────────────────────┤
│ BUFFERS                                                                     │
├────────────────────────────────────────────────────────────────────────────┤
│ Send Buffer                       │ Configurable (typically 16KB-16MB)     │
│ Receive Buffer                    │ Configurable (typically 16KB-16MB)     │
│ Out-of-Order Queue                │ Variable (holds out-of-order segments) │
│ Retransmit Queue                  │ Variable (holds unacknowledged data)   │
├────────────────────────────────────────────────────────────────────────────┤
│ SACK INFORMATION                                                            │
├────────────────────────────────────────────────────────────────────────────┤
│ SACK Blocks                       │ Up to 4 blocks × 8 bytes = 32 bytes    │
│ SACK Permitted Flag               │ 1 bit                                  │
└────────────────────────────────────────────────────────────────────────────┘
 
Estimated TCB Size (without buffers): 200-500 bytes per connection
With typical buffers: 32KB - 32MB per connection

Memory Overhead Comparison at Scale
Concurrent Connections	UDP Memory	TCP Memory (min)	TCP Memory (typical)
100	~20 KB	~50 KB	~6.4 MB
1,000	~20 KB	~500 KB	~64 MB
10,000	~20 KB	~5 MB	~640 MB
100,000	~20 KB	~50 MB	~6.4 GB
1,000,000 (C10M)	~20 KB	~500 MB	~64 GB

The C10K/C10M Problem

For servers handling millions of concurrent connections (the 'C10M' problem), TCP's per-connection state is a significant challenge. While UDP's stateless nature handles this easily, applications using UDP must implement their own application-layer state if reliability is needed—often ending up with similar memory requirements.

CPU Processing Overhead

Beyond memory, each packet requires CPU cycles for processing. The complexity difference between UDP and TCP processing is substantial.

UDP Processing: Minimal Path

UDP Receive Processing Path
UDP Receive Path (Approximate CPU Operations):
─────────────────────────────────────────────────────────────────
1. Receive interrupt from NIC                    │ ~500 cycles
2. DMA packet data to memory                     │ ~200 cycles
3. IP header validation                          │ ~100 cycles
4. UDP header extraction (8 bytes)               │ ~50 cycles
5. Checksum verification (if enabled)            │ ~500 cycles
6. Port lookup (hash table)                      │ ~100 cycles
7. Queue to socket receive buffer                │ ~200 cycles
8. Wake waiting application                      │ ~300 cycles
─────────────────────────────────────────────────────────────────
TOTAL: ~2,000 CPU cycles per packet
 
UDP Send Path (Approximate):
─────────────────────────────────────────────────────────────────
1. Application system call                       │ ~500 cycles
2. Socket lookup                                 │ ~100 cycles
3. Buffer allocation                             │ ~200 cycles
4. UDP header construction (8 bytes)             │ ~50 cycles
5. Checksum calculation (optional)               │ ~500 cycles
6. IP header construction                        │ ~100 cycles
7. Queue to NIC                                  │ ~200 cycles
─────────────────────────────────────────────────────────────────
TOTAL: ~1,650 CPU cycles per packet

TCP Processing: Extended Path

TCP Receive Processing Path
TCP Receive Path (Approximate CPU Operations):
─────────────────────────────────────────────────────────────────
1. Receive interrupt from NIC                    │ ~500 cycles
2. DMA packet data to memory                     │ ~200 cycles
3. IP header validation                          │ ~100 cycles
4. TCP header extraction (20-60 bytes)           │ ~150 cycles
5. Checksum verification (mandatory)             │ ~500 cycles
6. Connection lookup (4-tuple hash)              │ ~200 cycles
7. State machine validation                      │ ~300 cycles
8. Sequence number validation                    │ ~200 cycles
9. ACK processing                                │ ~500 cycles
   - Update send window                          │
   - Release acknowledged data from retransmit Q │
   - Update RTT estimate if timestamp present    │
10. Window update                                │ ~150 cycles
11. Out-of-order handling                        │ ~300 cycles
    - Check if fits in sequence                  │
    - Buffer if out of order                     │
    - Coalesce if fillsgap                      │
12. Congestion control update                    │ ~200 cycles
13. Queue to socket receive buffer               │ ~200 cycles
14. SACK block management                        │ ~200 cycles
15. Delayed ACK timer management                 │ ~150 cycles
16. Generate ACK (if needed)                     │ ~500 cycles
17. Wake waiting application                     │ ~300 cycles
─────────────────────────────────────────────────────────────────
TOTAL: ~4,650+ CPU cycles per packet
 
TCP Send Path (Per Segment):
─────────────────────────────────────────────────────────────────
1. Application write system call                 │ ~500 cycles
2. Socket/connection lookup                      │ ~200 cycles
3. Available window check                        │ ~150 cycles
4. Segment size determination (MSS, cwnd, rwnd)  │ ~300 cycles
5. Sequence number assignment                    │ ~100 cycles
6. TCP header construction (20-60 bytes)         │ ~200 cycles
7. Timestamp option insertion                    │ ~100 cycles
8. Checksum calculation                          │ ~500 cycles
9. Copy to retransmit queue                      │ ~300 cycles
10. Start/reset retransmission timer             │ ~200 cycles
11. Nagle algorithm check                        │ ~100 cycles
12. IP header construction                       │ ~100 cycles
13. Queue to NIC                                 │ ~200 cycles
─────────────────────────────────────────────────────────────────
TOTAL: ~2,950+ CPU cycles per segment

Processing Overhead Comparison
Operation	UDP Cycles	TCP Cycles	TCP Overhead
Receive processing	~2,000	~4,650	+132%
Send processing	~1,650	~2,950	+79%
Round-trip (send+receive)	~3,650	~7,600	+108%
At 1M packets/sec	3.65 Gcycles/s	7.6 Gcycles/s	+3.95 Gcycles/s
CPU cores needed @ 3GHz	~1.2 cores	~2.5 cores	+1.3 cores

Offloading Mitigates CPU Overhead

Modern NICs provide TCP Offload Engine (TOE), Large Receive Offload (LRO), and TCP Segmentation Offload (TSO), reducing CPU overhead significantly. However, UDP also benefits from equivalent offloads (UDP Fragmentation Offload). The relative difference remains, though absolute numbers decrease.

Acknowledgment Traffic Overhead

One of TCP's most significant hidden costs is acknowledgment traffic. Every data segment (with some exceptions) generates a corresponding acknowledgment, consuming bandwidth in the reverse direction.

The ACK Amplification Problem

TCP Traffic Pattern Analysis
Scenario: Download 100 MB file
 
UDP (no application ACK):
─────────────────────────────────────────────────────
Download direction:  100 MB of data
Upload direction:    0 bytes (no protocol acknowledgments)
Total traffic:       100 MB
Overhead:            0%
 
TCP with typical delayed ACK (1 ACK per 2 segments):
─────────────────────────────────────────────────────
Number of segments:  ~68,500 (assuming 1460 byte payload per segment)
ACKs generated:      ~34,250 (delayed ACK: 1 per 2 segments)
ACK packet size:     ~52 bytes (Ethernet + IP + TCP headers, no data)
 
Download direction:  100 MB of data (with TCP headers)
                     100 MB + (68,500 × 32 bytes TCP overhead) = 102.2 MB
Upload direction:    34,250 × 52 bytes = 1.78 MB
Total traffic:       ~104 MB
ACK overhead:        ~4%
 
TCP with 1:1 immediate ACK (worst case):
─────────────────────────────────────────────────────
ACKs generated:      ~68,500 (one per segment)
Upload direction:    68,500 × 52 bytes = 3.56 MB
Total traffic:       ~106 MB
ACK overhead:        ~6%

The Asymmetric Connection Problem

ACK traffic is particularly problematic on asymmetric connections where upload bandwidth is limited:

ACK Traffic Impact on Asymmetric Connections
Connection Type	Download	Upload	ACK Rate Limit	Effective Download
ADSL 24/1 Mbps	24 Mbps	1 Mbps	2.4 Mbps equivalent	~90% of max
Cable 100/10 Mbps	100 Mbps	10 Mbps	24 Mbps equivalent	~100% of max
Cable 1000/35 Mbps	1000 Mbps	35 Mbps	84 Mbps equivalent	~8% of max (!!)
Satellite 25/3 Mbps	25 Mbps	3 Mbps	7.2 Mbps equivalent	~30% of max

The Gigabit Asymmetric Trap

On highly asymmetric connections (like Gigabit download with limited upload), TCP acknowledgments can actually become the bottleneck. The upload pipe fills with ACKs, backing up the sender and limiting effective download speed to a fraction of the theoretical maximum. UDP avoids this entirely.

Bidirectional Traffic: Where ACKs Hide

When traffic flows in both directions, TCP's delayed ACK mechanism 'piggybacks' acknowledgments on data packets, hiding the cost:

ACK Piggybacking Trade-offs

•Request-response patterns — HTTP/2 multiplexing means frequent bidirectional traffic. ACKs piggyback on responses, reducing overhead.
•Streaming download — One-way traffic (no piggybacking possible). All ACKs are standalone packets consuming full overhead.
•Streaming upload — Same problem in reverse—pure upload generates pure ACK download traffic.
•Interactive sessions — SSH, RDP, and similar protocols have bidirectional traffic, enabling efficient piggybacking.

Connection Establishment Overhead

TCP's three-way handshake adds both latency overhead (covered in Page 1) and bandwidth/processing overhead worth examining.

TCP Handshake Packet Analysis

TCP Three-Way Handshake Cost
TCP Connection Establishment Packets:
═══════════════════════════════════════════════════════════════════════════════
 
Packet 1: SYN (Client → Server)
───────────────────────────────────────────────────────────────────────────────
Ethernet:         14 bytes
IP Header:        20 bytes
TCP Header:       32 bytes (20 base + 12 options: MSS, Window Scale, SACK OK, Timestamp)
TCP Data:         0 bytes
───────────────────────────────────────────────────────────────────────────────
Total:            66 bytes on wire
 
Packet 2: SYN-ACK (Server → Client)  
───────────────────────────────────────────────────────────────────────────────
Ethernet:         14 bytes
IP Header:        20 bytes
TCP Header:       32 bytes (same options echoed)
TCP Data:         0 bytes
───────────────────────────────────────────────────────────────────────────────
Total:            66 bytes on wire
 
Packet 3: ACK (Client → Server)
───────────────────────────────────────────────────────────────────────────────
Ethernet:         14 bytes
IP Header:        20 bytes
TCP Header:       32 bytes (timestamp only after handshake)
TCP Data:         0 bytes (often first data is included: "ACK+DATA")
───────────────────────────────────────────────────────────────────────────────
Total:            66 bytes on wire
 
═══════════════════════════════════════════════════════════════════════════════
TOTAL HANDSHAKE OVERHEAD:   198 bytes / 3 packets / 1.5 RTT
═══════════════════════════════════════════════════════════════════════════════
 
TCP Connection Termination (Four-Way Handshake):
───────────────────────────────────────────────────────────────────────────────
FIN (A→B):        66 bytes
ACK (B→A):        66 bytes
FIN (B→A):        66 bytes    (often combined with above ACK)
ACK (A→B):        66 bytes
───────────────────────────────────────────────────────────────────────────────
Total:            198-264 bytes / 3-4 packets / 2-4 RTT

Impact on Short-Lived Connections

For long-lived connections transferring megabytes, 200 bytes of handshake overhead is negligible. For short request-response patterns, it dominates:

Handshake Overhead as Percentage of Total Traffic
Data Transferred	UDP Total	TCP Total	Handshake %age of TCP
64 bytes (DNS query)	64 B	~460 B	86% overhead
256 bytes (small API)	256 B	~656 B	61% overhead
1 KB	1 KB	~1.4 KB	29% overhead
10 KB	10 KB	~10.4 KB	4% overhead
100 KB	100 KB	~100.4 KB	0.4% overhead
1 MB	1 MB	~1.0004 MB	0.04% overhead

Connection Reuse Is Critical

HTTP/1.1 keep-alive, HTTP/2 multiplexing, and connection pooling exist specifically to amortize TCP handshake overhead across many requests. A single TCP connection can serve thousands of HTTP requests, making the initial handshake negligible. Without connection reuse, TCP overhead is devastating for small requests.

Retransmission Overhead

When packets are lost, TCP must retransmit them. This creates additional overhead that doesn't exist in UDP (where lost data simply stays lost).

Quantifying Retransmission Overhead

TCP Retransmission Overhead by Packet Loss Rate
Packet Loss Rate	Effective Retransmit Ratio	Bandwidth Overhead	Notes
0%	1.000×	0%	Ideal network
0.1%	1.001×	0.1%	Typical good connection
1%	1.010×	1%	Acceptable quality
2%	1.020×	2%	Noticeable quality issues
5%	1.053×	5.3%	Poor connection
10%	1.111×	11.1%	Very poor connection
20%	1.250×	25%	Barely usable

Retransmission Interactions with Congestion Control

The true cost of retransmission isn't just the extra bytes sent—it's the cascade effect on TCP's congestion control:

Retransmission Cascade Effects

•Congestion window halved — When loss is detected (via timeout or triple duplicate ACK), TCP cuts cwnd by 50%, immediately halving throughput.
•Slow recovery — After loss, TCP must grow cwnd back using additive increase (1 MSS per RTT), taking many RTTs to recover full speed.
•Multiple losses compound — During recovery from one loss, additional losses cause multiplicative decreases. Each successive loss halves again.
•Timeout penalty — RTO-based retransmissions (versus fast retransmit) reset cwnd to 1 MSS, requiring slow start from scratch.
•Head-of-line blocking — Application stalls waiting for retransmitted data while buffer fills with subsequent in-order data.

TCP Throughput Degradation Example
Scenario: 100 Mbps link, 50ms RTT, 2% random packet loss
 
Theoretical maximum: 100 Mbps
 
TCP Throughput Estimation (simplified Mathis formula):
  Throughput ≈ (MSS / RTT) × (1 / sqrt(loss_rate))
  Throughput ≈ (1460 bytes / 0.050 sec) × (1 / sqrt(0.02))
  Throughput ≈ 29,200 bytes/sec × 7.07
  Throughput ≈ 206,364 bytes/sec
  Throughput ≈ 1.65 Mbps
 
Achieved: 1.65 Mbps out of 100 Mbps available
Efficiency: 1.65%
 
This is NOT the retransmission bytes (2% extra).
This is the CONGESTION CONTROL RESPONSE to those losses.
TCP's politeness in the face of loss destroys throughput.
 
UDP comparison:
  Sent: 100 Mbps
  Received: ~98 Mbps (2% lost)
  Efficiency: 98%
 
TCP uses 60× less bandwidth than UDP on the same lossy link.

The Congestion Control Tax

Retransmission overhead is typically described as 'sending the same data twice.' The reality is far worse. TCP's congestion control response to loss reduces throughput by 10-100× more than the actual bytes retransmitted. This is why lossy links (wireless, satellite) perform so much worse with TCP than raw packet loss would suggest.

Summary: Understanding Overhead

We've dissected the overhead costs of UDP and TCP across multiple dimensions. Let's consolidate the key insights:

Key Takeaways

•Header overhead: UDP 8 bytes, TCP 20-60 bytes — TCP headers are 2.5-7.5× larger than UDP. For small payloads, this significantly impacts efficiency.
•Small payloads amplify header overhead — Sending 64-byte payloads, TCP achieves only 39% efficiency vs UDP's 54%. The difference diminishes for large payloads.
•Connection state: 0 vs 200+ bytes per connection — UDP's statelessness enables massive connection counts. TCP requires significant per-connection memory.
•Processing overhead: TCP requires 2× the CPU cycles — TCP's reliability machinery requires substantially more processing per packet than UDP's simple forwarding.
•Acknowledgment traffic: 4-6% of data volume — TCP generates reverse-direction ACK traffic that doesn't exist for UDP. This is problematic on asymmetric links.
•Handshake overhead: 198 bytes plus 1.5 RTT — For short exchanges, TCP handshake can represent majority of total overhead. Connection reuse is essential.
•Retransmission: The hidden tax — Retransmission overhead isn't the bytes retransmitted—it's the congestion control response that can reduce throughput by 10-100×.

What's Next:

Now that we understand the overhead differences, we'll examine connection handling—how UDP's connectionless nature and TCP's connection-oriented service fundamentally shape application architecture and behavior.

Overhead Analysis Complete

You can now quantify the overhead costs of choosing UDP vs TCP for any workload. You understand where overhead comes from, how it scales, and which scenarios magnify or diminish the differences. This knowledge enables informed protocol selection and performance optimization.