System Design HLDLayer 4 vs Layer 7 Load Balancing

Layer 4 vs Layer 7 Load Balancing

LevelIntermediate

Duration75 mins

TopicLayer 4 vs Layer 7 Load Balancing

1 / 5

Layer 4 Load Balancing: Transport Layer (TCP/UDP)

The Foundation of Network Load Distribution

When Netflix streams video to 200 million subscribers simultaneously, when Google handles millions of search queries per second, or when AWS routes traffic across global regions, Layer 4 load balancing forms the invisible backbone that makes such scale possible. Operating at the transport layer of the networking stack, Layer 4 load balancers make routing decisions based on network information alone—IP addresses and port numbers—without any awareness of the payload they're carrying.

This seeming limitation is actually its greatest strength. By remaining agnostic to application content, Layer 4 load balancers achieve extraordinary throughput with minimal latency overhead. They're the high-performance workhorses of modern infrastructure, handling millions of connections per second on commodity hardware.

What You Will Learn

By the end of this page, you will understand how Layer 4 load balancing operates at the transport layer, the mechanics of TCP and UDP connection routing, the architectural patterns that enable massive scalability, and the precise scenarios where Layer 4 is the optimal choice over application-aware alternatives.

Understanding Layer 4 in the OSI Model

To fully understand Layer 4 load balancing, we must first establish where it sits in the networking stack and what information is available at this layer.

The OSI (Open Systems Interconnection) model defines seven layers of network abstraction, each building upon the previous:

Layer 1 (Physical): Electrical signals, cables, radio frequencies
Layer 2 (Data Link): MAC addresses, switches, Ethernet frames
Layer 3 (Network): IP addresses, routers, packets
Layer 4 (Transport): TCP/UDP, ports, segments/datagrams
Layer 5-7 (Session/Presentation/Application): HTTP, TLS, application protocols

Layer 4—the Transport Layer—is where Layer 4 load balancers operate. At this level, the load balancer sees:

Source IP address: Where the connection originates
Destination IP address: The target (typically the load balancer's VIP)
Source port: The client's ephemeral port
Destination port: The service port (e.g., 80 for HTTP, 443 for HTTPS)
Protocol: TCP or UDP

Critically, the load balancer does not see the actual content of the request—no HTTP headers, no URLs, no cookies, no request bodies. It operates on metadata alone.

Information Available at Each Network Layer
Layer	Name	Information Available	Load Balancing Capability
Layer 3	Network	IP addresses only	Basic routing, geographic distribution
Layer 4	Transport	IP + Port + Protocol	Connection/session distribution
Layer 7	Application	Full request content	Content-based routing, header inspection

The TCP/IP Model Simplification

In practice, most engineers use the simplified TCP/IP model (4 layers) rather than OSI (7 layers). In TCP/IP terminology, Layer 4 corresponds to the Transport layer, and Layer 7 corresponds to the Application layer. The concepts remain identical regardless of which model you reference.

How Layer 4 Load Balancing Works

Layer 4 load balancing operates through one of two fundamental mechanisms: NAT-based routing or Direct Server Return (DSR). Each has distinct operational characteristics that determine when it's appropriate.

NAT-Based Layer 4 Load Balancing

In NAT (Network Address Translation) mode, the load balancer acts as a full proxy at the network level:

Client sends a TCP SYN packet to the load balancer's Virtual IP (VIP)
Load balancer selects a backend server using its configured algorithm
Load balancer rewrites the packet:
- Changes destination IP from VIP to backend server IP
- Changes destination port if necessary
- Optionally changes source IP to load balancer's IP (Source NAT/SNAT)
Backend server processes the request
Response flows back through the load balancer, which rewrites it again
Client receives response appearing to come from the original VIP

The key characteristic: all traffic (both request and response) flows through the load balancer. This enables connection tracking, health checking, and consistent routing, but the load balancer becomes a potential bottleneck.

Converting Mermaid diagram...

Direct Server Return (DSR)

DSR (also called Direct Routing or Triangulation) is a high-performance technique where:

Client sends request to load balancer's VIP
Load balancer rewrites only the destination MAC address (Layer 2)
Backend server receives packet with VIP as destination IP
Backend server has VIP configured on a loopback interface
Backend server responds directly to the client, bypassing the load balancer

The key characteristic: only inbound traffic flows through the load balancer; responses go directly from backend to client. This dramatically increases throughput, as response traffic (typically larger than requests) doesn't consume load balancer resources.

DSR Requirements

DSR requires all backend servers to be on the same Layer 2 network (same broadcast domain) as the load balancer. Additionally, each backend must accept packets destined for the VIP, typically by configuring the VIP on a loopback interface. These constraints make DSR more complex to deploy but deliver superior performance for asymmetric traffic patterns.

NAT vs Direct Server Return Comparison
Characteristic	NAT Mode	Direct Server Return (DSR)
Traffic flow	Symmetric (all through LB)	Asymmetric (responses bypass LB)
Load balancer as bottleneck	Yes (for response traffic)	No (handles only requests)
Backend network requirements	Any (can span networks)	Same Layer 2 segment
Backend configuration	Standard	VIP on loopback interface
Connection tracking	Full support	Limited (no response visibility)
Health checking	Straightforward	Requires additional mechanisms
Typical use case	General purpose	High-bandwidth services (video, downloads)

TCP Connection Handling at Layer 4

TCP connection handling is central to Layer 4 load balancing performance and behavior. Understanding the nuances of TCP at this layer is essential for production deployments.

The TCP Three-Way Handshake

Every TCP connection begins with a three-way handshake:

SYN: Client initiates connection
SYN-ACK: Server acknowledges and responds
ACK: Client confirms, connection established

The load balancer must make its routing decision at the SYN packet—before any data is exchanged. This is why Layer 4 balancers cannot route based on content; routing happens before content exists.

Connection Affinity and the Tuple

Layer 4 load balancers identify connections using the 5-tuple:

(Source IP, Source Port, Destination IP, Destination Port, Protocol)

Once a routing decision is made for a connection, subsequent packets with the same 5-tuple must go to the same backend server. This is called connection affinity or connection persistence.

The load balancer maintains a connection table mapping 5-tuples to backend servers. For high-traffic systems, this table can contain millions of entries, requiring careful memory management.

connection-table-concept.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
# Conceptual Layer 4 Connection Table
 
+----------------------+----------------------+-------------------+
| Client Connection    | Backend Server       | State            |
+----------------------+----------------------+-------------------+
| 10.0.1.5:45123      | 192.168.1.10:8080   | ESTABLISHED      |
| 10.0.1.7:52891      | 192.168.1.11:8080   | ESTABLISHED      |
| 10.0.1.5:45124      | 192.168.1.12:8080   | TIME_WAIT        |
| 10.0.1.9:38442      | 192.168.1.10:8080   | SYN_RECEIVED     |
+----------------------+----------------------+-------------------+
 
# Note: Same client (10.0.1.5) can have multiple connections
# routed to different backends based on source port

Connection Timeouts and State Management

Layer 4 load balancers must track connection state to ensure packets are routed correctly. This creates several operational considerations:

TCP Connection States:

ESTABLISHED: Active connection, normal routing
TIME_WAIT: Connection closed, but entry retained (typically 2× MSL, ~60 seconds)
FIN_WAIT: Closing sequence in progress
SYN_RECEIVED: Handshake in progress

The TIME_WAIT problem: When a connection closes, TCP requires maintaining state for a period (TIME_WAIT) to handle delayed packets. With millions of short-lived connections, TIME_WAIT entries can exhaust connection table memory. Production load balancers implement aggressive connection reaping, reduced TIME_WAIT durations, or connection table compression.

State synchronization in HA pairs: When running load balancers in high-availability pairs, connection state must be synchronized between the active and standby unit. This is complex for high-throughput scenarios and represents a key engineering challenge.

SYN Flood Vulnerability

Layer 4 load balancers are exposed to SYN flood attacks—malicious actors send millions of SYN packets to exhaust connection table memory without completing handshakes. Production systems implement SYN cookies, connection rate limiting, and stateless packet filtering to mitigate this attack vector.

UDP Handling at Layer 4

While TCP dominates web traffic, UDP is critical for real-time applications: video streaming, voice over IP, gaming, and DNS. Layer 4 load balancing for UDP presents unique challenges due to the protocol's connectionless nature.

The Connectionless Challenge

Unlike TCP, UDP has no connection establishment or teardown—packets are fired independently with no guaranteed delivery or ordering. This creates fundamental difficulties:

No session establishment: There's no SYN packet to trigger routing decisions
No connection termination: No way to know when to remove state
No guaranteed ordering: Packets may arrive out of sequence

UDP Session Affinity

Despite being connectionless, many UDP-based protocols require session affinity—all packets from a given source must reach the same backend. This is critical for:

DNS resolvers: Complete resolution chains
VoIP/RTP: Continuous audio streams
Online gaming: Game state consistency

Layer 4 load balancers implement pseudo-sessions for UDP:

First packet from a source creates a session entry
Subsequent packets from the same source/port are routed to the same backend
Session entries timeout after a period of inactivity (typically 30-300 seconds)

The timeout value is critical: too short causes session breakage, too long exhausts memory.

UDP Session Timeout Recommendations by Application
Application	Protocol	Recommended Timeout	Rationale
DNS	UDP/53	30 seconds	Short queries, stateless
VoIP/SIP	UDP/5060	180 seconds	Call setup, registration
RTP (Voice/Video)	UDP dynamic	60 seconds	Active streams, packet loss acceptable
Gaming	UDP custom	120 seconds	Session persistence, reconnection
QUIC/HTTP3	UDP/443	300 seconds	Long-lived connections

QUIC and the Evolution of UDP Load Balancing

The emergence of QUIC (Quick UDP Internet Connections)—the protocol underlying HTTP/3—has transformed UDP load balancing requirements. QUIC implements connection semantics over UDP:

Connection IDs: QUIC includes a connection identifier in each packet
Encryption: Payload is encrypted from the first packet
Multiplexing: Multiple streams over a single connection
Migration: Connections can change IP addresses mid-stream

Traditional Layer 4 load balancing based on 5-tuple breaks when clients migrate networks (e.g., phone moving from WiFi to cellular). Advanced Layer 4 load balancers now support QUIC-aware routing:

Extract the Connection ID from the QUIC header (unencrypted portion)
Route based on Connection ID rather than source IP
Maintain connection affinity across client network changes

This represents an evolution of Layer 4 balancing—extracting just enough information from the packet to maintain sessions without full Layer 7 inspection.

UDP and Health Checking

Unlike TCP, UDP provides no built-in acknowledgment mechanism. Health checking UDP services requires application-specific probes—sending a valid request and expecting a valid response. For DNS, this might mean sending a query and expecting a response. Many environments fall back to TCP health checks even for UDP services when the service supports both protocols.

Performance Characteristics of Layer 4 Load Balancing

The primary advantage of Layer 4 load balancing is performance. By avoiding application-layer inspection, Layer 4 balancers achieve throughput and latency figures that Layer 7 balancers cannot match.

Throughput

Modern Layer 4 load balancers can handle extraordinary traffic volumes:

Hardware-based (ASIC): 100+ Gbps, millions of connections/second
Software-based (DPDK): 10-40 Gbps on commodity hardware
Kernel-based (IPVS): 1-10 Gbps depending on configuration

The key enabler is minimal per-packet processing:

Receive packet → 2. Lookup 5-tuple → 3. Rewrite addresses → 4. Forward

No protocol parsing, no content inspection, no connection termination/re-establishment.

Layer 4 Performance Advantages

•Minimal latency overhead: Typically <100 microseconds added per packet, often significantly less with hardware acceleration
•Linear scalability: Performance scales with CPU cores and network interfaces without complex state sharing
•Memory efficiency: Connection table entries are small (tens of bytes), enabling millions of concurrent connections
•Protocol agnostic: Same high performance for any TCP/UDP protocol, not just HTTP
•No TLS termination cost: Encrypted traffic passes through unchanged, leveraging backend server TLS acceleration

Latency

Latency at Layer 4 is dominated by network propagation time, not processing overhead:

NAT mode: Adds ~10-100 microseconds per packet (both directions)
DSR mode: Adds ~10-50 microseconds for inbound only
Kernel bypass (DPDK): Adds <10 microseconds

For comparison, Layer 7 processing typically adds 0.5-5 milliseconds—orders of magnitude more.

Scalability Patterns

Layer 4 load balancers scale through several patterns:

Horizontal scaling with ECMP: Deploy multiple independent Layer 4 balancers behind a router using Equal-Cost Multi-Path (ECMP) routing. The router distributes traffic across balancers based on packet hashes.

Kernel bypass: Technologies like DPDK (Data Plane Development Kit) and XDP (eXpress Data Path) allow packet processing in user space or early kernel stages, avoiding kernel network stack overhead entirely.

Hardware offload: Specialized NICs and ASICs implement load balancing in hardware, achieving throughput impossible in software.

Layer 4 Load Balancing Technologies Comparison
Technology	Throughput	Latency	Complexity	Use Case
Linux IPVS	1-10 Gbps	50-100 µs	Low	General purpose, small-medium scale
HAProxy (L4 mode)	1-5 Gbps	100-200 µs	Low	Flexible configuration, visibility
DPDK-based	10-40 Gbps	<10 µs	High	High-performance, telecom grade
XDP/eBPF	10-40 Gbps	<10 µs	Medium	Programmable, cloud native
Hardware ASIC	100+ Gbps	<5 µs	Medium	Enterprise, carrier grade
Cloud LB (AWS NLB)	100+ Gbps	~50 µs	Low	Cloud-native, managed

Load Balancing Algorithms at Layer 4

Layer 4 load balancers use various algorithms to distribute connections across backend servers. The choice of algorithm affects load distribution, cache efficiency, and connection persistence.

Round Robin

The simplest algorithm: connections are distributed to backends in circular order.

Advantages: Perfect distribution if all connections are equal Disadvantages: Ignores connection duration, backend capacity, or current load

Weighted Round Robin

Servers receive connections in proportion to assigned weights.

Example: Server A (weight 3), Server B (weight 1) → A receives 75% of connections Use case: Heterogeneous server capacities, gradual rollouts

Least Connections

New connections go to the backend with the fewest active connections.

Advantages: Automatically adapts to varying connection durations Disadvantages: Requires real-time connection counting, overhead for high-volume systems

Source IP Hash

The source IP address is hashed to determine the backend. Same source IP always routes to the same backend.

Advantages: Natural session affinity without connection tracking Disadvantages: Uneven distribution if source IPs are clustered (e.g., behind NAT)

source-ip-hash-concept.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def source_ip_hash(source_ip: str, backends: list[str]) -> str:
    """
    Route based on hash of source IP address.
    Same source IP always maps to same backend (assuming stable backend list).
    """
    # Simple hash-based routing
    hash_value = hash(source_ip)
    backend_index = hash_value % len(backends)
    return backends[backend_index]
 
# Example usage
backends = ["backend-1", "backend-2", "backend-3"]
print(source_ip_hash("10.0.1.5", backends))   # Consistent result
print(source_ip_hash("10.0.1.5", backends))   # Same result
print(source_ip_hash("10.0.1.6", backends))   # May differ

Consistent Hashing

Consistent hashing minimizes redistribution when backends are added or removed. Instead of modular hashing, backends are placed on a hash ring, and connections map to the nearest backend clockwise.

Advantages: Adding/removing a backend affects only 1/N of connections (where N = number of backends) Use case: Cache servers, stateful backends where connection redistribution is costly

Maglev Hashing

Google's Maglev paper introduced a consistent hashing algorithm specifically designed for load balancers. It provides:

Minimal disruption (similar to ring-based consistent hashing)
Perfect load distribution (unlike ring hashing which can be uneven)
Efficient lookup tables (O(1) lookup after table construction)

Maglev hashing is used in Google's production load balancing and several open-source implementations.

Algorithm Selection at Scale

At large scale, algorithm choice matters less than expected—with millions of connections per second, even imperfect algorithms converge toward even distribution. The critical factors become: connection tracking overhead, memory usage for hash tables, and behavior during backend changes. Maglev hashing excels at all three.

Production Layer 4 Implementations

Understanding real-world implementations helps contextualize Layer 4 concepts. Here are the most significant production systems:

Linux IPVS (IP Virtual Server)

IPVS is the Linux kernel's native Layer 4 load balancing implementation, part of the LVS (Linux Virtual Server) project. It operates within the kernel's netfilter framework.

Capabilities:

NAT, DR (Direct Routing), and TUN (IP Tunneling) modes
Round robin, weighted, least connections, and source hash algorithms
Connection tracking and persistence
Integration with keepalived for high availability

Use case: Kubernetes kube-proxy (in IPVS mode), traditional datacenter load balancing

AWS Network Load Balancer (NLB)

AWS NLB is a managed Layer 4 service designed for extreme scale:

Capabilities:

Millions of requests per second with ultra-low latency
Static IP addresses for allowlist requirements
Preserves source IP (unlike ALB)
Cross-zone and cross-region load balancing
WebSocket and persistent connection support

Use case: High-performance TCP/UDP services, gaming, IoT, non-HTTP protocols

Maglev (Google's Production LB)

•Scale: Handles a significant fraction of Google's global traffic
•Architecture: Distributed Layer 4 balancers using Maglev consistent hashing
•Performance: Operates in kernel bypass mode using specialized packet processing
•Resilience: Stateless design allows instant failover; any Maglev instance can handle any packet
•Research: Published algorithm and architecture, enabling open-source implementations

Cilium/eBPF-based Load Balancing

Modern cloud-native load balancing increasingly uses eBPF (extended Berkeley Packet Filter) technology:

How it works:

eBPF programs are loaded into the kernel
Programs execute at the packet ingress point (XDP) or socket layer
Routing decisions are made before traditional network stack processing
Modifications and forwarding happen in-kernel with minimal overhead

Advantages:

Near-DPDK performance without leaving the kernel
Programmable logic (not just configuration)
Observability built-in (packet tracing, connection tracking)
Integration with Kubernetes and service mesh

Implementation: Cilium uses eBPF to implement kube-proxy replacement, achieving 10x better performance than iptables-based implementations.

The Shift to Software

Historically, high-performance Layer 4 load balancing required expensive hardware (F5, Citrix). Modern software implementations—particularly those using DPDK or eBPF—can match or exceed hardware performance on commodity servers. This has democratized Layer 4 load balancing, making it accessible to any organization.

Summary: Layer 4 Load Balancing

Layer 4 load balancing represents the foundational layer of network traffic distribution. By operating at the transport layer, it achieves performance characteristics impossible at higher layers.

Key Takeaways

•Layer 4 operates on transport metadata: IP addresses, ports, and protocol—no content inspection
•Two primary modes: NAT (symmetric traffic flow) and DSR (asymmetric, higher performance)
•TCP handling requires connection tracking: The 5-tuple identifies connections, state must be maintained
•UDP presents unique challenges: Connectionless nature requires pseudo-session management with timeouts
•Performance is the key advantage: Millions of connections/second, microsecond latency
•Algorithms range from simple to sophisticated: Round robin for simplicity, consistent hashing for minimal disruption
•Modern implementations leverage kernel bypass: DPDK, XDP/eBPF deliver near-hardware performance in software

What's next:

Layer 4 load balancing excels at raw performance but lacks application awareness. In the next page, we'll explore Layer 7 load balancing, which trades some performance for the ability to make routing decisions based on HTTP headers, URLs, cookies, and request content—enabling sophisticated traffic management that Layer 4 cannot achieve.

Page Complete

You now understand how Layer 4 load balancing operates at the transport layer, handling TCP and UDP connections with minimal overhead. This foundational knowledge prepares you to understand Layer 7 load balancing and, critically, when to choose each approach.

1 / 5

Loading learning content...

System Design HLDLayer 4 vs Layer 7 Load Balancing

Layer 4 vs Layer 7 Load Balancing

LevelIntermediate

Duration75 mins

TopicLayer 4 vs Layer 7 Load Balancing

1 / 5

Layer 4 Load Balancing: Transport Layer (TCP/UDP)

The Foundation of Network Load Distribution

What You Will Learn

Understanding Layer 4 in the OSI Model

To fully understand Layer 4 load balancing, we must first establish where it sits in the networking stack and what information is available at this layer.

The OSI (Open Systems Interconnection) model defines seven layers of network abstraction, each building upon the previous:

Layer 1 (Physical): Electrical signals, cables, radio frequencies
Layer 2 (Data Link): MAC addresses, switches, Ethernet frames
Layer 3 (Network): IP addresses, routers, packets
Layer 4 (Transport): TCP/UDP, ports, segments/datagrams
Layer 5-7 (Session/Presentation/Application): HTTP, TLS, application protocols

Layer 4—the Transport Layer—is where Layer 4 load balancers operate. At this level, the load balancer sees:

Source IP address: Where the connection originates
Destination IP address: The target (typically the load balancer's VIP)
Source port: The client's ephemeral port
Destination port: The service port (e.g., 80 for HTTP, 443 for HTTPS)
Protocol: TCP or UDP

Critically, the load balancer does not see the actual content of the request—no HTTP headers, no URLs, no cookies, no request bodies. It operates on metadata alone.

Information Available at Each Network Layer
Layer	Name	Information Available	Load Balancing Capability
Layer 3	Network	IP addresses only	Basic routing, geographic distribution
Layer 4	Transport	IP + Port + Protocol	Connection/session distribution
Layer 7	Application	Full request content	Content-based routing, header inspection

The TCP/IP Model Simplification

How Layer 4 Load Balancing Works

NAT-Based Layer 4 Load Balancing

In NAT (Network Address Translation) mode, the load balancer acts as a full proxy at the network level:

Client sends a TCP SYN packet to the load balancer's Virtual IP (VIP)
Load balancer selects a backend server using its configured algorithm
Load balancer rewrites the packet:
- Changes destination IP from VIP to backend server IP
- Changes destination port if necessary
- Optionally changes source IP to load balancer's IP (Source NAT/SNAT)
Backend server processes the request
Response flows back through the load balancer, which rewrites it again
Client receives response appearing to come from the original VIP

Converting Mermaid diagram...

Direct Server Return (DSR)

DSR (also called Direct Routing or Triangulation) is a high-performance technique where:

Client sends request to load balancer's VIP
Load balancer rewrites only the destination MAC address (Layer 2)
Backend server receives packet with VIP as destination IP
Backend server has VIP configured on a loopback interface
Backend server responds directly to the client, bypassing the load balancer

DSR Requirements

NAT vs Direct Server Return Comparison
Characteristic	NAT Mode	Direct Server Return (DSR)
Traffic flow	Symmetric (all through LB)	Asymmetric (responses bypass LB)
Load balancer as bottleneck	Yes (for response traffic)	No (handles only requests)
Backend network requirements	Any (can span networks)	Same Layer 2 segment
Backend configuration	Standard	VIP on loopback interface
Connection tracking	Full support	Limited (no response visibility)
Health checking	Straightforward	Requires additional mechanisms
Typical use case	General purpose	High-bandwidth services (video, downloads)

TCP Connection Handling at Layer 4

TCP connection handling is central to Layer 4 load balancing performance and behavior. Understanding the nuances of TCP at this layer is essential for production deployments.

The TCP Three-Way Handshake

Every TCP connection begins with a three-way handshake:

SYN: Client initiates connection
SYN-ACK: Server acknowledges and responds
ACK: Client confirms, connection established

Connection Affinity and the Tuple

Layer 4 load balancers identify connections using the 5-tuple:

(Source IP, Source Port, Destination IP, Destination Port, Protocol)

The load balancer maintains a connection table mapping 5-tuples to backend servers. For high-traffic systems, this table can contain millions of entries, requiring careful memory management.

connection-table-concept.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
# Conceptual Layer 4 Connection Table
 
+----------------------+----------------------+-------------------+
| Client Connection    | Backend Server       | State            |
+----------------------+----------------------+-------------------+
| 10.0.1.5:45123      | 192.168.1.10:8080   | ESTABLISHED      |
| 10.0.1.7:52891      | 192.168.1.11:8080   | ESTABLISHED      |
| 10.0.1.5:45124      | 192.168.1.12:8080   | TIME_WAIT        |
| 10.0.1.9:38442      | 192.168.1.10:8080   | SYN_RECEIVED     |
+----------------------+----------------------+-------------------+
 
# Note: Same client (10.0.1.5) can have multiple connections
# routed to different backends based on source port

Connection Timeouts and State Management

Layer 4 load balancers must track connection state to ensure packets are routed correctly. This creates several operational considerations:

TCP Connection States:

ESTABLISHED: Active connection, normal routing
TIME_WAIT: Connection closed, but entry retained (typically 2× MSL, ~60 seconds)
FIN_WAIT: Closing sequence in progress
SYN_RECEIVED: Handshake in progress

SYN Flood Vulnerability

UDP Handling at Layer 4

The Connectionless Challenge

Unlike TCP, UDP has no connection establishment or teardown—packets are fired independently with no guaranteed delivery or ordering. This creates fundamental difficulties:

No session establishment: There's no SYN packet to trigger routing decisions
No connection termination: No way to know when to remove state
No guaranteed ordering: Packets may arrive out of sequence

UDP Session Affinity

Despite being connectionless, many UDP-based protocols require session affinity—all packets from a given source must reach the same backend. This is critical for:

DNS resolvers: Complete resolution chains
VoIP/RTP: Continuous audio streams
Online gaming: Game state consistency

Layer 4 load balancers implement pseudo-sessions for UDP:

First packet from a source creates a session entry
Subsequent packets from the same source/port are routed to the same backend
Session entries timeout after a period of inactivity (typically 30-300 seconds)

The timeout value is critical: too short causes session breakage, too long exhausts memory.

UDP Session Timeout Recommendations by Application
Application	Protocol	Recommended Timeout	Rationale
DNS	UDP/53	30 seconds	Short queries, stateless
VoIP/SIP	UDP/5060	180 seconds	Call setup, registration
RTP (Voice/Video)	UDP dynamic	60 seconds	Active streams, packet loss acceptable
Gaming	UDP custom	120 seconds	Session persistence, reconnection
QUIC/HTTP3	UDP/443	300 seconds	Long-lived connections

QUIC and the Evolution of UDP Load Balancing

The emergence of QUIC (Quick UDP Internet Connections)—the protocol underlying HTTP/3—has transformed UDP load balancing requirements. QUIC implements connection semantics over UDP:

Connection IDs: QUIC includes a connection identifier in each packet
Encryption: Payload is encrypted from the first packet
Multiplexing: Multiple streams over a single connection
Migration: Connections can change IP addresses mid-stream

Extract the Connection ID from the QUIC header (unencrypted portion)
Route based on Connection ID rather than source IP
Maintain connection affinity across client network changes

This represents an evolution of Layer 4 balancing—extracting just enough information from the packet to maintain sessions without full Layer 7 inspection.

UDP and Health Checking

Performance Characteristics of Layer 4 Load Balancing

Throughput

Modern Layer 4 load balancers can handle extraordinary traffic volumes:

Hardware-based (ASIC): 100+ Gbps, millions of connections/second
Software-based (DPDK): 10-40 Gbps on commodity hardware
Kernel-based (IPVS): 1-10 Gbps depending on configuration

The key enabler is minimal per-packet processing:

Receive packet → 2. Lookup 5-tuple → 3. Rewrite addresses → 4. Forward

No protocol parsing, no content inspection, no connection termination/re-establishment.

Layer 4 Performance Advantages

•Minimal latency overhead: Typically <100 microseconds added per packet, often significantly less with hardware acceleration
•Linear scalability: Performance scales with CPU cores and network interfaces without complex state sharing
•Memory efficiency: Connection table entries are small (tens of bytes), enabling millions of concurrent connections
•Protocol agnostic: Same high performance for any TCP/UDP protocol, not just HTTP
•No TLS termination cost: Encrypted traffic passes through unchanged, leveraging backend server TLS acceleration

Latency

Latency at Layer 4 is dominated by network propagation time, not processing overhead:

NAT mode: Adds ~10-100 microseconds per packet (both directions)
DSR mode: Adds ~10-50 microseconds for inbound only
Kernel bypass (DPDK): Adds <10 microseconds

For comparison, Layer 7 processing typically adds 0.5-5 milliseconds—orders of magnitude more.

Scalability Patterns

Layer 4 load balancers scale through several patterns:

Hardware offload: Specialized NICs and ASICs implement load balancing in hardware, achieving throughput impossible in software.

Layer 4 Load Balancing Technologies Comparison
Technology	Throughput	Latency	Complexity	Use Case
Linux IPVS	1-10 Gbps	50-100 µs	Low	General purpose, small-medium scale
HAProxy (L4 mode)	1-5 Gbps	100-200 µs	Low	Flexible configuration, visibility
DPDK-based	10-40 Gbps	<10 µs	High	High-performance, telecom grade
XDP/eBPF	10-40 Gbps	<10 µs	Medium	Programmable, cloud native
Hardware ASIC	100+ Gbps	<5 µs	Medium	Enterprise, carrier grade
Cloud LB (AWS NLB)	100+ Gbps	~50 µs	Low	Cloud-native, managed

Load Balancing Algorithms at Layer 4

Layer 4 load balancers use various algorithms to distribute connections across backend servers. The choice of algorithm affects load distribution, cache efficiency, and connection persistence.

Round Robin

The simplest algorithm: connections are distributed to backends in circular order.

Advantages: Perfect distribution if all connections are equal Disadvantages: Ignores connection duration, backend capacity, or current load

Weighted Round Robin

Servers receive connections in proportion to assigned weights.

Example: Server A (weight 3), Server B (weight 1) → A receives 75% of connections Use case: Heterogeneous server capacities, gradual rollouts

Least Connections

New connections go to the backend with the fewest active connections.

Advantages: Automatically adapts to varying connection durations Disadvantages: Requires real-time connection counting, overhead for high-volume systems

Source IP Hash

The source IP address is hashed to determine the backend. Same source IP always routes to the same backend.

Advantages: Natural session affinity without connection tracking Disadvantages: Uneven distribution if source IPs are clustered (e.g., behind NAT)

source-ip-hash-concept.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def source_ip_hash(source_ip: str, backends: list[str]) -> str:
    """
    Route based on hash of source IP address.
    Same source IP always maps to same backend (assuming stable backend list).
    """
    # Simple hash-based routing
    hash_value = hash(source_ip)
    backend_index = hash_value % len(backends)
    return backends[backend_index]
 
# Example usage
backends = ["backend-1", "backend-2", "backend-3"]
print(source_ip_hash("10.0.1.5", backends))   # Consistent result
print(source_ip_hash("10.0.1.5", backends))   # Same result
print(source_ip_hash("10.0.1.6", backends))   # May differ

Consistent Hashing

Consistent hashing minimizes redistribution when backends are added or removed. Instead of modular hashing, backends are placed on a hash ring, and connections map to the nearest backend clockwise.

Advantages: Adding/removing a backend affects only 1/N of connections (where N = number of backends) Use case: Cache servers, stateful backends where connection redistribution is costly

Maglev Hashing

Google's Maglev paper introduced a consistent hashing algorithm specifically designed for load balancers. It provides:

Minimal disruption (similar to ring-based consistent hashing)
Perfect load distribution (unlike ring hashing which can be uneven)
Efficient lookup tables (O(1) lookup after table construction)

Maglev hashing is used in Google's production load balancing and several open-source implementations.

Algorithm Selection at Scale

Production Layer 4 Implementations

Understanding real-world implementations helps contextualize Layer 4 concepts. Here are the most significant production systems:

Linux IPVS (IP Virtual Server)

IPVS is the Linux kernel's native Layer 4 load balancing implementation, part of the LVS (Linux Virtual Server) project. It operates within the kernel's netfilter framework.

Capabilities:

NAT, DR (Direct Routing), and TUN (IP Tunneling) modes
Round robin, weighted, least connections, and source hash algorithms
Connection tracking and persistence
Integration with keepalived for high availability

Use case: Kubernetes kube-proxy (in IPVS mode), traditional datacenter load balancing

AWS Network Load Balancer (NLB)

AWS NLB is a managed Layer 4 service designed for extreme scale:

Capabilities:

Millions of requests per second with ultra-low latency
Static IP addresses for allowlist requirements
Preserves source IP (unlike ALB)
Cross-zone and cross-region load balancing
WebSocket and persistent connection support

Use case: High-performance TCP/UDP services, gaming, IoT, non-HTTP protocols

Maglev (Google's Production LB)

•Scale: Handles a significant fraction of Google's global traffic
•Architecture: Distributed Layer 4 balancers using Maglev consistent hashing
•Performance: Operates in kernel bypass mode using specialized packet processing
•Resilience: Stateless design allows instant failover; any Maglev instance can handle any packet
•Research: Published algorithm and architecture, enabling open-source implementations

Cilium/eBPF-based Load Balancing

Modern cloud-native load balancing increasingly uses eBPF (extended Berkeley Packet Filter) technology:

How it works:

eBPF programs are loaded into the kernel
Programs execute at the packet ingress point (XDP) or socket layer
Routing decisions are made before traditional network stack processing
Modifications and forwarding happen in-kernel with minimal overhead

Advantages:

Near-DPDK performance without leaving the kernel
Programmable logic (not just configuration)
Observability built-in (packet tracing, connection tracking)
Integration with Kubernetes and service mesh

Implementation: Cilium uses eBPF to implement kube-proxy replacement, achieving 10x better performance than iptables-based implementations.

The Shift to Software

Summary: Layer 4 Load Balancing

Layer 4 load balancing represents the foundational layer of network traffic distribution. By operating at the transport layer, it achieves performance characteristics impossible at higher layers.

Key Takeaways

•Layer 4 operates on transport metadata: IP addresses, ports, and protocol—no content inspection
•Two primary modes: NAT (symmetric traffic flow) and DSR (asymmetric, higher performance)
•TCP handling requires connection tracking: The 5-tuple identifies connections, state must be maintained
•UDP presents unique challenges: Connectionless nature requires pseudo-session management with timeouts
•Performance is the key advantage: Millions of connections/second, microsecond latency
•Algorithms range from simple to sophisticated: Round robin for simplicity, consistent hashing for minimal disruption
•Modern implementations leverage kernel bypass: DPDK, XDP/eBPF deliver near-hardware performance in software

What's next:

Page Complete

1 / 5