Loading content...
ESP can protect network traffic in two fundamentally different ways, each suited to different network architectures and security requirements. Transport mode protects only the payload of IP packets, preserving the original IP addresses for routing. Tunnel mode encapsulates entire IP packets within new IP packets, hiding the original endpoints and enabling gateway-to-gateway VPNs.
Choosing the correct mode is not merely a configuration detail—it determines what gets protected, who can see endpoint addresses, how packets traverse NAT devices, and ultimately whether ESP provides the security properties your deployment requires.
This page explores both modes in depth, explaining their packet structures, typical deployment scenarios, security implications, and the factors that should guide your mode selection.
By the end of this page, you will understand the structural differences between transport and tunnel mode, how each mode affects packet flow through networks, the security properties and limitations of each mode, NAT traversal challenges and solutions for ESP, and decision criteria for selecting the appropriate mode for different deployment scenarios.
Transport mode is the simpler of the two ESP modes, designed for end-to-end protection between two hosts. In transport mode, ESP protects only the upper-layer protocol (TCP, UDP, ICMP, etc.) while leaving the original IP header intact and visible.
Transport Mode Packet Structure:
Original Packet: [IP Header][TCP/UDP Header][Payload Data]
\_________________________________/
Protected by ESP
ESP Transport: [IP Header][ESP Header][IV][TCP/UDP Header][Payload][ESP Trailer][ICV]
\__________________________________________________/
Encrypted
Key Characteristics:
| Field | Original Value | After ESP Transport Mode |
|---|---|---|
| IP Source Address | Original source | Unchanged (visible) |
| IP Destination Address | Original destination | Unchanged (visible) |
| IP Protocol | 6 (TCP), 17 (UDP), etc. | 50 (ESP) |
| IP Total Length | Original length | Increased (ESP overhead) |
| Upper-layer header | Visible | Encrypted |
| Payload data | Visible | Encrypted |
Transport Mode Use Cases:
Host-to-Host Protection: Direct security between two endpoints (e.g., server to server within a data center)
L2TP/IPSec: Layer 2 Tunneling Protocol uses transport mode ESP to encrypt L2TP packets
Application-Level VPNs: Protecting specific application traffic between known hosts
Same-Subnet Encryption: Encrypting traffic on shared network segments
Limitations:
When transport mode is used with Authentication Header (AH) instead of or alongside ESP, AH authenticates the IP header's immutable fields (all except TTL, checksum, and IP options that change in transit). This provides protection for source/destination addresses—but at the cost of NAT incompatibility. ESP transport mode intentionally doesn't authenticate the IP header to enable NAT traversal.
Tunnel mode is the more common ESP mode, designed for network-to-network or host-to-network VPNs. In tunnel mode, ESP encapsulates the entire original IP packet—including its IP header—within a new IP packet. The original addresses become hidden (encrypted) inside the ESP payload.
Tunnel Mode Packet Structure:
Original Packet: [Original IP Header][TCP/UDP Header][Payload Data]
\__________________________________________________/
Entire packet becomes ESP payload
ESP Tunnel: [New IP Header][ESP Header][IV][Original IP][TCP/UDP][Payload][Trailer][ICV]
\________________________________/
Encrypted
The "Tunneling" Concept:
Imagine the original IP packet being placed inside an envelope (ESP). That envelope is then given a new address label (outer IP header) for delivery through the postal system (untrusted network). Upon arrival, the recipient opens the envelope to retrieve the original letter (decrypting ESP to recover the inner packet).
| Header Type | IP Addresses | Visibility | Purpose |
|---|---|---|---|
| Outer IP Header | VPN Gateway A ↔ VPN Gateway B | Visible to all | Routing through untrusted network |
| Inner IP Header | Actual Source ↔ Actual Destination | Encrypted (hidden) | Original packet delivery after decryption |
Key Characteristics:
Tunnel Mode Use Cases:
When configuring IPSec VPNs, tunnel mode is typically the default and recommended choice. It provides more complete protection (including IP headers), enables gateway-based deployment, and hides the actual endpoints from network observers. Use transport mode only when you specifically need end-to-end host protection with visible addresses.
Understanding the differences between transport and tunnel mode requires examining multiple dimensions: packet structure, security properties, performance implications, and deployment flexibility.
| Aspect | Transport Mode | Tunnel Mode |
|---|---|---|
| Protection Scope | Upper-layer protocol only | Entire original IP packet |
| IP Header | Original header preserved | Original hidden; new outer header |
| Endpoint Visibility | Source/destination visible | Only tunnel endpoints visible |
| Overhead | ~24-40 bytes (ESP only) | ~44-60 bytes (ESP + outer IP) |
| MTU Impact | Moderate reduction | Greater reduction (extra IP header) |
| Typical Deployment | Host-to-host | Gateway-to-gateway, host-to-gateway |
| NAT Compatibility | Requires NAT-T | Works well with NAT-T |
| Implementation Location | Endpoints (hosts) | Gateways or endpoints |
| Address Translation | Cannot translate inner addresses | Inner addresses can differ from outer |
| Traffic Analysis Resistance | Low (endpoints visible) | Higher (only tunnel endpoints visible) |
Overhead Analysis:
Let's calculate the exact overhead for each mode with AES-256-GCM:
Transport Mode Overhead:
ESP Header: 8 bytes (SPI + Sequence)
IV: 8 bytes (AES-GCM implicit IV)
ESP Trailer: 2-17 bytes (padding + pad_length + next_header)
ICV: 16 bytes (AES-GCM tag)
─────────────────────────────
Total: 34-49 bytes (depending on padding)
Tunnel Mode Overhead:
Outer IP Header: 20 bytes (IPv4) or 40 bytes (IPv6)
ESP Header: 8 bytes
IV: 8 bytes
ESP Trailer: 2-17 bytes
ICV: 16 bytes
─────────────────────────────
Total IPv4: 54-69 bytes
Total IPv6: 74-89 bytes
MTU Considerations:
With a standard 1500-byte MTU:
This reduction can cause fragmentation issues if not properly handled.
Use Transport Mode when: Both endpoints are the actual communicating hosts, NAT is not present (or NAT-T is acceptable), and you want minimal overhead. Use Tunnel Mode when: Gateways protect traffic for entire subnets, you need to hide internal addresses, or you're implementing site-to-site or remote access VPNs.
Network Address Translation (NAT) poses significant challenges for ESP because NAT devices modify IP addresses—which can break security mechanisms. NAT Traversal (NAT-T) solves this by encapsulating ESP packets within UDP, allowing them to pass through NAT devices.
The NAT Problem:
NAT-T Solution (RFC 3948):
NAT-T wraps ESP packets in UDP datagrams:
Without NAT-T: [IP Header (Proto=50)][ESP Header][Encrypted Payload][ICV]
↑
NAT can't handle this
With NAT-T: [IP Header (Proto=17)][UDP Header :4500][ESP Header][Encrypted Payload][ICV]
↑
NAT-friendly
| Aspect | Without NAT-T | With NAT-T |
|---|---|---|
| IP Protocol | 50 (ESP) | 17 (UDP) |
| Port Numbers | None | Source: ephemeral, Dest: 4500 |
| NAT Compatibility | Fails (most NATs) | Works through NAT |
| Overhead | 0 bytes extra | 8 bytes (UDP header) |
| Detection | N/A | IKE NAT detection payloads |
| Keepalives | Not required | NAT keepalive packets every 20-30s |
NAT-T Operation:
NAT-T and Modes:
Port 4500:
Port 4500 is the IANA-assigned port for IPSec NAT-T. Both source and destination typically use 4500, though the source may be translated by NAT. The receiver distinguishes IKE from ESP by checking the first 4 bytes:
When deploying IPSec through firewalls, ensure UDP port 4500 is permitted (bidrectional). Also ensure UDP port 500 (IKE) is open. Without NAT-T, you'd need to permit IP protocol 50 directly—most NAT devices don't support this. Modern deployments should assume NAT-T will be required and plan accordingly.
Selecting the appropriate ESP mode requires analyzing your deployment scenario across multiple dimensions. Use the following criteria to guide your decision.
Decision Factor 1: Endpoint Nature
Who are the IPSec peers?
Decision Factor 2: Address Visibility Requirements
Do you need to hide internal addresses?
Decision Factor 3: NAT Environment
Is NAT present on the path?
| Scenario | Recommended Mode | Reason |
|---|---|---|
| Site-to-site VPN | Tunnel | Gateways protect entire subnets; hides internal addresses |
| Remote access VPN | Tunnel | Client accesses entire remote network; gateway implementation |
| Data center east-west | Transport or Tunnel | Transport if direct host-to-host; tunnel if though gateway |
| L2TP/IPSec | Transport | IPSec protects L2TP; L2TP provides tunneling |
| Database replication | Transport or Tunnel | Transport for known hosts; tunnel if through gateways |
| Cloud connectivity | Tunnel | Standard for AWS/Azure/GCP VPN connections |
| Mobile device VPN | Tunnel | Device accesses corporate resources via gateway |
Decision Factor 4: Performance Requirements
Decision Factor 5: QoS and Traffic Management
Decision Factor 6: Routing Complexity
If you're unsure which mode to select, tunnel mode is almost always the safer choice. It provides stronger security properties (hiding internal addresses), works better with NAT, and is the standard for most VPN products. Transport mode is primarily useful for specific host-to-host scenarios where minimal overhead is critical and address visibility is acceptable.
Successfully deploying ESP requires attention to several implementation details that affect both modes, though with different implications.
MTU and Fragmentation:
ESP overhead reduces the available payload size, potentially requiring fragmentation:
Pre-encryption Fragmentation (Preferred):
Post-encryption Fragmentation:
| Configuration | Inner MTU (IPv4 Tunnel) | Notes |
|---|---|---|
| Standard Internet path | 1400 bytes | Conservative; works through most paths |
| Enterprise network | 1420-1440 bytes | Controlled environment; may optimize |
| Jumbo frames enabled | 8900-8950 bytes | 9000 byte MTU path required end-to-end |
| NAT-T enabled | 1380-1400 bytes | Additional UDP overhead |
Path MTU Discovery (PMTUD):
With tunnel mode, PMTUD can be problematic:
Solutions:
DSCP/QoS Marking:
In tunnel mode, the inner DSCP is encrypted. Options for outer header:
Routing and Selectors:
IPSec traffic selectors define what traffic enters the tunnel:
For TCP traffic through IPSec tunnels, enable MSS clamping to avoid fragmentation. This adjusts the TCP Maximum Segment Size in SYN packets to account for tunnel overhead. Set MSS to (Tunnel MTU - 40 for TCP/IP headers), typically 1360-1380 bytes for standard Internet paths with ESP tunnel mode.
Beyond basic mode selection, several advanced scenarios require specialized mode configurations.
Nested Tunnels:
Multiple layers of ESP protection for defense-in-depth:
[Outer IP][ESP₂ Tunnel][Inner IP][ESP₁ Tunnel][Original Packet]
Use cases:
Transport within Tunnel:
Host-to-host transport mode inside a gateway tunnel:
[Outer IP][ESP₂ Tunnel][Inner IP][ESP₁ Transport][TCP/Data]
Provides end-to-end encryption on top of site-to-site tunneling.
BEET Mode (Bound End-to-End Tunnel):
A hybrid mode (non-standard, Linux-specific) that:
| Configuration | Description | Use Case |
|---|---|---|
| Tunnel-over-Tunnel | ESP tunnel encapsulated in another ESP tunnel | Multi-hop security, transit provider isolation |
| Transport-over-Tunnel | ESP transport inside ESP tunnel | End-to-end inside site-to-site |
| GRE-over-IPSec | GRE tunnel encrypted by ESP | Multicast support, routing protocol transport |
| IPSec-over-GRE | ESP packets carried in GRE | Legacy scenarios, specific routing requirements |
| VTI (Virtual Tunnel Interface) | Route-based tunnel mode | Simplified routing, dynamic peer selection |
Virtual Tunnel Interfaces (VTI):
VTIs provide a routable interface for IPSec tunnels, simplifying configuration:
Traditional Policy-Based:
VTI Route-Based:
DMVPN (Dynamic Multipoint VPN):
Combines GRE with IPSec for spoke-to-spoke dynamic tunnels:
While these advanced configurations exist, simpler setups are generally more reliable and easier to troubleshoot. Before implementing nested tunnels or complex overlays, ensure the added complexity is justified by concrete security or functionality requirements. Most deployments work well with straightforward tunnel or transport mode configurations.
The choice between transport and tunnel mode fundamentally shapes how ESP protects your traffic. Transport mode offers simplicity and efficiency for host-to-host communication, while tunnel mode provides comprehensive protection and flexibility for network-wide VPN deployments.
Module Summary:
You have now completed a comprehensive exploration of ESP—the Encapsulating Security Payload protocol that secures network layer communications across the Internet. From ESP's purpose and design philosophy through packet formats, encryption mechanisms, authentication services, and operational modes, you've gained the deep understanding necessary to implement, configure, and troubleshoot IPSec deployments.
ESP is the foundation of VPN technologies worldwide, protecting everything from corporate site-to-site links to remote worker connections to cloud infrastructure connectivity. This knowledge prepares you to work with real-world IPSec implementations and understand the security guarantees they provide.
Congratulations! You have completed the ESP module. You now understand ESP's purpose, packet format, encryption mechanisms, authentication services, and operational modes. This comprehensive knowledge of ESP—the workhorse of IPSec—prepares you to understand, deploy, and troubleshoot secure network communications at the network layer.