Loading content...
Every second, a core Internet router may process hundreds of millions of packets. For each one, the router must examine the destination address, consult its tables, determine the correct output port, and transmit the packet—all in mere nanoseconds. This lightning-fast, per-packet operation is called forwarding.
If routing is the strategic planning that determines which paths exist, forwarding is the tactical execution that moves packets along those paths. It's the data plane workhorse of the network—the function that routers perform more than any other, and the one that most directly impacts network throughput and latency.
By the end of this page, you will understand the precise definition of forwarding, how it differs from routing, the mechanics of forwarding table lookups, and the hardware optimizations that enable routers to forward packets at line rate.
Forwarding is the router-local process of transferring a packet from an input interface to the appropriate output interface. It's the physical movement of data within a single network device.
More formally:
Forwarding is the data plane action of moving a packet from input port to output port based on the destination address and the forwarding table.
Unlike routing, which is concerned with network-wide path computation, forwarding is entirely local to a single router. The router receives a packet, looks up the destination in its forwarding table, and sends the packet out the corresponding interface. No communication with other routers is required—the forwarding table contains all the information needed.
| Characteristic | Description | Design Implication |
|---|---|---|
| Scope | Local to single router | Must be self-contained; no external queries |
| Frequency | Every packet | Billions of operations per second on core routers |
| Latency | Nanoseconds to microseconds | Hardware acceleration essential |
| Computation | Simple table lookup | Optimized data structures (tries, CAM) |
| State | Uses forwarding table | Pre-computed by routing; rarely changes |
| Plane | Data plane function | Implemented in fast path, often in hardware |
Forwarding must happen at 'line rate'—fast enough that no packets are dropped due to processing delays. On a 100 Gbps interface with minimum-size 64-byte packets, this means ~148 million forwarding decisions per second per port.
When a packet arrives at a router, it undergoes a precisely defined sequence of operations. Understanding this flow is essential for grasping how forwarding actually works:
Step 1: Reception The packet arrives on an input interface. The physical layer (PHY) converts signals to bits. The link layer verifies the frame and extracts the payload.
Step 2: Packet Extraction The link-layer frame is decapsulated, revealing the network-layer packet (typically IP). The router reads the header, extracting critical fields—especially the destination IP address.
Step 3: Forwarding Table Lookup The destination IP address is used as a key to look up the next-hop information in the forwarding table. This is typically a longest-prefix match operation.
Step 4: Header Modification The router modifies certain header fields: decrements TTL (Time-to-Live), recalculates the header checksum, possibly modifies options.
Step 5: Switching The packet is transferred from the input port to the appropriate output port through the router's switching fabric.
Step 6: Queuing At the output port, the packet is placed in a queue, waiting for transmission on the outgoing link.
Step 7: Transmission When the link becomes available, the packet is encapsulated in a new link-layer frame (with appropriate destination MAC) and transmitted.
Not all packets follow this fast path. Packets with options, TTL=0, destination is the router itself, or matching special rules may be sent to the 'slow path' (CPU processing). This is called 'punting' and should be rare for good performance.
The forwarding table lookup isn't a simple exact-match operation. Because IP addresses are hierarchically structured and routes are aggregated, routers use longest prefix matching.
The Problem: A forwarding table might contain entries like:
10.0.0.0/8 → Port 110.1.0.0/16 → Port 210.1.2.0/24 → Port 3A packet destined for 10.1.2.5 matches all three entries. Which one should be used?
The Solution: Longest Prefix Match
The router selects the entry with the longest matching prefix (most specific route). In this case, /24 is longer than /16, which is longer than /8, so the packet goes to Port 3.
This approach enables route aggregation—instead of storing individual host routes (billions of entries), routers store summarized prefixes while still allowing more specific exceptions when needed.
| Destination IP | Matches | Selected Route | Output Port |
|---|---|---|---|
10.1.2.5 | 10.0.0.0/8, 10.1.0.0/16, 10.1.2.0/24 | 10.1.2.0/24 (longest) | Port 3 |
10.1.3.1 | 10.0.0.0/8, 10.1.0.0/16 | 10.1.0.0/16 (longest) | Port 2 |
10.5.0.1 | 10.0.0.0/8 | 10.0.0.0/8 (only match) | Port 1 |
192.168.1.1 | None (unless default route) | Default route or drop | Port 0 or Drop |
Implementation Challenges:
Longest prefix match is computationally challenging at high speeds. Unlike exact-match (which a hash table solves in O(1)), LPM requires comparing against variable-length prefixes. Common implementations include:
Tries (Prefix Trees): Binary tries where each bit in the address determines left/right traversal. Match is found at the deepest node reached. O(W) where W is address width (32 for IPv4).
Multi-bit Tries: Process multiple bits per step, reducing tree depth at the cost of memory.
TCAM (Ternary Content-Addressable Memory): Hardware that searches all entries in parallel in a single clock cycle. Entries are ordered by prefix length; first match wins.
With 128-bit addresses versus 32-bit, IPv6 significantly increases trie depth and TCAM entry width requirements. This has driven innovations in hierarchical lookup schemes and compressed trie representations.
The forwarding table is the essential data structure that enables fast packet forwarding. While the routing table may contain rich policy information, the forwarding table is optimized for lookup speed.
Key Fields in a Forwarding Table Entry:
+------------------+--------+--------------+------------+------------------+| Destination | Prefix | Next-Hop | Output | Next-Hop MAC || Prefix | Length | IP | Interface | (if resolved) |+------------------+--------+--------------+------------+------------------+| 10.0.0.0 | /8 | 192.168.1.2 | Eth0 | aa:bb:cc:dd:ee:01|| 10.1.0.0 | /16 | 192.168.1.3 | Eth0 | aa:bb:cc:dd:ee:02|| 10.1.2.0 | /24 | 192.168.2.1 | Eth1 | aa:bb:cc:dd:ee:03|| 172.16.0.0 | /12 | 192.168.2.2 | Eth1 | aa:bb:cc:dd:ee:04|| 0.0.0.0 | /0 | 10.0.0.1 | Eth0 | aa:bb:cc:dd:ee:00|+------------------+--------+--------------+------------+------------------+The Routing Information Base (RIB) is the full routing table with all routes and policies. The Forwarding Information Base (FIB) is the optimized, hardware-friendly version used for actual forwarding. Routers maintain both: RIB for routing decisions, FIB for forwarding.
Modern routers are sophisticated systems engineered for maximum forwarding throughput. Understanding the architecture reveals how line-rate forwarding is achieved:
Input Ports: Each input port contains:
Switching Fabric: The switching fabric transfers packets from input to output ports. Three main architectures exist:
Memory-based: CPU copies packets between I/O ports via shared memory. Bandwidth limited by memory speed. Used in older/low-end routers.
Bus-based: Shared bus connecting all ports. Bandwidth limited by bus speed. Contention requires arbitration.
Crossbar: Direct connections between any input-output pair. Non-blocking; can achieve N × line rate in an N-port router. Used in high-performance routers.
Output Ports: Each output port contains:
In distributed forwarding, each line card has its own forwarding table copy and processor—lookups happen in parallel across all ports. In centralized forwarding, a single lookup engine serves all ports—simpler but becomes a bottleneck. High-end routers use distributed forwarding.
Line-rate forwarding means the router can forward packets as fast as they arrive—no packets dropped due to processing limitations. This is the gold standard for router performance.
The Math of Line Rate:
Consider a 100 Gbps interface:
The router has ~6.7 nanoseconds to make a forwarding decision for each packet. At this speed, even memory access latency (~100ns for DRAM) is too slow!
| Interface Speed | Min-Size Pkts/sec | Time Budget per Packet | Memory Tech Required |
|---|---|---|---|
| 1 Gbps | 1.49 million | 672 ns | DRAM viable |
| 10 Gbps | 14.9 million | 67.2 ns | SRAM or cached DRAM |
| 100 Gbps | 148.8 million | 6.72 ns | SRAM, TCAM essential |
| 400 Gbps | 595.2 million | 1.68 ns | Multiple parallel TCAMs |
Technologies Enabling Line Rate:
TCAM (Ternary CAM): Searches entire forwarding table in one clock cycle. Each entry can match 0, 1, or 'don't care' for each bit—perfect for prefix matching.
Pipelining: Break the forwarding process into stages; different packets in different stages simultaneously. Increases throughput without faster clocks.
Parallelism: Multiple lookup engines operating in parallel. Packets are distributed across engines, multiplying effective throughput.
Prefetching and Caching: Anticipate likely lookups; keep hot prefixes in fast SRAM cache.
Custom ASICs: Application-specific chips designed solely for packet processing. Far faster than general-purpose CPUs.
Every feature added to the forwarding path—ACLs, QoS marking, NAT, encryption—consumes nanoseconds. Feature-rich forwarding is harder to achieve at line rate. Network engineers must balance functionality against performance.
Network devices can forward packets using different methods, each with distinct trade-offs between latency and error handling:
Store-and-Forward: The device receives the entire packet, validates it (FCS check), then begins forwarding. If the packet is corrupted, it's discarded here.
Cut-Through (Fast Forward): The device begins forwarding as soon as the destination address is read—before the packet is fully received.
Fragment-Free (Modified Cut-Through): The device reads the first 64 bytes before forwarding. This catches runt frames (collision fragments) while minimizing latency.
Some switches use adaptive forwarding: start with cut-through, but if error rate exceeds a threshold, automatically switch to store-and-forward. This balances low latency in healthy networks with error protection when problems arise.
We've established a comprehensive understanding of forwarding as the data plane function that actually moves packets. Let's consolidate the key concepts:
You now understand forwarding as the fast, per-packet, data-plane operation that moves packets through routers. Next, we'll examine the routing table in detail—the data structure that routing populates and forwarding consumes.