Loading learning content...
When a segment arrives at a receiving host, it carries a payload destined for one specific application among potentially hundreds running on that machine. The network layer has done its job—it delivered the IP datagram to the correct host based on the destination IP address. But the transport layer now faces a critical question:
Which application process should receive this segment's data?
This question is answered through demultiplexing—the process of examining the transport layer header fields in an incoming segment and directing that segment to the correct socket (and therefore, the correct application process). Demultiplexing is the inverse of multiplexing: where multiplexing aggregates data from multiple sources into a single stream, demultiplexing distributes incoming data to its proper destinations.
By the end of this page, you will understand the complete demultiplexing process at the transport layer receiver side—how incoming segments are examined, how port numbers and connection information enable correct routing, and how the transport layer ensures that every byte of data reaches its intended application process. You'll master both connectionless (UDP) and connection-oriented (TCP) demultiplexing mechanisms.
Demultiplexing, in the context of the transport layer, refers to the process of examining the header fields in a received segment to identify the correct receiving socket, and then directing that segment's payload to that socket for delivery to the associated application process.
Formal Definition:
Transport Layer Demultiplexing is the process by which the transport layer at the receiver uses the port number fields (and, for TCP, the IP address fields as well) in a segment's header to deliver the segment's data to the correct socket—and thereby to the correct application process—on the receiving host.
The Demultiplexing Challenge:
Consider a web server handling requests from thousands of clients simultaneously. Each incoming TCP segment needs to be routed to the correct connection-specific socket. The transport layer must:
Think of demultiplexing like a post office sorting incoming mail. The building address (IP address) got the mail to the post office. Now, the apartment number (port number) or the full name and apartment (4-tuple for TCP) must be examined to place each envelope in the correct mailbox. Without this sorting process, all mail would pile up uselessly in the lobby.
Why Demultiplexing is Non-Trivial:
Demultiplexing might seem straightforward—just look at the destination port and deliver to the matching socket. But several complexities arise:
Socket Lookup Efficiency: With potentially millions of active sockets, finding the right one must be fast (hash tables, not linear search)
Protocol Differentiation: TCP and UDP use separate port namespaces, so the protocol field must be considered
Connection State: For TCP, the arriving segment must match an established connection or be a new connection request
Security Validation: Incoming segments should be validated before delivery (checksums, connection state)
Error Handling: What happens when a segment arrives for a non-existent socket? (ICMP port unreachable, TCP RST)
Let's trace the complete journey of a segment from network interface to application process. Understanding this flow reveals the elegant design of transport layer demultiplexing.
Step 1: Segment Arrival at Network Interface
An IP datagram arrives at the network interface. The network layer:
Step 2: Transport Layer Reception
The segment is passed to the appropriate transport layer protocol handler:
Step 3: Header Validation
The transport layer validates the segment:
Step 4: Socket Lookup
This is the core demultiplexing step:
Step 5: Payload Delivery
| Step | Layer | Action | Key Decision Point |
|---|---|---|---|
| 1 | Network | Receive IP datagram | Is destination IP this host? |
| 2 | Network→Transport | Pass segment to protocol handler | Which protocol (TCP/UDP)? |
| 3 | Transport | Validate segment headers | Is checksum valid? |
| 4 | Transport | Socket lookup | Which socket does this segment belong to? |
| 5 | Transport | Deliver to receive buffer | Is buffer space available? |
| 6 | Application | Application reads data | Is application ready to receive? |
Socket lookup happens for every incoming segment. On a busy server receiving millions of segments per second, even microsecond delays in lookup become significant. This is why production systems use highly optimized hash tables with O(1) average lookup time, often with specialized kernel data structures like Linux's socket hash tables.
UDP demultiplexing is straightforward because UDP is connectionless. There's no connection state to consider—each datagram is handled independently based solely on the destination port.
UDP Socket Identification:
A UDP socket is fully identified by a 2-tuple:
Note: The source IP and port are not used for demultiplexing in UDP. The same UDP socket receives datagrams from any source.
UDP Demultiplexing Algorithm:
1. Extract destination port from UDP header
2. Extract destination IP from IP header
3. Search socket table for entry matching (dest_IP, dest_port)
4. If found:
a. Place datagram in socket's receive queue
b. Include source IP:port so application knows where it came from
5. If not found:
a. Generate ICMP Port Unreachable message
b. Discard the datagram
12345678910111213141516171819202122232425262728293031
function udp_demultiplex(ip_header, udp_header, payload): # Extract identification fields dest_ip = ip_header.destination_address dest_port = udp_header.destination_port src_ip = ip_header.source_address src_port = udp_header.source_port # Lookup socket (only dest_ip and dest_port matter) socket = socket_table.lookup(dest_ip, dest_port, protocol=UDP) if socket is None: # No application listening on this port send_icmp_port_unreachable(src_ip, src_port) drop_datagram() return # Create a receive record with source info for recvfrom() receive_record = { source_address: (src_ip, src_port), data: payload } # Deliver to socket buffer if socket.receive_buffer.has_space(): socket.receive_buffer.enqueue(receive_record) wake_waiting_application(socket) else: # Buffer full - datagram dropped silently increment_counter(socket.receive_drops) drop_datagram()Key Characteristics of UDP Demultiplexing:
Single Socket, Multiple Sources: A UDP socket bound to port 53 receives DNS queries from all clients—there's no concept of separate connections
Source Information Preserved: While not used for demultiplexing, source IP:port is passed to the application so it can send responses
No Connection Rejection: UDP cannot reject a specific sender while accepting others at the socket level—this must be done at the application layer
Stateless Lookup: No connection state is consulted; each datagram is demultiplexed independently
Fast but Inflexible: Very quick socket lookup, but no way to have multiple sockets for different clients on the same port
UDP's 2-tuple matching reflects its connectionless nature. A DNS server doesn't need separate connections for each client—it just receives queries and sends responses. The simpler demultiplexing makes UDP faster and requires less kernel memory (no per-connection state), which is ideal for high-volume, stateless protocols.
TCP demultiplexing is more sophisticated than UDP because TCP maintains connection state. Each TCP connection is a distinct entity with its own socket, and demultiplexing must route segments to the correct connection-specific socket.
TCP Socket Identification:
A TCP connection socket is identified by the full 4-tuple:
This means two segments with the same destination port but different source addresses go to completely different sockets.
TCP Socket Types:
TCP demultiplexing must handle two types of sockets:
Listening Socket: Bound to (any_IP, local_port), waiting for new connection requests (SYN segments)
Connection Socket: Created after connection establishment, uniquely identified by the 4-tuple
TCP Demultiplexing Algorithm:
1. Extract 4-tuple: (src_IP, src_port, dest_IP, dest_port)
2. Search for connection socket matching the full 4-tuple
3. If found:
- Deliver segment to that connection's socket
4. If not found:
- Search for listening socket matching (any, dest_port)
- If found AND segment is SYN:
* Begin connection establishment
- If not found OR not SYN:
* Send TCP RST to sender
* Discard segment
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
function tcp_demultiplex(ip_header, tcp_header, payload): # Build the 4-tuple src_ip = ip_header.source_address dest_ip = ip_header.destination_address src_port = tcp_header.source_port dest_port = tcp_header.destination_port four_tuple = (src_ip, src_port, dest_ip, dest_port) # FIRST: Try to find established connection socket socket = connection_table.lookup(four_tuple) if socket is not None: # Found connection - deliver segment to it deliver_to_connection(socket, tcp_header, payload) return # SECOND: Look for listening socket listen_socket = listening_table.lookup(dest_ip, dest_port) if listen_socket is not None: if tcp_header.flags.SYN and not tcp_header.flags.ACK: # New connection request - begin handshake initiate_connection(listen_socket, four_tuple, tcp_header) return else: # Non-SYN to listening socket - invalid send_tcp_rst(src_ip, src_port, dest_port) return # THIRD: No matching socket if not tcp_header.flags.RST: # Don't RST an RST send_tcp_rst(src_ip, src_port, dest_port) drop_segment() function deliver_to_connection(socket, header, payload): # Verify sequence numbers, process flags if not valid_sequence(socket, header): handle_out_of_order(socket, header, payload) return # Add data to receive buffer socket.receive_buffer.add(payload, header.sequence_number) # Process flags (FIN, ACK, etc.) process_tcp_flags(socket, header) # Notify waiting application if socket.has_waiting_reader(): wake_reader(socket)Why TCP Needs the Full 4-Tuple:
Consider a web server on port 443:
Server IP: 203.0.113.50, Port: 443
Incoming segments:
Segment A: From 10.0.0.1:52341 → 203.0.113.50:443
Segment B: From 10.0.0.1:52342 → 203.0.113.50:443
Segment C: From 10.0.0.2:52341 → 203.0.113.50:443
All three segments have the same destination (203.0.113.50:443), but they belong to three different TCP connections:
Without the full 4-tuple, the server couldn't distinguish between these connections, and TCP's guarantees (ordering, reliability) would be impossible.
The listening socket is special—it receives only connection requests (SYN segments). When a new connection is established (3-way handshake completes), a new connection socket is created for that specific 4-tuple. The listening socket continues waiting for more connection requests. This is why a single server can accept thousands of connections on the same port.
Understanding the differences between UDP and TCP demultiplexing illuminates the fundamental design philosophies of these two protocols.
Side-by-Side Comparison:
| Aspect | UDP | TCP |
|---|---|---|
| Lookup Key | (dest_IP, dest_port) | (src_IP, src_port, dest_IP, dest_port) |
| Sockets Required | 1 per local port | 1 per connection |
| Memory Scaling | O(ports) | O(connections) |
| New Connection Handling | N/A (connectionless) | Listening socket → new connection socket |
| Client Isolation | None at transport layer | Complete isolation |
| Lookup Complexity | O(1) hash lookup | O(1) hash lookup + fallback |
| Error Response | ICMP Port Unreachable | TCP RST segment |
When Each Approach Excels:
UDP Demultiplexing Excels For:
TCP Demultiplexing Excels For:
Key Insight:
The complexity of TCP demultiplexing is the price paid for TCP's connection semantics. Each connection gets its own socket with dedicated buffers, sequence numbers, and congestion state. UDP's simpler demultiplexing trades this isolation for efficiency and simplicity.
Efficient demultiplexing depends on fast socket lookup. Modern operating systems use sophisticated data structures to achieve O(1) average-case lookup times even with millions of active sockets.
The Socket Table:
The kernel maintains socket tables—data structures mapping from identification tuples to socket structures. Different implementations exist:
1. Hash Table Implementation:
Hash Function:
hash = hash_function(src_ip, src_port, dest_ip, dest_port) % table_size
Table Structure:
socket_table[hash] → linked_list of sockets with same hash
Lookup:
1. Compute hash from 4-tuple
2. Search linked list for exact match
3. Average O(1) with good hash function
2. Multi-Level Hash (Linux Approach):
Linux uses separate hash tables for different purposes:
12345678910111213141516171819202122232425262728293031323334353637383940
# Linux-style socket hash table (simplified) struct inet_hashinfo { # Hash table for established TCP connections # Key: full 4-tuple struct hlist_head established_hash[HASH_SIZE]; # Hash table for listening sockets # Key: (local_addr, local_port) struct hlist_head listening_hash[LISTENING_HASH_SIZE]; # Lock per hash bucket for concurrent access spinlock_t bucket_locks[HASH_SIZE];} # Hash function for 4-tuple lookupfunction inet_ehashfn(src_ip, src_port, dest_ip, dest_port): # Mix all four values for uniform distribution h = src_ip ^ src_port ^ dest_ip ^ dest_port h = jenkins_hash(h) # Better mixing return h % HASH_SIZE # Lookup procedurefunction inet_lookup(protocol, src_ip, src_port, dest_ip, dest_port): # First try established connections hash = inet_ehashfn(src_ip, src_port, dest_ip, dest_port) bucket = established_hash[hash] for socket in bucket: if socket.matches(src_ip, src_port, dest_ip, dest_port): return socket # Fall back to listening sockets listen_hash = hash(dest_ip, dest_port) % LISTENING_HASH_SIZE for socket in listening_hash[listen_hash]: if socket.matches_local(dest_ip, dest_port): return socket return None # No matching socketHash Table Sizing:
The hash table size affects lookup performance:
| Active Connections | Ideal Hash Size | Avg Chain Length |
|---|---|---|
| 1,000 | 1,024 | ~1 |
| 10,000 | 16,384 | ~0.6 |
| 100,000 | 131,072 | ~0.76 |
| 1,000,000 | 1,048,576 | ~0.95 |
Modern systems use dynamically-sized hash tables that grow with connection count.
Concurrent Access:
Multiple CPU cores may demultiplex segments simultaneously. The socket table must handle concurrent access:
Attackers can craft packets with source/destination values that hash to the same bucket, creating long chains and degrading lookup performance. Modern systems use keyed hash functions (SipHash) where the key is randomized at boot time, making collision attacks computationally infeasible.
Robust demultiplexing must handle numerous edge cases and error conditions. Understanding these scenarios helps explain observed network behavior.
Edge Case 1: No Matching Socket
When a segment arrives for a port with no listening application:
Edge Case 2: Buffer Full
When the socket's receive buffer cannot accept more data:
Edge Case 3: Invalid Checksum
When checksum verification fails:
Edge Case 4: TCP Half-Open Connection
When a segment arrives for a connection that one side thinks is closed:
| Condition | UDP Response | TCP Response |
|---|---|---|
| Port closed | ICMP Port Unreachable | TCP RST |
| Buffer full | Silently drop | Window=0, TCP probing |
| Invalid checksum | Silently drop | Silently drop |
| Malformed header | Silently drop | Silently drop |
| Connection in TIME_WAIT | N/A | Depends on sequence number |
| SYN to non-listening | N/A | TCP RST |
| Protocol unreachable | ICMP Protocol Unreachable | N/A |
Edge Case 5: Wildcards and Specificity
Sockets can bind to specific addresses or wildcards:
Socket A: Bound to (192.168.1.100, 80) - Specific IP
Socket B: Bound to (0.0.0.0, 80) - Any IP (wildcard)
When a segment arrives for 192.168.1.100:80:
When a segment arrives for 10.0.0.1:80 (different IP on same host):
Lookup Priority:
Socket options like SO_REUSEADDR and SO_REUSEPORT allow multiple sockets to bind to the same address/port combination. In these cases, demultiplexing may distribute incoming connections across sockets (for load balancing) or use other rules to select the target socket. This is common in high-performance servers.
Let's trace through a real-world demultiplexing scenario to solidify our understanding.
Scenario: Web Server with Multiple Clients
Server: 203.0.113.50, running web server on port 443 Active sockets:
Incoming Segments:
| Segment | Src IP | Src Port | Dst IP | Dst Port | Flags | Demux Result |
|---|---|---|---|---|---|---|
| 1 | 198.51.100.10 | 52341 | 203.0.113.50 | 443 | ACK,PSH | → Connection A |
| 2 | 198.51.100.10 | 52342 | 203.0.113.50 | 443 | ACK | → Connection B |
| 3 | 192.0.2.50 | 48001 | 203.0.113.50 | 443 | FIN,ACK | → Connection C |
| 4 | 10.0.0.99 | 55555 | 203.0.113.50 | 443 | SYN | → Listening Socket (new conn) |
| 5 | 10.0.0.100 | 44444 | 203.0.113.50 | 80 | SYN | → RST (port 80 closed) |
| 6 | 198.51.100.10 | 52341 | 203.0.113.50 | 443 | ACK | → Connection A |
Detailed Trace of Segment 4 (New Connection):
1. IP layer receives datagram from 10.0.0.99
→ Validates IP header, destination matches our IP
→ Extracts TCP segment, passes to TCP handler
2. TCP handler receives segment
→ Extracts 4-tuple: (10.0.0.99, 55555, 203.0.113.50, 443)
→ Validates TCP checksum: OK
→ Segment has SYN flag set
3. Socket lookup (established connections)
→ Hash lookup for (10.0.0.99, 55555, 203.0.113.50, 443)
→ No match found (new client, new connection)
4. Socket lookup (listening sockets)
→ Search for socket listening on 443
→ Found: listening socket bound to (0.0.0.0, 443)
5. SYN handling
→ This is a new connection request
→ Create embryonic connection in SYN_RCVD state
→ Send SYN-ACK response
→ Add to pending connections queue
6. Eventually (when ACK received)
→ Move to ESTABLISHED state
→ Create new connection socket for 4-tuple
→ Accept() in application returns this socket
Performance Metrics:
On a modern server, this entire demultiplexing process takes:
At 10 million segments/second, demultiplexing consumes less than 1 CPU core.
You can observe demultiplexing in action using tools like netstat (shows socket table), ss (socket statistics on Linux), or tcpdump/Wireshark (shows segments with their port numbers). The socket table state directly reflects the demultiplexing mappings.
We've thoroughly explored how the transport layer receiver performs demultiplexing—the essential process that delivers incoming data to the correct application processes. Let's consolidate the key concepts:
What's Next:
Now that we understand both multiplexing (at the sender) and demultiplexing (at the receiver), we need to explore the identification mechanism that makes both possible: port identification. The next page examines how ports are assigned, structured, and used to enable the efficient matching we've described.
You now understand transport layer demultiplexing at the receiver. You can explain how incoming segments are routed to correct sockets, how UDP and TCP differ in their demultiplexing strategies, and how operating systems implement efficient socket lookup for high-performance networking.