Computer NetworksMultiplexing and Demultiplexing

Transport Layer Multiplexing and Demultiplexing

LevelIntermediate

Duration60 mins

TopicMultiplexing and Demultiplexing

5 / 5

Process Mapping

The Final Mile: From Socket to Application

We've traced the journey of network data through multiplexing at the sender, demultiplexing at the receiver, port identification, and connection identification. But there's one final step in this journey: process mapping—the mechanism by which the operating system delivers demultiplexed data to the correct application process.

When a TCP segment arrives and is matched to a socket, what happens next? The data doesn't magically appear in the application's memory. The operating system must:

Identify which process owns the socket
Place data in the appropriate process buffer
Notify the process that data is available
Manage the handoff when the process reads the data

This page explores the complete process mapping system—how sockets, file descriptors, processes, and the I/O subsystem work together to complete data delivery.

What You Will Master

By the end of this page, you will understand how sockets connect to processes through file descriptors, how the operating system kernel manages socket-to-process mapping, how processes are notified of incoming data, and how various I/O models (blocking, non-blocking, multiplexed) affect network programming. You'll see the complete picture of data flow from network interface to application memory.

Sockets as File Descriptors

In Unix-like operating systems (Linux, macOS, BSD), sockets are represented as file descriptors. This elegant design means that network I/O uses the same interface as file I/O—read(), write(), close()—providing a unified programming model.

What is a File Descriptor?

A file descriptor (fd) is a small non-negative integer that serves as a handle to an open I/O resource. When a process opens a file, creates a socket, or opens a pipe, the kernel assigns a file descriptor and returns it to the process.

Process File Descriptor Table:
┌────┬──────────────────────────────┐
│ FD │ Resource                     │
├────┼──────────────────────────────┤
│  0 │ stdin (standard input)       │
│  1 │ stdout (standard output)     │
│  2 │ stderr (standard error)      │
│  3 │ /var/log/app.log (file)      │
│  4 │ TCP socket (listen :8080)    │
│  5 │ TCP socket (connected)       │
│  6 │ UDP socket                   │
└────┴──────────────────────────────┘

The first three descriptors (0, 1, 2) are reserved by convention. New resources get the lowest available number.

Everything is a File

The Unix philosophy 'everything is a file' extends to network sockets. This means tools like 'read' and 'write' work on sockets, shell redirection can involve sockets, and file-oriented utilities can often work with network data. This unification simplifies programming and enables powerful composition.

Socket Creation and File Descriptor Assignment:

// Create a TCP socket
int sockfd = socket(AF_INET, SOCK_STREAM, 0);
// sockfd now holds a file descriptor, e.g., 3

// Bind to local address
bind(sockfd, (struct sockaddr*)&addr, sizeof(addr));

// Listen for connections
listen(sockfd, backlog);

// Accept a connection - creates NEW file descriptor
int clientfd = accept(sockfd, NULL, NULL);
// clientfd is a different fd, e.g., 4

// Now we have:
// sockfd (3) - listening socket
// clientfd (4) - connected socket for specific client

Windows Socket Handles:

Windows uses a different model—sockets are handles (SOCKET type), not file descriptors. They use different API calls (recv/send vs read/write) but the conceptual mapping is similar.

// Windows
SOCKET sock = socket(AF_INET, SOCK_STREAM, 0);
recv(sock, buffer, length, 0);  // Not read()
send(sock, buffer, length, 0);  // Not write()
closesocket(sock);               // Not close()

The Kernel Socket Structure

When demultiplexing delivers data to a socket, it actually delivers to a kernel data structure. Understanding this structure reveals how process mapping works.

The Socket Kernel Object:

Each socket file descriptor references a kernel socket structure that contains:

Protocol information: TCP, UDP, etc.
Local and remote addresses: The identification tuple
State: Connection state (for TCP)
Buffers: Send and receive queues
Wait queues: Processes waiting for events
Options: Socket options (SO_REUSEADDR, etc.)
Statistics: Bytes sent, received, errors

Kernel Socket Structure (Simplified)

Pseudocode

# Simplified socket kernel structure
 
struct socket {
    # Identification and state
    int family;              # AF_INET, AF_INET6
    int type;                # SOCK_STREAM, SOCK_DGRAM
    int protocol;            # IPPROTO_TCP, IPPROTO_UDP
    
    # Process ownership
    struct file *file;       # Pointer to file structure
    pid_t owner_pid;         # Process ID of owner
    uid_t owner_uid;         # User ID of owner
    
    # Address information
    struct sockaddr_in local_addr;   # Local IP:port
    struct sockaddr_in remote_addr;  # Remote IP:port (TCP)
    
    # Protocol-specific state
    union {
        struct tcp_sock *tcp;  # TCP-specific data
        struct udp_sock *udp;  # UDP-specific data
    } protocol_data;
    
    # I/O buffers
    struct sk_buff_head receive_queue;  # Incoming data
    struct sk_buff_head send_queue;     # Outgoing data
    
    # Wait queues for blocking I/O
    wait_queue_head_t wait;
    
    # Socket options
    struct socket_options opts;
}
 
struct file {
    # Links file descriptor to socket
    struct socket *socket;
    struct inode *inode;
    unsigned int f_flags;    # O_NONBLOCK, etc.
    
    # Reference counting
    atomic_t f_count;
}

The Path from File Descriptor to Socket:

Process calls recv(fd=5, buffer, len)
    │
    ▼
Kernel looks up fd 5 in process's file descriptor table
    │
    ▼
File descriptor table entry points to struct file
    │
    ▼
struct file contains pointer to struct socket
    │
    ▼
struct socket contains receive_queue with buffered data
    │
    ▼
Data is copied from receive_queue to user buffer

Key Insight:

The file descriptor is just an index. The actual socket state (buffers, addresses, etc.) lives in kernel memory, protected from direct user access. System calls like recv() provide the controlled interface to access this kernel state.

Why Kernel Buffers?

Kernel buffers allow data to arrive even when the application isn't ready. If a TCP segment arrives but the application hasn't called recv() yet, the data waits in the kernel's receive queue. This decouples network timing from application timing, essential for reliable operation.

Process Ownership of Sockets

Sockets are owned by processes. This ownership determines which process receives data, which process can send, and what happens when the process terminates.

Ownership Establishment:

A process owns a socket if:

The process created the socket (socket() call)
The process inherited the socket from a parent (fork())
The socket was passed via Unix domain socket (fd passing)

Ownership and Fork:

int sockfd = socket(AF_INET, SOCK_STREAM, 0);
bind(sockfd, ...);
listen(sockfd, ...);

pid_t child = fork();

if (child == 0) {
    // Child process
    // Inherits sockfd - can accept() connections
    int clientfd = accept(sockfd, NULL, NULL);
} else {
    // Parent process  
    // Also has sockfd - both can accept()
}

After fork(), both parent and child share the same socket. This is how multi-process servers (like Apache's prefork MPM) work.

Single-Process Model

•One process handles all sockets
•Uses I/O multiplexing (select/poll/epoll)
•Example: nginx, redis
•Lower memory overhead
•No IPC needed
•Single point of failure

Multi-Process Model

•Multiple processes share listening socket
•Each accepts and handles connections
•Example: Apache prefork, PostgreSQL
•Higher memory (process per connection)
•Isolation between connections
•One crash doesn't affect others

Process Termination and Sockets:

When a process terminates, all its open file descriptors are closed:

Process exits with open TCP connections:
    │
    ▼
Kernel closes all file descriptors
    │
    ▼
For each TCP socket:
    └── If empty send buffer: Send FIN, enter FIN_WAIT_1
    └── If data in send buffer: Send data, then FIN
    └── Socket enters TIME_WAIT after full close
    │
    ▼
Eventually, socket resources are freed

Orphaned Connections:

If a process crashes abruptly (kill -9), the kernel still handles graceful close. However, the application may not have flushed all data. This is why applications should handle shutdown gracefully when possible.

Viewing Process-Socket Relationships:

# Linux: Show sockets with process info
ss -tnp
# State  Recv-Q  Send-Q  Local     Foreign     Process
# ESTAB  0       0       *:443     10.0.0.1    nginx,pid=1234,fd=10

# lsof: List open files (including sockets)
lsof -i :443
# COMMAND   PID   USER   FD   TYPE    SIZE/OFF NODE NAME
# nginx     1234  www    10u  IPv4    12345    TCP  *:443 (LISTEN)

Data Delivery to Processes

When demultiplexing routes a segment to a socket, how does the data actually reach the application process? This involves several steps within the kernel.

Step 1: Kernel Receives Segment

Network card receives frame
    │
    ▼
Interrupt handler copies frame to kernel memory
    │
    ▼
IP layer processes, validates, passes to transport
    │
    ▼
Transport layer demultiplexes to specific socket

Step 2: Data Enqueued on Socket

TCP receives validated segment
    │
    ▼
Sequence number checking, reassembly if needed
    │
    ▼
Data added to socket's receive_queue (sk_buff chain)
    │
    ▼
Socket buffer counter updated

Step 3: Process Notification

Socket checks if process is waiting for data
    │
    ├── If waiting: Wake process from sleep
    │
    ├── If using epoll/select: Mark fd as readable
    │
    └── If using signals (SIGIO): Send signal to process

Step 4: Process Reads Data

Process calls recv(sockfd, buffer, len, flags)
    │
    ▼
Kernel copies data from socket receive_queue to user buffer
    │
    ▼
receive_queue freed, socket buffer counters updated
    │
    ▼
Kernel sends window update to peer (more space in buffer)

Data Delivery Flow

Diagram

Network Interface Card (NIC)
         │
         │ Interrupt: "Frame arrived!"
         ▼
┌─────────────────────────────┐
│    Interrupt Handler        │  Kernel Space
│   (softirq/NAPI context)    │
│                             │
│   1. Copy frame to sk_buff  │
│   2. Pass up network stack  │
└──────────────┬──────────────┘
               │
               ▼
┌─────────────────────────────┐
│       IP Layer              │
│                             │
│   1. Parse IP header        │
│   2. Check destination IP   │
│   3. Pass to transport      │
└──────────────┬──────────────┘
               │
               ▼
┌─────────────────────────────┐
│     TCP Layer               │
│                             │
│   1. Parse TCP header       │
│   2. Lookup socket (4-tuple)│
│   3. Validate sequence num  │
│   4. Add to receive_queue   │
└──────────────┬──────────────┘
               │
               ▼
┌─────────────────────────────┐
│     Socket Layer            │
│                             │
│   1. Check for waiters      │
│   2. Wake blocked process   │
│   3. Update poll status     │
└──────────────┬──────────────┘
               │
═══════════════╪═══════════════  User/Kernel Boundary
               │
               ▼
┌─────────────────────────────┐
│    Application Process      │  User Space
│                             │
│   recv() returns with data  │
│   Data now in user buffer   │
└─────────────────────────────┘

The Copy Cost

Notice that data is copied at least twice: once from NIC to kernel buffer, once from kernel to user buffer. This copying has CPU cost. High-performance systems use techniques like zero-copy (sendfile, splice) or kernel bypass (DPDK, XDP) to reduce copies.

Blocking vs. Non-Blocking I/O

How a process waits for data significantly affects process mapping behavior. There are two fundamental modes: blocking and non-blocking.

Blocking I/O (Default):

When a process calls recv() on a blocking socket:

// Blocking mode (default)
int bytes = recv(sockfd, buffer, 1024, 0);
// Process sleeps here until:
// - Data arrives (bytes > 0)
// - Connection closed (bytes = 0)
// - Error occurs (bytes = -1)

Internally:

Process is placed on socket's wait queue
Process state set to SLEEPING
Scheduler runs other processes
When data arrives, process is woken
recv() returns with data

Non-Blocking I/O:

When a socket is set to non-blocking:

// Set non-blocking mode
int flags = fcntl(sockfd, F_GETFL, 0);
fcntl(sockfd, F_SETFL, flags | O_NONBLOCK);

int bytes = recv(sockfd, buffer, 1024, 0);
// Returns immediately!
// If data available: bytes > 0
// If no data: bytes = -1, errno = EWOULDBLOCK/EAGAIN
// If closed: bytes = 0

The process is never put to sleep; it must poll or use I/O multiplexing.

Blocking vs Non-Blocking I/O Comparison
Aspect	Blocking I/O	Non-Blocking I/O
recv() with no data	Process sleeps	Returns -1, EWOULDBLOCK
CPU usage	Efficient (sleeping uses no CPU)	Busy-wait wastes CPU if looped
Programming model	Simple, synchronous	Complex, must handle EWOULDBLOCK
Multiple sockets	Need threads per socket	Can handle many with one thread
Latency	Wake-up delay after data arrives	Minimal if polling frequently
Common usage	Simple clients, thread-per-connection	High-performance servers

Wait Queue Mechanics:

Blocking I/O uses kernel wait queues:

Process A calls recv() on empty socket:
    │
    ▼
Process A added to socket->wait_queue
    │
    ▼
Process A state = TASK_INTERRUPTIBLE (sleeping)
    │
    ▼
Scheduler switches to another process

... later, data arrives ...

Network stack adds data to socket receive_queue
    │
    ▼
Kernel walks socket->wait_queue, wakes all waiters
    │
    ▼
Process A state = TASK_RUNNING (runnable)
    │
    ▼
Scheduler eventually runs Process A
    │
    ▼
recv() copies data and returns

The Thundering Herd

If multiple processes wait on the same socket (e.g., multiple workers calling accept()), all are woken when a connection arrives, but only one can accept it. This 'thundering herd' wastes CPU. Modern kernels use flags like EPOLLEXCLUSIVE or SO_REUSEPORT to mitigate this.

I/O Multiplexing: Managing Many Sockets

A server handling thousands of connections needs an efficient way to know which sockets have data ready. I/O multiplexing APIs solve this problem.

The Problem:

Server has 10,000 connected sockets
How to efficiently wait for data on any of them?

Option A: Thread-per-socket (10,000 threads - expensive!)
Option B: Busy-poll all sockets (100% CPU wasteful)
Option C: I/O multiplexing (efficient solution ✓)

I/O Multiplexing APIs:

1. select() - The Classic:

fd_set readfds;
FD_ZERO(&readfds);
FD_SET(sock1, &readfds);
FD_SET(sock2, &readfds);

int ready = select(maxfd + 1, &readfds, NULL, NULL, &timeout);

if (FD_ISSET(sock1, &readfds)) {
    // sock1 has data
}

Limitations: O(n) per call, limited to 1024 fds typically.

2. poll() - Slightly Better:

struct pollfd fds[2];
fds[0] = {.fd = sock1, .events = POLLIN};
fds[1] = {.fd = sock2, .events = POLLIN};

int ready = poll(fds, 2, timeout_ms);

if (fds[0].revents & POLLIN) {
    // sock1 has data
}

No fd limit, still O(n) per call.

3. epoll() - Linux High Performance:

int epfd = epoll_create1(0);

struct epoll_event ev = {.events = EPOLLIN, .data.fd = sock1};
epoll_ctl(epfd, EPOLL_CTL_ADD, sock1, &ev);

struct epoll_event events[100];
int n = epoll_wait(epfd, events, 100, timeout_ms);

for (int i = 0; i < n; i++) {
    // events[i].data.fd has data
}

O(1) for adding/removing, O(ready) for wait - scales to millions of fds.

I/O Multiplexing API Comparison
API	Add/Remove	Wait	fd Limit	Platform
select()	O(n)	O(n)	~1024	All Unix, Windows
poll()	O(n)	O(n)	Unlimited	All Unix
epoll()	O(1)	O(ready)	Unlimited	Linux only
kqueue()	O(1)	O(ready)	Unlimited	BSD, macOS
IOCP	O(1)	O(ready)	Unlimited	Windows

How epoll Works Internally:

epoll_ctl(ADD) registers socket:
    │
    ▼
Socket's wait_queue gets special epoll callback
    │
    ▼
When data arrives on socket:
    └── Callback adds socket to epoll's ready list
    │
    ▼
epoll_wait() returns only ready sockets
    └── No need to scan all registered sockets

This is why epoll scales: instead of asking "is socket ready?" for each socket, it gets told when any socket becomes ready.

Event-Driven Architecture:

Modern servers combine non-blocking I/O with epoll:

while (running) {
    int n = epoll_wait(epfd, events, MAX_EVENTS, -1);
    
    for (int i = 0; i < n; i++) {
        if (events[i].data.fd == listen_sock) {
            accept_new_connection();
        } else {
            handle_client_data(events[i].data.fd);
        }
    }
}

A single thread can handle tens of thousands of connections efficiently.

Thread Models for Network I/O

How sockets are distributed across threads significantly impacts process mapping and performance. Several models exist.

Model 1: Thread-Per-Connection

Main Thread:
    while (true):
        clientfd = accept(listenfd)
        spawn_thread(handle_client, clientfd)

Worker Thread:
    while (connection_open):
        data = recv(clientfd)  # Blocking OK
        response = process(data)
        send(clientfd, response)
    close(clientfd)

Simple programming model
Context switching overhead with many connections
Memory overhead (stack per thread)
Works well for <100-1000 connections

Model 2: Event Loop (Single-Threaded)

Single Thread:
    while (true):
        ready_fds = epoll_wait(epfd)
        for fd in ready_fds:
            if fd is listen_socket:
                new_client = accept(fd)
                epoll_add(new_client)
            else:
                data = recv(fd)  # Non-blocking
                response = process(data)
                send(fd, response)

No thread synchronization needed
Can't utilize multiple CPU cores
Must avoid blocking operations
Good for I/O-bound workloads

Thread Model Comparison

Diagram

Model 1: Thread-Per-Connection
─────────────────────────────────────────
                               ┌─────────┐
                          ┌───►│Thread 1 │──► Client A
┌─────────┐  accept()    │    └─────────┘
│  Main   │──────────────┤    ┌─────────┐
│ Thread  │              ├───►│Thread 2 │──► Client B
│(listen) │              │    └─────────┘
└─────────┘              │    ┌─────────┐
                         └───►│Thread N │──► Client N
                              └─────────┘
 
 
Model 2: Single-Threaded Event Loop
─────────────────────────────────────────
┌────────────────────────────────────────┐
│            Single Thread               │
│  ┌──────────────────────────────────┐  │
│  │         Event Loop               │  │
│  │                                  │  │
│  │  ┌─────────┐  ┌─────────┐       │  │
│  │  │Client A │  │Client B │  ...  │  │
│  │  │  fd=5   │  │  fd=6   │       │  │
│  │  └─────────┘  └─────────┘       │  │
│  │           epoll_wait()          │  │
│  └──────────────────────────────────┘  │
└────────────────────────────────────────┘
 
 
Model 3: Thread Pool with Work Stealing
─────────────────────────────────────────
                              ┌─────────┐
┌───────────┐                 │Worker 1 │
│  Accept   │   Work Queue    │         │
│  Thread   │──────────────►  │ epoll   │
│           │                 └─────────┘
└───────────┘                 ┌─────────┐
                              │Worker 2 │
    All sockets               │         │
    distributed               │ epoll   │
    across workers            └─────────┘

Model 3: Multi-Threaded with I/O Multiplexing

Acceptor Thread:
    while (true):
        clientfd = accept(listenfd)
        worker = select_worker()  # Round-robin or least-loaded
        assign_to_worker(worker, clientfd)

Worker Thread (one per CPU core):
    while (true):
        ready_fds = epoll_wait(my_epfd)
        for fd in ready_fds:
            handle_event(fd)

This hybrid model (used by nginx, Node.js cluster):

Utilizes all CPU cores
Each worker handles many connections with epoll
Low context switching
Scales to millions of connections

SO_REUSEPORT: Kernel-Level Load Balancing

// Multiple processes can bind to same port
setsockopt(sockfd, SOL_SOCKET, SO_REUSEPORT, &one, sizeof(one));
bind(sockfd, ...);
listen(sockfd, ...);

// Kernel distributes incoming connections across all listeners

With SO_REUSEPORT, the kernel handles distribution—each worker has its own listening socket, and the kernel load-balances incoming connections.

Thread Safety with Sockets

While multiple threads can read from a socket, they may receive interleaved data. For TCP streams, this is problematic—you'll get scrambled data. Each TCP connection should be owned by one thread at a time, or protected with mutex for coordinated access.

Complete Data Flow: Network to Application

Let's trace a complete example from packet arrival to application processing, showing every step of process mapping.

Scenario:

Server: nginx listening on port 443
Client: Browser requesting a webpage
Data: HTTP request "GET / HTTP/1.1\r\n..."

Timeline:

Complete Data Flow Timeline

Trace

Time    Component              Action
═════   ═════════════════════  ═══════════════════════════════════
T+0ms   Network Card           Receives Ethernet frame
 
T+0.01  NIC DMA                Copies frame to kernel ring buffer
 
T+0.02  Interrupt              Hardware interrupt signals CPU
 
T+0.03  Interrupt Handler      Acknowledges interrupt, schedules softirq
 
T+0.10  Softirq Handler        Pulls frame from ring buffer
 
T+0.12  Ethernet Layer         Strips Ethernet header, identifies IP
 
T+0.15  IP Layer               Validates IP header, destination is local
                               Protocol field = 6 (TCP)
 
T+0.18  TCP Layer              Extracts 4-tuple:
                               (10.0.0.1:52341, 192.0.2.1:443)
 
T+0.20  Socket Lookup          hash(4-tuple) → lookup in connection table
                               Found: socket fd=12, owned by nginx pid=1234
 
T+0.22  TCP Processing         Validate sequence number: 12000 in window ✓
                               Add 500 bytes to socket receive queue
 
T+0.25  Socket Layer           Check socket->wait_queue for waiters
                               nginx worker (pid=1235) is in epoll_wait()
 
T+0.27  Epoll                  Add fd=12 to ready list for epfd=3
 
T+0.28  Scheduler              Mark nginx worker (pid=1235) as RUNNABLE
 
T+0.50  Scheduler              Context switch to nginx worker
 
T+0.52  nginx (user space)     epoll_wait() returns: fd=12 is readable
 
T+0.55  nginx (user space)     Calls recv(12, buffer, 4096, 0)
 
T+0.56  Kernel                 sys_recvfrom() system call entry
 
T+0.58  Kernel                 Look up fd=12 → socket structure
 
T+0.60  Kernel                 Copy 500 bytes from socket receive_queue
                               to user buffer at 0x7fff1234abc0
 
T+0.62  Kernel                 Update socket buffer counters
                               Schedule TCP window update
 
T+0.64  Kernel                 System call returns, bytes=500
 
T+0.65  nginx (user space)     recv() returns 500
 
T+1.00  nginx (user space)     Parses HTTP request: "GET / HTTP/1.1"
 
T+2.00  nginx (user space)     Prepares response, calls send()
 
═══════════════════════════════════════════════════════════════════
Total time from NIC to application: ~0.65ms (typical modern system)

Key Observations:

Multiple transitions: Interrupt → Softirq → Kernel thread → User process
Kernel as intermediary: All data passes through kernel buffers
Scheduler involvement: Process must be scheduled to run recv()
Copy operations: Data copied from NIC → kernel → user space
Sub-millisecond: Modern systems process packets very quickly

What Affects Latency:

Factor	Impact	Mitigation
Interrupt handling	~microseconds	Interrupt coalescing, NAPI
Socket lookup	~nanoseconds	Efficient hash tables
Process wake-up	~microseconds	Busy polling, DPDK
Memory copy	~microseconds	Zero-copy techniques
Scheduler latency	~microseconds	Real-time scheduling

Ultra-Low-Latency Approaches:

Kernel bypass (DPDK): Application directly accesses NIC, no kernel
Busy polling: Process spins waiting for data instead of sleeping
CPU pinning: Dedicate CPU cores to network processing

Summary: Process Mapping

We've completed our exploration of multiplexing and demultiplexing with process mapping—the final link connecting network segments to application processes. Let's consolidate the key concepts:

Key Takeaways

•Sockets are file descriptors — In Unix-like systems, sockets use the same interface as files, enabling unified I/O operations
•Kernel structures manage socket state — Socket data lives in kernel memory, accessed via system calls through file descriptors
•Processes own sockets — Socket ownership determines which process receives data and handles termination cleanup
•Data flows through kernel buffers — Incoming data is buffered in kernel space until the application reads it
•Blocking vs non-blocking I/O — Blocking puts processes to sleep; non-blocking requires polling or multiplexing
•I/O multiplexing scales to millions — epoll/kqueue/IOCP enable single threads to manage vast numbers of connections
•Thread models affect architecture — Thread-per-connection, event loop, and hybrid models each have trade-offs

Module Complete:

You've now mastered the complete multiplexing and demultiplexing system:

Multiplexing at sender: Aggregating data from multiple sockets into a single outbound stream
Demultiplexing at receiver: Routing incoming segments to correct sockets based on ports and connections
Port identification: The 16-bit identifiers enabling process-level addressing
Connection identification: The 4-tuple enabling TCP's massive concurrency
Process mapping: The final delivery from socket to application process

This knowledge forms the foundation for understanding how all networked applications function—from simple clients to high-performance servers handling millions of connections.

Module Complete

Congratulations! You've completed the Multiplexing and Demultiplexing module. You now understand the complete data path from application to network and back—the fundamental mechanism enabling all Internet communication. This knowledge is essential for network programming, system administration, and understanding distributed systems.

5 / 5

Loading learning content...

Computer NetworksMultiplexing and Demultiplexing

Transport Layer Multiplexing and Demultiplexing

LevelIntermediate

Duration60 mins

TopicMultiplexing and Demultiplexing

5 / 5

Process Mapping

The Final Mile: From Socket to Application

When a TCP segment arrives and is matched to a socket, what happens next? The data doesn't magically appear in the application's memory. The operating system must:

Identify which process owns the socket
Place data in the appropriate process buffer
Notify the process that data is available
Manage the handoff when the process reads the data

This page explores the complete process mapping system—how sockets, file descriptors, processes, and the I/O subsystem work together to complete data delivery.

What You Will Master

Sockets as File Descriptors

What is a File Descriptor?

Process File Descriptor Table:
┌────┬──────────────────────────────┐
│ FD │ Resource                     │
├────┼──────────────────────────────┤
│  0 │ stdin (standard input)       │
│  1 │ stdout (standard output)     │
│  2 │ stderr (standard error)      │
│  3 │ /var/log/app.log (file)      │
│  4 │ TCP socket (listen :8080)    │
│  5 │ TCP socket (connected)       │
│  6 │ UDP socket                   │
└────┴──────────────────────────────┘

The first three descriptors (0, 1, 2) are reserved by convention. New resources get the lowest available number.

Everything is a File

Socket Creation and File Descriptor Assignment:

// Create a TCP socket
int sockfd = socket(AF_INET, SOCK_STREAM, 0);
// sockfd now holds a file descriptor, e.g., 3

// Bind to local address
bind(sockfd, (struct sockaddr*)&addr, sizeof(addr));

// Listen for connections
listen(sockfd, backlog);

// Accept a connection - creates NEW file descriptor
int clientfd = accept(sockfd, NULL, NULL);
// clientfd is a different fd, e.g., 4

// Now we have:
// sockfd (3) - listening socket
// clientfd (4) - connected socket for specific client

Windows Socket Handles:

Windows uses a different model—sockets are handles (SOCKET type), not file descriptors. They use different API calls (recv/send vs read/write) but the conceptual mapping is similar.

// Windows
SOCKET sock = socket(AF_INET, SOCK_STREAM, 0);
recv(sock, buffer, length, 0);  // Not read()
send(sock, buffer, length, 0);  // Not write()
closesocket(sock);               // Not close()

The Kernel Socket Structure

When demultiplexing delivers data to a socket, it actually delivers to a kernel data structure. Understanding this structure reveals how process mapping works.

The Socket Kernel Object:

Each socket file descriptor references a kernel socket structure that contains:

Protocol information: TCP, UDP, etc.
Local and remote addresses: The identification tuple
State: Connection state (for TCP)
Buffers: Send and receive queues
Wait queues: Processes waiting for events
Options: Socket options (SO_REUSEADDR, etc.)
Statistics: Bytes sent, received, errors

Kernel Socket Structure (Simplified)

Pseudocode

# Simplified socket kernel structure
 
struct socket {
    # Identification and state
    int family;              # AF_INET, AF_INET6
    int type;                # SOCK_STREAM, SOCK_DGRAM
    int protocol;            # IPPROTO_TCP, IPPROTO_UDP
    
    # Process ownership
    struct file *file;       # Pointer to file structure
    pid_t owner_pid;         # Process ID of owner
    uid_t owner_uid;         # User ID of owner
    
    # Address information
    struct sockaddr_in local_addr;   # Local IP:port
    struct sockaddr_in remote_addr;  # Remote IP:port (TCP)
    
    # Protocol-specific state
    union {
        struct tcp_sock *tcp;  # TCP-specific data
        struct udp_sock *udp;  # UDP-specific data
    } protocol_data;
    
    # I/O buffers
    struct sk_buff_head receive_queue;  # Incoming data
    struct sk_buff_head send_queue;     # Outgoing data
    
    # Wait queues for blocking I/O
    wait_queue_head_t wait;
    
    # Socket options
    struct socket_options opts;
}
 
struct file {
    # Links file descriptor to socket
    struct socket *socket;
    struct inode *inode;
    unsigned int f_flags;    # O_NONBLOCK, etc.
    
    # Reference counting
    atomic_t f_count;
}

The Path from File Descriptor to Socket:

Process calls recv(fd=5, buffer, len)
    │
    ▼
Kernel looks up fd 5 in process's file descriptor table
    │
    ▼
File descriptor table entry points to struct file
    │
    ▼
struct file contains pointer to struct socket
    │
    ▼
struct socket contains receive_queue with buffered data
    │
    ▼
Data is copied from receive_queue to user buffer

Key Insight:

Why Kernel Buffers?

Process Ownership of Sockets

Sockets are owned by processes. This ownership determines which process receives data, which process can send, and what happens when the process terminates.

Ownership Establishment:

A process owns a socket if:

The process created the socket (socket() call)
The process inherited the socket from a parent (fork())
The socket was passed via Unix domain socket (fd passing)

Ownership and Fork:

int sockfd = socket(AF_INET, SOCK_STREAM, 0);
bind(sockfd, ...);
listen(sockfd, ...);

pid_t child = fork();

if (child == 0) {
    // Child process
    // Inherits sockfd - can accept() connections
    int clientfd = accept(sockfd, NULL, NULL);
} else {
    // Parent process  
    // Also has sockfd - both can accept()
}

After fork(), both parent and child share the same socket. This is how multi-process servers (like Apache's prefork MPM) work.

Single-Process Model

•One process handles all sockets
•Uses I/O multiplexing (select/poll/epoll)
•Example: nginx, redis
•Lower memory overhead
•No IPC needed
•Single point of failure

Multi-Process Model

•Multiple processes share listening socket
•Each accepts and handles connections
•Example: Apache prefork, PostgreSQL
•Higher memory (process per connection)
•Isolation between connections
•One crash doesn't affect others

Process Termination and Sockets:

When a process terminates, all its open file descriptors are closed:

Process exits with open TCP connections:
    │
    ▼
Kernel closes all file descriptors
    │
    ▼
For each TCP socket:
    └── If empty send buffer: Send FIN, enter FIN_WAIT_1
    └── If data in send buffer: Send data, then FIN
    └── Socket enters TIME_WAIT after full close
    │
    ▼
Eventually, socket resources are freed

Orphaned Connections:

Viewing Process-Socket Relationships:

# Linux: Show sockets with process info
ss -tnp
# State  Recv-Q  Send-Q  Local     Foreign     Process
# ESTAB  0       0       *:443     10.0.0.1    nginx,pid=1234,fd=10

# lsof: List open files (including sockets)
lsof -i :443
# COMMAND   PID   USER   FD   TYPE    SIZE/OFF NODE NAME
# nginx     1234  www    10u  IPv4    12345    TCP  *:443 (LISTEN)

Data Delivery to Processes

When demultiplexing routes a segment to a socket, how does the data actually reach the application process? This involves several steps within the kernel.

Step 1: Kernel Receives Segment

Network card receives frame
    │
    ▼
Interrupt handler copies frame to kernel memory
    │
    ▼
IP layer processes, validates, passes to transport
    │
    ▼
Transport layer demultiplexes to specific socket

Step 2: Data Enqueued on Socket

TCP receives validated segment
    │
    ▼
Sequence number checking, reassembly if needed
    │
    ▼
Data added to socket's receive_queue (sk_buff chain)
    │
    ▼
Socket buffer counter updated

Step 3: Process Notification

Socket checks if process is waiting for data
    │
    ├── If waiting: Wake process from sleep
    │
    ├── If using epoll/select: Mark fd as readable
    │
    └── If using signals (SIGIO): Send signal to process

Step 4: Process Reads Data

Process calls recv(sockfd, buffer, len, flags)
    │
    ▼
Kernel copies data from socket receive_queue to user buffer
    │
    ▼
receive_queue freed, socket buffer counters updated
    │
    ▼
Kernel sends window update to peer (more space in buffer)

Data Delivery Flow

Diagram

Network Interface Card (NIC)
         │
         │ Interrupt: "Frame arrived!"
         ▼
┌─────────────────────────────┐
│    Interrupt Handler        │  Kernel Space
│   (softirq/NAPI context)    │
│                             │
│   1. Copy frame to sk_buff  │
│   2. Pass up network stack  │
└──────────────┬──────────────┘
               │
               ▼
┌─────────────────────────────┐
│       IP Layer              │
│                             │
│   1. Parse IP header        │
│   2. Check destination IP   │
│   3. Pass to transport      │
└──────────────┬──────────────┘
               │
               ▼
┌─────────────────────────────┐
│     TCP Layer               │
│                             │
│   1. Parse TCP header       │
│   2. Lookup socket (4-tuple)│
│   3. Validate sequence num  │
│   4. Add to receive_queue   │
└──────────────┬──────────────┘
               │
               ▼
┌─────────────────────────────┐
│     Socket Layer            │
│                             │
│   1. Check for waiters      │
│   2. Wake blocked process   │
│   3. Update poll status     │
└──────────────┬──────────────┘
               │
═══════════════╪═══════════════  User/Kernel Boundary
               │
               ▼
┌─────────────────────────────┐
│    Application Process      │  User Space
│                             │
│   recv() returns with data  │
│   Data now in user buffer   │
└─────────────────────────────┘

The Copy Cost

Blocking vs. Non-Blocking I/O

How a process waits for data significantly affects process mapping behavior. There are two fundamental modes: blocking and non-blocking.

Blocking I/O (Default):

When a process calls recv() on a blocking socket:

// Blocking mode (default)
int bytes = recv(sockfd, buffer, 1024, 0);
// Process sleeps here until:
// - Data arrives (bytes > 0)
// - Connection closed (bytes = 0)
// - Error occurs (bytes = -1)

Internally:

Process is placed on socket's wait queue
Process state set to SLEEPING
Scheduler runs other processes
When data arrives, process is woken
recv() returns with data

Non-Blocking I/O:

When a socket is set to non-blocking:

// Set non-blocking mode
int flags = fcntl(sockfd, F_GETFL, 0);
fcntl(sockfd, F_SETFL, flags | O_NONBLOCK);

int bytes = recv(sockfd, buffer, 1024, 0);
// Returns immediately!
// If data available: bytes > 0
// If no data: bytes = -1, errno = EWOULDBLOCK/EAGAIN
// If closed: bytes = 0

The process is never put to sleep; it must poll or use I/O multiplexing.

Blocking vs Non-Blocking I/O Comparison
Aspect	Blocking I/O	Non-Blocking I/O
recv() with no data	Process sleeps	Returns -1, EWOULDBLOCK
CPU usage	Efficient (sleeping uses no CPU)	Busy-wait wastes CPU if looped
Programming model	Simple, synchronous	Complex, must handle EWOULDBLOCK
Multiple sockets	Need threads per socket	Can handle many with one thread
Latency	Wake-up delay after data arrives	Minimal if polling frequently
Common usage	Simple clients, thread-per-connection	High-performance servers

Wait Queue Mechanics:

Blocking I/O uses kernel wait queues:

Process A calls recv() on empty socket:
    │
    ▼
Process A added to socket->wait_queue
    │
    ▼
Process A state = TASK_INTERRUPTIBLE (sleeping)
    │
    ▼
Scheduler switches to another process

... later, data arrives ...

Network stack adds data to socket receive_queue
    │
    ▼
Kernel walks socket->wait_queue, wakes all waiters
    │
    ▼
Process A state = TASK_RUNNING (runnable)
    │
    ▼
Scheduler eventually runs Process A
    │
    ▼
recv() copies data and returns

The Thundering Herd

I/O Multiplexing: Managing Many Sockets

A server handling thousands of connections needs an efficient way to know which sockets have data ready. I/O multiplexing APIs solve this problem.

The Problem:

Server has 10,000 connected sockets
How to efficiently wait for data on any of them?

Option A: Thread-per-socket (10,000 threads - expensive!)
Option B: Busy-poll all sockets (100% CPU wasteful)
Option C: I/O multiplexing (efficient solution ✓)

I/O Multiplexing APIs:

1. select() - The Classic:

fd_set readfds;
FD_ZERO(&readfds);
FD_SET(sock1, &readfds);
FD_SET(sock2, &readfds);

int ready = select(maxfd + 1, &readfds, NULL, NULL, &timeout);

if (FD_ISSET(sock1, &readfds)) {
    // sock1 has data
}

Limitations: O(n) per call, limited to 1024 fds typically.

2. poll() - Slightly Better:

struct pollfd fds[2];
fds[0] = {.fd = sock1, .events = POLLIN};
fds[1] = {.fd = sock2, .events = POLLIN};

int ready = poll(fds, 2, timeout_ms);

if (fds[0].revents & POLLIN) {
    // sock1 has data
}

No fd limit, still O(n) per call.

3. epoll() - Linux High Performance:

int epfd = epoll_create1(0);

struct epoll_event ev = {.events = EPOLLIN, .data.fd = sock1};
epoll_ctl(epfd, EPOLL_CTL_ADD, sock1, &ev);

struct epoll_event events[100];
int n = epoll_wait(epfd, events, 100, timeout_ms);

for (int i = 0; i < n; i++) {
    // events[i].data.fd has data
}

O(1) for adding/removing, O(ready) for wait - scales to millions of fds.

I/O Multiplexing API Comparison
API	Add/Remove	Wait	fd Limit	Platform
select()	O(n)	O(n)	~1024	All Unix, Windows
poll()	O(n)	O(n)	Unlimited	All Unix
epoll()	O(1)	O(ready)	Unlimited	Linux only
kqueue()	O(1)	O(ready)	Unlimited	BSD, macOS
IOCP	O(1)	O(ready)	Unlimited	Windows

How epoll Works Internally:

epoll_ctl(ADD) registers socket:
    │
    ▼
Socket's wait_queue gets special epoll callback
    │
    ▼
When data arrives on socket:
    └── Callback adds socket to epoll's ready list
    │
    ▼
epoll_wait() returns only ready sockets
    └── No need to scan all registered sockets

This is why epoll scales: instead of asking "is socket ready?" for each socket, it gets told when any socket becomes ready.

Event-Driven Architecture:

Modern servers combine non-blocking I/O with epoll:

while (running) {
    int n = epoll_wait(epfd, events, MAX_EVENTS, -1);
    
    for (int i = 0; i < n; i++) {
        if (events[i].data.fd == listen_sock) {
            accept_new_connection();
        } else {
            handle_client_data(events[i].data.fd);
        }
    }
}

A single thread can handle tens of thousands of connections efficiently.

Thread Models for Network I/O

How sockets are distributed across threads significantly impacts process mapping and performance. Several models exist.

Model 1: Thread-Per-Connection

Main Thread:
    while (true):
        clientfd = accept(listenfd)
        spawn_thread(handle_client, clientfd)

Worker Thread:
    while (connection_open):
        data = recv(clientfd)  # Blocking OK
        response = process(data)
        send(clientfd, response)
    close(clientfd)

Simple programming model
Context switching overhead with many connections
Memory overhead (stack per thread)
Works well for <100-1000 connections

Model 2: Event Loop (Single-Threaded)

Single Thread:
    while (true):
        ready_fds = epoll_wait(epfd)
        for fd in ready_fds:
            if fd is listen_socket:
                new_client = accept(fd)
                epoll_add(new_client)
            else:
                data = recv(fd)  # Non-blocking
                response = process(data)
                send(fd, response)

No thread synchronization needed
Can't utilize multiple CPU cores
Must avoid blocking operations
Good for I/O-bound workloads

Thread Model Comparison

Diagram

Model 1: Thread-Per-Connection
─────────────────────────────────────────
                               ┌─────────┐
                          ┌───►│Thread 1 │──► Client A
┌─────────┐  accept()    │    └─────────┘
│  Main   │──────────────┤    ┌─────────┐
│ Thread  │              ├───►│Thread 2 │──► Client B
│(listen) │              │    └─────────┘
└─────────┘              │    ┌─────────┐
                         └───►│Thread N │──► Client N
                              └─────────┘
 
 
Model 2: Single-Threaded Event Loop
─────────────────────────────────────────
┌────────────────────────────────────────┐
│            Single Thread               │
│  ┌──────────────────────────────────┐  │
│  │         Event Loop               │  │
│  │                                  │  │
│  │  ┌─────────┐  ┌─────────┐       │  │
│  │  │Client A │  │Client B │  ...  │  │
│  │  │  fd=5   │  │  fd=6   │       │  │
│  │  └─────────┘  └─────────┘       │  │
│  │           epoll_wait()          │  │
│  └──────────────────────────────────┘  │
└────────────────────────────────────────┘
 
 
Model 3: Thread Pool with Work Stealing
─────────────────────────────────────────
                              ┌─────────┐
┌───────────┐                 │Worker 1 │
│  Accept   │   Work Queue    │         │
│  Thread   │──────────────►  │ epoll   │
│           │                 └─────────┘
└───────────┘                 ┌─────────┐
                              │Worker 2 │
    All sockets               │         │
    distributed               │ epoll   │
    across workers            └─────────┘

Model 3: Multi-Threaded with I/O Multiplexing

Acceptor Thread:
    while (true):
        clientfd = accept(listenfd)
        worker = select_worker()  # Round-robin or least-loaded
        assign_to_worker(worker, clientfd)

Worker Thread (one per CPU core):
    while (true):
        ready_fds = epoll_wait(my_epfd)
        for fd in ready_fds:
            handle_event(fd)

This hybrid model (used by nginx, Node.js cluster):

Utilizes all CPU cores
Each worker handles many connections with epoll
Low context switching
Scales to millions of connections

SO_REUSEPORT: Kernel-Level Load Balancing

// Multiple processes can bind to same port
setsockopt(sockfd, SOL_SOCKET, SO_REUSEPORT, &one, sizeof(one));
bind(sockfd, ...);
listen(sockfd, ...);

// Kernel distributes incoming connections across all listeners

With SO_REUSEPORT, the kernel handles distribution—each worker has its own listening socket, and the kernel load-balances incoming connections.

Thread Safety with Sockets

Complete Data Flow: Network to Application

Let's trace a complete example from packet arrival to application processing, showing every step of process mapping.

Scenario:

Server: nginx listening on port 443
Client: Browser requesting a webpage
Data: HTTP request "GET / HTTP/1.1\r\n..."

Timeline:

Complete Data Flow Timeline

Trace

Time    Component              Action
═════   ═════════════════════  ═══════════════════════════════════
T+0ms   Network Card           Receives Ethernet frame
 
T+0.01  NIC DMA                Copies frame to kernel ring buffer
 
T+0.02  Interrupt              Hardware interrupt signals CPU
 
T+0.03  Interrupt Handler      Acknowledges interrupt, schedules softirq
 
T+0.10  Softirq Handler        Pulls frame from ring buffer
 
T+0.12  Ethernet Layer         Strips Ethernet header, identifies IP
 
T+0.15  IP Layer               Validates IP header, destination is local
                               Protocol field = 6 (TCP)
 
T+0.18  TCP Layer              Extracts 4-tuple:
                               (10.0.0.1:52341, 192.0.2.1:443)
 
T+0.20  Socket Lookup          hash(4-tuple) → lookup in connection table
                               Found: socket fd=12, owned by nginx pid=1234
 
T+0.22  TCP Processing         Validate sequence number: 12000 in window ✓
                               Add 500 bytes to socket receive queue
 
T+0.25  Socket Layer           Check socket->wait_queue for waiters
                               nginx worker (pid=1235) is in epoll_wait()
 
T+0.27  Epoll                  Add fd=12 to ready list for epfd=3
 
T+0.28  Scheduler              Mark nginx worker (pid=1235) as RUNNABLE
 
T+0.50  Scheduler              Context switch to nginx worker
 
T+0.52  nginx (user space)     epoll_wait() returns: fd=12 is readable
 
T+0.55  nginx (user space)     Calls recv(12, buffer, 4096, 0)
 
T+0.56  Kernel                 sys_recvfrom() system call entry
 
T+0.58  Kernel                 Look up fd=12 → socket structure
 
T+0.60  Kernel                 Copy 500 bytes from socket receive_queue
                               to user buffer at 0x7fff1234abc0
 
T+0.62  Kernel                 Update socket buffer counters
                               Schedule TCP window update
 
T+0.64  Kernel                 System call returns, bytes=500
 
T+0.65  nginx (user space)     recv() returns 500
 
T+1.00  nginx (user space)     Parses HTTP request: "GET / HTTP/1.1"
 
T+2.00  nginx (user space)     Prepares response, calls send()
 
═══════════════════════════════════════════════════════════════════
Total time from NIC to application: ~0.65ms (typical modern system)

Key Observations:

Multiple transitions: Interrupt → Softirq → Kernel thread → User process
Kernel as intermediary: All data passes through kernel buffers
Scheduler involvement: Process must be scheduled to run recv()
Copy operations: Data copied from NIC → kernel → user space
Sub-millisecond: Modern systems process packets very quickly

What Affects Latency:

Factor	Impact	Mitigation
Interrupt handling	~microseconds	Interrupt coalescing, NAPI
Socket lookup	~nanoseconds	Efficient hash tables
Process wake-up	~microseconds	Busy polling, DPDK
Memory copy	~microseconds	Zero-copy techniques
Scheduler latency	~microseconds	Real-time scheduling

Ultra-Low-Latency Approaches:

Kernel bypass (DPDK): Application directly accesses NIC, no kernel
Busy polling: Process spins waiting for data instead of sleeping
CPU pinning: Dedicate CPU cores to network processing

Summary: Process Mapping

We've completed our exploration of multiplexing and demultiplexing with process mapping—the final link connecting network segments to application processes. Let's consolidate the key concepts:

Key Takeaways

•Sockets are file descriptors — In Unix-like systems, sockets use the same interface as files, enabling unified I/O operations
•Kernel structures manage socket state — Socket data lives in kernel memory, accessed via system calls through file descriptors
•Processes own sockets — Socket ownership determines which process receives data and handles termination cleanup
•Data flows through kernel buffers — Incoming data is buffered in kernel space until the application reads it
•Blocking vs non-blocking I/O — Blocking puts processes to sleep; non-blocking requires polling or multiplexing
•I/O multiplexing scales to millions — epoll/kqueue/IOCP enable single threads to manage vast numbers of connections
•Thread models affect architecture — Thread-per-connection, event loop, and hybrid models each have trade-offs

Module Complete:

You've now mastered the complete multiplexing and demultiplexing system:

Multiplexing at sender: Aggregating data from multiple sockets into a single outbound stream
Demultiplexing at receiver: Routing incoming segments to correct sockets based on ports and connections
Port identification: The 16-bit identifiers enabling process-level addressing
Connection identification: The 4-tuple enabling TCP's massive concurrency
Process mapping: The final delivery from socket to application process

This knowledge forms the foundation for understanding how all networked applications function—from simple clients to high-performance servers handling millions of connections.

Module Complete

5 / 5