Loading learning content...
When you call read() on a file descriptor and the data isn't immediately available, what happens? Does your process continue executing? Does it spin in a loop? Does it yield the CPU? The answer to these questions defines the most fundamental I/O model in computing: blocking I/O.
Blocking I/O is the default behavior for virtually all I/O operations in Unix-like systems and Windows. It's the model that most programmers encounter first, and it shapes how we think about program execution. Understanding blocking I/O deeply is essential—not just to use it correctly, but to understand why other I/O models exist and when they're necessary.
By the end of this page, you will understand the precise semantics of blocking I/O, how the kernel manages blocked processes, the relationship between blocking and process scheduling, and the implications for application design. You'll see exactly what happens in the kernel when a process blocks, and why this seemingly simple model has profound implications for system architecture.
Blocking I/O is an I/O model where a system call does not return to the calling process until the requested operation is complete—meaning the data is available for a read or the data has been accepted for a write. During this waiting period, the process is said to be blocked or sleeping, and the CPU is available for other processes.
The formal definition:
A blocking I/O operation suspends the execution of the calling thread until:
This is in contrast to other models where the system call returns immediately, regardless of whether the operation is complete.
When you open a file, socket, pipe, or any I/O resource, it's opened in blocking mode by default. This means every read() and write() can potentially block. Non-blocking behavior must be explicitly requested using flags like O_NONBLOCK.
The mental model:
Think of blocking I/O like ordering at a restaurant with table service. You place your order (make a system call), and then you wait at your table (process is suspended). You can't do anything else until your food arrives (data becomes available). The waiter (kernel) serves other tables (runs other processes) while your food is being prepared.
This model is synchronous—the caller and the I/O operation move in lockstep. The caller's thread of execution cannot proceed until the I/O completes.
| Operation | Blocks When | Returns When | Error Behavior |
|---|---|---|---|
| read(fd, buf, n) | No data available in buffer | At least 1 byte available or EOF | Returns -1 with errno set |
| write(fd, buf, n) | Kernel buffer full | At least 1 byte accepted | Returns -1 with errno set |
| accept(sockfd, ...) | No pending connections | Connection available | Returns -1 with errno set |
| connect(sockfd, ...) | TCP handshake in progress | Connection established | Returns -1 with errno set |
| recv(sockfd, ...) | No data in socket buffer | Data received or connection closed | Returns -1 with errno set |
| send(sockfd, ...) | Send buffer full | Data queued for transmission | Returns -1 with errno set |
When a process makes a blocking I/O call, a sophisticated dance occurs between user space and kernel space. Understanding this dance is crucial for system programmers and anyone diagnosing I/O performance issues.
The blocking sequence in detail:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
// Simplified kernel code demonstrating blocking I/O implementation// This is representative of how wait queues work in Linux struct wait_queue_entry { struct task_struct *task; struct list_head list; unsigned int flags;}; // The wait queue associated with a socket's receive bufferstruct wait_queue_head socket_wait_queue; // Inside the read() system call implementationssize_t socket_read(struct socket *sock, char __user *buf, size_t count) { struct sock *sk = sock->sk; DEFINE_WAIT(wait); // Declare a wait queue entry for current process // Lock the socket to check buffer state lock_sock(sk); while (1) { // Check if data is available if (skb_queue_len(&sk->sk_receive_queue) > 0) { // Data available! Copy to user space and return ssize_t copied = copy_data_to_user(sk, buf, count); release_sock(sk); return copied; } // No data available - prepare to sleep // Add ourselves to the socket's wait queue prepare_to_wait(&sk->sk_wq->wait, &wait, TASK_INTERRUPTIBLE); // Release lock before sleeping (important for deadlock prevention) release_sock(sk); // Check for signals before actually sleeping if (signal_pending(current)) { finish_wait(&sk->sk_wq->wait, &wait); return -EINTR; // Interrupted by signal } // Actually sleep - scheduler takes over // CPU is now free for other processes schedule(); // We've been woken up! Clean up wait queue entry finish_wait(&sk->sk_wq->wait, &wait); // Re-acquire lock and loop to check for data lock_sock(sk); }} // Called by network stack when data arrives (softirq context)void tcp_data_ready(struct sock *sk) { // Wake up any processes waiting for data on this socket wake_up_interruptible(&sk->sk_wq->wait);}Wait queues are one of the most important kernel data structures. Every blockable resource—files, sockets, pipes, devices, IPC mechanisms—has associated wait queues. When you see a system where blocking I/O performs poorly, the investigation often leads to wait queue behavior and wake-up patterns.
Blocking I/O directly affects process scheduling. When a process blocks, its state changes in ways that the scheduler understands. Let's examine the relevant process states in Linux:
TASK_RUNNING (R) The process is either currently executing on a CPU or is waiting in the run queue to be scheduled. This is the only state from which a process can be selected by the scheduler.
TASK_INTERRUPTIBLE (S) The process is sleeping, waiting for some condition (like I/O completion). It can be awakened by either:
Most blocking I/O uses this state. When you see "S" in ps output, this is what it means.
TASK_UNINTERRUPTIBLE (D)
The process is sleeping and cannot be interrupted by signals. This is used when the kernel cannot safely stop waiting—typically for disk I/O where the operation must complete to maintain filesystem consistency. Processes in this state show as "D" in ps and are sometimes called "uninterruptible sleep."
Processes in TASK_UNINTERRUPTIBLE cannot be killed—not even with SIGKILL. If you see processes stuck in D state, it typically indicates a kernel-level issue: a hung NFS mount, a malfunctioning device driver, or disk I/O stuck waiting for unresponsive hardware. These processes will remain until the I/O completes or the system is rebooted.
123456789101112131415161718192021222324252627282930313233343536
# Observing process states during I/O operations # Create a named pipe for controlled blockingmkfifo /tmp/test_pipe # In one terminal, start a blocking read (will block until data arrives)cat /tmp/test_pipe &READER_PID=$! # Check the process state - should show 'S' (interruptible sleep)ps -o pid,state,comm -p $READER_PID# Output: # PID S COMMAND# 12345 S cat # The process is sleeping, waiting for data on the pipe# Let's look at where it's blocked:cat /proc/$READER_PID/wchan# Output: pipe_read (or similar kernel function name) # More detailed view with stack trace:cat /proc/$READER_PID/stack# Output shows kernel stack - blocked in pipe_read waiting for data # Now let's send data to unblock:echo "hello" > /tmp/test_pipe # The cat process wakes up, prints "hello", and exits # Clean uprm /tmp/test_pipe # For disk I/O, we might see TASK_UNINTERRUPTIBLE:# dd if=/dev/sda of=/dev/null bs=1M count=100 &# ps aux | grep dd# The 'D' state may appear briefly during actual disk I/O| Event | State Transition | Scheduler Action | CPU Impact |
|---|---|---|---|
| read() called, no data | RUNNING → INTERRUPTIBLE | Remove from run queue | CPU freed for other processes |
| Data arrives (interrupt) | INTERRUPTIBLE → RUNNING | Add to run queue | Will be scheduled when selected |
| Signal received while blocked | INTERRUPTIBLE → RUNNING | Add to run queue | Returns -EINTR to user space |
| disk read() initiated | RUNNING → UNINTERRUPTIBLE | Remove from run queue | Cannot be interrupted |
| Disk I/O completes | UNINTERRUPTIBLE → RUNNING | Add to run queue | Data available in page cache |
Understanding the precise semantics of blocking I/O calls is crucial for writing correct programs. Many subtle bugs arise from misunderstanding what these calls guarantee.
read() System CallSignature: ssize_t read(int fd, void *buf, size_t count);
Blocking behavior:
count bytes are availablecount)Critical insight: A successful read() may return fewer bytes than requested. This is called a short read and is perfectly normal behavior, especially for network sockets, pipes, and terminals. You must loop to read exactly count bytes.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475
#include <unistd.h>#include <errno.h>#include <string.h>#include <stdio.h> /** * Read exactly 'count' bytes from file descriptor. * Handles short reads by looping until all bytes are read. * * Returns: * count on success (all bytes read) * < count if EOF reached * -1 on error */ssize_t read_exact(int fd, void *buf, size_t count) { size_t total_read = 0; char *ptr = (char *)buf; while (total_read < count) { ssize_t n = read(fd, ptr + total_read, count - total_read); if (n < 0) { // Error occurred if (errno == EINTR) { // Interrupted by signal - retry continue; } // Actual error return -1; } if (n == 0) { // EOF reached before reading 'count' bytes // Return what we got break; } total_read += n; } return total_read;} /** * INCORRECT version - common bug! * Assumes read() always returns 'count' bytes. */ssize_t read_exact_WRONG(int fd, void *buf, size_t count) { // BUG: read() may return fewer bytes than requested! return read(fd, buf, count);} // Example usageint main() { char buffer[1024]; // Suppose we're reading from a network socket // WRONG: Assumes we get all 1024 bytes if (read(socket_fd, buffer, 1024) == 1024) { // This check often fails on sockets! } // CORRECT: Loop until we have all data ssize_t bytes = read_exact(socket_fd, buffer, 1024); if (bytes < 0) { perror("read failed"); } else if (bytes < 1024) { printf("EOF: got only %zd bytes", bytes); } else { // Full message received } return 0;}write() System CallSignature: ssize_t write(int fd, const void *buf, size_t count);
Blocking behavior:
count bytes are writtencount)Critical insight: Short writes are less common than short reads for regular files, but they're very common for sockets. Always check the return value and loop if necessary.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
#include <unistd.h>#include <errno.h> /** * Write exactly 'count' bytes to file descriptor. * Handles short writes and EINTR by looping. * * Returns: * count on success * -1 on error (partial writes may have occurred!) */ssize_t write_exact(int fd, const void *buf, size_t count) { size_t total_written = 0; const char *ptr = (const char *)buf; while (total_written < count) { ssize_t n = write(fd, ptr + total_written, count - total_written); if (n < 0) { if (errno == EINTR) { // Interrupted by signal - retry continue; } // Actual error - note that some bytes may have been written! return -1; } // Note: write() returning 0 is unusual but possible // for some special files. For regular files and sockets, // it indicates an error condition. if (n == 0) { // This shouldn't happen for blocking writes // to regular files or sockets, but handle defensively continue; } total_written += n; } return total_written;} // Example: Sending a complete message over a socketint send_message(int sock, const char *message, size_t len) { ssize_t written = write_exact(sock, message, len); if (written < 0) { perror("send_message failed"); return -1; } // All bytes sent return 0;}When a process receives a signal while blocked in a system call, the call may return early with errno set to EINTR. This is NOT an error—it's the kernel telling you 'something else needs your attention.' Proper code must check for EINTR and retry the operation. Alternatively, use the SA_RESTART flag when installing signal handlers to have the kernel automatically restart interrupted calls.
Different I/O resources exhibit different blocking behaviors. Understanding these nuances is essential for building reliable systems.
Blocking on regular files is typically brief—data is either in the page cache (instant), or it must be read from disk (milliseconds to seconds).
Key characteristics:
Network sockets exhibit the most variable blocking behavior—from microseconds to indefinitely.
Key characteristics:
Critical consideration: TCP sockets can block indefinitely if the peer becomes unreachable. Always use timeouts for production code.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
#include <sys/socket.h>#include <sys/time.h> /** * Set socket receive timeout. * After timeout expires, read() will return -1 with errno EAGAIN or EWOULDBLOCK. */int set_socket_timeout(int sockfd, int timeout_seconds) { struct timeval tv; tv.tv_sec = timeout_seconds; tv.tv_usec = 0; // SO_RCVTIMEO: Receive timeout if (setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv)) < 0) { perror("setsockopt SO_RCVTIMEO"); return -1; } // SO_SNDTIMEO: Send timeout if (setsockopt(sockfd, SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof(tv)) < 0) { perror("setsockopt SO_SNDTIMEO"); return -1; } return 0;} // Usage exampleint main() { int client_sock = accept(server_sock, NULL, NULL); // Set 30-second timeout for all I/O on this socket set_socket_timeout(client_sock, 30); char buffer[1024]; ssize_t n = read(client_sock, buffer, sizeof(buffer)); if (n < 0) { if (errno == EAGAIN || errno == EWOULDBLOCK) { // Timeout occurred - client too slow printf("Read timed out"); } else { perror("read failed"); } } return 0;}Pipes exhibit blocking behavior based on their internal buffer state.
Key characteristics:
| Resource | Typical Block Duration | Maximum Duration | Notes |
|---|---|---|---|
| Regular file (cached) | Microseconds | Milliseconds | Page cache hit, memory copy only |
| Regular file (disk) | 1-100 milliseconds | Seconds | Depends on disk speed, queue depth |
| SSD random read | 50-200 microseconds | Milliseconds | Flash latency plus software overhead |
| HDD random read | 5-15 milliseconds | Hundreds of ms | Dominated by seek time |
| Local socket | Microseconds | Depends on peer | Loopback is fast, but peer can be slow |
| Remote socket (LAN) | Microseconds to milliseconds | Indefinite | Network latency plus peer processing |
| Remote socket (WAN) | Tens to hundreds of ms | Indefinite | Wide variation, packet loss, congestion |
| Pipe (data available) | Microseconds | Microseconds | Memory-to-memory copy |
| Pipe (empty) | Depends on writer | Indefinite | Blocks until writer provides data |
| Terminal read | Human scale | Indefinite | Waiting for user input |
Blocking I/O is the default for good reasons. Despite its limitations, it offers significant advantages that make it the right choice for many scenarios.
Blocking I/O is ideal for: command-line tools, batch processing, scripts, applications with one or few I/O sources, CPU-bound programs with occasional I/O, and situations where simplicity is more valuable than maximum concurrency.
Blocking I/O's simplicity comes with significant limitations that motivate the development of alternative I/O models.
The fundamental tension:
Blocking I/O provides an elegant per-operation model but struggles with concurrent operations. For a program that reads from one file and writes to another, blocking I/O is perfect. For a web server handling thousands of simultaneous connections, blocking I/O requires thousands of threads—and threads don't scale well to those numbers.
This tension drove the development of non-blocking I/O, I/O multiplexing, and asynchronous I/O, which we'll explore in subsequent sections.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
#include <pthread.h>#include <sys/socket.h>#include <unistd.h>#include <stdlib.h> /** * Classic thread-per-connection server. * Simple to understand but doesn't scale. */ void *handle_client(void *arg) { int client_fd = *(int *)arg; free(arg); char buffer[1024]; ssize_t n; // Blocking read from client - simple! while ((n = read(client_fd, buffer, sizeof(buffer))) > 0) { // Echo back - also blocking write(client_fd, buffer, n); } close(client_fd); return NULL;} int main() { int server_fd = socket(AF_INET, SOCK_STREAM, 0); // ... bind and listen ... while (1) { int *client_fd = malloc(sizeof(int)); *client_fd = accept(server_fd, NULL, NULL); // Blocking! pthread_t thread; pthread_create(&thread, NULL, handle_client, client_fd); pthread_detach(thread); // Problem: With 10,000 clients, we have 10,000 threads! // Each thread consumes: // - ~8KB-8MB stack space (platform dependent) // - Kernel resources for thread scheduling // - Context switch overhead when switching threads // // At some point, thread creation fails or performance degrades // severely due to memory pressure and scheduling overhead. } return 0;}Blocking I/O is the foundational I/O model that every systems programmer must understand. Let's consolidate the key concepts:
What's next:
Now that we understand blocking I/O's semantics and limitations, we'll explore non-blocking I/O—a model where system calls return immediately even when the operation cannot complete. Non-blocking I/O addresses some of blocking's limitations but introduces new challenges around polling and state management.
You now have a deep understanding of blocking I/O—the synchronous model that underlies most I/O programming. You understand the kernel mechanisms (wait queues, process states), the system call semantics (short reads, EINTR), and the fundamental tradeoffs. Next, we'll explore how non-blocking I/O changes the programming model.