Loading learning content...
Imagine a restaurant where the chef prepares each dish one at a time—taking an order, cooking it completely, plating it, and serving it—before even looking at the next order. During a busy evening, customers would wait hours as the chef blocks on each individual dish.
Now imagine a different model: the chef takes multiple orders, starts several dishes simultaneously, checks on cooking timers, delegates prep tasks to sous chefs, and orchestrates completion across many concurrent activities. The throughput increases dramatically without adding more chefs.
This is the fundamental distinction between synchronous and asynchronous operation. In computing, the synchronous model where operations block until completion creates the same bottleneck as our first chef. Asynchronous operations allow systems to initiate multiple activities, continue working on other tasks, and respond when those activities complete—dramatically improving throughput and responsiveness.
By the end of this page, you will understand what asynchronous operations are at the operating system level, how they differ from synchronous operations, the distinction between blocking and non-blocking I/O, and why async programming has become essential for building scalable, responsive systems.
At the most fundamental level, synchronous and asynchronous describe how operations relate to the flow of program execution.
Synchronous execution means the caller initiates an operation and waits—the calling thread is suspended until the operation completes. The word 'synchronous' derives from Greek roots meaning 'same time'—the caller and the operation proceed together, in lockstep. The caller cannot proceed until the operation returns.
Asynchronous execution means the caller initiates an operation and continues immediately—the operation proceeds independently while the caller does other work. The word 'asynchronous' means 'not at the same time'—the caller and the operation proceed on separate timelines. The caller is notified of completion through some mechanism (callback, polling, or signal).
| Characteristic | Synchronous | Asynchronous |
|---|---|---|
| Caller Behavior | Blocks until operation completes | Continues immediately after initiating |
| Thread Utilization | Thread is idle during wait | Thread remains productive |
| Control Flow | Linear, sequential | Non-linear, requires coordination |
| Result Retrieval | Return value from function call | Callback, future, or polling |
| Error Handling | Traditional try-catch | Callback parameters or future inspection |
| Code Complexity | Simple, intuitive flow | More complex, requires explicit coordination |
| Scalability | Limited by thread availability | Can handle many concurrent operations |
A concrete example—reading a file:
Synchronous approach:
data = read_file("/path/to/file") // Thread blocks here
process(data) // Executes after read completes
Asynchronous approach:
read_file_async("/path/to/file", callback=process) // Returns immediately
do_other_work() // Executes while read proceeds
// Later: process() is called when read completes
In the synchronous model, do_other_work() cannot execute until read_file completes. In the asynchronous model, do_other_work() executes immediately, and process() is invoked later when the file read finishes.
Think of synchronous as placing an order and standing at the counter until it's ready (blocking). Asynchronous is placing an order, getting a buzzer, and sitting down to chat with friends—when the buzzer goes off, you pick up your food. The total preparation time is the same, but your time utilization is dramatically different.
To truly understand asynchronous programming, we must first understand what blocking means at the operating system level.
When a process or thread makes a blocking system call, the kernel moves that thread from the ready queue to a wait queue associated with the resource it's waiting for. The thread is no longer scheduled for CPU time—it's entirely suspended until the awaited event occurs.
Common blocking operations include:
12345678910111213141516171819202122232425262728293031323334353637
// Example of blocking system calls in C #include <stdio.h>#include <unistd.h>#include <fcntl.h> int main() { char buffer[1024]; ssize_t bytes_read; // This call BLOCKS the thread until data is available // The thread is moved to a wait queue by the kernel printf("About to block on read...\n"); bytes_read = read(STDIN_FILENO, buffer, sizeof(buffer)); // Execution resumes here ONLY after read() completes // Thread was not consuming CPU during the wait printf("Read %zd bytes\n", bytes_read); return 0;} /* * What happens at the OS level during the blocking read(): * * 1. User process calls read() system call * 2. Kernel checks if data is available in the input buffer * 3. If no data: kernel moves thread to TASK_INTERRUPTIBLE state * 4. Thread is added to a wait queue for the terminal device * 5. Scheduler selects another thread to run (context switch) * 6. When data arrives: device driver signals the wait queue * 7. Kernel moves thread back to TASK_RUNNING state * 8. Thread is added back to the run queue * 9. Eventually scheduler runs the thread again * 10. read() returns with the data */The kernel wait queue mechanism:
The kernel maintains wait queues for every resource that can cause blocking. When a thread blocks:
RUNNING to INTERRUPTIBLE (or UNINTERRUPTIBLE for non-abortable waits)When the awaited event occurs (e.g., I/O completion):
INTERRUPTIBLE to RUNNINGBlocking operations are efficient from a CPU perspective—a blocked thread consumes zero CPU cycles. The problem isn't CPU waste but thread waste. Each blocked thread ties up kernel resources (stack, scheduling structures, memory mappings). With limited threads available, blocking limits concurrency.
Non-blocking I/O takes a different approach: instead of suspending the caller when an operation cannot complete immediately, the system call returns immediately with an indication that the operation is not yet complete.
The caller can then:
The key insight: Non-blocking I/O separates the initiation of an operation from the completion. The thread remains schedulable and can make progress on other tasks.
12345678910111213141516171819202122232425262728293031323334353637383940414243
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <fcntl.h>#include <errno.h> int main() { char buffer[1024]; ssize_t bytes_read; // Set stdin to non-blocking mode int flags = fcntl(STDIN_FILENO, F_GETFL, 0); fcntl(STDIN_FILENO, F_SETFL, flags | O_NONBLOCK); // Now read() returns immediately even if no data is available bytes_read = read(STDIN_FILENO, buffer, sizeof(buffer)); if (bytes_read == -1 && errno == EAGAIN) { // EAGAIN means "no data available, try again later" // The thread is NOT blocked - we can do other work! printf("No data available - would have blocked\n"); printf("But thread continues executing...\n"); // Do some other useful work here do_other_computation(); // Maybe try again later } else if (bytes_read > 0) { printf("Read %zd bytes immediately\n", bytes_read); } return 0;} /* * Non-blocking I/O characteristics: * * - Returns immediately, never suspends the thread * - Returns EAGAIN/EWOULDBLOCK if operation would block * - Caller is responsible for retrying at appropriate times * - Enables single thread to handle multiple I/O streams * - Foundation for event-driven architectures */EAGAIN and EWOULDBLOCK:
These error codes are central to non-blocking I/O. They indicate that the operation cannot complete right now but is not an error—the caller should try again later when conditions change.
On most modern systems, these are the same value. The semantics: "What you asked for isn't ready, but it's not your fault and nothing's broken."
Non-blocking connects vs reads:
Different operations have different non-blocking behaviors:
| Operation | Blocking Behavior | Non-Blocking Behavior |
|---|---|---|
read() | Waits for data | Returns EAGAIN if no data |
write() | Waits for buffer space | Returns EAGAIN if buffer full |
accept() | Waits for connection | Returns EAGAIN if no pending connections |
connect() | Waits for handshake | Returns EINPROGRESS, must poll for completion |
Naive non-blocking I/O with constant polling (busy-waiting) wastes CPU cycles checking if operations are ready. This is worse than blocking! The solution is event-driven I/O multiplexing (select, poll, epoll, kqueue) which we cover in the Event Loops page.
There's an important distinction between non-blocking I/O and true asynchronous I/O (AIO). They're often conflated, but the mechanisms differ significantly.
Non-blocking I/O tells you that an operation would block, returning immediately. You must still perform the actual I/O operation yourself, typically when an event notifies you that the descriptor is ready.
Asynchronous I/O initiates the entire operation in the kernel, which completes it in the background. The kernel notifies you when the data is already transferred—no additional read/write call is needed.
Think of it this way:
| Model | Initiation | Completion | Notification | CPU Efficiency |
|---|---|---|---|---|
| Blocking | Synchronous | After wait | Function return | Good (thread sleeps) |
| Non-blocking | Immediate | Polling required | Poll result or event | Depends on polling strategy |
| Async I/O | Submit to kernel | Kernel performs I/O | Signal, callback, or queue | Excellent |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
#include <stdio.h>#include <stdlib.h>#include <aio.h>#include <fcntl.h>#include <errno.h>#include <signal.h>#include <string.h>#include <unistd.h> #define BUFFER_SIZE 65536 // Callback function invoked when AIO completesvoid aio_completion_handler(sigval_t sigval) { struct aiocb *req = (struct aiocb *)sigval.sival_ptr; // Check the result of the async operation int status = aio_error(req); if (status == 0) { ssize_t bytes = aio_return(req); printf("AIO completed: read %zd bytes\n", bytes); // Data is NOW in the buffer - kernel already did the transfer process_data((char *)req->aio_buf, bytes); } else { printf("AIO error: %s\n", strerror(status)); }} int main() { int fd = open("largefile.dat", O_RDONLY); if (fd < 0) { perror("open"); exit(1); } // Allocate buffer for the data char *buffer = malloc(BUFFER_SIZE); // Set up the async I/O control block struct aiocb cb; memset(&cb, 0, sizeof(cb)); cb.aio_fildes = fd; // File descriptor cb.aio_buf = buffer; // Buffer for data cb.aio_nbytes = BUFFER_SIZE; // How much to read cb.aio_offset = 0; // Where in file to read // Set up notification via thread callback cb.aio_sigevent.sigev_notify = SIGEV_THREAD; cb.aio_sigevent.sigev_notify_function = aio_completion_handler; cb.aio_sigevent.sigev_notify_attributes = NULL; cb.aio_sigevent.sigev_value.sival_ptr = &cb; // Submit the async read - THIS RETURNS IMMEDIATELY // The kernel will perform the read in the background if (aio_read(&cb) == -1) { perror("aio_read"); exit(1); } printf("AIO submitted - doing other work...\n"); // We can now do OTHER work while the kernel reads the file // This is the key advantage of true async I/O for (int i = 0; i < 10; i++) { printf("Working on task %d while read progresses...\n", i); do_computation(i); } // Callback will fire automatically when I/O completes // In production: proper event loop or main thread coordination sleep(2); // Simplified: wait for completion free(buffer); close(fd); return 0;}Linux AIO implementations:
Linux has evolved through several async I/O implementations:
1. POSIX AIO (aio_read/aio_write)
2. Native Linux AIO (io_submit/io_getevents)
3. io_uring (Linux 5.1+)
io_uring represents a paradigm shift in Linux I/O. By using ring buffers in shared memory between user and kernel space, it eliminates system call overhead for submissions and completions. A single io_uring can handle thousands of concurrent operations with minimal CPU usage.
The importance of asynchronous programming crystallized around the C10K problem—the challenge of handling 10,000 concurrent connections on a single server. This problem, articulated by Dan Kegel in 1999, exposed the fundamental limitations of thread-per-connection architectures.
The thread-per-connection model:
In traditional server designs, each client connection gets its own thread:
while (true) {
client = accept(server_socket); // Block waiting for connection
spawn_thread(handle_client, client);
}
void handle_client(socket) {
while (true) {
request = read(socket); // BLOCKS waiting for client
response = process(request);
write(socket, response); // BLOCKS until buffer available
}
}
Why this breaks at scale:
The math of thread limits:
| Threads | Stack Size (8MB default) | Total Stack Memory | Kernel Overhead (~6KB/thread) |
|---|---|---|---|
| 100 | 800 MB | ~1 GB | 600 KB |
| 1,000 | 8 GB | ~10 GB | 6 MB |
| 10,000 | 80 GB | ~100 GB | 60 MB |
| 100,000 | 800 GB | Impossible | 600 MB |
Even with reduced stack sizes (256KB), 10,000 threads consume 2.5 GB of stack memory. More critically, most of these threads are blocked most of the time, doing nothing but waiting for slow network I/O.
The async solution:
Asynchronous I/O inverts the model. Instead of one thread per connection, a small pool of threads (often equal to CPU cores) handles all connections:
while (true) {
events = wait_for_events(); // Single syscall monitors ALL connections
for event in events {
if (event.type == READ_READY) {
data = non_blocking_read(event.socket);
process_data(event.socket, data);
}
if (event.type == WRITE_READY) {
send_pending_data(event.socket);
}
}
}
This event-driven approach means:
Today's challenge isn't C10K but C10M—10 million concurrent connections. Systems like nginx and HAProxy achieve this through aggressive use of async I/O, event loops, and careful memory management. A single modern server can handle more connections than entire data centers could in 1999.
Different programming languages provide varying levels of async support, from manual callback management to sophisticated runtime systems. Understanding these differences helps you choose the right tool and understand what's happening beneath abstractions.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
// C: Manual async with epoll (Linux)// Explicit state machine management required #include <sys/epoll.h>#include <fcntl.h>#include <unistd.h> #define MAX_EVENTS 1024 typedef struct { int fd; int state; // Manual state tracking char buffer[4096]; size_t bytes_read;} Connection; int main() { int epoll_fd = epoll_create1(0); struct epoll_event events[MAX_EVENTS]; while (1) { int nfds = epoll_wait(epoll_fd, events, MAX_EVENTS, -1); for (int i = 0; i < nfds; i++) { Connection *conn = events[i].data.ptr; // Manual state machine for each connection switch (conn->state) { case STATE_READING: handle_read(conn); break; case STATE_PROCESSING: handle_process(conn); break; case STATE_WRITING: handle_write(conn); break; } } }} // Characteristics:// - Maximum control and performance// - Significant boilerplate and complexity// - Manual memory and state management// - Used in: nginx, Redis, high-performance serversAll these language-level async features ultimately rely on OS-level async primitives (epoll, kqueue, IOCP). The language runtime translates high-level async constructs into efficient system calls. Understanding the OS layer helps you debug performance issues that transcend any single language.
A common source of confusion is conflating asynchronous with parallel. These are related but distinct concepts.
Asynchronous: Operations that don't block the calling thread. The operations may or may not run simultaneously—the key property is that the caller continues without waiting.
Parallel: Operations that run simultaneously on multiple CPU cores. This requires multiple threads or processes executing at the same instant.
The key insight: Async is about waiting efficiently, while parallel is about computing simultaneously.
Single-threaded async example:
Time: |--0ms--|--1ms--|--2ms--|--3ms--|--4ms--|--5ms--|--6ms--|--7ms--|
Thread 1:
[Start A] [Start B] [Start C] [A done] [B done] [C done]
| | | | | |
+---------|---------|----------+ | |
+---------|---------------------+ |
+-------------------------------+
↓ All I/O operations overlap in wall-clock time
↓ But only ONE thread is executing at any moment
Multi-threaded parallel example:
Time: |--0ms--|--1ms--|--2ms--|--3ms--|
Thread 1: [=====COMPUTE A=====]
Thread 2: [=====COMPUTE B=====]
Thread 3: [=====COMPUTE C=====]
↓ All three threads execute SIMULTANEOUSLY
↓ Uses 3 CPU cores at once
Combined async + parallel:
Modern systems often combine both:
Use async when your bottleneck is waiting (I/O-bound): network requests, database queries, file operations. Use parallelism when your bottleneck is computation (CPU-bound): image processing, cryptography, simulations. Use both when you have mixed workloads.
We've established the foundation of asynchronous programming at the operating system level. Let's consolidate the key concepts:
What's next:
With async operations understood, we need a mechanism to be notified when operations complete. The oldest and most fundamental pattern is the callback—a function passed to an async operation that's invoked upon completion. The next page explores callbacks in depth: their mechanics, patterns, and the problems they introduce.
You now understand asynchronous operations at the OS level—the distinction between sync/async, blocking/non-blocking, and why async programming enables the high-concurrency systems that power the modern web. Next, we'll explore the callback pattern that makes async orchestration possible.