Loading content...
What if you could tell the kernel "read 1KB from this file into this buffer," and then immediately continue with other work—no waiting, no polling? When the data is ready, the kernel would signal you. This is asynchronous I/O (AIO)—the most complete decoupling of I/O initiation from I/O completion.
Unlike blocking I/O (where you wait) or non-blocking I/O (where you poll), asynchronous I/O is truly fire-and-forget. You initiate operations and receive notifications when they're done. Your thread is never blocked or occupied checking status—it can do completely unrelated work until the I/O completes.
By the end of this page, you will understand the conceptual model of asynchronous I/O, the POSIX AIO interface and its limitations, Linux's native io_uring interface, and when asynchronous I/O provides advantages over multiplexed non-blocking I/O. You'll see both the promise and the practical realities of async I/O.
Asynchronous I/O is an I/O model where operations are submitted to the kernel and complete independently of the submitting thread. The thread receives notification when operations finish.
The formal definition:
An asynchronous I/O operation works as follows:
This model is fundamentally different from both blocking I/O (thread waits for completion) and non-blocking I/O (thread polls for completion).
| Model | Submit | Wait Mechanism | Thread During I/O | Notification |
|---|---|---|---|---|
| Blocking | Implicit | Kernel blocks thread | Suspended | Return from syscall |
| Non-blocking | Implicit | Application polls | Free but must poll | Successful read/write |
| Multiplexed | With select/poll/epoll | Kernel tracks readiness | Blocked in select | Readiness event |
| Asynchronous | Explicit submit | None required | Completely free | Signal/callback/queue |
The mental model:
Think of asynchronous I/O like a restaurant with a buzzer system. You place your order (submit I/O), receive a buzzer (request handle), and go about your business—shop, sit outside, chat with friends. When your food is ready, the buzzer vibrates (notification). You collect your food (retrieve result) and enjoy.
You're never standing in line, never repeatedly checking "is it ready?"—you're free until the buzzer goes off.
The term 'asynchronous' is often misused. In true async I/O, the I/O operation runs completely independently of your thread. Non-blocking I/O with epoll is sometimes called 'async' but technically it's synchronous non-blocking with multiplexing. In true async I/O, you don't call read()—you call aio_read() which returns before any reading happens.
POSIX defines a standardized interface for asynchronous I/O that's available on most Unix-like systems. While not always the most efficient implementation, it provides a portable API.
Core structures and functions:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
#include <aio.h> /** * The aiocb (AIO Control Block) structure describes an async operation. * It's the central data structure of POSIX AIO. */struct aiocb { int aio_fildes; // File descriptor off_t aio_offset; // File offset volatile void *aio_buf; // Buffer for data size_t aio_nbytes; // Number of bytes int aio_reqprio; // Request priority struct sigevent aio_sigevent; // Notification method int aio_lio_opcode; // Operation type (for lio_listio) /* Implementation-specific fields follow */}; /** * Notification methods (via aio_sigevent): * - SIGEV_NONE: No notification (application polls) * - SIGEV_SIGNAL: Deliver a signal when complete * - SIGEV_THREAD: Call a function in a new thread */ /* Core POSIX AIO functions */ // Submit async readint aio_read(struct aiocb *aiocbp); // Submit async writeint aio_write(struct aiocb *aiocbp); // Submit multiple operations at onceint lio_listio(int mode, struct aiocb *const list[], int nent, struct sigevent *sig); // Check if operation is complete (polling)int aio_error(const struct aiocb *aiocbp); // Get result of completed operationssize_t aio_return(struct aiocb *aiocbp); // Wait for completion with timeoutint aio_suspend(const struct aiocb *const list[], int nent, const struct timespec *timeout); // Cancel pending operationint aio_cancel(int fd, struct aiocb *aiocbp);Complete POSIX AIO example:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
#include <aio.h>#include <fcntl.h>#include <string.h>#include <stdio.h>#include <errno.h>#include <unistd.h>#include <stdlib.h> #define BUFFER_SIZE 4096 /** * Asynchronous file read using POSIX AIO */int main() { // Open file int fd = open("data.txt", O_RDONLY); if (fd < 0) { perror("open"); return 1; } // Allocate buffer (must remain valid until I/O completes!) char *buffer = malloc(BUFFER_SIZE); // Initialize the AIO control block struct aiocb cb; memset(&cb, 0, sizeof(cb)); cb.aio_fildes = fd; // File descriptor cb.aio_offset = 0; // Read from start cb.aio_buf = buffer; // Destination buffer cb.aio_nbytes = BUFFER_SIZE; // Bytes to read // No notification - we'll poll for completion cb.aio_sigevent.sigev_notify = SIGEV_NONE; // Submit the async read request if (aio_read(&cb) < 0) { perror("aio_read"); free(buffer); close(fd); return 1; } printf("Async read submitted, doing other work..."); // We can do other work here! // The I/O is happening in the background for (int i = 0; i < 5; i++) { printf(" Working... iteration %d", i); usleep(100000); // 100ms } // Poll for completion (in real code, you might use aio_suspend // or signal-based notification) int status; while ((status = aio_error(&cb)) == EINPROGRESS) { printf(" Still waiting..."); usleep(10000); // 10ms } if (status != 0) { printf("AIO error: %s", strerror(status)); free(buffer); close(fd); return 1; } // Get the result ssize_t bytes_read = aio_return(&cb); if (bytes_read < 0) { perror("aio_return"); free(buffer); close(fd); return 1; } printf("Read complete: %zd bytes", bytes_read); printf("Content: %.100s...", buffer); // First 100 chars free(buffer); close(fd); return 0;}On Linux, glibc's POSIX AIO implementation uses a thread pool internally—it's not true kernel-level async I/O. Each aio_read() may actually block a thread in the pool. This means it doesn't scale well and can have surprising latency characteristics. For high-performance applications on Linux, io_uring is the preferred alternative.
Linux provides kernel-level asynchronous I/O through the io_submit/io_getevents interface, often accessed via libaio. Unlike glibc's POSIX AIO, this is true kernel async I/O—no hidden threads.
However, Linux native AIO has significant limitations:
These limitations make it useful primarily for database engines that manage their own caching.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192
#include <libaio.h>#include <fcntl.h>#include <stdio.h>#include <stdlib.h>#include <string.h>#include <unistd.h>#include <errno.h> #define BLOCK_SIZE 4096#define MAX_EVENTS 64 /** * Linux native AIO example. * Note: Requires O_DIRECT and aligned buffers! */int main() { io_context_t ctx = 0; struct io_event events[MAX_EVENTS]; struct iocb cb; struct iocb *cbs[1] = {&cb}; // Create AIO context if (io_setup(MAX_EVENTS, &ctx) < 0) { perror("io_setup"); return 1; } // Open file with O_DIRECT (required for native AIO) int fd = open("data.bin", O_RDONLY | O_DIRECT); if (fd < 0) { perror("open"); io_destroy(ctx); return 1; } // Allocate aligned buffer (required for O_DIRECT) void *buffer; if (posix_memalign(&buffer, BLOCK_SIZE, BLOCK_SIZE) != 0) { perror("posix_memalign"); close(fd); io_destroy(ctx); return 1; } // Prepare the I/O control block io_prep_pread(&cb, fd, buffer, BLOCK_SIZE, 0); cb.data = (void *)"my_request"; // User data for identification // Submit the request int submitted = io_submit(ctx, 1, cbs); if (submitted < 0) { perror("io_submit"); free(buffer); close(fd); io_destroy(ctx); return 1; } printf("Submitted %d requests", submitted); printf("Doing other work while I/O completes..."); // Do other work here... // Wait for completion (blocking, or use timeout) int completed = io_getevents(ctx, 1, MAX_EVENTS, events, NULL); if (completed < 0) { perror("io_getevents"); } else { for (int i = 0; i < completed; i++) { struct io_event *e = &events[i]; printf("Event %d: res=%lld, user_data=%s", i, (long long)e->res, (char *)e->obj->data); if (e->res < 0) { printf(" Error: %s", strerror(-e->res)); } else { printf(" Success: read %lld bytes", (long long)e->res); } } } // Cleanup free(buffer); close(fd); io_destroy(ctx); return 0;}Databases like PostgreSQL, MySQL/InnoDB, and RocksDB use Linux native AIO for O_DIRECT disk I/O. They manage their own buffer pools and need precise control over when data hits disk. For most applications, the page cache is beneficial, making native AIO less useful.
Introduced in Linux 5.1 (2019), io_uring is a revolutionary new interface that provides true asynchronous I/O for all types of operations—files, sockets, and more. It addresses all the limitations of previous Linux async I/O mechanisms.
Why io_uring is transformative:
io_uring architecture:
io_uring uses two ring buffers shared between user space and kernel space:
Both queues are memory-mapped, allowing lock-free communication without system calls in the best case.
1234567891011121314151617181920212223242526
/* * io_uring Conceptual Overview * * ┌─────────────────────────────────────────────────────┐ * │ User Space │ * │ ┌──────────────┐ ┌──────────────────────┐ │ * │ │ Application │ ──────▶ │ Submission Queue │ │ * │ │ │ │ ┌─────┬─────┬─────┐ │ │ * │ │ Submit ops │ │ │ op1 │ op2 │ op3 │ │ │ * │ │ Read results│ ◀────── │ └─────┴─────┴─────┘ │ │ * │ └──────────────┘ ├──────────────────────┤ │ * │ │ Completion Queue │ │ * │ │ ┌─────┬─────┬─────┐ │ │ * │ │ │res1 │res2 │ │ │ │ * │ │ └─────┴─────┴─────┘ │ │ * │ └──────────────────────┘ │ * ├─────────────────────────────────────────────────────┤ * │ Kernel Space │ * │ │ ▲ │ * │ ▼ │ │ * │ ┌──────────────────────────────────────────────┐ │ * │ │ io_uring worker threads / interrupt │ │ * │ │ Process SQ entries, post to CQ │ │ * │ └──────────────────────────────────────────────┘ │ * └─────────────────────────────────────────────────────┘ */123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687
#include <liburing.h>#include <fcntl.h>#include <stdio.h>#include <string.h>#include <stdlib.h> #define QUEUE_DEPTH 256#define BUFFER_SIZE 4096 /** * io_uring example: async file read * Compile: gcc -o io_uring_example io_uring_example.c -luring */int main() { struct io_uring ring; struct io_uring_sqe *sqe; struct io_uring_cqe *cqe; // Initialize io_uring with queue depth if (io_uring_queue_init(QUEUE_DEPTH, &ring, 0) < 0) { perror("io_uring_queue_init"); return 1; } // Open file (works with buffered I/O - no O_DIRECT needed!) int fd = open("data.txt", O_RDONLY); if (fd < 0) { perror("open"); io_uring_queue_exit(&ring); return 1; } // Allocate buffer char *buffer = malloc(BUFFER_SIZE); // Get a submission queue entry (SQE) sqe = io_uring_get_sqe(&ring); if (!sqe) { fprintf(stderr, "Could not get SQE"); return 1; } // Prepare a read operation io_uring_prep_read(sqe, fd, buffer, BUFFER_SIZE, 0); // Set user data for identifying this request later io_uring_sqe_set_data(sqe, buffer); // Submit the request to the kernel int submitted = io_uring_submit(&ring); printf("Submitted %d requests", submitted); // Do other work here - the I/O is truly async! printf("Doing other work while I/O completes..."); // Wait for completion int ret = io_uring_wait_cqe(&ring, &cqe); if (ret < 0) { fprintf(stderr, "io_uring_wait_cqe: %s", strerror(-ret)); return 1; } // Check result if (cqe->res < 0) { fprintf(stderr, "Read failed: %s", strerror(-cqe->res)); } else { printf("Read %d bytes", cqe->res); char *buf = io_uring_cqe_get_data(cqe); printf("Content: %.100s...", buf); } // Mark CQE as consumed io_uring_cqe_seen(&ring, cqe); // Cleanup free(buffer); close(fd); io_uring_queue_exit(&ring); return 0;}io_uring is rapidly being adopted by high-performance applications. Nginx, RocksDB, and many other projects now have io_uring backends. Its ability to batch operations and minimize system calls makes it significantly faster than epoll for I/O-heavy workloads. If you're building performance-critical software on Linux, io_uring should be on your radar.
Asynchronous I/O isn't Linux-specific. Other platforms have their own implementations, each with unique characteristics.
Windows has a highly mature async I/O system called I/O Completion Ports. IOCP is arguably the most complete async I/O implementation across major platforms:
123456789101112131415161718192021222324252627282930313233343536373839404142434445
// Windows I/O Completion Ports - Conceptual Example#include <windows.h> // Create completion portHANDLE iocp = CreateIoCompletionPort( INVALID_HANDLE_VALUE, // No file yet NULL, // New port 0, // Completion key 0 // Number of concurrent threads (0 = system default)); // Associate a socket with the completion portCreateIoCompletionPort( (HANDLE)socket, iocp, (ULONG_PTR)myContext, // Your data pointer 0); // Initiate async readOVERLAPPED overlapped = {0};ReadFile( (HANDLE)socket, buffer, bufferSize, NULL, // No immediate bytes read count &overlapped // Async operation tracking); // In worker thread: wait for completionsDWORD bytesTransferred;ULONG_PTR completionKey;OVERLAPPED *pOverlapped; while (GetQueuedCompletionStatus( iocp, &bytesTransferred, &completionKey, &pOverlapped, INFINITE)) { // Handle completed I/O MyContext *ctx = (MyContext *)completionKey; ProcessCompletion(ctx, bytesTransferred);}kqueue is primarily an event notification mechanism (similar to epoll for multiplexing), but it also supports asynchronous I/O signaling. It's not true async I/O like IOCP or io_uring—the actual I/O still happens synchronously, but readiness notification is efficient.
Due to platform differences, most applications use abstraction libraries:
| Platform | Mechanism | True Async | Socket Support | File Support |
|---|---|---|---|---|
| Linux (modern) | io_uring | Yes | Full | Full (buffered & direct) |
| Linux (legacy) | libaio | Yes | No | O_DIRECT only |
| Linux (portable) | POSIX AIO (glibc) | No (threads) | No | Yes |
| Windows | IOCP | Yes | Full | Full |
| macOS/BSD | kqueue + AIO | Partial | Readiness only | Yes (limited) |
| Cross-platform | libuv | Varies | Full | Thread pool |
A common question: How does true async I/O compare to non-blocking I/O with epoll? Both allow handling many concurrent I/O operations, but they differ in fundamental ways.
When does async I/O win?
High-throughput file I/O — For applications doing many disk operations (databases, log processing), io_uring can batch operations and reduce system call overhead dramatically.
Mixed file + socket workloads — io_uring can handle both in a unified interface; epoll is only for sockets/pipes.
Deep operation pipelines — When you can submit many operations before needing results, batching provides huge benefits.
CPU-bound applications — When your application has useful work to do while I/O completes, true async frees the thread entirely.
When is epoll sufficient?
Network-only servers — For pure socket I/O, epoll is already extremely efficient.
Low latency requirements — Epoll's immediate notification of readiness can be faster than waiting for completion.
Simpler applications — Epoll's programming model is more widely understood.
Portability — Epoll-style abstractions exist everywhere; io_uring is Linux-specific.
io_uring can actually work together with epoll. You can use io_uring for file I/O and epoll for sockets, or use io_uring exclusively. Many applications are migrating socket handling to io_uring for the batching benefits while keeping familiar epoll-style patterns.
While asynchronous I/O provides performance benefits, it introduces significant complexity that developers must manage carefully.
12345678910111213141516171819202122232425262728293031
// DANGER: Buffer lifetime issues with async I/O void dangerous_async_read(int fd, io_uring *ring) { char buffer[4096]; // STACK buffer - DANGER! struct io_uring_sqe *sqe = io_uring_get_sqe(ring); io_uring_prep_read(sqe, fd, buffer, sizeof(buffer), 0); io_uring_submit(ring); // BUG: This function returns, stack is unwound, // 'buffer' becomes invalid while kernel still reads into it!} // buffer destroyed here, kernel writes to garbage memory void safe_async_read(int fd, io_uring *ring) { // HEAP buffer - lives until explicitly freed char *buffer = malloc(4096); struct io_uring_sqe *sqe = io_uring_get_sqe(ring); io_uring_prep_read(sqe, fd, buffer, 4096, 0); io_uring_sqe_set_data(sqe, buffer); // Track buffer io_uring_submit(ring); // Safe: buffer survives function return // Must free in completion handler!} void completion_handler(io_uring_cqe *cqe) { char *buffer = io_uring_cqe_get_data(cqe); // Use buffer... free(buffer); // Free when done}The complexity of async I/O is why high-level abstractions exist. Languages with async/await (C#, JavaScript, Python, Rust) hide callback complexity. Frameworks like libuv and Tokio handle buffer management. Using these abstractions is usually preferable to raw async I/O APIs.
Asynchronous I/O represents the most complete decoupling of I/O initiation from completion. Let's consolidate the key concepts:
What's next:
Now that we understand the three fundamental I/O models (blocking, non-blocking, asynchronous), we'll explore I/O multiplexing—the practical technique that makes non-blocking I/O usable. We'll see how select(), poll(), and epoll() let a single thread efficiently manage thousands of connections.
You now understand asynchronous I/O—the fire-and-forget model where operations complete independently of your thread. You've seen POSIX AIO, Linux native AIO, and the modern io_uring interface. You understand when async I/O provides benefits and the complexity it introduces. Next, we'll master I/O multiplexing with select, poll, and epoll.