Loading learning content...
Every programmer learns file I/O through the lens of streams and buffers. You open a file, read chunks into application memory, process them, and write results back. This mental model—of files as external entities that must be explicitly fetched and stored—is so deeply ingrained that questioning it seems almost absurd.
But what if we simply... didn't have to do that?
Memory-mapped files offer a radical alternative: treat the file as if it were already in memory. No read() calls. No write() calls. No explicit buffering. You receive a pointer, and from that moment on, the file is just a byte array you can index, iterate, or process with any pointer-based algorithm.
// Traditional I/O: Many system calls, explicit buffer management
char buffer[4096];
while ((n = read(fd, buffer, sizeof(buffer))) > 0) {
process(buffer, n);
}
// Memory-mapped: Zero system calls for data access
char *file_data = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0);
process(file_data, file_size); // Just use it like any array
This isn't syntactic sugar—it's a fundamental change in how your application interacts with the storage subsystem. And understanding it deeply transforms how you architect data-intensive applications.
This page explores the 'file as memory' paradigm comprehensively—how it changes your application architecture, why it can be dramatically faster than traditional I/O, what limitations exist, and when this approach is (and isn't) appropriate. You'll gain the deep understanding needed to make informed decisions about file access patterns in your systems.
To appreciate the elegance of memory-mapped files, we must first understand the overhead inherent in traditional file I/O. When you call read() on a modern operating system, a remarkable amount of machinery activates:
The System Call Journey:
User-to-Kernel Transition: Your read() triggers a system call instruction (syscall on x86-64, svc on ARM). The CPU switches from user mode to kernel mode, saving register state and switching to a kernel stack. This transition itself costs hundreds of CPU cycles.
File Descriptor Lookup: The kernel locates the struct file associated with your file descriptor in the process's file descriptor table.
Permission Verification: The kernel verifies your process has read permission on this file.
Page Cache Check: The kernel checks if the requested file data is already in the page cache (also called buffer cache). This involves translating the file offset to a page cache index and performing hash table lookups.
Cache Miss Handling: If data isn't cached, the kernel must:
The First Copy: Data arrives from the storage device into the kernel's page cache.
The Second Copy: The kernel copies data from the page cache to your user-space buffer. This is the infamous "copy" overhead.
Return to User Space: The kernel returns execution to your application, with another mode transition.
| Operation | Approximate Cost | Frequency per I/O |
|---|---|---|
| System call entry/exit | ~200-500 cycles | Per read()/write() call |
| File descriptor lookup | ~50-100 cycles | Per call |
| Page cache lookup | ~100-300 cycles | Per page accessed |
| Memory copy (per 4KB page) | ~1000-2000 cycles | Per page of data |
| Context switch (if blocking) | ~5,000-20,000 cycles | On cache miss |
| Disk I/O (HDD) | ~10,000,000 cycles | On cache miss |
| Disk I/O (SSD) | ~100,000-500,000 cycles | On cache miss |
The Copy Problem at Scale:
Consider a program that processes a 10GB file using 4KB read() calls:
Even with larger buffer sizes (reducing system calls), the copy overhead remains: every byte must transit from kernel space to user space, consuming memory bandwidth and CPU cycles.
The Fundamental Inefficiency:
Notice what's happening: file data gets loaded into memory twice. It exists both in the kernel's page cache AND in your application's buffer. For large files, this means:
When you read() a file, the data already exists in kernel memory (the page cache) before it's copied to your buffer. With memory-mapped files, you access that same page cache directly—eliminating the copy entirely. You're not just avoiding system calls; you're avoiding unnecessary data movement at the hardware level.
Memory mapping inverts the traditional file access model. Instead of moving data from files to your program, you project files into your program's address space. The conceptual shift is profound:
Traditional Model: File → (system call) → Kernel Buffer → (copy) → User Buffer → Process
Memory-Mapped Model: File → Kernel Page Cache ←→ Process Address Space (same physical pages!)
After mmap(), your process's page table contains entries that point to the same physical pages used by the kernel's page cache. There's no intermediate copy because there's no intermediate buffer—you're directly accessing the page cache through your virtual address space.
The Virtual Memory Magic:
This unification is possible because of virtual memory's flexibility. Virtual addresses don't care what physical pages they point to. The kernel can map:
All these appear as different portions of your flat virtual address space. Your code doesn't know (or care) whether a particular address holds heap data or a memory-mapped file—it's all just memory.
What Happens When You Access a Mapped Address:
char *mapped_file = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
char first_byte = mapped_file[0]; // What actually happens here?
When mapped_file[0] executes:
Subsequent accesses to the same page are purely hardware operations with zero kernel involvement. This is why memory-mapped I/O can be dramatically faster for random access patterns.
Memory-mapped files achieve genuine zero-copy I/O. The data in the page cache is the exact same data your application accesses—not a copy, but the original. This eliminates memory bandwidth waste and keeps CPU caches efficient (no duplicate data polluting cache lines).
Once a file is mapped, all the powerful tools of memory manipulation become available for file processing:
Pointer Arithmetic:
struct Header *header = (struct Header *)mapped_file;
struct Record *records = (struct Record *)(mapped_file + header->record_offset);
for (int i = 0; i < header->num_records; i++) {
process_record(&records[i]);
}
Standard Library Functions:
// Search for a byte sequence in the file
char *found = memmem(mapped_file, file_size, pattern, pattern_len);
// Compare portions of two files
int diff = memcmp(mapped_file1, mapped_file2, compare_size);
// Copy file contents to another buffer (if needed)
memcpy(destination, mapped_file + offset, length);
Data Structure Access:
// Binary search in a sorted file
struct Entry *entries = (struct Entry *)mapped_file;
int n_entries = file_size / sizeof(struct Entry);
struct Entry *target = bsearch(&key, entries, n_entries,
sizeof(struct Entry), compare_entries);
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
#include <sys/mman.h>#include <sys/stat.h>#include <fcntl.h>#include <string.h>#include <stdio.h> // Example 1: Treating a file as a structtypedef struct { uint32_t magic; uint32_t version; uint64_t entry_count; uint64_t data_offset;} FileHeader; typedef struct { uint64_t id; char name[56]; double value;} DataEntry; void process_structured_file(const char *path) { int fd = open(path, O_RDONLY); struct stat sb; fstat(fd, &sb); void *map = mmap(NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0); close(fd); // Direct structure access - no parsing needed! FileHeader *header = (FileHeader *)map; if (header->magic != 0xDEADBEEF) { printf("Invalid file format"); munmap(map, sb.st_size); return; } // Navigate to data section using pointer arithmetic DataEntry *entries = (DataEntry *)((char *)map + header->data_offset); // Iterate entries - feels like iterating an array for (uint64_t i = 0; i < header->entry_count; i++) { printf("Entry %lu: %s = %f", entries[i].id, entries[i].name, entries[i].value); } munmap(map, sb.st_size);} // Example 2: Index-based random accesstypedef struct { uint64_t offset; uint32_t length;} IndexEntry; char *get_record_by_index(void *data_map, IndexEntry *index, int record_num) { // Direct access to any record via index - O(1) regardless of file size return (char *)data_map + index[record_num].offset;} // Example 3: Memory operations on file contentint count_occurrences(void *map, size_t size, const char *pattern) { size_t pattern_len = strlen(pattern); int count = 0; char *pos = map; char *end = (char *)map + size - pattern_len; while (pos <= end) { pos = memmem(pos, end - pos + pattern_len, pattern, pattern_len); if (pos == NULL) break; count++; pos++; } return count;}The Power of Direct Access:
This unified interface eliminates entire categories of code:
A complex file format that might require hundreds of lines of parsing code with read() can often be reduced to casting pointers with mmap().
Memory-mapped I/O isn't universally faster than read()—understanding when it excels is crucial for making informed decisions.
Access Pattern Analysis:
| Access Pattern | mmap() Performance | read() Performance | Winner |
|---|---|---|---|
| Random access to large file | Excellent—direct single-page faults | Poor—each access is a system call | mmap() |
| Sequential read, process once | Good—but may fault per-page | Good—readahead helps significantly | Roughly equal |
| Re-reading same data multiple times | Excellent—pages stay warm in cache | Requires explicit caching | mmap() |
| Very large file, touch small portion | Excellent—only load needed pages | Wasteful if read beyond needs | mmap() |
| Streaming data (copy to socket) | Overhead from page faults | sendfile() bypasses user space entirely | sendfile() |
| Write-heavy, durability critical | Requires msync() management | fsync() after write is clearer | read()/write() |
Why Random Access Favors mmap():
Consider a database-style access pattern: reading record #1000, then #42, then #999,000, scattered throughout a large file.
With read():
With mmap():
Quantifying the Difference:
Benchmark scenario: Random access to 4-byte integers in a 1GB file, 1 million accesses:
read() with lseek(): ~45 seconds (dominated by system calls)
mmap() with indexing: ~2 seconds (page faults only on first access)
That's a 22x speedup—not from clever optimization, but from fundamental change in access model.
For purely sequential access, the kernel's read-ahead mechanism helps read() performance significantly. When the kernel detects sequential access, it prefetches data before you request it, hiding I/O latency. mmap() also triggers read-ahead, but the patterns may be less predictable. For truly sequential processing, both methods can achieve similar throughput—but mmap() still avoids the copy overhead.
Memory Efficiency:
With traditional I/O, processing a 100GB file requires either:
With mmap(), you can map the entire 100GB file even if you have only 16GB of RAM:
This enables algorithms that conceptually need to "see" entire datasets to work on datasets larger than RAM, without explicit chunking or streaming logic.
Let's examine equivalent operations in both models to highlight practical differences:
Scenario 1: Reading and Processing a Configuration File
12345678910111213141516171819202122232425
// Traditional approachint fd = open("config.dat", O_RDONLY);struct stat sb;fstat(fd, &sb); // Allocate bufferchar *buffer = malloc(sb.st_size + 1);if (!buffer) { /* error */ } // Read entire filessize_t bytes = 0;while (bytes < sb.st_size) { ssize_t r = read(fd, buffer + bytes, sb.st_size - bytes); if (r <= 0) break; bytes += r;}close(fd);buffer[bytes] = '\0'; // Processprocess_config(buffer, bytes); // Cleanupfree(buffer);12345678910111213141516171819202122
// Memory-mapped approachint fd = open("config.dat", O_RDONLY);struct stat sb;fstat(fd, &sb); // Map filechar *mapped = mmap(NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0);close(fd); if (mapped == MAP_FAILED) { /* error */ } // Process - file is already accessibleprocess_config(mapped, sb.st_size); // Cleanupmunmap(mapped, sb.st_size); // Note: No malloc, no read loop,// no buffer boundary handlingScenario 2: Modifying a Binary File In-Place
1234567891011121314151617181920
// Modify record at offsetvoid update_record_traditional( int fd, off_t offset, Record *new_data) { // Seek to position if (lseek(fd, offset, SEEK_SET) == -1) return; // Write new data if (write(fd, new_data, sizeof(Record)) != sizeof(Record)) { // Partial write handling... } // Ensure durability fsync(fd);} // For multiple updates: many seeks + writes1234567891011121314151617
// Modify record via memoryvoid update_record_mmap( void *map, off_t offset, Record *new_data) { // Direct memory write Record *target = (Record *) ((char *)map + offset); *target = *new_data; // Sync if durability needed msync(target, sizeof(Record), MS_SYNC);} // For multiple updates: just assign to// different offsets - no system calls// until you msync()Scenario 3: Searching for a Pattern
12345678910111213141516171819202122
// Search for pattern in fileoff_t find_pattern_traditional( int fd, const char *pattern, size_t pattern_len) { char buffer[8192]; char overlap[256]; // For cross-boundary size_t overlap_len = 0; off_t position = 0; while (1) { ssize_t n = read(fd, buffer, sizeof(buffer)); if (n <= 0) return -1; // Search in overlap + new data // (Complex boundary handling) // ... position += n; }}12345678910111213141516171819
// Search for pattern in fileoff_t find_pattern_mmap( void *map, size_t file_size, const char *pattern, size_t pattern_len) { // Use standard memory search void *found = memmem(map, file_size, pattern, pattern_len); if (found == NULL) return -1; return (char *)found - (char *)map;} // That's it. No buffers, no boundaries,// no complex state management.Memory-mapped I/O isn't just for reading—it enables intuitive file modification through standard memory operations. When you write to a MAP_SHARED mapping, your changes eventually propagate to the underlying file.
How Write Propagation Works:
*ptr = value)The Writeback Timing Challenge:
Unlike write() which blocks until data reaches kernel buffers (and optionally disk with fsync), mmap() writes are decoupled from filesystem operations:
// When does this actually hit the disk?
*mapped_ptr = new_value; // Only modifies RAM (page cache)
// Answer: "Eventually" — when:
// 1. Kernel writeback timer fires (usually 5-30 seconds)
// 2. System runs low on memory
// 3. You explicitly call msync()
// 4. You call munmap()
// 5. Process exits
For crash-safe applications, never assume mmap() writes reached disk. A power failure or kernel crash before writeback will lose your changes. Use msync(addr, len, MS_SYNC) to force writes to disk, similar to fsync() after write().
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
#include <sys/mman.h>#include <fcntl.h>#include <unistd.h>#include <stdio.h> typedef struct { uint64_t transaction_id; char data[4088];} Record; // Safely update a record with durability guaranteeint update_record_durably(void *map, size_t map_size, int record_index, Record *new_record) { Record *records = (Record *)map; Record *target = &records[record_index]; // Validate bounds if ((char *)(target + 1) > (char *)map + map_size) { return -1; // Out of bounds } // Step 1: Write to memory *target = *new_record; // Step 2: Force to disk synchronously // MS_SYNC: Wait for write to complete // MS_ASYNC: Schedule write but don't wait // MS_INVALIDATE: Invalidate other mappings (rarely needed) if (msync(target, sizeof(Record), MS_SYNC) == -1) { perror("msync failed"); // Note: Data might still be in page cache // It may still reach disk eventually return -1; } return 0; // Record durably stored} // Batch multiple updates, then sync once for efficiencyint update_records_batch(void *map, size_t map_size, int *indices, Record *new_records, int count) { Record *records = (Record *)map; // Step 1: All writes to memory (fast) for (int i = 0; i < count; i++) { records[indices[i]] = new_records[i]; } // Step 2: Single sync for entire mapped region (slower, but once) if (msync(map, map_size, MS_SYNC) == -1) { perror("msync failed"); return -1; } return 0;}msync() Flags Explained:
Extending Files Through mmap():
You cannot extend a file's size by writing beyond its current end through mmap(). Attempting to access bytes beyond the file size triggers SIGBUS. To grow a file:
// Growing a memory-mapped file
int fd = open("data.bin", O_RDWR);
// Extend the file first
if (ftruncate(fd, new_size) == -1) {
perror("ftruncate");
// handle error
}
// Now remap with larger size
// Note: Must munmap() old mapping first, or use mremap() on Linux
Memory-mapped I/O isn't universally superior. Understanding its limitations helps you make appropriate choices:
Pitfall 1: Error Handling Complexity
With read(), I/O errors return immediately via the return value:
if (read(fd, buf, size) == -1) {
// Handle error - we know immediately
}
With mmap(), I/O errors manifest as signals when you access the memory:
char *data = mmap(...);
// mmap() succeeded, but...
char c = data[0]; // This might SIGBUS if disk read fails!
Handling errors requires setting up signal handlers—substantially more complex than checking return values.
Pitfall 2: The Truncation Race
Consider this dangerous scenario:
Unlike read() which would return 0 or an error for a truncated file, mmap() causes a crash. Solutions:
Pitfall 3: Sequential Streaming Performance
For pure sequential reads of massive files, mmap() may not outperform optimzied read() loops:
// This is often faster for sequential reads:
while ((n = read(fd, buf, BIG_BUFFER)) > 0) {
send(socket_fd, buf, n, 0); // Stream to network
}
// Better still - zero-copy to socket:
sendfile(socket_fd, fd, NULL, file_size);
The sendfile() system call moves data directly between file and socket within the kernel—even mmap() can't compete because it involves user-space address access.
On 32-bit architectures, user-space typically has ~3GB of virtual address space. Mapping multiple large files simultaneously can exhaust this quickly. Each mapping consumes virtual address space even if you never touch most pages. On 64-bit systems, this is rarely a concern—you have 128TB or more of virtual space.
Use this decision framework to choose between memory mapping and traditional I/O:
| Use Case | Recommended Approach | Rationale |
|---|---|---|
| Database files | mmap() | Random access, re-reads, structured data |
| Log file tailing | read() | Sequential, may need nonblocking |
| Configuration files | mmap() or read() | Usually small, either works |
| Image/video editing | mmap() | Random access, sparse edits |
| HTTP file serving | sendfile() | Zero-copy to socket |
| Search/grep | mmap() | Simpler algorithms, may re-scan |
| Archive extraction | read()+write() | Sequential decompression |
We've explored the paradigm of treating files as memory—how memory-mapped files fundamentally transform file access patterns and enable elegant, efficient solutions to data processing challenges.
What's Next:
With the file-as-memory paradigm understood, we'll examine how the kernel's lazy loading mechanism makes mmap() efficient even for enormous files. The next page explores lazy loading in depth—how pages are faulted in on demand, what triggers I/O, and how to optimize access patterns for your working set.
You now understand how memory-mapped files transform file access—allowing you to treat persistent storage as ordinary memory. This unified interface simplifies code, eliminates copying overhead, and enables powerful data processing patterns. Next, we'll explore the lazy loading mechanism that makes this efficient.