Loading learning content...
What if you could access a file as simply as accessing an array in memory? No read() calls, no write() calls, no buffers to manage—just pointers and dereferencing.
Memory-mapped file access makes this possible. By mapping a file into a process's virtual address space, the operating system creates a seamless bridge between disk and memory. Your program can read file contents with data[offset] and modify them with data[offset] = value. The OS handles everything: loading pages on demand, writing changes back to disk, and managing the buffer cache.
This isn't just a convenience—it's a powerful optimization technique. Memory mapping eliminates the copy between kernel and user space that conventional read/write incurs. It enables zero-copy file sharing between processes. It allows the OS to use the same memory pages for the buffer cache and your application's view of the file.
At the same time, memory mapping has subtleties and pitfalls that can catch the unwary. Understanding when to use it—and when not to—is essential for systems programming mastery.
By the end of this page, you will master memory-mapped file access—the mmap() system call with all its parameters, the mechanics of demand paging, shared vs. private mappings, performance implications, essential use cases, and critical pitfalls that can cause data corruption or crashes.
Memory mapping creates a direct correspondence between a region of virtual memory and a file on disk:
┌────────────────────────────────────────────────────────┐
│ Process Virtual Address Space │
│ │
│ 0x00000000 ─────┐ │
│ ... │ │
│ 0x7f000000 ───► ┌─────────────────────────────────┐ │
│ │ Memory-Mapped Region │ │
│ │ (points to file data) │ │
│ ◄─────────┤ ptr = mmap(...) │ │
│ │ ptr[0], ptr[1], ...ptr[n] │ │
│ 0x7f100000 ───► └─────────────────────────────────┘ │
│ ... │ │
└────────────────────────────────────────────────────────┘
│
│ (Page table mapping)
▼
┌────────────────────────────────────────────────────────┐
│ File on Disk │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Page 0 │ Page 1 │ Page 2 │ ... │ Page N │ │
│ │ 4KB │ 4KB │ 4KB │ │ 4KB │ │
│ └──────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────┘
Key insight: After mapping, accessing memory addresses in the mapped region causes the OS to:
On first access to a page: Trigger a page fault, load the corresponding file page from disk (or buffer cache), map it into the process's address space, and resume execution.
On subsequent accesses: Access the already-loaded page directly in memory—no system call, no copy.
On modification (for writable mappings): Mark the page dirty. The OS will eventually write it back to the file.
Benefits of memory mapping:
Memory mapping leverages the same virtual memory machinery that provides process isolation and demand paging for regular program memory. The file becomes just another source of page contents, handled by the same page fault mechanism and buffer cache infrastructure.
The mmap() system call creates memory mappings. Understanding its parameters deeply is essential for using it correctly.
The signature:
#include <sys/mman.h>
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
Parameters in detail:
addr (Hint for mapping address)
NULL, letting the OS choose an appropriate addresslength (Size of the mapping)
prot (Protection flags)
PROT_READ — Pages can be readPROT_WRITE — Pages can be writtenPROT_EXEC — Pages can be executedPROT_NONE — Pages cannot be accessed (guard pages)PROT_READ | PROT_WRITEflags (Mapping type and behavior)
MAP_SHARED — Changes are visible to others, written to fileMAP_PRIVATE — Copy-on-write; changes are privateMAP_FIXED — Use addr exactly (dangerous; can overwrite existing mappings)MAP_ANONYMOUS — No file backing; memory initialized to zeroMAP_POPULATE — Prefault pages on map (avoid later faults)MAP_HUGETLB — Use huge pagesfd (File descriptor)
offset (Offset into file)
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127
#include <stdio.h>#include <stdlib.h>#include <fcntl.h>#include <unistd.h>#include <sys/mman.h>#include <sys/stat.h>#include <string.h> /** * Example 1: Read-only mapping of entire file */void read_file_via_mmap(const char *filename) { int fd = open(filename, O_RDONLY); if (fd < 0) { perror("open"); return; } // Get file size struct stat st; fstat(fd, &st); size_t file_size = st.st_size; // Map the file read-only char *data = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0); if (data == MAP_FAILED) { perror("mmap"); close(fd); return; } // Can close fd after mapping (mapping remains valid) close(fd); // Access file contents directly via pointer printf("First 100 bytes: %.100s\n", data); printf("Byte at offset 1000: 0x%02x\n", (unsigned char)data[1000]); // Cleanup munmap(data, file_size);} /** * Example 2: Read-write mapping (modifications written to file) */void modify_file_via_mmap(const char *filename) { int fd = open(filename, O_RDWR); if (fd < 0) { perror("open"); return; } struct stat st; fstat(fd, &st); size_t file_size = st.st_size; // Map read-write, shared (writes go to file) char *data = mmap(NULL, file_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); if (data == MAP_FAILED) { perror("mmap"); close(fd); return; } close(fd); // Modify file by writing to memory memcpy(data, "MODIFIED!", 9); // Changes bytes 0-8 of file data[100] = 'X'; // Changes byte 100 of file // Force write to disk (optional; OS does this eventually) msync(data, file_size, MS_SYNC); munmap(data, file_size); printf("File modified via memory mapping\n");} /** * Example 3: Private mapping (copy-on-write) */void private_copy_via_mmap(const char *filename) { int fd = open(filename, O_RDONLY); // Read-only is OK for private if (fd < 0) { perror("open"); return; } struct stat st; fstat(fd, &st); size_t file_size = st.st_size; // Private mapping allows writes but they don't affect the file char *data = mmap(NULL, file_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); close(fd); // Modify in memory - creates private copy of affected pages data[0] = 'X'; // Page 0 copied, modification is private // Original file is unchanged! munmap(data, file_size);} /** * Example 4: Create new file via mapping */void create_file_via_mmap(const char *filename, size_t size) { int fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, 0644); if (fd < 0) { perror("open"); return; } // Extend file to desired size ftruncate(fd, size); // Map the empty file char *data = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); close(fd); // Fill the file by writing to memory memset(data, 0, size); strcpy(data, "File created and populated via mmap!"); msync(data, size, MS_SYNC); munmap(data, size);} /** * Example 5: Anonymous mapping (no file, just memory) */void anonymous_mapping_demo() { size_t size = 1024 * 1024; // 1 MB // No file backing; memory initialized to zero char *data = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (data == MAP_FAILED) { perror("mmap"); return; } // Use like malloc'd memory strcpy(data, "Anonymous mapping content"); printf("Data: %s\n", data); // Free with munmap, not free() munmap(data, size);}Always call munmap() when done with a mapping. Unlike malloc/free, unfree'd mappings persist until process exit. For long-running processes, failing to munmap leaks address space and kernel resources. Close() does NOT unmap—the mapping survives after close().
The choice between MAP_SHARED and MAP_PRIVATE is one of the most important decisions when memory mapping. Getting it wrong can cause silent data loss or corruption.
MAP_SHARED: Changes Written to File
MAP_PRIVATE: Copy-On-Write Semantics
Behavior comparison:
| Aspect | MAP_SHARED | MAP_PRIVATE |
|---|---|---|
| Write to mapping | Modifies file | Private copy (COW) |
| Other processes see changes | Yes | No |
| File modification | File updated | File unchanged |
| Memory use on write | Same pages | Copy made per page modified |
| Requires O_RDWR file | Yes (for writes) | No |
| Use case | IPC, databases, logs | Executables, text files, sandboxes |
| msync() effect | Flushes to disk | No effect (no backing file) |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889
#include <stdio.h>#include <stdlib.h>#include <fcntl.h>#include <unistd.h>#include <sys/mman.h>#include <sys/wait.h> /** * Demonstrates the crucial difference between MAP_SHARED and MAP_PRIVATE */ void demo_shared_mapping() { printf("=== MAP_SHARED Demo ===\n"); // Create a test file with initial content int fd = open("shared_test.dat", O_RDWR | O_CREAT | O_TRUNC, 0644); write(fd, "ORIGINAL", 8); // Map shared char *data = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); close(fd); pid_t pid = fork(); if (pid == 0) { // Child modifies the mapping data[0] = 'X'; munmap(data, 4096); exit(0); } wait(NULL); // Wait for child // Parent sees child's change! printf("After child: %.8s\n", data); // Prints "XRIGINAL" // File also modified! fd = open("shared_test.dat", O_RDONLY); char buf[9] = {0}; read(fd, buf, 8); printf("File content: %s\n", buf); // Prints "XRIGINAL" close(fd); munmap(data, 4096);} void demo_private_mapping() { printf("\n=== MAP_PRIVATE Demo ===\n"); // Create a test file int fd = open("private_test.dat", O_RDWR | O_CREAT | O_TRUNC, 0644); write(fd, "ORIGINAL", 8); // Map private char *data = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); close(fd); pid_t pid = fork(); if (pid == 0) { // Child modifies the mapping (creates private copy) data[0] = 'Y'; printf("Child sees: %.8s\n", data); // Prints "YRIGINAL" munmap(data, 4096); exit(0); } wait(NULL); // Parent does NOT see child's change! printf("Parent sees: %.8s\n", data); // Prints "ORIGINAL" // File is completely unchanged fd = open("private_test.dat", O_RDONLY); char buf[9] = {0}; read(fd, buf, 8); printf("File content: %s\n", buf); // Prints "ORIGINAL" close(fd); munmap(data, 4096);} int main() { demo_shared_mapping(); demo_private_mapping(); return 0;}Unless you specifically need file modification or inter-process sharing, use MAP_PRIVATE. It's safer—you cannot accidentally corrupt the original file, and copy-on-write means unmodified pages don't consume extra memory.
With MAP_SHARED, modifications to the mapping are eventually written to the file—but 'eventually' means the OS decides when. For applications requiring durability guarantees (databases, transaction logs), you need explicit synchronization.
The msync() system call:
#include <sys/mman.h>
int msync(void *addr, size_t length, int flags);
Parameters:
MS_SYNC — Synchronous; blocks until data is on diskMS_ASYNC — Asynchronous; schedules write but returns immediatelyMS_INVALIDATE — Invalidate other mappings of this file (force re-read)When writes actually happen without msync():
The durability problem:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394
#include <stdio.h>#include <string.h>#include <fcntl.h>#include <unistd.h>#include <sys/mman.h> /** * Example: Transaction log with durability requirements */ typedef struct { int transaction_id; char data[60];} LogEntry; typedef struct { int fd; char *base; size_t size; size_t offset;} MappedLog; MappedLog* create_mapped_log(const char *filename, size_t size) { MappedLog *log = malloc(sizeof(MappedLog)); log->fd = open(filename, O_RDWR | O_CREAT, 0644); ftruncate(log->fd, size); log->base = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, log->fd, 0); log->size = size; log->offset = 0; return log;} /** * Append entry WITHOUT durability guarantee * Fast but data may be lost on crash */void append_entry_fast(MappedLog *log, LogEntry *entry) { memcpy(log->base + log->offset, entry, sizeof(LogEntry)); log->offset += sizeof(LogEntry); // No msync - relies on OS eventual write-back} /** * Append entry WITH durability guarantee * Slower but data is safe on disk after return */void append_entry_durable(MappedLog *log, LogEntry *entry) { memcpy(log->base + log->offset, entry, sizeof(LogEntry)); // Sync the specific region containing the new entry // Round down to page boundary for addr size_t page_size = 4096; void *sync_addr = (void*)((size_t)(log->base + log->offset) & ~(page_size - 1)); size_t sync_len = sizeof(LogEntry) + ((log->base + log->offset) - (char*)sync_addr); // MS_SYNC blocks until data is durably on disk if (msync(sync_addr, sync_len, MS_SYNC) != 0) { perror("msync"); } log->offset += sizeof(LogEntry);} /** * Append entry with async hint * Middle ground: OS will prioritize writing this soon */void append_entry_async(MappedLog *log, LogEntry *entry) { memcpy(log->base + log->offset, entry, sizeof(LogEntry)); void *sync_addr = log->base; // Sync entire mapping size_t sync_len = log->size; // MS_ASYNC just schedules the write, returns immediately msync(sync_addr, sync_len, MS_ASYNC); log->offset += sizeof(LogEntry);} /* * Performance implications: * * - append_entry_fast(): ~100,000-1,000,000 entries/sec * - append_entry_async(): ~50,000-500,000 entries/sec * - append_entry_durable(): ~100-10,000 entries/sec (depends on disk) * * The durability version is dramatically slower because it waits * for disk I/O to complete. Use it only when crash safety is required. */For true durability on some file systems/hardware, you may also need fsync(fd) even after msync(). This is because msync() may only ensure data reaches the disk controller's write cache, not permanent storage. For critical data, use both: msync(data, len, MS_SYNC) followed by fsync(fd).
Memory mapping offers significant performance benefits but isn't universally superior. Understanding the trade-offs is essential.
Advantages of mmap() over read()/write():
Zero-copy access
No system call per access
Automatic caching
Efficient random access
Disadvantages and when read()/write() is better:
| Scenario | Better Choice | Reason |
|---|---|---|
| Small file, read once | read() | mmap overhead (page table setup) exceeds benefit |
| Large file, random access | mmap() | Zero-copy, no syscall per access |
| Streaming sequential | read() with large buffer | Similar performance, simpler error handling |
| Database page access | mmap() | Buffer pool integrates with OS page cache |
| Strict I/O error handling | read()/write() | mmap() errors manifest as SIGBUS, hard to handle |
| Files larger than address space | read()/write() | 32-bit: 0-2GB limit; must use windowing |
| Compressed/encrypted files | read()/write() | Must transform data on read anyway |
| Network file systems (NFS) | read()/write() | mmap() semantics poorly defined on network |
Quantitative performance comparison:
| Operation | read() | mmap() | Winner |
|---|---|---|---|
| Setup 1MB file | ~10μs (open) | ~50μs (open+mmap) | read() |
| Sequential read 1MB | ~500μs | ~500μs | Tie |
| Random read 1K chunks | ~1ms (1000 syscalls) | ~10μs (memory access) | mmap() 100x |
| Single 4K read | ~2μs | ~10μs (page fault) | read() |
| Repeated 4K reads (cached) | ~1.5μs each | ~0.01μs each | mmap() 100x |
Key observations:
Page faults during access can cause latency spikes. For latency-sensitive applications, use MAP_POPULATE to prefault all pages at mmap() time, or use mlock() to both fault and pin pages in memory. This trades upfront latency for consistent access times.
Memory mapping shines in specific scenarios. Let's examine the canonical use cases:
1. Database Buffer Pools
Databases like SQLite, LMDB, and MongoDB use mmap() for their buffer pools:
Caveat: Some databases (PostgreSQL) avoid mmap() for better control over write ordering and cache policies.
2. Loading Executables and Shared Libraries
When you run a program, the OS uses mmap():
3. Text Editors and IDEs
Editing large files efficiently:
4. Inter-Process Communication (IPC)
Multiple processes sharing memory:
5. Fast File Copying
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
#include <stdio.h>#include <fcntl.h>#include <unistd.h>#include <sys/mman.h>#include <sys/stat.h>#include <string.h> /** * Fast file copy using memory mapping * Outperforms read/write loops for large files */int copy_file_mmap(const char *src, const char *dst) { // Open and map source int src_fd = open(src, O_RDONLY); if (src_fd < 0) return -1; struct stat st; fstat(src_fd, &st); size_t size = st.st_size; char *src_data = mmap(NULL, size, PROT_READ, MAP_PRIVATE, src_fd, 0); close(src_fd); if (src_data == MAP_FAILED) return -1; // Create and map destination int dst_fd = open(dst, O_RDWR | O_CREAT | O_TRUNC, st.st_mode); if (dst_fd < 0) { munmap(src_data, size); return -1; } ftruncate(dst_fd, size); char *dst_data = mmap(NULL, size, PROT_WRITE, MAP_SHARED, dst_fd, 0); close(dst_fd); if (dst_data == MAP_FAILED) { munmap(src_data, size); return -1; } // Single memcpy for entire file memcpy(dst_data, src_data, size); // Ensure data is on disk msync(dst_data, size, MS_SYNC); munmap(src_data, size); munmap(dst_data, size); return 0;} /* * Performance comparison for 1GB file: * * Traditional (4K buffer loop): ~3 seconds * mmap + memcpy: ~2 seconds * sendfile(): ~1.5 seconds (kernel-only copy) * * mmap wins due to zero-copy and efficient page handling. * Even faster: use sendfile() or copy_file_range() for pure copy. */6. Memory-Mapped Data Structures
Persistent data structures that survive process restarts:
// Map a file containing a hash table
HashTable *table = mmap(NULL, size, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
// Use directly - operations persist automatically
table->slots[hash(key)] = value;
// On restart, just mmap again - data is there
Used by: LMDB, Redis persistence, configuration stores.
7. Large Binary Data Access
Scientific computing, image processing, data analysis:
LMDB (Lightning Memory-Mapped Database) uses mmap() as its core architecture. The entire database is memory-mapped; all reads are direct pointer access. Combined with copy-on-write B-trees, LMDB achieves exceptional read performance with clean code. It's an excellent study in mmap() power.
Memory mapping has subtle pitfalls that can cause crashes, corruption, or security vulnerabilities. Understanding these is essential for safe usage.
Pitfall 1: SIGBUS on Access Past EOF
If you map a file and the file shrinks (or you mapped beyond EOF from the start), accessing those pages generates SIGBUS:
// File is 1000 bytes
char *data = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
// This accesses page 0 (bytes 0-4095)
char c = data[2000]; // SIGBUS! Only bytes 0-999 exist in file
Pitfall 2: File Truncation Race
// Process A maps file
char *data = mmap(...);
// Process B truncates file
truncate(filename, 0);
// Process A accesses mapped region
data[1000] = 'x'; // SIGBUS!
Solution: Coordinate between processes, or handle SIGBUS.
Pitfall 3: Write Without MAP_SHARED
// Map private
char *data = mmap(NULL, size, PROT_READ | PROT_WRITE,
MAP_PRIVATE, fd, 0);
// Write (creates private copy, file unchanged!)
data[0] = 'X';
msync(data, size, MS_SYNC); // Useless - no file backing
// Close - file is NOT modified!
Solution: Use MAP_SHARED for file modifications.
Pitfall 4: Missing munmap() (Resource Leak)
for (int i = 0; i < 1000000; i++) {
char *p = mmap(...);
// Use p...
// Forgot munmap()!
}
// Eventually: out of address space or memory
Pitfall 5: SIGBUS Handling
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
#include <stdio.h>#include <signal.h>#include <setjmp.h>#include <sys/mman.h>#include <fcntl.h>#include <unistd.h> static sigjmp_buf jump_buf;static volatile sig_atomic_t got_sigbus = 0; void sigbus_handler(int sig) { got_sigbus = 1; siglongjmp(jump_buf, 1);} /** * Safe mmap access with SIGBUS handling * Returns 0 on success, -1 on SIGBUS */int safe_mmap_access(char *data, size_t offset, char *result) { // Install SIGBUS handler struct sigaction sa, old_sa; sa.sa_handler = sigbus_handler; sigemptyset(&sa.sa_mask); sa.sa_flags = 0; sigaction(SIGBUS, &sa, &old_sa); got_sigbus = 0; // Set jump point if (sigsetjmp(jump_buf, 1) == 0) { // Normal path *result = data[offset]; // May SIGBUS sigaction(SIGBUS, &old_sa, NULL); // Restore return 0; // Success } else { // Returned from SIGBUS handler sigaction(SIGBUS, &old_sa, NULL); return -1; // Failed }} /* * WARNING: SIGBUS handling is tricky: * - Can't safely return to the faulting instruction * - Must longjmp out or exit * - Signal-safe functions only in handler * - Better to prevent SIGBUS via proper size checking * * Prevention is better than handling: * - Check file size before mapping * - Don't access beyond mapped region * - Use flock() or coordination for shared files */Never mmap() files from untrusted sources with PROT_EXEC. Malicious code in the file could be executed. Always validate file contents before enabling execute permission.
We've conducted a comprehensive exploration of memory-mapped file access—a powerful technique that unifies file I/O with memory operations. Let's consolidate the critical concepts:
Module Complete:
This concludes our exploration of file access methods. We've journeyed from the foundational sequential access pattern, through direct (random) access with lseek(), explored indexed access structures that enable efficient key-based lookup, systematically compared all methods to understand their trade-offs, and finally mastered memory-mapped access that bridges files and memory.
These access methods are the primitives underlying all file I/O. Whether you're building a database, designing a log system, implementing a text editor, or optimizing a data pipeline, the choice and combination of access methods fundamentally shapes your system's performance and capabilities.
Armed with this deep understanding, you're prepared to tackle the remaining file system topics: directory structures, file protection, and the various organizational patterns that file systems use to manage persistent storage.
Congratulations! You've mastered the five fundamental file access methods: sequential, direct, indexed, access method comparison, and memory-mapped access. This knowledge forms the foundation for understanding and building storage systems, databases, and any application that interacts with persistent data.