Operating SystemsAccess Methods

File Access Methods

LevelIntermediate

Duration75 mins

TopicAccess Methods

5 / 5

Memory-Mapped Access

When Files Become Memory

What if you could access a file as simply as accessing an array in memory? No read() calls, no write() calls, no buffers to manage—just pointers and dereferencing.

Memory-mapped file access makes this possible. By mapping a file into a process's virtual address space, the operating system creates a seamless bridge between disk and memory. Your program can read file contents with data[offset] and modify them with data[offset] = value. The OS handles everything: loading pages on demand, writing changes back to disk, and managing the buffer cache.

This isn't just a convenience—it's a powerful optimization technique. Memory mapping eliminates the copy between kernel and user space that conventional read/write incurs. It enables zero-copy file sharing between processes. It allows the OS to use the same memory pages for the buffer cache and your application's view of the file.

At the same time, memory mapping has subtleties and pitfalls that can catch the unwary. Understanding when to use it—and when not to—is essential for systems programming mastery.

What You Will Learn

By the end of this page, you will master memory-mapped file access—the mmap() system call with all its parameters, the mechanics of demand paging, shared vs. private mappings, performance implications, essential use cases, and critical pitfalls that can cause data corruption or crashes.

The Conceptual Model: Files as Memory

Memory mapping creates a direct correspondence between a region of virtual memory and a file on disk:

┌────────────────────────────────────────────────────────┐
│                Process Virtual Address Space            │
│                                                         │
│  0x00000000 ─────┐                                     │
│        ...       │                                     │
│  0x7f000000 ───► ┌─────────────────────────────────┐   │
│                  │    Memory-Mapped Region         │   │
│                  │    (points to file data)        │   │
│        ◄─────────┤    ptr = mmap(...)              │   │
│                  │    ptr[0], ptr[1], ...ptr[n]    │   │
│  0x7f100000 ───► └─────────────────────────────────┘   │
│        ...       │                                     │
└────────────────────────────────────────────────────────┘
                           │
                           │ (Page table mapping)
                           ▼
┌────────────────────────────────────────────────────────┐
│                     File on Disk                        │
│  ┌──────────────────────────────────────────────────┐  │
│  │ Page 0 │ Page 1 │ Page 2 │ ... │ Page N          │  │
│  │ 4KB    │ 4KB    │ 4KB    │     │ 4KB             │  │
│  └──────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────┘

Key insight: After mapping, accessing memory addresses in the mapped region causes the OS to:

On first access to a page: Trigger a page fault, load the corresponding file page from disk (or buffer cache), map it into the process's address space, and resume execution.
On subsequent accesses: Access the already-loaded page directly in memory—no system call, no copy.
On modification (for writable mappings): Mark the page dirty. The OS will eventually write it back to the file.

Benefits of memory mapping:

Zero-copy access: Data goes directly from buffer cache to your memory space
Lazy loading: Only pages actually accessed are loaded (demand paging)
Unified buffer cache: One copy serves both file I/O and memory mapping
Simple programming model: Use pointers instead of read/write/seek
Automatic caching: OS manages what's in memory; no manual buffer management
Efficient random access: Access any offset as easily as any other

Virtual Memory Integration

Memory mapping leverages the same virtual memory machinery that provides process isolation and demand paging for regular program memory. The file becomes just another source of page contents, handled by the same page fault mechanism and buffer cache infrastructure.

The mmap() System Call in Depth

The mmap() system call creates memory mappings. Understanding its parameters deeply is essential for using it correctly.

The signature:

#include <sys/mman.h>

void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);

Parameters in detail:

addr (Hint for mapping address)
- Usually NULL, letting the OS choose an appropriate address
- Non-NULL suggests a preferred address (not guaranteed unless MAP_FIXED)
- Must be page-aligned if specified
length (Size of the mapping)
- Number of bytes to map
- Rounded up to page size (typically 4KB)
- Can extend past file end (access to that region causes SIGBUS)
prot (Protection flags)
- PROT_READ — Pages can be read
- PROT_WRITE — Pages can be written
- PROT_EXEC — Pages can be executed
- PROT_NONE — Pages cannot be accessed (guard pages)
- Combine with OR: PROT_READ | PROT_WRITE
flags (Mapping type and behavior)
- MAP_SHARED — Changes are visible to others, written to file
- MAP_PRIVATE — Copy-on-write; changes are private
- MAP_FIXED — Use addr exactly (dangerous; can overwrite existing mappings)
- MAP_ANONYMOUS — No file backing; memory initialized to zero
- MAP_POPULATE — Prefault pages on map (avoid later faults)
- MAP_HUGETLB — Use huge pages
fd (File descriptor)
- Open file descriptor for the file to map
- Ignored for MAP_ANONYMOUS
- File must be opened with matching permissions (O_RDWR for PROT_WRITE with MAP_SHARED)
offset (Offset into file)
- Where in the file to start the mapping
- Must be page-aligned

mmap_examples.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <string.h>
 
/**
 * Example 1: Read-only mapping of entire file
 */
void read_file_via_mmap(const char *filename) {
    int fd = open(filename, O_RDONLY);
    if (fd < 0) { perror("open"); return; }
    
    // Get file size
    struct stat st;
    fstat(fd, &st);
    size_t file_size = st.st_size;
    
    // Map the file read-only
    char *data = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0);
    if (data == MAP_FAILED) { perror("mmap"); close(fd); return; }
    
    // Can close fd after mapping (mapping remains valid)
    close(fd);
    
    // Access file contents directly via pointer
    printf("First 100 bytes: %.100s\n", data);
    printf("Byte at offset 1000: 0x%02x\n", (unsigned char)data[1000]);
    
    // Cleanup
    munmap(data, file_size);
}
 
/**
 * Example 2: Read-write mapping (modifications written to file)
 */
void modify_file_via_mmap(const char *filename) {
    int fd = open(filename, O_RDWR);
    if (fd < 0) { perror("open"); return; }
    
    struct stat st;
    fstat(fd, &st);
    size_t file_size = st.st_size;
    
    // Map read-write, shared (writes go to file)
    char *data = mmap(NULL, file_size, PROT_READ | PROT_WRITE, 
                      MAP_SHARED, fd, 0);
    if (data == MAP_FAILED) { perror("mmap"); close(fd); return; }
    close(fd);
    
    // Modify file by writing to memory
    memcpy(data, "MODIFIED!", 9);  // Changes bytes 0-8 of file
    data[100] = 'X';               // Changes byte 100 of file
    
    // Force write to disk (optional; OS does this eventually)
    msync(data, file_size, MS_SYNC);
    
    munmap(data, file_size);
    printf("File modified via memory mapping\n");
}
 
/**
 * Example 3: Private mapping (copy-on-write)
 */
void private_copy_via_mmap(const char *filename) {
    int fd = open(filename, O_RDONLY);  // Read-only is OK for private
    if (fd < 0) { perror("open"); return; }
    
    struct stat st;
    fstat(fd, &st);
    size_t file_size = st.st_size;
    
    // Private mapping allows writes but they don't affect the file
    char *data = mmap(NULL, file_size, PROT_READ | PROT_WRITE,
                      MAP_PRIVATE, fd, 0);
    close(fd);
    
    // Modify in memory - creates private copy of affected pages
    data[0] = 'X';  // Page 0 copied, modification is private
    
    // Original file is unchanged!
    munmap(data, file_size);
}
 
/**
 * Example 4: Create new file via mapping
 */
void create_file_via_mmap(const char *filename, size_t size) {
    int fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, 0644);
    if (fd < 0) { perror("open"); return; }
    
    // Extend file to desired size
    ftruncate(fd, size);
    
    // Map the empty file
    char *data = mmap(NULL, size, PROT_READ | PROT_WRITE,
                      MAP_SHARED, fd, 0);
    close(fd);
    
    // Fill the file by writing to memory
    memset(data, 0, size);
    strcpy(data, "File created and populated via mmap!");
    
    msync(data, size, MS_SYNC);
    munmap(data, size);
}
 
/**
 * Example 5: Anonymous mapping (no file, just memory)
 */
void anonymous_mapping_demo() {
    size_t size = 1024 * 1024;  // 1 MB
    
    // No file backing; memory initialized to zero
    char *data = mmap(NULL, size, PROT_READ | PROT_WRITE,
                      MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (data == MAP_FAILED) { perror("mmap"); return; }
    
    // Use like malloc'd memory
    strcpy(data, "Anonymous mapping content");
    printf("Data: %s\n", data);
    
    // Free with munmap, not free()
    munmap(data, size);
}

Critical: munmap() to Cleanup

Always call munmap() when done with a mapping. Unlike malloc/free, unfree'd mappings persist until process exit. For long-running processes, failing to munmap leaks address space and kernel resources. Close() does NOT unmap—the mapping survives after close().

MAP_SHARED vs. MAP_PRIVATE: Critical Differences

The choice between MAP_SHARED and MAP_PRIVATE is one of the most important decisions when memory mapping. Getting it wrong can cause silent data loss or corruption.

MAP_SHARED: Changes Written to File

Writes to the mapping modify the underlying file
Changes are visible to other processes mapping the same file
Used for: file modification, inter-process communication, databases
Requires file opened O_RDWR for write access

MAP_PRIVATE: Copy-On-Write Semantics

Writes create a private copy of the page (COW)
Original file is never modified
Changes are invisible to other processes
Used for: loading executables, read-then-modify patterns, sandboxing
Can work with O_RDONLY files

Behavior comparison:

MAP_SHARED vs MAP_PRIVATE
Aspect	MAP_SHARED	MAP_PRIVATE
Write to mapping	Modifies file	Private copy (COW)
Other processes see changes	Yes	No
File modification	File updated	File unchanged
Memory use on write	Same pages	Copy made per page modified
Requires O_RDWR file	Yes (for writes)	No
Use case	IPC, databases, logs	Executables, text files, sandboxes
msync() effect	Flushes to disk	No effect (no backing file)

shared_vs_private_demo.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/wait.h>
 
/**
 * Demonstrates the crucial difference between MAP_SHARED and MAP_PRIVATE
 */
 
void demo_shared_mapping() {
    printf("=== MAP_SHARED Demo ===\n");
    
    // Create a test file with initial content
    int fd = open("shared_test.dat", O_RDWR | O_CREAT | O_TRUNC, 0644);
    write(fd, "ORIGINAL", 8);
    
    // Map shared
    char *data = mmap(NULL, 4096, PROT_READ | PROT_WRITE,
                      MAP_SHARED, fd, 0);
    close(fd);
    
    pid_t pid = fork();
    
    if (pid == 0) {
        // Child modifies the mapping
        data[0] = 'X';
        munmap(data, 4096);
        exit(0);
    }
    
    wait(NULL);  // Wait for child
    
    // Parent sees child's change!
    printf("After child: %.8s\n", data);  // Prints "XRIGINAL"
    
    // File also modified!
    fd = open("shared_test.dat", O_RDONLY);
    char buf[9] = {0};
    read(fd, buf, 8);
    printf("File content: %s\n", buf);  // Prints "XRIGINAL"
    close(fd);
    
    munmap(data, 4096);
}
 
void demo_private_mapping() {
    printf("\n=== MAP_PRIVATE Demo ===\n");
    
    // Create a test file
    int fd = open("private_test.dat", O_RDWR | O_CREAT | O_TRUNC, 0644);
    write(fd, "ORIGINAL", 8);
    
    // Map private
    char *data = mmap(NULL, 4096, PROT_READ | PROT_WRITE,
                      MAP_PRIVATE, fd, 0);
    close(fd);
    
    pid_t pid = fork();
    
    if (pid == 0) {
        // Child modifies the mapping (creates private copy)
        data[0] = 'Y';
        printf("Child sees: %.8s\n", data);  // Prints "YRIGINAL"
        munmap(data, 4096);
        exit(0);
    }
    
    wait(NULL);
    
    // Parent does NOT see child's change!
    printf("Parent sees: %.8s\n", data);  // Prints "ORIGINAL"
    
    // File is completely unchanged
    fd = open("private_test.dat", O_RDONLY);
    char buf[9] = {0};
    read(fd, buf, 8);
    printf("File content: %s\n", buf);  // Prints "ORIGINAL"
    close(fd);
    
    munmap(data, 4096);
}
 
int main() {
    demo_shared_mapping();
    demo_private_mapping();
    return 0;
}

Default to MAP_PRIVATE

Unless you specifically need file modification or inter-process sharing, use MAP_PRIVATE. It's safer—you cannot accidentally corrupt the original file, and copy-on-write means unmodified pages don't consume extra memory.

Synchronization: msync() and Data Durability

With MAP_SHARED, modifications to the mapping are eventually written to the file—but 'eventually' means the OS decides when. For applications requiring durability guarantees (databases, transaction logs), you need explicit synchronization.

The msync() system call:

#include <sys/mman.h>

int msync(void *addr, size_t length, int flags);

Parameters:

addr — Start of the region to sync (must be page-aligned)
length — Length of the region to sync
flags — Synchronization mode:
- MS_SYNC — Synchronous; blocks until data is on disk
- MS_ASYNC — Asynchronous; schedules write but returns immediately
- MS_INVALIDATE — Invalidate other mappings of this file (force re-read)

When writes actually happen without msync():

Dirty pages accumulated — Modified pages are marked dirty
pdflush/kworker threads — Background threads periodically write dirty pages
Memory pressure — Pages may be written to make room for other data
munmap() — Dirty pages are written before unmapping
Process exit — All dirty pages flushed

The durability problem:

msync_durability.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
 
/**
 * Example: Transaction log with durability requirements
 */
 
typedef struct {
    int transaction_id;
    char data[60];
} LogEntry;
 
typedef struct {
    int fd;
    char *base;
    size_t size;
    size_t offset;
} MappedLog;
 
MappedLog* create_mapped_log(const char *filename, size_t size) {
    MappedLog *log = malloc(sizeof(MappedLog));
    
    log->fd = open(filename, O_RDWR | O_CREAT, 0644);
    ftruncate(log->fd, size);
    
    log->base = mmap(NULL, size, PROT_READ | PROT_WRITE,
                     MAP_SHARED, log->fd, 0);
    log->size = size;
    log->offset = 0;
    
    return log;
}
 
/**
 * Append entry WITHOUT durability guarantee
 * Fast but data may be lost on crash
 */
void append_entry_fast(MappedLog *log, LogEntry *entry) {
    memcpy(log->base + log->offset, entry, sizeof(LogEntry));
    log->offset += sizeof(LogEntry);
    // No msync - relies on OS eventual write-back
}
 
/**
 * Append entry WITH durability guarantee
 * Slower but data is safe on disk after return
 */
void append_entry_durable(MappedLog *log, LogEntry *entry) {
    memcpy(log->base + log->offset, entry, sizeof(LogEntry));
    
    // Sync the specific region containing the new entry
    // Round down to page boundary for addr
    size_t page_size = 4096;
    void *sync_addr = (void*)((size_t)(log->base + log->offset) & ~(page_size - 1));
    size_t sync_len = sizeof(LogEntry) + 
                      ((log->base + log->offset) - (char*)sync_addr);
    
    // MS_SYNC blocks until data is durably on disk
    if (msync(sync_addr, sync_len, MS_SYNC) != 0) {
        perror("msync");
    }
    
    log->offset += sizeof(LogEntry);
}
 
/**
 * Append entry with async hint
 * Middle ground: OS will prioritize writing this soon
 */
void append_entry_async(MappedLog *log, LogEntry *entry) {
    memcpy(log->base + log->offset, entry, sizeof(LogEntry));
    
    void *sync_addr = log->base;  // Sync entire mapping
    size_t sync_len = log->size;
    
    // MS_ASYNC just schedules the write, returns immediately
    msync(sync_addr, sync_len, MS_ASYNC);
    
    log->offset += sizeof(LogEntry);
}
 
/*
 * Performance implications:
 * 
 * - append_entry_fast():    ~100,000-1,000,000 entries/sec
 * - append_entry_async():   ~50,000-500,000 entries/sec  
 * - append_entry_durable(): ~100-10,000 entries/sec (depends on disk)
 *
 * The durability version is dramatically slower because it waits
 * for disk I/O to complete. Use it only when crash safety is required.
 */

msync() != fsync()

For true durability on some file systems/hardware, you may also need fsync(fd) even after msync(). This is because msync() may only ensure data reaches the disk controller's write cache, not permanent storage. For critical data, use both: msync(data, len, MS_SYNC) followed by fsync(fd).

Performance Characteristics and Trade-offs

Memory mapping offers significant performance benefits but isn't universally superior. Understanding the trade-offs is essential.

Advantages of mmap() over read()/write():

Zero-copy access
- read() copies data: disk → buffer cache → user buffer
- mmap() provides direct access to buffer cache pages
- Saves one memory copy per access (significant for large files)
No system call per access
- read() requires a syscall for each operation (~100ns overhead)
- mmap() accesses are regular memory operations (~1ns)
- For small, frequent accesses, this is huge
Automatic caching
- OS manages which pages are in memory
- Hot pages stay resident; cold pages can be evicted
- No manual buffer management code needed
Efficient random access
- lseek() + read() for each access is expensive
- mmap() random access is just pointer arithmetic

Disadvantages and when read()/write() is better:

mmap() vs. read()/write() Trade-offs
Scenario	Better Choice	Reason
Small file, read once	read()	mmap overhead (page table setup) exceeds benefit
Large file, random access	mmap()	Zero-copy, no syscall per access
Streaming sequential	read() with large buffer	Similar performance, simpler error handling
Database page access	mmap()	Buffer pool integrates with OS page cache
Strict I/O error handling	read()/write()	mmap() errors manifest as SIGBUS, hard to handle
Files larger than address space	read()/write()	32-bit: 0-2GB limit; must use windowing
Compressed/encrypted files	read()/write()	Must transform data on read anyway
Network file systems (NFS)	read()/write()	mmap() semantics poorly defined on network

Quantitative performance comparison:

Operation	read()	mmap()	Winner
Setup 1MB file	~10μs (open)	~50μs (open+mmap)	read()
Sequential read 1MB	~500μs	~500μs	Tie
Random read 1K chunks	~1ms (1000 syscalls)	~10μs (memory access)	mmap() 100x
Single 4K read	~2μs	~10μs (page fault)	read()
Repeated 4K reads (cached)	~1.5μs each	~0.01μs each	mmap() 100x

Key observations:

mmap() has higher setup cost but lower per-access cost
For random access or repeated access patterns, mmap() dominates
For single sequential pass of moderate files, roughly equivalent
For very small files accessed once, read() may be faster

Prefaulting for Predictable Latency

Page faults during access can cause latency spikes. For latency-sensitive applications, use MAP_POPULATE to prefault all pages at mmap() time, or use mlock() to both fault and pin pages in memory. This trades upfront latency for consistent access times.

Essential Use Cases for Memory Mapping

Memory mapping shines in specific scenarios. Let's examine the canonical use cases:

1. Database Buffer Pools

Databases like SQLite, LMDB, and MongoDB use mmap() for their buffer pools:

Database file is memory-mapped
Page access becomes memory access
OS page cache serves as buffer pool
Zero manual cache management code

Caveat: Some databases (PostgreSQL) avoid mmap() for better control over write ordering and cache policies.

2. Loading Executables and Shared Libraries

When you run a program, the OS uses mmap():

Executable's code segment is mapped read+execute
Shared libraries (.so, .dll) are mapped into multiple processes
Copy-on-write for writable sections
Only accessed pages are loaded (demand paging)

3. Text Editors and IDEs

Editing large files efficiently:

Map file, access any offset instantly
Modifications visible through the mapping
Undo history can reference mmap'd regions
Large files that don't fit in RAM: only loaded pages consume memory

4. Inter-Process Communication (IPC)

Multiple processes sharing memory:

Both processes mmap() the same file with MAP_SHARED
One writes, the other sees changes immediately
Fastest IPC mechanism (no kernel crossing for data)
Requires synchronization (mutexes, semaphores) for safety

5. Fast File Copying

fast_file_copy.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <string.h>
 
/**
 * Fast file copy using memory mapping
 * Outperforms read/write loops for large files
 */
int copy_file_mmap(const char *src, const char *dst) {
    // Open and map source
    int src_fd = open(src, O_RDONLY);
    if (src_fd < 0) return -1;
    
    struct stat st;
    fstat(src_fd, &st);
    size_t size = st.st_size;
    
    char *src_data = mmap(NULL, size, PROT_READ, MAP_PRIVATE, src_fd, 0);
    close(src_fd);
    if (src_data == MAP_FAILED) return -1;
    
    // Create and map destination
    int dst_fd = open(dst, O_RDWR | O_CREAT | O_TRUNC, st.st_mode);
    if (dst_fd < 0) { munmap(src_data, size); return -1; }
    
    ftruncate(dst_fd, size);
    
    char *dst_data = mmap(NULL, size, PROT_WRITE, MAP_SHARED, dst_fd, 0);
    close(dst_fd);
    if (dst_data == MAP_FAILED) { munmap(src_data, size); return -1; }
    
    // Single memcpy for entire file
    memcpy(dst_data, src_data, size);
    
    // Ensure data is on disk
    msync(dst_data, size, MS_SYNC);
    
    munmap(src_data, size);
    munmap(dst_data, size);
    
    return 0;
}
 
/*
 * Performance comparison for 1GB file:
 * 
 * Traditional (4K buffer loop): ~3 seconds
 * mmap + memcpy:               ~2 seconds
 * sendfile():                  ~1.5 seconds (kernel-only copy)
 * 
 * mmap wins due to zero-copy and efficient page handling.
 * Even faster: use sendfile() or copy_file_range() for pure copy.
 */

6. Memory-Mapped Data Structures

Persistent data structures that survive process restarts:

// Map a file containing a hash table
HashTable *table = mmap(NULL, size, PROT_READ | PROT_WRITE,
                        MAP_SHARED, fd, 0);

// Use directly - operations persist automatically
table->slots[hash(key)] = value;

// On restart, just mmap again - data is there

Used by: LMDB, Redis persistence, configuration stores.

7. Large Binary Data Access

Scientific computing, image processing, data analysis:

100GB array too large for malloc()
mmap() with MAP_PRIVATE allows access with paging
OS loads only accessed pages; swaps cold pages out
No need to manually manage which pages are in memory

LMDB: mmap() as Architecture

LMDB (Lightning Memory-Mapped Database) uses mmap() as its core architecture. The entire database is memory-mapped; all reads are direct pointer access. Combined with copy-on-write B-trees, LMDB achieves exceptional read performance with clean code. It's an excellent study in mmap() power.

Pitfalls, Errors, and Signal Handling

Memory mapping has subtle pitfalls that can cause crashes, corruption, or security vulnerabilities. Understanding these is essential for safe usage.

Pitfall 1: SIGBUS on Access Past EOF

If you map a file and the file shrinks (or you mapped beyond EOF from the start), accessing those pages generates SIGBUS:

// File is 1000 bytes
char *data = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);

// This accesses page 0 (bytes 0-4095)
char c = data[2000];  // SIGBUS! Only bytes 0-999 exist in file

Pitfall 2: File Truncation Race

// Process A maps file
char *data = mmap(...);

// Process B truncates file
truncate(filename, 0);

// Process A accesses mapped region
data[1000] = 'x';  // SIGBUS!

Solution: Coordinate between processes, or handle SIGBUS.

Pitfall 3: Write Without MAP_SHARED

// Map private
char *data = mmap(NULL, size, PROT_READ | PROT_WRITE,
                  MAP_PRIVATE, fd, 0);

// Write (creates private copy, file unchanged!)
data[0] = 'X';
msync(data, size, MS_SYNC);  // Useless - no file backing

// Close - file is NOT modified!

Solution: Use MAP_SHARED for file modifications.

Pitfall 4: Missing munmap() (Resource Leak)

for (int i = 0; i < 1000000; i++) {
    char *p = mmap(...);
    // Use p...
    // Forgot munmap()!
}
// Eventually: out of address space or memory

Pitfall 5: SIGBUS Handling

sigbus_handling.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
 
static sigjmp_buf jump_buf;
static volatile sig_atomic_t got_sigbus = 0;
 
void sigbus_handler(int sig) {
    got_sigbus = 1;
    siglongjmp(jump_buf, 1);
}
 
/**
 * Safe mmap access with SIGBUS handling
 * Returns 0 on success, -1 on SIGBUS
 */
int safe_mmap_access(char *data, size_t offset, char *result) {
    // Install SIGBUS handler
    struct sigaction sa, old_sa;
    sa.sa_handler = sigbus_handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;
    sigaction(SIGBUS, &sa, &old_sa);
    
    got_sigbus = 0;
    
    // Set jump point
    if (sigsetjmp(jump_buf, 1) == 0) {
        // Normal path
        *result = data[offset];  // May SIGBUS
        sigaction(SIGBUS, &old_sa, NULL);  // Restore
        return 0;  // Success
    } else {
        // Returned from SIGBUS handler
        sigaction(SIGBUS, &old_sa, NULL);
        return -1;  // Failed
    }
}
 
/*
 * WARNING: SIGBUS handling is tricky:
 * - Can't safely return to the faulting instruction
 * - Must longjmp out or exit
 * - Signal-safe functions only in handler
 * - Better to prevent SIGBUS via proper size checking
 *
 * Prevention is better than handling:
 * - Check file size before mapping
 * - Don't access beyond mapped region
 * - Use flock() or coordination for shared files
 */

Additional Pitfalls

•32-bit address space limits — On 32-bit systems, mmap() can only address ~2-3GB. Use windowing for larger files.
•Page-aligned requirements — offset parameter must be page-aligned; returned address is always page-aligned.
•mmap() + fork() + exec() interactions — Mappings are inherited on fork, lost on exec.
•NFS and network filesystems — mmap() semantics vary; consistency not guaranteed across hosts.
•MAP_FIXED danger — Can silently overwrite existing mappings, including critical memory regions.

Security Consideration

Never mmap() files from untrusted sources with PROT_EXEC. Malicious code in the file could be executed. Always validate file contents before enabling execute permission.

Summary: Mastering Memory-Mapped Access

We've conducted a comprehensive exploration of memory-mapped file access—a powerful technique that unifies file I/O with memory operations. Let's consolidate the critical concepts:

Key Takeaways

•mmap() maps files into virtual address space — Access via pointers instead of read()/write() calls.
•MAP_SHARED writes to file; MAP_PRIVATE uses copy-on-write — Choose based on whether changes should persist.
•Zero-copy efficiency — No copy from kernel to user space; direct buffer cache access.
•Demand paging loads only accessed pages — Efficient for large files with sparse access.
•msync() for durability — Explicit synchronization required for crash-safe writes.
•SIGBUS on access past EOF — Handle or prevent; can't catch like a read() error.
•Ideal for: databases, editors, IPC, executables — Use when random access or sharing is critical.

Module Complete:

This concludes our exploration of file access methods. We've journeyed from the foundational sequential access pattern, through direct (random) access with lseek(), explored indexed access structures that enable efficient key-based lookup, systematically compared all methods to understand their trade-offs, and finally mastered memory-mapped access that bridges files and memory.

These access methods are the primitives underlying all file I/O. Whether you're building a database, designing a log system, implementing a text editor, or optimizing a data pipeline, the choice and combination of access methods fundamentally shapes your system's performance and capabilities.

Armed with this deep understanding, you're prepared to tackle the remaining file system topics: directory structures, file protection, and the various organizational patterns that file systems use to manage persistent storage.

Module Complete

Congratulations! You've mastered the five fundamental file access methods: sequential, direct, indexed, access method comparison, and memory-mapped access. This knowledge forms the foundation for understanding and building storage systems, databases, and any application that interacts with persistent data.

5 / 5

Loading learning content...

Operating SystemsAccess Methods

File Access Methods

LevelIntermediate

Duration75 mins

TopicAccess Methods

5 / 5

Memory-Mapped Access

When Files Become Memory

What if you could access a file as simply as accessing an array in memory? No read() calls, no write() calls, no buffers to manage—just pointers and dereferencing.

At the same time, memory mapping has subtleties and pitfalls that can catch the unwary. Understanding when to use it—and when not to—is essential for systems programming mastery.

What You Will Learn

The Conceptual Model: Files as Memory

Memory mapping creates a direct correspondence between a region of virtual memory and a file on disk:

┌────────────────────────────────────────────────────────┐
│                Process Virtual Address Space            │
│                                                         │
│  0x00000000 ─────┐                                     │
│        ...       │                                     │
│  0x7f000000 ───► ┌─────────────────────────────────┐   │
│                  │    Memory-Mapped Region         │   │
│                  │    (points to file data)        │   │
│        ◄─────────┤    ptr = mmap(...)              │   │
│                  │    ptr[0], ptr[1], ...ptr[n]    │   │
│  0x7f100000 ───► └─────────────────────────────────┘   │
│        ...       │                                     │
└────────────────────────────────────────────────────────┘
                           │
                           │ (Page table mapping)
                           ▼
┌────────────────────────────────────────────────────────┐
│                     File on Disk                        │
│  ┌──────────────────────────────────────────────────┐  │
│  │ Page 0 │ Page 1 │ Page 2 │ ... │ Page N          │  │
│  │ 4KB    │ 4KB    │ 4KB    │     │ 4KB             │  │
│  └──────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────┘

Key insight: After mapping, accessing memory addresses in the mapped region causes the OS to:

On first access to a page: Trigger a page fault, load the corresponding file page from disk (or buffer cache), map it into the process's address space, and resume execution.
On subsequent accesses: Access the already-loaded page directly in memory—no system call, no copy.
On modification (for writable mappings): Mark the page dirty. The OS will eventually write it back to the file.

Benefits of memory mapping:

Zero-copy access: Data goes directly from buffer cache to your memory space
Lazy loading: Only pages actually accessed are loaded (demand paging)
Unified buffer cache: One copy serves both file I/O and memory mapping
Simple programming model: Use pointers instead of read/write/seek
Automatic caching: OS manages what's in memory; no manual buffer management
Efficient random access: Access any offset as easily as any other

Virtual Memory Integration

The mmap() System Call in Depth

The mmap() system call creates memory mappings. Understanding its parameters deeply is essential for using it correctly.

The signature:

#include <sys/mman.h>

void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);

Parameters in detail:

addr (Hint for mapping address)
- Usually NULL, letting the OS choose an appropriate address
- Non-NULL suggests a preferred address (not guaranteed unless MAP_FIXED)
- Must be page-aligned if specified
length (Size of the mapping)
- Number of bytes to map
- Rounded up to page size (typically 4KB)
- Can extend past file end (access to that region causes SIGBUS)
prot (Protection flags)
- PROT_READ — Pages can be read
- PROT_WRITE — Pages can be written
- PROT_EXEC — Pages can be executed
- PROT_NONE — Pages cannot be accessed (guard pages)
- Combine with OR: PROT_READ | PROT_WRITE
flags (Mapping type and behavior)
- MAP_SHARED — Changes are visible to others, written to file
- MAP_PRIVATE — Copy-on-write; changes are private
- MAP_FIXED — Use addr exactly (dangerous; can overwrite existing mappings)
- MAP_ANONYMOUS — No file backing; memory initialized to zero
- MAP_POPULATE — Prefault pages on map (avoid later faults)
- MAP_HUGETLB — Use huge pages
fd (File descriptor)
- Open file descriptor for the file to map
- Ignored for MAP_ANONYMOUS
- File must be opened with matching permissions (O_RDWR for PROT_WRITE with MAP_SHARED)
offset (Offset into file)
- Where in the file to start the mapping
- Must be page-aligned

mmap_examples.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <string.h>
 
/**
 * Example 1: Read-only mapping of entire file
 */
void read_file_via_mmap(const char *filename) {
    int fd = open(filename, O_RDONLY);
    if (fd < 0) { perror("open"); return; }
    
    // Get file size
    struct stat st;
    fstat(fd, &st);
    size_t file_size = st.st_size;
    
    // Map the file read-only
    char *data = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0);
    if (data == MAP_FAILED) { perror("mmap"); close(fd); return; }
    
    // Can close fd after mapping (mapping remains valid)
    close(fd);
    
    // Access file contents directly via pointer
    printf("First 100 bytes: %.100s\n", data);
    printf("Byte at offset 1000: 0x%02x\n", (unsigned char)data[1000]);
    
    // Cleanup
    munmap(data, file_size);
}
 
/**
 * Example 2: Read-write mapping (modifications written to file)
 */
void modify_file_via_mmap(const char *filename) {
    int fd = open(filename, O_RDWR);
    if (fd < 0) { perror("open"); return; }
    
    struct stat st;
    fstat(fd, &st);
    size_t file_size = st.st_size;
    
    // Map read-write, shared (writes go to file)
    char *data = mmap(NULL, file_size, PROT_READ | PROT_WRITE, 
                      MAP_SHARED, fd, 0);
    if (data == MAP_FAILED) { perror("mmap"); close(fd); return; }
    close(fd);
    
    // Modify file by writing to memory
    memcpy(data, "MODIFIED!", 9);  // Changes bytes 0-8 of file
    data[100] = 'X';               // Changes byte 100 of file
    
    // Force write to disk (optional; OS does this eventually)
    msync(data, file_size, MS_SYNC);
    
    munmap(data, file_size);
    printf("File modified via memory mapping\n");
}
 
/**
 * Example 3: Private mapping (copy-on-write)
 */
void private_copy_via_mmap(const char *filename) {
    int fd = open(filename, O_RDONLY);  // Read-only is OK for private
    if (fd < 0) { perror("open"); return; }
    
    struct stat st;
    fstat(fd, &st);
    size_t file_size = st.st_size;
    
    // Private mapping allows writes but they don't affect the file
    char *data = mmap(NULL, file_size, PROT_READ | PROT_WRITE,
                      MAP_PRIVATE, fd, 0);
    close(fd);
    
    // Modify in memory - creates private copy of affected pages
    data[0] = 'X';  // Page 0 copied, modification is private
    
    // Original file is unchanged!
    munmap(data, file_size);
}
 
/**
 * Example 4: Create new file via mapping
 */
void create_file_via_mmap(const char *filename, size_t size) {
    int fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, 0644);
    if (fd < 0) { perror("open"); return; }
    
    // Extend file to desired size
    ftruncate(fd, size);
    
    // Map the empty file
    char *data = mmap(NULL, size, PROT_READ | PROT_WRITE,
                      MAP_SHARED, fd, 0);
    close(fd);
    
    // Fill the file by writing to memory
    memset(data, 0, size);
    strcpy(data, "File created and populated via mmap!");
    
    msync(data, size, MS_SYNC);
    munmap(data, size);
}
 
/**
 * Example 5: Anonymous mapping (no file, just memory)
 */
void anonymous_mapping_demo() {
    size_t size = 1024 * 1024;  // 1 MB
    
    // No file backing; memory initialized to zero
    char *data = mmap(NULL, size, PROT_READ | PROT_WRITE,
                      MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (data == MAP_FAILED) { perror("mmap"); return; }
    
    // Use like malloc'd memory
    strcpy(data, "Anonymous mapping content");
    printf("Data: %s\n", data);
    
    // Free with munmap, not free()
    munmap(data, size);
}

Critical: munmap() to Cleanup

MAP_SHARED vs. MAP_PRIVATE: Critical Differences

The choice between MAP_SHARED and MAP_PRIVATE is one of the most important decisions when memory mapping. Getting it wrong can cause silent data loss or corruption.

MAP_SHARED: Changes Written to File

Writes to the mapping modify the underlying file
Changes are visible to other processes mapping the same file
Used for: file modification, inter-process communication, databases
Requires file opened O_RDWR for write access

MAP_PRIVATE: Copy-On-Write Semantics

Writes create a private copy of the page (COW)
Original file is never modified
Changes are invisible to other processes
Used for: loading executables, read-then-modify patterns, sandboxing
Can work with O_RDONLY files

Behavior comparison:

MAP_SHARED vs MAP_PRIVATE
Aspect	MAP_SHARED	MAP_PRIVATE
Write to mapping	Modifies file	Private copy (COW)
Other processes see changes	Yes	No
File modification	File updated	File unchanged
Memory use on write	Same pages	Copy made per page modified
Requires O_RDWR file	Yes (for writes)	No
Use case	IPC, databases, logs	Executables, text files, sandboxes
msync() effect	Flushes to disk	No effect (no backing file)

shared_vs_private_demo.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/wait.h>
 
/**
 * Demonstrates the crucial difference between MAP_SHARED and MAP_PRIVATE
 */
 
void demo_shared_mapping() {
    printf("=== MAP_SHARED Demo ===\n");
    
    // Create a test file with initial content
    int fd = open("shared_test.dat", O_RDWR | O_CREAT | O_TRUNC, 0644);
    write(fd, "ORIGINAL", 8);
    
    // Map shared
    char *data = mmap(NULL, 4096, PROT_READ | PROT_WRITE,
                      MAP_SHARED, fd, 0);
    close(fd);
    
    pid_t pid = fork();
    
    if (pid == 0) {
        // Child modifies the mapping
        data[0] = 'X';
        munmap(data, 4096);
        exit(0);
    }
    
    wait(NULL);  // Wait for child
    
    // Parent sees child's change!
    printf("After child: %.8s\n", data);  // Prints "XRIGINAL"
    
    // File also modified!
    fd = open("shared_test.dat", O_RDONLY);
    char buf[9] = {0};
    read(fd, buf, 8);
    printf("File content: %s\n", buf);  // Prints "XRIGINAL"
    close(fd);
    
    munmap(data, 4096);
}
 
void demo_private_mapping() {
    printf("\n=== MAP_PRIVATE Demo ===\n");
    
    // Create a test file
    int fd = open("private_test.dat", O_RDWR | O_CREAT | O_TRUNC, 0644);
    write(fd, "ORIGINAL", 8);
    
    // Map private
    char *data = mmap(NULL, 4096, PROT_READ | PROT_WRITE,
                      MAP_PRIVATE, fd, 0);
    close(fd);
    
    pid_t pid = fork();
    
    if (pid == 0) {
        // Child modifies the mapping (creates private copy)
        data[0] = 'Y';
        printf("Child sees: %.8s\n", data);  // Prints "YRIGINAL"
        munmap(data, 4096);
        exit(0);
    }
    
    wait(NULL);
    
    // Parent does NOT see child's change!
    printf("Parent sees: %.8s\n", data);  // Prints "ORIGINAL"
    
    // File is completely unchanged
    fd = open("private_test.dat", O_RDONLY);
    char buf[9] = {0};
    read(fd, buf, 8);
    printf("File content: %s\n", buf);  // Prints "ORIGINAL"
    close(fd);
    
    munmap(data, 4096);
}
 
int main() {
    demo_shared_mapping();
    demo_private_mapping();
    return 0;
}

Default to MAP_PRIVATE

Synchronization: msync() and Data Durability

The msync() system call:

#include <sys/mman.h>

int msync(void *addr, size_t length, int flags);

Parameters:

addr — Start of the region to sync (must be page-aligned)
length — Length of the region to sync
flags — Synchronization mode:
- MS_SYNC — Synchronous; blocks until data is on disk
- MS_ASYNC — Asynchronous; schedules write but returns immediately
- MS_INVALIDATE — Invalidate other mappings of this file (force re-read)

When writes actually happen without msync():

Dirty pages accumulated — Modified pages are marked dirty
pdflush/kworker threads — Background threads periodically write dirty pages
Memory pressure — Pages may be written to make room for other data
munmap() — Dirty pages are written before unmapping
Process exit — All dirty pages flushed

The durability problem:

msync_durability.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
 
/**
 * Example: Transaction log with durability requirements
 */
 
typedef struct {
    int transaction_id;
    char data[60];
} LogEntry;
 
typedef struct {
    int fd;
    char *base;
    size_t size;
    size_t offset;
} MappedLog;
 
MappedLog* create_mapped_log(const char *filename, size_t size) {
    MappedLog *log = malloc(sizeof(MappedLog));
    
    log->fd = open(filename, O_RDWR | O_CREAT, 0644);
    ftruncate(log->fd, size);
    
    log->base = mmap(NULL, size, PROT_READ | PROT_WRITE,
                     MAP_SHARED, log->fd, 0);
    log->size = size;
    log->offset = 0;
    
    return log;
}
 
/**
 * Append entry WITHOUT durability guarantee
 * Fast but data may be lost on crash
 */
void append_entry_fast(MappedLog *log, LogEntry *entry) {
    memcpy(log->base + log->offset, entry, sizeof(LogEntry));
    log->offset += sizeof(LogEntry);
    // No msync - relies on OS eventual write-back
}
 
/**
 * Append entry WITH durability guarantee
 * Slower but data is safe on disk after return
 */
void append_entry_durable(MappedLog *log, LogEntry *entry) {
    memcpy(log->base + log->offset, entry, sizeof(LogEntry));
    
    // Sync the specific region containing the new entry
    // Round down to page boundary for addr
    size_t page_size = 4096;
    void *sync_addr = (void*)((size_t)(log->base + log->offset) & ~(page_size - 1));
    size_t sync_len = sizeof(LogEntry) + 
                      ((log->base + log->offset) - (char*)sync_addr);
    
    // MS_SYNC blocks until data is durably on disk
    if (msync(sync_addr, sync_len, MS_SYNC) != 0) {
        perror("msync");
    }
    
    log->offset += sizeof(LogEntry);
}
 
/**
 * Append entry with async hint
 * Middle ground: OS will prioritize writing this soon
 */
void append_entry_async(MappedLog *log, LogEntry *entry) {
    memcpy(log->base + log->offset, entry, sizeof(LogEntry));
    
    void *sync_addr = log->base;  // Sync entire mapping
    size_t sync_len = log->size;
    
    // MS_ASYNC just schedules the write, returns immediately
    msync(sync_addr, sync_len, MS_ASYNC);
    
    log->offset += sizeof(LogEntry);
}
 
/*
 * Performance implications:
 * 
 * - append_entry_fast():    ~100,000-1,000,000 entries/sec
 * - append_entry_async():   ~50,000-500,000 entries/sec  
 * - append_entry_durable(): ~100-10,000 entries/sec (depends on disk)
 *
 * The durability version is dramatically slower because it waits
 * for disk I/O to complete. Use it only when crash safety is required.
 */

msync() != fsync()

Performance Characteristics and Trade-offs

Memory mapping offers significant performance benefits but isn't universally superior. Understanding the trade-offs is essential.

Advantages of mmap() over read()/write():

Zero-copy access
- read() copies data: disk → buffer cache → user buffer
- mmap() provides direct access to buffer cache pages
- Saves one memory copy per access (significant for large files)
No system call per access
- read() requires a syscall for each operation (~100ns overhead)
- mmap() accesses are regular memory operations (~1ns)
- For small, frequent accesses, this is huge
Automatic caching
- OS manages which pages are in memory
- Hot pages stay resident; cold pages can be evicted
- No manual buffer management code needed
Efficient random access
- lseek() + read() for each access is expensive
- mmap() random access is just pointer arithmetic

Disadvantages and when read()/write() is better:

mmap() vs. read()/write() Trade-offs
Scenario	Better Choice	Reason
Small file, read once	read()	mmap overhead (page table setup) exceeds benefit
Large file, random access	mmap()	Zero-copy, no syscall per access
Streaming sequential	read() with large buffer	Similar performance, simpler error handling
Database page access	mmap()	Buffer pool integrates with OS page cache
Strict I/O error handling	read()/write()	mmap() errors manifest as SIGBUS, hard to handle
Files larger than address space	read()/write()	32-bit: 0-2GB limit; must use windowing
Compressed/encrypted files	read()/write()	Must transform data on read anyway
Network file systems (NFS)	read()/write()	mmap() semantics poorly defined on network

Quantitative performance comparison:

Operation	read()	mmap()	Winner
Setup 1MB file	~10μs (open)	~50μs (open+mmap)	read()
Sequential read 1MB	~500μs	~500μs	Tie
Random read 1K chunks	~1ms (1000 syscalls)	~10μs (memory access)	mmap() 100x
Single 4K read	~2μs	~10μs (page fault)	read()
Repeated 4K reads (cached)	~1.5μs each	~0.01μs each	mmap() 100x

Key observations:

mmap() has higher setup cost but lower per-access cost
For random access or repeated access patterns, mmap() dominates
For single sequential pass of moderate files, roughly equivalent
For very small files accessed once, read() may be faster

Prefaulting for Predictable Latency

Essential Use Cases for Memory Mapping

Memory mapping shines in specific scenarios. Let's examine the canonical use cases:

1. Database Buffer Pools

Databases like SQLite, LMDB, and MongoDB use mmap() for their buffer pools:

Database file is memory-mapped
Page access becomes memory access
OS page cache serves as buffer pool
Zero manual cache management code

Caveat: Some databases (PostgreSQL) avoid mmap() for better control over write ordering and cache policies.

2. Loading Executables and Shared Libraries

When you run a program, the OS uses mmap():

Executable's code segment is mapped read+execute
Shared libraries (.so, .dll) are mapped into multiple processes
Copy-on-write for writable sections
Only accessed pages are loaded (demand paging)

3. Text Editors and IDEs

Editing large files efficiently:

Map file, access any offset instantly
Modifications visible through the mapping
Undo history can reference mmap'd regions
Large files that don't fit in RAM: only loaded pages consume memory

4. Inter-Process Communication (IPC)

Multiple processes sharing memory:

Both processes mmap() the same file with MAP_SHARED
One writes, the other sees changes immediately
Fastest IPC mechanism (no kernel crossing for data)
Requires synchronization (mutexes, semaphores) for safety

5. Fast File Copying

fast_file_copy.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <string.h>
 
/**
 * Fast file copy using memory mapping
 * Outperforms read/write loops for large files
 */
int copy_file_mmap(const char *src, const char *dst) {
    // Open and map source
    int src_fd = open(src, O_RDONLY);
    if (src_fd < 0) return -1;
    
    struct stat st;
    fstat(src_fd, &st);
    size_t size = st.st_size;
    
    char *src_data = mmap(NULL, size, PROT_READ, MAP_PRIVATE, src_fd, 0);
    close(src_fd);
    if (src_data == MAP_FAILED) return -1;
    
    // Create and map destination
    int dst_fd = open(dst, O_RDWR | O_CREAT | O_TRUNC, st.st_mode);
    if (dst_fd < 0) { munmap(src_data, size); return -1; }
    
    ftruncate(dst_fd, size);
    
    char *dst_data = mmap(NULL, size, PROT_WRITE, MAP_SHARED, dst_fd, 0);
    close(dst_fd);
    if (dst_data == MAP_FAILED) { munmap(src_data, size); return -1; }
    
    // Single memcpy for entire file
    memcpy(dst_data, src_data, size);
    
    // Ensure data is on disk
    msync(dst_data, size, MS_SYNC);
    
    munmap(src_data, size);
    munmap(dst_data, size);
    
    return 0;
}
 
/*
 * Performance comparison for 1GB file:
 * 
 * Traditional (4K buffer loop): ~3 seconds
 * mmap + memcpy:               ~2 seconds
 * sendfile():                  ~1.5 seconds (kernel-only copy)
 * 
 * mmap wins due to zero-copy and efficient page handling.
 * Even faster: use sendfile() or copy_file_range() for pure copy.
 */

6. Memory-Mapped Data Structures

Persistent data structures that survive process restarts:

// Map a file containing a hash table
HashTable *table = mmap(NULL, size, PROT_READ | PROT_WRITE,
                        MAP_SHARED, fd, 0);

// Use directly - operations persist automatically
table->slots[hash(key)] = value;

// On restart, just mmap again - data is there

Used by: LMDB, Redis persistence, configuration stores.

7. Large Binary Data Access

Scientific computing, image processing, data analysis:

100GB array too large for malloc()
mmap() with MAP_PRIVATE allows access with paging
OS loads only accessed pages; swaps cold pages out
No need to manually manage which pages are in memory

LMDB: mmap() as Architecture

Pitfalls, Errors, and Signal Handling

Memory mapping has subtle pitfalls that can cause crashes, corruption, or security vulnerabilities. Understanding these is essential for safe usage.

Pitfall 1: SIGBUS on Access Past EOF

If you map a file and the file shrinks (or you mapped beyond EOF from the start), accessing those pages generates SIGBUS:

// File is 1000 bytes
char *data = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);

// This accesses page 0 (bytes 0-4095)
char c = data[2000];  // SIGBUS! Only bytes 0-999 exist in file

Pitfall 2: File Truncation Race

// Process A maps file
char *data = mmap(...);

// Process B truncates file
truncate(filename, 0);

// Process A accesses mapped region
data[1000] = 'x';  // SIGBUS!

Solution: Coordinate between processes, or handle SIGBUS.

Pitfall 3: Write Without MAP_SHARED

// Map private
char *data = mmap(NULL, size, PROT_READ | PROT_WRITE,
                  MAP_PRIVATE, fd, 0);

// Write (creates private copy, file unchanged!)
data[0] = 'X';
msync(data, size, MS_SYNC);  // Useless - no file backing

// Close - file is NOT modified!

Solution: Use MAP_SHARED for file modifications.

Pitfall 4: Missing munmap() (Resource Leak)

for (int i = 0; i < 1000000; i++) {
    char *p = mmap(...);
    // Use p...
    // Forgot munmap()!
}
// Eventually: out of address space or memory

Pitfall 5: SIGBUS Handling

sigbus_handling.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
 
static sigjmp_buf jump_buf;
static volatile sig_atomic_t got_sigbus = 0;
 
void sigbus_handler(int sig) {
    got_sigbus = 1;
    siglongjmp(jump_buf, 1);
}
 
/**
 * Safe mmap access with SIGBUS handling
 * Returns 0 on success, -1 on SIGBUS
 */
int safe_mmap_access(char *data, size_t offset, char *result) {
    // Install SIGBUS handler
    struct sigaction sa, old_sa;
    sa.sa_handler = sigbus_handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;
    sigaction(SIGBUS, &sa, &old_sa);
    
    got_sigbus = 0;
    
    // Set jump point
    if (sigsetjmp(jump_buf, 1) == 0) {
        // Normal path
        *result = data[offset];  // May SIGBUS
        sigaction(SIGBUS, &old_sa, NULL);  // Restore
        return 0;  // Success
    } else {
        // Returned from SIGBUS handler
        sigaction(SIGBUS, &old_sa, NULL);
        return -1;  // Failed
    }
}
 
/*
 * WARNING: SIGBUS handling is tricky:
 * - Can't safely return to the faulting instruction
 * - Must longjmp out or exit
 * - Signal-safe functions only in handler
 * - Better to prevent SIGBUS via proper size checking
 *
 * Prevention is better than handling:
 * - Check file size before mapping
 * - Don't access beyond mapped region
 * - Use flock() or coordination for shared files
 */

Additional Pitfalls

•32-bit address space limits — On 32-bit systems, mmap() can only address ~2-3GB. Use windowing for larger files.
•Page-aligned requirements — offset parameter must be page-aligned; returned address is always page-aligned.
•mmap() + fork() + exec() interactions — Mappings are inherited on fork, lost on exec.
•NFS and network filesystems — mmap() semantics vary; consistency not guaranteed across hosts.
•MAP_FIXED danger — Can silently overwrite existing mappings, including critical memory regions.

Security Consideration

Never mmap() files from untrusted sources with PROT_EXEC. Malicious code in the file could be executed. Always validate file contents before enabling execute permission.

Summary: Mastering Memory-Mapped Access

We've conducted a comprehensive exploration of memory-mapped file access—a powerful technique that unifies file I/O with memory operations. Let's consolidate the critical concepts:

Key Takeaways

•mmap() maps files into virtual address space — Access via pointers instead of read()/write() calls.
•MAP_SHARED writes to file; MAP_PRIVATE uses copy-on-write — Choose based on whether changes should persist.
•Zero-copy efficiency — No copy from kernel to user space; direct buffer cache access.
•Demand paging loads only accessed pages — Efficient for large files with sparse access.
•msync() for durability — Explicit synchronization required for crash-safe writes.
•SIGBUS on access past EOF — Handle or prevent; can't catch like a read() error.
•Ideal for: databases, editors, IPC, executables — Use when random access or sharing is critical.

Module Complete:

Module Complete

5 / 5