Loading learning content...
Sometimes you want to work with a file's contents without affecting the original. You need to read the data, modify it as part of your computation, but leave the source file untouched. Traditional I/O achieves this naturally—you read into a buffer, modify the buffer, and never write back—but you pay the copy cost upfront.
Memory-mapped files with private mappings (MAP_PRIVATE) offer an elegant alternative: you get a virtual view of the file that appears modifiable but uses copy-on-write to avoid unnecessary copies. If you only read, you share pages with the original file. If you write, only the modified pages are copied—and only for your process.
// Map file privately - start by sharing, copy on demand
void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
// Reading: direct access to page cache pages (shared, efficient)
char first = ((char *)map)[0]; // No copy yet
// Writing: triggers copy-on-write - just this page is copied
((char *)map)[0] = 'X'; // Now this page is your private copy
// Original file is unchanged
// Other processes mapping the same file see the original content
This mechanism is fundamental to operating system design—it's how fork() achieves efficient process duplication, how programs load with private writable data sections, and how you can safely work with file data without coordination with other processes.
This page provides comprehensive coverage of private mappings—MAP_PRIVATE semantics, copy-on-write implementation details, the efficiency gains over eager copying, use cases from fork() to data analysis, and the subtleties of private mapping behavior. You'll understand when and why to choose private over shared mappings.
When you specify MAP_PRIVATE in mmap(), you're establishing a specific contract with the operating system:
The Three Guarantees of MAP_PRIVATE:
Initial sharing: Your mapping initially shares physical pages with the page cache (and any other private mappers of the same file). No copying occurs during mmap().
Copy-on-write isolation: When you write to a page for the first time, the kernel creates a private copy of that page for your process. Your writes affect only your copy.
File non-modification: Changes you make are NEVER written back to the underlying file. The file remains unchanged regardless of what you write to the mapping.
This Differs from MAP_SHARED:
| Behavior | MAP_PRIVATE | MAP_SHARED |
|---|---|---|
| Initial page source | Page cache (shared) | Page cache (shared) |
| On first write | Copy page, then write to copy | Write directly to shared page |
| Other mappers see writes? | No (isolated) | Yes (visible) |
| File modified? | Never | Eventually (or on msync()) |
| Memory overhead of writes | One page per modified page | Zero (sharing maintained) |
| File descriptor requirement | O_RDONLY sufficient | O_RDWR for PROT_WRITE |
Why O_RDONLY is Sufficient:
With MAP_SHARED + PROT_WRITE, your writes must eventually reach the file—so you need write permission on the file. With MAP_PRIVATE, writes never reach the file—they go to private copies in RAM. You only need read permission to get the initial data:
// This works! Writes go to private copies, never touch the file
int fd = open("readonly_file.dat", O_RDONLY);
void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
// Can now read AND write to 'map'
// File remains completely unchanged
This is useful for working with files you don't have write access to, or when you explicitly want isolation from the original.
Page Table Entry Changes:
The kernel tracks copy-on-write state through page table entry flags:
Key Implementation Detail:
Even though you request PROT_WRITE, the kernel initially maps pages as read-only. When you attempt to write:
This happens per-page, so if you modify one byte on a 4KB page, you get a 4KB private copy.
Copy-on-write (COW) is the mechanism that makes private mappings efficient. Let's examine how it works in detail:
Reference Counting:
The kernel tracks how many processes are using each physical page through reference counts. For file-backed pages in the page cache:
┌────────────────────────────────────────────────────────────────┐
│ struct page (kernel page descriptor) │
│ ┌────────────────┐ ┌───────────────┐ ┌────────────────────┐ │
│ │ _refcount = 3 │ │ _mapcount = 2 │ │ mapping = inode │ │
│ └────────────────┘ └───────────────┘ └────────────────────┘ │
│ ^ ^ │
│ | | │
│ +-- Total refs +-- # of page table mappings │
└────────────────────────────────────────────────────────────────┘
When _mapcount > 1, the page is shared among multiple processes/mappings.
The COW Fault Handling Flow:
1234567891011121314151617181920212223242526272829303132333435
// Simplified conceptual kernel code for COW handlingvoid handle_cow_fault(struct vm_area_struct *vma, unsigned long address, struct page *old_page) { // Step 1: Allocate a new physical page struct page *new_page = alloc_page(GFP_KERNEL); if (!new_page) { // Out of memory - send SIGKILL or SIGBUS send_signal(current, SIGKILL); return; } // Step 2: Copy content from old page to new page void *old_kaddr = kmap(old_page); // Map old page to kernel space void *new_kaddr = kmap(new_page); // Map new page to kernel space memcpy(new_kaddr, old_kaddr, PAGE_SIZE); kunmap(old_page); kunmap(new_page); // Step 3: Update the page table entry pte_t *pte = get_pte(vma->vm_mm, address); set_pte(pte, mk_pte(new_page, vma->vm_page_prot)); // Step 4: Mark new page as read-write (since we own it) pte_mkwrite(*pte); // Step 5: Decrement reference count on old page put_page(old_page); // Step 6: Flush TLB entry for this address flush_tlb_page(vma, address); // Fault handler returns; CPU retries the write instruction}Special Case: Single Reference
An important optimization: if the mapcount is 1 (only this process maps the page), no copy is needed! The page can simply be made writable:
if (page_mapcount(old_page) == 1) {
// We're the only mapper - just mark as writable
pte_mkwrite(*pte);
return; // No copy needed!
}
This optimization is why repeatedly writing to the same private-mapped page doesn't create repeated copies—after the first COW, you own the page exclusively.
Anonymous vs. File-Backed COW:
COW applies to both:
In both cases, the result is an anonymous (non-file-backed) private page owned by one process.
A COW fault involves: (1) page allocation, (2) 4KB memory copy, and (3) page table updates. This costs roughly 1-5 microseconds on modern hardware—much cheaper than a disk I/O but not free. For write-heavy workloads on private mappings, the COW overhead accumulates. Profile your application if you're modifying a significant percentage of mapped pages.
The fork() system call creates a new process that is a copy of the calling process. Without copy-on-write, this would require copying all of the parent's memory—potentially gigabytes of data. With COW, fork() is typically instantaneous.
What fork() Does Internally:
pid_t child = fork();
// Conceptually, the kernel:
// 1. Creates new process descriptor, new page tables
// 2. SHARES all parent's physical pages with child (no copy!)
// 3. Marks ALL pages in BOTH processes as read-only
// 4. Returns (twice: once in parent, once in child)
Post-fork() State:
When Either Process Writes:
After fork(), both parent and child proceed executing. Eventually one of them writes to memory:
// In child process:
int x = 42; // x is on stack, currently shared with parent
x = 100; // This triggers:
// 1. Write fault (page was marked read-only)
// 2. Kernel sees mapcount > 1 (shared with parent)
// 3. Kernel copies the stack page for child
// 4. Child's PTE updated to point to new page (RW)
// 5. Parent still has original page (stays RO until parent writes)
// 6. Child's write completes
The fork() + exec() Pattern:
The traditional Unix process creation pattern is:
pid_t pid = fork();
if (pid == 0) {
// Child immediately exec()s a new program
execve("/bin/ls", argv, envp);
}
exec() replaces the entire address space with the new program. If fork() had copied all memory upfront, that work would be wasted—the exec() throws it all away. With COW:
This is why fork() is viable even for processes with gigabytes of memory.
For the fork+exec pattern, vfork() is even faster—it doesn't set up COW page tables at all. The child shares the parent's address space directly until exec(). The parent is suspended until the child exec()s or _exit()s. Use vfork() only when you'll immediately exec().
Private mappings are the right choice for many scenarios. Here are common patterns:
Use Case 1: Read-Only Access to Shared Data
When you only read from a file, MAP_PRIVATE and MAP_SHARED behave identically—pages are shared with the page cache. But MAP_PRIVATE is safer if you accidentally write:
// Safe pattern for read-only access
void *config = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
// If a bug causes a write, we get SIGSEGV (no PROT_WRITE)
// instead of corrupting the file
Use Case 2: Analysis Without Modification
Processing file data where you need to modify values in memory but preserve the original:
// Analyze image file: need to modify pixels in memory
void *image = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
// Apply transformations for analysis
apply_filter((Pixel *)image, width, height);
detect_edges((Pixel *)image, width, height);
find_objects((Pixel *)image, width, height);
// Original file unchanged - can re-run with different parameters
munmap(image, size);
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
#include <sys/mman.h>#include <fcntl.h>#include <stdlib.h> // Pattern: Speculatively modify data, commit only if successfultypedef struct { int version; char data[4092]; } Record; int try_update_record(const char *path, int record_id, Record *new_data) { int fd = open(path, O_RDONLY); // Read-only is sufficient! // ...map setup... void *map = mmap(NULL, file_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); close(fd); Record *records = (Record *)map; // Modify in place (private copy created on write) records[record_id] = *new_data; // Perform validation on modified data if (!validate_all_records(records, num_records)) { // Validation failed - just unmap, original file untouched munmap(map, file_size); return -1; } // Validation passed - now write to file explicitly fd = open(path, O_WRONLY); pwrite(fd, new_data, sizeof(Record), record_id * sizeof(Record)); close(fd); munmap(map, file_size); return 0;} // Pattern: Working copy for parallel processingvoid parallel_process(const char *path) { int fd = open(path, O_RDONLY); struct stat sb; fstat(fd, &sb); // Each thread gets MAP_PRIVATE mapping // Same physical pages initially, COW on write // Threads don't interfere with each other #pragma omp parallel { void *my_copy = mmap(NULL, sb.st_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); // Each thread can modify its copy freely process_with_modifications(my_copy, sb.st_size); munmap(my_copy, sb.st_size); } close(fd);}Use Case 3: Executable Loading
When the dynamic linker loads your program and shared libraries, it uses MAP_PRIVATE:
# View a process's mappings
cat /proc/$(pidof bash)/maps | head -10
# 55a0f5800000-55a0f58d2000 r-xp 00000000 08:01 123 /bin/bash
# 55a0f5ad1000-55a0f5ad5000 r--p 000d1000 08:01 123 /bin/bash
# 55a0f5ad5000-55a0f5ade000 rw-p 000d5000 08:01 123 /bin/bash
# ^^^^
# Note: 'p' = private (MAP_PRIVATE)
# The rw-p section is writable data - private copy per process
Use Case 4: Database Snapshots
Some databases use private mappings to provide consistent read views:
// Reader opens database file with MAP_PRIVATE
// Writer modifies with MAP_SHARED
// Reader sees consistent snapshot at time of mmap()
// (Until reader's pages get reclaimed and re-faulted from updated file)
Note: This provides a form of snapshot isolation, but with caveats around page reclaim. Real databases use more sophisticated mechanisms (MVCC, undo logs).
Understanding how private mappings consume memory is essential for capacity planning:
The Read-Only Case:
If you only read from a private mapping, memory consumption is minimal:
Process A (MAP_PRIVATE, read-only): → ┐
├── Physical pages in page cache
Process B (MAP_PRIVATE, read-only): → ┘
Total physical memory: 1 copy of the file data
The Write-Heavy Case:
If you write to all pages:
Process A (MAP_PRIVATE, full write): → 1 copy of all pages (private)
Process B (MAP_PRIVATE, full write): → 1 copy of all pages (private)
Page cache: → 1 copy of all pages (original)
Total: 3x the file size in physical memory
The Sparse Write Case (Most Common):
Typically, you write to only a fraction of pages:
File: 100MB, modify 10% in each process
Page cache: 100MB (original data)
Process A private pages: 10MB (modified pages only)
Process B private pages: 10MB (different modified pages)
Total: ~120MB instead of 300MB
| Write Pattern | File Size = F | N Processes | Memory Usage |
|---|---|---|---|
| Read-only | F | N | ~F (shared) |
| Write 1% | F | N | F + 0.01×N×F |
| Write 10% | F | N | F + 0.1×N×F |
| Write 100% | F | N | F + N×F = (1+N)×F |
Monitoring COW Memory:
On Linux, you can see private memory usage:
# Per-process private memory
grep Private /proc/<pid>/smaps_rollup
Private_Clean: xxx kB # Private, not modified (from COW before write)
Private_Dirty: yyy kB # Private AND modified
# Or use smem for system-wide view
smem --processfilter='myapp'
Private vs. Shared Classification:
Practical Example:
// 1GB file, multiple processes do MAP_PRIVATE
// Scenario: Each process modifies first 64MB
for (size_t i = 0; i < 64 * 1024 * 1024; i += 4096) {
((char *)map)[i] = 'X'; // Touch one byte per page
}
// Memory impact per process: ~64MB private dirty
// Total for 10 processes: ~640MB private + 1GB shared = ~1.64GB
// Without COW: 10GB (each process full copy) + 1GB page cache = 11GB
COW enables memory overcommit—the sum of virtual memory across all processes can exceed physical RAM because not all pages are private yet. Under memory pressure, COW faults may fail (OOM killer). On systems where memory accounting matters (containers, VMs with memory limits), understand that private mappings can grow unpredictably as writes occur.
Choosing between MAP_PRIVATE and MAP_SHARED is a fundamental decision. Here's a decision framework:
| Scenario | Mapping Type | Key Reason |
|---|---|---|
| Configuration file read | MAP_PRIVATE | Read-only, safety |
| Image processing (non-destructive) | MAP_PRIVATE | Modify without affecting original |
| Database file access | MAP_SHARED | Changes must persist |
| IPC shared memory | MAP_SHARED | Other processes must see updates |
| Shared library code | MAP_PRIVATE | Read-only, with COW safety |
| Executable data section | MAP_PRIVATE | Per-process private data |
| Memory-mapped log append | MAP_SHARED | Persistent records |
| Search index read | MAP_PRIVATE | Read-only, multiple readers safe |
Private mappings have subtle behaviors that can surprise developers:
Edge Case 1: File Changes After Mapping
What happens if another process (or mmap with MAP_SHARED) modifies the underlying file after you've created a MAP_PRIVATE mapping?
void *private = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
// Another process writes to the same file via MAP_SHARED
// What does the MAP_PRIVATE process see?
Answer: It depends on whether you've already faulted in the pages:
This means MAP_PRIVATE provides a point-in-time snapshot per page, at fault time, not a consistent whole-file snapshot.
If a file is being modified while you read via MAP_PRIVATE, you might see some pages from before the modification and some from after. For consistent snapshots, either (1) use file locking, (2) use MAP_POPULATE to fault all pages immediately, or (3) use a proper snapshot mechanism like BTRFS snapshots or a database with MVCC.
Edge Case 2: Private Anonymous Mappings
MAP_PRIVATE | MAP_ANONYMOUS creates pages backed only by swap (not a file):
// Private anonymous mapping - like malloc for large allocations
void *mem = mmap(NULL, large_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
These pages are:
Edge Case 3: Writing Beyond File End
With MAP_PRIVATE, you might think you can extend the file by writing beyond its end. You cannot:
int fd = open("small.txt", O_RDONLY); // 100 bytes
void *map = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
// This works (within file size):
((char *)map)[50] = 'X'; // COW's a page, modifies your copy
// This causes SIGBUS:
((char *)map)[500] = 'X'; // Beyond EOF, no file data to COW from!
Edge Case 4: mprotect() on Private Mappings
You can change protection of private mappings:
void *map = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
// Later, make it writable
mprotect(map, size, PROT_READ | PROT_WRITE);
// Now writes trigger COW as expected
Edge Case 5: No Writeback Ever
Even if you call msync() on a private mapping, changes don't go to the file:
void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
((char *)map)[0] = 'X';
msync(map, size, MS_SYNC); // Does what? Nothing useful.
munmap(map, size);
// File is UNCHANGED. msync on MAP_PRIVATE is essentially a no-op.
We've comprehensively explored private memory mappings—the copy-on-write mechanism that enables efficient file access, process isolation, and the fundamental fork() optimization.
Module Complete:
You've now mastered memory-mapped files from every angle:
This comprehensive understanding enables you to make informed decisions about file access patterns, build efficient inter-process communication systems, understand how programs load and execute, and optimize memory-intensive applications.
You have achieved mastery of memory-mapped files—one of the most powerful and nuanced mechanisms in operating system design. From the mmap() interface to lazy loading to the intricacies of shared versus private mappings, you now possess the deep understanding required to architect efficient, correct memory-mapped systems.