Operating SystemsVirtual Memory

Memory-Mapped Files

LevelAdvanced

Duration90 mins

TopicVirtual Memory

5 / 5

Private Mappings

Your Own Private Copy of the Universe

Sometimes you want to work with a file's contents without affecting the original. You need to read the data, modify it as part of your computation, but leave the source file untouched. Traditional I/O achieves this naturally—you read into a buffer, modify the buffer, and never write back—but you pay the copy cost upfront.

Memory-mapped files with private mappings (MAP_PRIVATE) offer an elegant alternative: you get a virtual view of the file that appears modifiable but uses copy-on-write to avoid unnecessary copies. If you only read, you share pages with the original file. If you write, only the modified pages are copied—and only for your process.

// Map file privately - start by sharing, copy on demand
void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);

// Reading: direct access to page cache pages (shared, efficient)
char first = ((char *)map)[0];  // No copy yet

// Writing: triggers copy-on-write - just this page is copied
((char *)map)[0] = 'X';  // Now this page is your private copy

// Original file is unchanged
// Other processes mapping the same file see the original content

This mechanism is fundamental to operating system design—it's how fork() achieves efficient process duplication, how programs load with private writable data sections, and how you can safely work with file data without coordination with other processes.

What You Will Master

This page provides comprehensive coverage of private mappings—MAP_PRIVATE semantics, copy-on-write implementation details, the efficiency gains over eager copying, use cases from fork() to data analysis, and the subtleties of private mapping behavior. You'll understand when and why to choose private over shared mappings.

MAP_PRIVATE Semantics: The Copy-on-Write Contract

When you specify MAP_PRIVATE in mmap(), you're establishing a specific contract with the operating system:

The Three Guarantees of MAP_PRIVATE:

Initial sharing: Your mapping initially shares physical pages with the page cache (and any other private mappers of the same file). No copying occurs during mmap().
Copy-on-write isolation: When you write to a page for the first time, the kernel creates a private copy of that page for your process. Your writes affect only your copy.
File non-modification: Changes you make are NEVER written back to the underlying file. The file remains unchanged regardless of what you write to the mapping.

This Differs from MAP_SHARED:

MAP_PRIVATE vs MAP_SHARED Detailed Comparison
Behavior	MAP_PRIVATE	MAP_SHARED
Initial page source	Page cache (shared)	Page cache (shared)
On first write	Copy page, then write to copy	Write directly to shared page
Other mappers see writes?	No (isolated)	Yes (visible)
File modified?	Never	Eventually (or on msync())
Memory overhead of writes	One page per modified page	Zero (sharing maintained)
File descriptor requirement	O_RDONLY sufficient	O_RDWR for PROT_WRITE

Why O_RDONLY is Sufficient:

With MAP_SHARED + PROT_WRITE, your writes must eventually reach the file—so you need write permission on the file. With MAP_PRIVATE, writes never reach the file—they go to private copies in RAM. You only need read permission to get the initial data:

// This works! Writes go to private copies, never touch the file
int fd = open("readonly_file.dat", O_RDONLY);
void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
// Can now read AND write to 'map'
// File remains completely unchanged

This is useful for working with files you don't have write access to, or when you explicitly want isolation from the original.

Page Table Entry Changes:

The kernel tracks copy-on-write state through page table entry flags:

Converting Mermaid diagram...

Key Implementation Detail:

Even though you request PROT_WRITE, the kernel initially maps pages as read-only. When you attempt to write:

CPU raises a protection fault (write to read-only page)
Kernel's fault handler recognizes this as a COW page
Kernel allocates a new physical page
Kernel copies content from original to new page
Kernel updates your page table entry to point to new page with read-write permission
CPU retries the write instruction—now succeeds

This happens per-page, so if you modify one byte on a 4KB page, you get a 4KB private copy.

Copy-on-Write: Deep Mechanics

Copy-on-write (COW) is the mechanism that makes private mappings efficient. Let's examine how it works in detail:

Reference Counting:

The kernel tracks how many processes are using each physical page through reference counts. For file-backed pages in the page cache:

┌────────────────────────────────────────────────────────────────┐
│  struct page (kernel page descriptor)                          │
│  ┌────────────────┐  ┌───────────────┐  ┌────────────────────┐ │
│  │ _refcount = 3  │  │ _mapcount = 2 │  │ mapping = inode    │ │
│  └────────────────┘  └───────────────┘  └────────────────────┘ │
│    ^                   ^                                        │
│    |                   |                                        │
│    +-- Total refs      +-- # of page table mappings             │
└────────────────────────────────────────────────────────────────┘

_refcount: Total references (page cache + any direct users)
_mapcount: Specifically how many PTEs map this page

When _mapcount > 1, the page is shared among multiple processes/mappings.

The COW Fault Handling Flow:

cow_fault_handling_conceptual.c
C (Kernel Pseudocode)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Simplified conceptual kernel code for COW handling
void handle_cow_fault(struct vm_area_struct *vma, 
                      unsigned long address,
                      struct page *old_page) {
    
    // Step 1: Allocate a new physical page
    struct page *new_page = alloc_page(GFP_KERNEL);
    if (!new_page) {
        // Out of memory - send SIGKILL or SIGBUS
        send_signal(current, SIGKILL);
        return;
    }
    
    // Step 2: Copy content from old page to new page
    void *old_kaddr = kmap(old_page);   // Map old page to kernel space
    void *new_kaddr = kmap(new_page);   // Map new page to kernel space
    memcpy(new_kaddr, old_kaddr, PAGE_SIZE);
    kunmap(old_page);
    kunmap(new_page);
    
    // Step 3: Update the page table entry
    pte_t *pte = get_pte(vma->vm_mm, address);
    set_pte(pte, mk_pte(new_page, vma->vm_page_prot));
    
    // Step 4: Mark new page as read-write (since we own it)
    pte_mkwrite(*pte);
    
    // Step 5: Decrement reference count on old page
    put_page(old_page);
    
    // Step 6: Flush TLB entry for this address
    flush_tlb_page(vma, address);
    
    // Fault handler returns; CPU retries the write instruction
}

Special Case: Single Reference

An important optimization: if the mapcount is 1 (only this process maps the page), no copy is needed! The page can simply be made writable:

if (page_mapcount(old_page) == 1) {
    // We're the only mapper - just mark as writable
    pte_mkwrite(*pte);
    return;  // No copy needed!
}

This optimization is why repeatedly writing to the same private-mapped page doesn't create repeated copies—after the first COW, you own the page exclusively.

Anonymous vs. File-Backed COW:

COW applies to both:

File-backed private mappings: Copy from page cache to anonymous page
Anonymous shared mappings after fork(): Copy from parent's anonymous page to new anonymous page

In both cases, the result is an anonymous (non-file-backed) private page owned by one process.

COW Performance Characteristics

A COW fault involves: (1) page allocation, (2) 4KB memory copy, and (3) page table updates. This costs roughly 1-5 microseconds on modern hardware—much cheaper than a disk I/O but not free. For write-heavy workloads on private mappings, the COW overhead accumulates. Profile your application if you're modifying a significant percentage of mapped pages.

fork() and Copy-on-Write: The Killer Application

The fork() system call creates a new process that is a copy of the calling process. Without copy-on-write, this would require copying all of the parent's memory—potentially gigabytes of data. With COW, fork() is typically instantaneous.

What fork() Does Internally:

pid_t child = fork();

// Conceptually, the kernel:
// 1. Creates new process descriptor, new page tables
// 2. SHARES all parent's physical pages with child (no copy!)
// 3. Marks ALL pages in BOTH processes as read-only
// 4. Returns (twice: once in parent, once in child)

Post-fork() State:

Converting Mermaid diagram...

When Either Process Writes:

After fork(), both parent and child proceed executing. Eventually one of them writes to memory:

// In child process:
int x = 42;  // x is on stack, currently shared with parent
x = 100;     // This triggers:
             // 1. Write fault (page was marked read-only)
             // 2. Kernel sees mapcount > 1 (shared with parent)
             // 3. Kernel copies the stack page for child
             // 4. Child's PTE updated to point to new page (RW)
             // 5. Parent still has original page (stays RO until parent writes)
             // 6. Child's write completes

The fork() + exec() Pattern:

The traditional Unix process creation pattern is:

pid_t pid = fork();
if (pid == 0) {
    // Child immediately exec()s a new program
    execve("/bin/ls", argv, envp);
}

exec() replaces the entire address space with the new program. If fork() had copied all memory upfront, that work would be wasted—the exec() throws it all away. With COW:

fork() shares all pages (instant)
Child exec()s before writing to most pages
exec() unmaps all shared pages (no COW copies ever made!)
Result: Almost zero memory copying for fork+exec

This is why fork() is viable even for processes with gigabytes of memory.

vfork() for Maximum fork() Efficiency

For the fork+exec pattern, vfork() is even faster—it doesn't set up COW page tables at all. The child shares the parent's address space directly until exec(). The parent is suspended until the child exec()s or _exit()s. Use vfork() only when you'll immediately exec().

Practical Use Cases for MAP_PRIVATE

Private mappings are the right choice for many scenarios. Here are common patterns:

Use Case 1: Read-Only Access to Shared Data

When you only read from a file, MAP_PRIVATE and MAP_SHARED behave identically—pages are shared with the page cache. But MAP_PRIVATE is safer if you accidentally write:

// Safe pattern for read-only access
void *config = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
// If a bug causes a write, we get SIGSEGV (no PROT_WRITE)
// instead of corrupting the file

Use Case 2: Analysis Without Modification

Processing file data where you need to modify values in memory but preserve the original:

// Analyze image file: need to modify pixels in memory
void *image = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);

// Apply transformations for analysis
apply_filter((Pixel *)image, width, height);
detect_edges((Pixel *)image, width, height);
find_objects((Pixel *)image, width, height);

// Original file unchanged - can re-run with different parameters
munmap(image, size);

private_mapping_patterns.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include <sys/mman.h>
#include <fcntl.h>
#include <stdlib.h>
 
// Pattern: Speculatively modify data, commit only if successful
typedef struct { int version; char data[4092]; } Record;
 
int try_update_record(const char *path, int record_id, Record *new_data) {
    int fd = open(path, O_RDONLY);  // Read-only is sufficient!
    // ...map setup...
    
    void *map = mmap(NULL, file_size, PROT_READ | PROT_WRITE, 
                     MAP_PRIVATE, fd, 0);
    close(fd);
    
    Record *records = (Record *)map;
    
    // Modify in place (private copy created on write)
    records[record_id] = *new_data;
    
    // Perform validation on modified data
    if (!validate_all_records(records, num_records)) {
        // Validation failed - just unmap, original file untouched
        munmap(map, file_size);
        return -1;
    }
    
    // Validation passed - now write to file explicitly
    fd = open(path, O_WRONLY);
    pwrite(fd, new_data, sizeof(Record), record_id * sizeof(Record));
    close(fd);
    
    munmap(map, file_size);
    return 0;
}
 
// Pattern: Working copy for parallel processing
void parallel_process(const char *path) {
    int fd = open(path, O_RDONLY);
    struct stat sb;
    fstat(fd, &sb);
    
    // Each thread gets MAP_PRIVATE mapping
    // Same physical pages initially, COW on write
    // Threads don't interfere with each other
    
    #pragma omp parallel
    {
        void *my_copy = mmap(NULL, sb.st_size, PROT_READ | PROT_WRITE,
                             MAP_PRIVATE, fd, 0);
        
        // Each thread can modify its copy freely
        process_with_modifications(my_copy, sb.st_size);
        
        munmap(my_copy, sb.st_size);
    }
    
    close(fd);
}

Use Case 3: Executable Loading

When the dynamic linker loads your program and shared libraries, it uses MAP_PRIVATE:

Code sections (.text): PROT_READ | PROT_EXEC, typically never written \u2192 pages shared across all processes running this program
Read-only data (.rodata): PROT_READ, shared
Writable data (.data, .bss): PROT_READ | PROT_WRITE, COW \u2192 each process gets private copy when it writes

# View a process's mappings
cat /proc/$(pidof bash)/maps | head -10

# 55a0f5800000-55a0f58d2000 r-xp 00000000 08:01 123  /bin/bash
# 55a0f5ad1000-55a0f5ad5000 r--p 000d1000 08:01 123  /bin/bash
# 55a0f5ad5000-55a0f5ade000 rw-p 000d5000 08:01 123  /bin/bash
#                           ^^^^
# Note: 'p' = private (MAP_PRIVATE)
# The rw-p section is writable data - private copy per process

Use Case 4: Database Snapshots

Some databases use private mappings to provide consistent read views:

// Reader opens database file with MAP_PRIVATE
// Writer modifies with MAP_SHARED
// Reader sees consistent snapshot at time of mmap()
// (Until reader's pages get reclaimed and re-faulted from updated file)

Note: This provides a form of snapshot isolation, but with caveats around page reclaim. Real databases use more sophisticated mechanisms (MVCC, undo logs).

Memory Consumption Patterns

Understanding how private mappings consume memory is essential for capacity planning:

The Read-Only Case:

If you only read from a private mapping, memory consumption is minimal:

Process A (MAP_PRIVATE, read-only):    →  ┐
                                          ├── Physical pages in page cache
Process B (MAP_PRIVATE, read-only):    →  ┘

Total physical memory: 1 copy of the file data

The Write-Heavy Case:

If you write to all pages:

Process A (MAP_PRIVATE, full write):   →  1 copy of all pages (private)
Process B (MAP_PRIVATE, full write):   →  1 copy of all pages (private)
Page cache:                            →  1 copy of all pages (original)

Total: 3x the file size in physical memory

The Sparse Write Case (Most Common):

Typically, you write to only a fraction of pages:

File: 100MB, modify 10% in each process

Page cache: 100MB (original data)
Process A private pages: 10MB (modified pages only)
Process B private pages: 10MB (different modified pages)

Total: ~120MB instead of 300MB

Memory Overhead by Write Pattern
Write Pattern	File Size = F	N Processes	Memory Usage
Read-only	F	N	~F (shared)
Write 1%	F	N	F + 0.01×N×F
Write 10%	F	N	F + 0.1×N×F
Write 100%	F	N	F + N×F = (1+N)×F

Monitoring COW Memory:

On Linux, you can see private memory usage:

# Per-process private memory
grep Private /proc/<pid>/smaps_rollup
Private_Clean:    xxx kB   # Private, not modified (from COW before write)
Private_Dirty:    yyy kB   # Private AND modified

# Or use smem for system-wide view
smem --processfilter='myapp'

Private vs. Shared Classification:

Private_Clean: Pages that are private (COW'd) but haven't been written yet. This can happen after fork() before any process writes.
Private_Dirty: Pages that your process exclusively owns AND has written to. This is your "true" memory impact.
Shared_Clean/Dirty: Pages shared with other processes (not from MAP_PRIVATE, but from MAP_SHARED or shared library code).

Practical Example:

// 1GB file, multiple processes do MAP_PRIVATE
// Scenario: Each process modifies first 64MB

for (size_t i = 0; i < 64 * 1024 * 1024; i += 4096) {
    ((char *)map)[i] = 'X';  // Touch one byte per page
}

// Memory impact per process: ~64MB private dirty
// Total for 10 processes: ~640MB private + 1GB shared = ~1.64GB
// Without COW: 10GB (each process full copy) + 1GB page cache = 11GB

Memory Overcommit Implications

COW enables memory overcommit—the sum of virtual memory across all processes can exceed physical RAM because not all pages are private yet. Under memory pressure, COW faults may fail (OOM killer). On systems where memory accounting matters (containers, VMs with memory limits), understand that private mappings can grow unpredictably as writes occur.

Decision Framework: When to Use Private Mappings

Choosing between MAP_PRIVATE and MAP_SHARED is a fundamental decision. Here's a decision framework:

Use MAP_PRIVATE When

•You only read the data — Both work identically, but PRIVATE is safer against accidental writes
•You need to modify data without affecting the file — Working copies, temporary modifications
•You don't have file write permission — PRIVATE only needs O_RDONLY
•Other processes shouldn't see your changes — Isolation is required
•You want parallel processing with independent modifications — Each process/thread gets isolated working copy
•You're implementing copy-on-write semantics — Snapshots, fork() optimization

Use MAP_SHARED When

•Changes must persist to the file — Databases, file updates
•Multiple processes must see each other's changes — IPC, shared state
•Memory efficiency is paramount for write-heavy workloads — Shared pages aren't duplicated
•You need durable writes — Combined with msync() for persistence

Quick Reference: Scenario to Mapping Type
Scenario	Mapping Type	Key Reason
Configuration file read	MAP_PRIVATE	Read-only, safety
Image processing (non-destructive)	MAP_PRIVATE	Modify without affecting original
Database file access	MAP_SHARED	Changes must persist
IPC shared memory	MAP_SHARED	Other processes must see updates
Shared library code	MAP_PRIVATE	Read-only, with COW safety
Executable data section	MAP_PRIVATE	Per-process private data
Memory-mapped log append	MAP_SHARED	Persistent records
Search index read	MAP_PRIVATE	Read-only, multiple readers safe

Subtleties and Edge Cases

Private mappings have subtle behaviors that can surprise developers:

Edge Case 1: File Changes After Mapping

What happens if another process (or mmap with MAP_SHARED) modifies the underlying file after you've created a MAP_PRIVATE mapping?

void *private = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
// Another process writes to the same file via MAP_SHARED
// What does the MAP_PRIVATE process see?

Answer: It depends on whether you've already faulted in the pages:

Pages not yet faulted in: When you fault them, you get the updated file content (from the page cache)
Pages already in your address space: You keep seeing the original content (your page table already points to a snapshot)
Pages you've written to (COW'd): You see your modifications (you have private copies)

This means MAP_PRIVATE provides a point-in-time snapshot per page, at fault time, not a consistent whole-file snapshot.

Inconsistent Views Across a Private Mapping

If a file is being modified while you read via MAP_PRIVATE, you might see some pages from before the modification and some from after. For consistent snapshots, either (1) use file locking, (2) use MAP_POPULATE to fault all pages immediately, or (3) use a proper snapshot mechanism like BTRFS snapshots or a database with MVCC.

Edge Case 2: Private Anonymous Mappings

MAP_PRIVATE | MAP_ANONYMOUS creates pages backed only by swap (not a file):

// Private anonymous mapping - like malloc for large allocations
void *mem = mmap(NULL, large_size, PROT_READ | PROT_WRITE,
                 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

These pages are:

Zero-initialized on first access
Swappable if memory pressure occurs
Private to your process (not shared with anyone)

Edge Case 3: Writing Beyond File End

With MAP_PRIVATE, you might think you can extend the file by writing beyond its end. You cannot:

int fd = open("small.txt", O_RDONLY);  // 100 bytes
void *map = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);

// This works (within file size):
((char *)map)[50] = 'X';  // COW's a page, modifies your copy

// This causes SIGBUS:
((char *)map)[500] = 'X';  // Beyond EOF, no file data to COW from!

Edge Case 4: mprotect() on Private Mappings

You can change protection of private mappings:

void *map = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);

// Later, make it writable
mprotect(map, size, PROT_READ | PROT_WRITE);

// Now writes trigger COW as expected

Edge Case 5: No Writeback Ever

Even if you call msync() on a private mapping, changes don't go to the file:

void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
((char *)map)[0] = 'X';
msync(map, size, MS_SYNC);  // Does what? Nothing useful.
munmap(map, size);

// File is UNCHANGED. msync on MAP_PRIVATE is essentially a no-op.

Summary: Mastering Private Mappings

We've comprehensively explored private memory mappings—the copy-on-write mechanism that enables efficient file access, process isolation, and the fundamental fork() optimization.

Key Takeaways

•MAP_PRIVATE provides COW isolation — Reads share pages with the page cache; writes create private copies
•File is never modified — Changes exist only in your private pages, not the underlying file
•O_RDONLY is sufficient — You don't need file write permission since writes go to private copies
•Copy happens per-page on first write — Unmodified pages remain shared, minimizing memory overhead
•fork() efficiency depends on COW — Parent and child share all pages until one writes
•Snapshot consistency is per-page — Pages are snapshotted at fault time, not mmap() time
•Memory overhead scales with writes — More modified pages = more private memory consumption
•Use for read-only access, working copies, and isolation — The safe default when you don't need sharing

Module Complete:

You've now mastered memory-mapped files from every angle:

mmap() system call — The interface, parameters, and kernel mechanics
File as memory — The paradigm shift from read/write to direct access
Lazy loading — Demand paging, read-ahead, and working set optimization
Shared mappings — Multi-process sharing, visibility, synchronization
Private mappings — Copy-on-write, isolation, and efficient process duplication

This comprehensive understanding enables you to make informed decisions about file access patterns, build efficient inter-process communication systems, understand how programs load and execute, and optimize memory-intensive applications.

Module Complete: Memory-Mapped Files

You have achieved mastery of memory-mapped files—one of the most powerful and nuanced mechanisms in operating system design. From the mmap() interface to lazy loading to the intricacies of shared versus private mappings, you now possess the deep understanding required to architect efficient, correct memory-mapped systems.

5 / 5

Loading learning content...

Operating SystemsVirtual Memory

Memory-Mapped Files

LevelAdvanced

Duration90 mins

TopicVirtual Memory

5 / 5

Private Mappings

Your Own Private Copy of the Universe

// Map file privately - start by sharing, copy on demand
void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);

// Reading: direct access to page cache pages (shared, efficient)
char first = ((char *)map)[0];  // No copy yet

// Writing: triggers copy-on-write - just this page is copied
((char *)map)[0] = 'X';  // Now this page is your private copy

// Original file is unchanged
// Other processes mapping the same file see the original content

What You Will Master

MAP_PRIVATE Semantics: The Copy-on-Write Contract

When you specify MAP_PRIVATE in mmap(), you're establishing a specific contract with the operating system:

The Three Guarantees of MAP_PRIVATE:

Initial sharing: Your mapping initially shares physical pages with the page cache (and any other private mappers of the same file). No copying occurs during mmap().
Copy-on-write isolation: When you write to a page for the first time, the kernel creates a private copy of that page for your process. Your writes affect only your copy.
File non-modification: Changes you make are NEVER written back to the underlying file. The file remains unchanged regardless of what you write to the mapping.

This Differs from MAP_SHARED:

MAP_PRIVATE vs MAP_SHARED Detailed Comparison
Behavior	MAP_PRIVATE	MAP_SHARED
Initial page source	Page cache (shared)	Page cache (shared)
On first write	Copy page, then write to copy	Write directly to shared page
Other mappers see writes?	No (isolated)	Yes (visible)
File modified?	Never	Eventually (or on msync())
Memory overhead of writes	One page per modified page	Zero (sharing maintained)
File descriptor requirement	O_RDONLY sufficient	O_RDWR for PROT_WRITE

Why O_RDONLY is Sufficient:

// This works! Writes go to private copies, never touch the file
int fd = open("readonly_file.dat", O_RDONLY);
void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
// Can now read AND write to 'map'
// File remains completely unchanged

This is useful for working with files you don't have write access to, or when you explicitly want isolation from the original.

Page Table Entry Changes:

The kernel tracks copy-on-write state through page table entry flags:

Converting Mermaid diagram...

Key Implementation Detail:

Even though you request PROT_WRITE, the kernel initially maps pages as read-only. When you attempt to write:

CPU raises a protection fault (write to read-only page)
Kernel's fault handler recognizes this as a COW page
Kernel allocates a new physical page
Kernel copies content from original to new page
Kernel updates your page table entry to point to new page with read-write permission
CPU retries the write instruction—now succeeds

This happens per-page, so if you modify one byte on a 4KB page, you get a 4KB private copy.

Copy-on-Write: Deep Mechanics

Copy-on-write (COW) is the mechanism that makes private mappings efficient. Let's examine how it works in detail:

Reference Counting:

The kernel tracks how many processes are using each physical page through reference counts. For file-backed pages in the page cache:

┌────────────────────────────────────────────────────────────────┐
│  struct page (kernel page descriptor)                          │
│  ┌────────────────┐  ┌───────────────┐  ┌────────────────────┐ │
│  │ _refcount = 3  │  │ _mapcount = 2 │  │ mapping = inode    │ │
│  └────────────────┘  └───────────────┘  └────────────────────┘ │
│    ^                   ^                                        │
│    |                   |                                        │
│    +-- Total refs      +-- # of page table mappings             │
└────────────────────────────────────────────────────────────────┘

_refcount: Total references (page cache + any direct users)
_mapcount: Specifically how many PTEs map this page

When _mapcount > 1, the page is shared among multiple processes/mappings.

The COW Fault Handling Flow:

cow_fault_handling_conceptual.c
C (Kernel Pseudocode)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Simplified conceptual kernel code for COW handling
void handle_cow_fault(struct vm_area_struct *vma, 
                      unsigned long address,
                      struct page *old_page) {
    
    // Step 1: Allocate a new physical page
    struct page *new_page = alloc_page(GFP_KERNEL);
    if (!new_page) {
        // Out of memory - send SIGKILL or SIGBUS
        send_signal(current, SIGKILL);
        return;
    }
    
    // Step 2: Copy content from old page to new page
    void *old_kaddr = kmap(old_page);   // Map old page to kernel space
    void *new_kaddr = kmap(new_page);   // Map new page to kernel space
    memcpy(new_kaddr, old_kaddr, PAGE_SIZE);
    kunmap(old_page);
    kunmap(new_page);
    
    // Step 3: Update the page table entry
    pte_t *pte = get_pte(vma->vm_mm, address);
    set_pte(pte, mk_pte(new_page, vma->vm_page_prot));
    
    // Step 4: Mark new page as read-write (since we own it)
    pte_mkwrite(*pte);
    
    // Step 5: Decrement reference count on old page
    put_page(old_page);
    
    // Step 6: Flush TLB entry for this address
    flush_tlb_page(vma, address);
    
    // Fault handler returns; CPU retries the write instruction
}

Special Case: Single Reference

An important optimization: if the mapcount is 1 (only this process maps the page), no copy is needed! The page can simply be made writable:

if (page_mapcount(old_page) == 1) {
    // We're the only mapper - just mark as writable
    pte_mkwrite(*pte);
    return;  // No copy needed!
}

This optimization is why repeatedly writing to the same private-mapped page doesn't create repeated copies—after the first COW, you own the page exclusively.

Anonymous vs. File-Backed COW:

COW applies to both:

File-backed private mappings: Copy from page cache to anonymous page
Anonymous shared mappings after fork(): Copy from parent's anonymous page to new anonymous page

In both cases, the result is an anonymous (non-file-backed) private page owned by one process.

COW Performance Characteristics

fork() and Copy-on-Write: The Killer Application

What fork() Does Internally:

pid_t child = fork();

// Conceptually, the kernel:
// 1. Creates new process descriptor, new page tables
// 2. SHARES all parent's physical pages with child (no copy!)
// 3. Marks ALL pages in BOTH processes as read-only
// 4. Returns (twice: once in parent, once in child)

Post-fork() State:

Converting Mermaid diagram...

When Either Process Writes:

After fork(), both parent and child proceed executing. Eventually one of them writes to memory:

// In child process:
int x = 42;  // x is on stack, currently shared with parent
x = 100;     // This triggers:
             // 1. Write fault (page was marked read-only)
             // 2. Kernel sees mapcount > 1 (shared with parent)
             // 3. Kernel copies the stack page for child
             // 4. Child's PTE updated to point to new page (RW)
             // 5. Parent still has original page (stays RO until parent writes)
             // 6. Child's write completes

The fork() + exec() Pattern:

The traditional Unix process creation pattern is:

pid_t pid = fork();
if (pid == 0) {
    // Child immediately exec()s a new program
    execve("/bin/ls", argv, envp);
}

exec() replaces the entire address space with the new program. If fork() had copied all memory upfront, that work would be wasted—the exec() throws it all away. With COW:

fork() shares all pages (instant)
Child exec()s before writing to most pages
exec() unmaps all shared pages (no COW copies ever made!)
Result: Almost zero memory copying for fork+exec

This is why fork() is viable even for processes with gigabytes of memory.

vfork() for Maximum fork() Efficiency

Practical Use Cases for MAP_PRIVATE

Private mappings are the right choice for many scenarios. Here are common patterns:

Use Case 1: Read-Only Access to Shared Data

When you only read from a file, MAP_PRIVATE and MAP_SHARED behave identically—pages are shared with the page cache. But MAP_PRIVATE is safer if you accidentally write:

// Safe pattern for read-only access
void *config = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
// If a bug causes a write, we get SIGSEGV (no PROT_WRITE)
// instead of corrupting the file

Use Case 2: Analysis Without Modification

Processing file data where you need to modify values in memory but preserve the original:

// Analyze image file: need to modify pixels in memory
void *image = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);

// Apply transformations for analysis
apply_filter((Pixel *)image, width, height);
detect_edges((Pixel *)image, width, height);
find_objects((Pixel *)image, width, height);

// Original file unchanged - can re-run with different parameters
munmap(image, size);

private_mapping_patterns.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include <sys/mman.h>
#include <fcntl.h>
#include <stdlib.h>
 
// Pattern: Speculatively modify data, commit only if successful
typedef struct { int version; char data[4092]; } Record;
 
int try_update_record(const char *path, int record_id, Record *new_data) {
    int fd = open(path, O_RDONLY);  // Read-only is sufficient!
    // ...map setup...
    
    void *map = mmap(NULL, file_size, PROT_READ | PROT_WRITE, 
                     MAP_PRIVATE, fd, 0);
    close(fd);
    
    Record *records = (Record *)map;
    
    // Modify in place (private copy created on write)
    records[record_id] = *new_data;
    
    // Perform validation on modified data
    if (!validate_all_records(records, num_records)) {
        // Validation failed - just unmap, original file untouched
        munmap(map, file_size);
        return -1;
    }
    
    // Validation passed - now write to file explicitly
    fd = open(path, O_WRONLY);
    pwrite(fd, new_data, sizeof(Record), record_id * sizeof(Record));
    close(fd);
    
    munmap(map, file_size);
    return 0;
}
 
// Pattern: Working copy for parallel processing
void parallel_process(const char *path) {
    int fd = open(path, O_RDONLY);
    struct stat sb;
    fstat(fd, &sb);
    
    // Each thread gets MAP_PRIVATE mapping
    // Same physical pages initially, COW on write
    // Threads don't interfere with each other
    
    #pragma omp parallel
    {
        void *my_copy = mmap(NULL, sb.st_size, PROT_READ | PROT_WRITE,
                             MAP_PRIVATE, fd, 0);
        
        // Each thread can modify its copy freely
        process_with_modifications(my_copy, sb.st_size);
        
        munmap(my_copy, sb.st_size);
    }
    
    close(fd);
}

Use Case 3: Executable Loading

When the dynamic linker loads your program and shared libraries, it uses MAP_PRIVATE:

Code sections (.text): PROT_READ | PROT_EXEC, typically never written \u2192 pages shared across all processes running this program
Read-only data (.rodata): PROT_READ, shared
Writable data (.data, .bss): PROT_READ | PROT_WRITE, COW \u2192 each process gets private copy when it writes

# View a process's mappings
cat /proc/$(pidof bash)/maps | head -10

# 55a0f5800000-55a0f58d2000 r-xp 00000000 08:01 123  /bin/bash
# 55a0f5ad1000-55a0f5ad5000 r--p 000d1000 08:01 123  /bin/bash
# 55a0f5ad5000-55a0f5ade000 rw-p 000d5000 08:01 123  /bin/bash
#                           ^^^^
# Note: 'p' = private (MAP_PRIVATE)
# The rw-p section is writable data - private copy per process

Use Case 4: Database Snapshots

Some databases use private mappings to provide consistent read views:

// Reader opens database file with MAP_PRIVATE
// Writer modifies with MAP_SHARED
// Reader sees consistent snapshot at time of mmap()
// (Until reader's pages get reclaimed and re-faulted from updated file)

Note: This provides a form of snapshot isolation, but with caveats around page reclaim. Real databases use more sophisticated mechanisms (MVCC, undo logs).

Memory Consumption Patterns

Understanding how private mappings consume memory is essential for capacity planning:

The Read-Only Case:

If you only read from a private mapping, memory consumption is minimal:

Process A (MAP_PRIVATE, read-only):    →  ┐
                                          ├── Physical pages in page cache
Process B (MAP_PRIVATE, read-only):    →  ┘

Total physical memory: 1 copy of the file data

The Write-Heavy Case:

If you write to all pages:

Process A (MAP_PRIVATE, full write):   →  1 copy of all pages (private)
Process B (MAP_PRIVATE, full write):   →  1 copy of all pages (private)
Page cache:                            →  1 copy of all pages (original)

Total: 3x the file size in physical memory

The Sparse Write Case (Most Common):

Typically, you write to only a fraction of pages:

File: 100MB, modify 10% in each process

Page cache: 100MB (original data)
Process A private pages: 10MB (modified pages only)
Process B private pages: 10MB (different modified pages)

Total: ~120MB instead of 300MB

Memory Overhead by Write Pattern
Write Pattern	File Size = F	N Processes	Memory Usage
Read-only	F	N	~F (shared)
Write 1%	F	N	F + 0.01×N×F
Write 10%	F	N	F + 0.1×N×F
Write 100%	F	N	F + N×F = (1+N)×F

Monitoring COW Memory:

On Linux, you can see private memory usage:

# Per-process private memory
grep Private /proc/<pid>/smaps_rollup
Private_Clean:    xxx kB   # Private, not modified (from COW before write)
Private_Dirty:    yyy kB   # Private AND modified

# Or use smem for system-wide view
smem --processfilter='myapp'

Private vs. Shared Classification:

Private_Clean: Pages that are private (COW'd) but haven't been written yet. This can happen after fork() before any process writes.
Private_Dirty: Pages that your process exclusively owns AND has written to. This is your "true" memory impact.
Shared_Clean/Dirty: Pages shared with other processes (not from MAP_PRIVATE, but from MAP_SHARED or shared library code).

Practical Example:

// 1GB file, multiple processes do MAP_PRIVATE
// Scenario: Each process modifies first 64MB

for (size_t i = 0; i < 64 * 1024 * 1024; i += 4096) {
    ((char *)map)[i] = 'X';  // Touch one byte per page
}

// Memory impact per process: ~64MB private dirty
// Total for 10 processes: ~640MB private + 1GB shared = ~1.64GB
// Without COW: 10GB (each process full copy) + 1GB page cache = 11GB

Memory Overcommit Implications

Decision Framework: When to Use Private Mappings

Choosing between MAP_PRIVATE and MAP_SHARED is a fundamental decision. Here's a decision framework:

Use MAP_PRIVATE When

•You only read the data — Both work identically, but PRIVATE is safer against accidental writes
•You need to modify data without affecting the file — Working copies, temporary modifications
•You don't have file write permission — PRIVATE only needs O_RDONLY
•Other processes shouldn't see your changes — Isolation is required
•You want parallel processing with independent modifications — Each process/thread gets isolated working copy
•You're implementing copy-on-write semantics — Snapshots, fork() optimization

Use MAP_SHARED When

•Changes must persist to the file — Databases, file updates
•Multiple processes must see each other's changes — IPC, shared state
•Memory efficiency is paramount for write-heavy workloads — Shared pages aren't duplicated
•You need durable writes — Combined with msync() for persistence

Quick Reference: Scenario to Mapping Type
Scenario	Mapping Type	Key Reason
Configuration file read	MAP_PRIVATE	Read-only, safety
Image processing (non-destructive)	MAP_PRIVATE	Modify without affecting original
Database file access	MAP_SHARED	Changes must persist
IPC shared memory	MAP_SHARED	Other processes must see updates
Shared library code	MAP_PRIVATE	Read-only, with COW safety
Executable data section	MAP_PRIVATE	Per-process private data
Memory-mapped log append	MAP_SHARED	Persistent records
Search index read	MAP_PRIVATE	Read-only, multiple readers safe

Subtleties and Edge Cases

Private mappings have subtle behaviors that can surprise developers:

Edge Case 1: File Changes After Mapping

What happens if another process (or mmap with MAP_SHARED) modifies the underlying file after you've created a MAP_PRIVATE mapping?

void *private = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
// Another process writes to the same file via MAP_SHARED
// What does the MAP_PRIVATE process see?

Answer: It depends on whether you've already faulted in the pages:

Pages not yet faulted in: When you fault them, you get the updated file content (from the page cache)
Pages already in your address space: You keep seeing the original content (your page table already points to a snapshot)
Pages you've written to (COW'd): You see your modifications (you have private copies)

This means MAP_PRIVATE provides a point-in-time snapshot per page, at fault time, not a consistent whole-file snapshot.

Inconsistent Views Across a Private Mapping

Edge Case 2: Private Anonymous Mappings

MAP_PRIVATE | MAP_ANONYMOUS creates pages backed only by swap (not a file):

// Private anonymous mapping - like malloc for large allocations
void *mem = mmap(NULL, large_size, PROT_READ | PROT_WRITE,
                 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

These pages are:

Zero-initialized on first access
Swappable if memory pressure occurs
Private to your process (not shared with anyone)

Edge Case 3: Writing Beyond File End

With MAP_PRIVATE, you might think you can extend the file by writing beyond its end. You cannot:

int fd = open("small.txt", O_RDONLY);  // 100 bytes
void *map = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);

// This works (within file size):
((char *)map)[50] = 'X';  // COW's a page, modifies your copy

// This causes SIGBUS:
((char *)map)[500] = 'X';  // Beyond EOF, no file data to COW from!

Edge Case 4: mprotect() on Private Mappings

You can change protection of private mappings:

void *map = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);

// Later, make it writable
mprotect(map, size, PROT_READ | PROT_WRITE);

// Now writes trigger COW as expected

Edge Case 5: No Writeback Ever

Even if you call msync() on a private mapping, changes don't go to the file:

void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
((char *)map)[0] = 'X';
msync(map, size, MS_SYNC);  // Does what? Nothing useful.
munmap(map, size);

// File is UNCHANGED. msync on MAP_PRIVATE is essentially a no-op.

Summary: Mastering Private Mappings

We've comprehensively explored private memory mappings—the copy-on-write mechanism that enables efficient file access, process isolation, and the fundamental fork() optimization.

Key Takeaways

•MAP_PRIVATE provides COW isolation — Reads share pages with the page cache; writes create private copies
•File is never modified — Changes exist only in your private pages, not the underlying file
•O_RDONLY is sufficient — You don't need file write permission since writes go to private copies
•Copy happens per-page on first write — Unmodified pages remain shared, minimizing memory overhead
•fork() efficiency depends on COW — Parent and child share all pages until one writes
•Snapshot consistency is per-page — Pages are snapshotted at fault time, not mmap() time
•Memory overhead scales with writes — More modified pages = more private memory consumption
•Use for read-only access, working copies, and isolation — The safe default when you don't need sharing

Module Complete:

You've now mastered memory-mapped files from every angle:

mmap() system call — The interface, parameters, and kernel mechanics
File as memory — The paradigm shift from read/write to direct access
Lazy loading — Demand paging, read-ahead, and working set optimization
Shared mappings — Multi-process sharing, visibility, synchronization
Private mappings — Copy-on-write, isolation, and efficient process duplication

Module Complete: Memory-Mapped Files

5 / 5