Loading content...
In the world of processes, isolation is the default. Each process has its own address space, and memory written in one process is invisible to another. This isolation provides security, stability, and predictability—but sometimes you need to break it.
Shared mappings provide a controlled mechanism for multiple processes to access the exact same physical memory. When two processes both mmap() the same file with MAP_SHARED, they're not just looking at the same file—they're accessing the same physical pages. A write in one process is immediately visible in the other:
Process A Physical Memory Process B
┌─────────────┐ ┌────────────────┐ ┌─────────────┐
│ Virtual Addr│────────────>│ Shared Page │<────────────────│ Virtual Addr│
│ 0x7f10000 │ │ (Page Cache) │ │ 0x7f20000 │
└─────────────┘ └────────────────┘ └─────────────┘
| | |
| *ptr = 42; ────────+──────> *ptr now reads 42 |
This sharing mechanism is fundamental to how modern systems work. Shared libraries, database systems, real-time communication channels, and inter-process coordination all leverage shared mappings. Understanding them deeply is essential for systems programming.
This page provides comprehensive coverage of shared memory mappings—MAP_SHARED semantics, visibility guarantees, synchronization requirements, common patterns for inter-process communication, and practical considerations for building shared-memory systems. You'll understand both the power and the pitfalls of shared mappings.
When you specify MAP_SHARED in your mmap() call, you're requesting a fundamentally different relationship between your process, the memory, and the underlying file:
The Three Key Guarantees:
Physical page sharing: Multiple processes mapping the same file section share the same physical pages (not copies)
Write visibility: Writes by one process are immediately visible to other processes mapping the same region (subject to CPU cache coherency, which is typically transparent on modern hardware)
File persistence: Writes to the mapped region will eventually be written back to the underlying file (controlled by kernel writeback and msync())
Comparison with MAP_PRIVATE:
| Aspect | MAP_SHARED | MAP_PRIVATE |
|---|---|---|
| Physical pages | Shared among all mappers | Initially shared, COW on write |
| Writes by other processes | Visible immediately | Never visible |
| Your writes | Visible to other mappers | Visible only to you |
| File modification | File is modified | File is not modified |
| Memory consumption | Constant (pages shared) | Grows with writes (private copies) |
| Use case | IPC, shared state, updating files | Working copy, temporary modifications |
File Descriptor Requirements:
For MAP_SHARED with PROT_WRITE, the file must be opened with write permission:
// Correct: File opened for read-write
int fd = open("shared_data.bin", O_RDWR);
void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
// This will fail with EACCES:
int fd = open("shared_data.bin", O_RDONLY);
void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
// Error: MAP_SHARED + PROT_WRITE requires O_RDWR
For MAP_PRIVATE with PROT_WRITE, O_RDONLY is sufficient because writes create private copies that never touch the file.
Kernel Page Cache Integration:
MAP_SHARED mappings are directly integrated with the kernel's page cache:
┌─────────────────────────────────────┐
│ Page Cache │
│ ┌───────────────────────────────┐ │
│ │ File: /data/shared.bin │ │
Process A ──────────┼──┤ Page 0: 0x1234000 (physical) │───┼──────── Process B
(MAP_SHARED) │ │ Page 1: 0x1235000 (physical) │ │ (MAP_SHARED)
│ │ Page 2: 0x1236000 (physical) │ │
│ └───────────────────────────────┘ │
│ ↓ (writeback) │
└─────────┼───────────────────────────┘
↓
┌──────────────┐
│ Disk File │
└──────────────┘
Both processes' virtual addresses point to the same physical pages in the page cache. Writes modify those pages directly, and the kernel's writeback mechanism eventually syncs them to disk.
On modern multi-core systems with cache-coherent architectures (x86, ARM with CCN/CMN), when one core writes to a shared page, other cores automatically see the updated value (their caches are invalidated or updated by hardware). You don't need memory barriers for visibility between processes—but you still need synchronization for atomicity of compound operations.
There are two primary ways to establish shared mappings between processes:
Method 1: File-Based Mapping
Both processes independently mmap() the same file. The kernel recognizes the same underlying file (via inode) and maps them to the same physical pages:
// Process A
int fd_a = open("/shared/data.bin", O_RDWR);
void *map_a = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd_a, 0);
// Process B (running separately)
int fd_b = open("/shared/data.bin", O_RDWR);
void *map_b = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd_b, 0);
// map_a and map_b now point to the same physical pages!
// Different virtual addresses, same physical memory.
This works even if processes start at different times—the file provides a persistent rendezvous point.
Method 2: Anonymous Shared Memory (fork inheritance)
Parent creates a MAP_SHARED | MAP_ANONYMOUS mapping, then forks. The child inherits the mapping, and both share the same pages:
// Before fork
void *shared = mmap(NULL, 4096, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
pid_t pid = fork();
if (pid == 0) {
// Child: 'shared' points to same physical pages as parent
((int *)shared)[0] = 42;
exit(0);
} else {
// Parent: waits, then reads child's write
wait(NULL);
printf("Child wrote: %d\n", ((int *)shared)[0]); // Prints 42
}
For anonymous shared mappings that need to be shared with unrelated processes, use POSIX shared memory (shm_open).
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
#include <sys/mman.h>#include <sys/stat.h>#include <fcntl.h>#include <unistd.h>#include <stdio.h>#include <string.h> #define SHM_NAME "/my_shared_region"#define SHM_SIZE 4096 // Process A: Create and initialize shared memoryvoid create_shared_memory() { // Create shared memory object (like a file in /dev/shm) int fd = shm_open(SHM_NAME, O_CREAT | O_RDWR, 0666); if (fd == -1) { perror("shm_open"); return; } // Set size if (ftruncate(fd, SHM_SIZE) == -1) { perror("ftruncate"); close(fd); return; } // Map it void *ptr = mmap(NULL, SHM_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); close(fd); // Safe to close after mmap if (ptr == MAP_FAILED) { perror("mmap"); return; } // Initialize strcpy((char *)ptr, "Hello from Process A!"); printf("Process A: Initialized shared memory\n"); // Keep mapping open; other processes can now access} // Process B: Attach to existing shared memoryvoid attach_shared_memory() { // Open existing shared memory object int fd = shm_open(SHM_NAME, O_RDWR, 0666); if (fd == -1) { perror("shm_open"); return; } void *ptr = mmap(NULL, SHM_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); close(fd); if (ptr == MAP_FAILED) { perror("mmap"); return; } // Read what Process A wrote printf("Process B: Read from shared memory: %s\n", (char *)ptr); // Write response strcpy((char *)ptr, "Hello from Process B!"); munmap(ptr, SHM_SIZE);} // Cleanupvoid cleanup_shared_memory() { shm_unlink(SHM_NAME); // Remove when no longer needed}POSIX Shared Memory Benefits:
Key Difference from File Mapping:
With regular file mapping, data persists on disk. With POSIX shared memory, data is typically volatile (stored in RAM-backed tmpfs). For persistent shared state, use a regular file; for ephemeral inter-process communication, use shm_open().
When one process writes to a shared mapping, when does another process see the change? This question involves multiple layers of hardware and software:
Cache Coherency (Hardware Level):
Modern multi-processor systems implement cache coherency protocols (e.g., MESI, MOESI) that ensure:
This happens automatically and transparently—you don't need to flush caches manually for visibility between CPUs.
Memory Ordering (Compiler/CPU Level):
However, both compilers and CPUs may reorder operations for optimization:
// Process A writes:
shared_data->value = 42;
shared_data->ready = 1;
// Without barriers, CPU might reorder: ready could become visible before value!
For correct synchronization, you need:
<stdatomic.h> or compiler intrinsics for atomic accessJust because writes are 'visible' doesn't mean they're safe. Reading a partially-written structure, tearing of 64-bit values on 32-bit systems, and race conditions all apply. You MUST use proper synchronization when multiple processes access shared data concurrently.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
#include <pthread.h>#include <sys/mman.h>#include <stdatomic.h>#include <stdio.h> // Shared structure with synchronizationtypedef struct { pthread_mutex_t mutex; // Must be initialized with PTHREAD_PROCESS_SHARED int counter; char data[1024];} SharedData; void initialize_shared_data(SharedData *shared) { pthread_mutexattr_t attr; pthread_mutexattr_init(&attr); // CRITICAL: Enable sharing between processes pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED); pthread_mutex_init(&shared->mutex, &attr); pthread_mutexattr_destroy(&attr); shared->counter = 0; memset(shared->data, 0, sizeof(shared->data));} // Safe increment from any processvoid increment_counter(SharedData *shared) { pthread_mutex_lock(&shared->mutex); shared->counter++; pthread_mutex_unlock(&shared->mutex);} // Alternative: Lock-free atomic countertypedef struct { atomic_int counter; atomic_flag ready;} AtomicSharedData; void atomic_increment(AtomicSharedData *shared) { atomic_fetch_add(&shared->counter, 1);} void signal_ready(AtomicSharedData *shared) { atomic_store(&shared->ready, 1); // Or with explicit ordering: // atomic_store_explicit(&shared->ready, 1, memory_order_release);} int wait_for_ready(AtomicSharedData *shared) { while (!atomic_load(&shared->ready)) { // Spin or use futex for efficient waiting } return atomic_load(&shared->counter);}The PTHREAD_PROCESS_SHARED Attribute:
POSIX synchronization primitives (mutexes, condition variables, read-write locks, barriers) can be shared between processes IF:
PTHREAD_PROCESS_SHARED attributeWithout this attribute, the primitives may use thread-local data structures that don't work across process boundaries.
Futexes: Efficient Waiting on Shared Memory:
For Linux-specific high-performance scenarios, futexes (fast userspace mutexes) enable efficient waiting:
#include <linux/futex.h>
#include <sys/syscall.h>
// Wait until shared->value != expected_value
int futex_wait(int *addr, int expected) {
return syscall(SYS_futex, addr, FUTEX_WAIT, expected, NULL, NULL, 0);
}
// Wake one waiter on shared->value
int futex_wake(int *addr) {
return syscall(SYS_futex, addr, FUTEX_WAKE, 1, NULL, NULL, 0);
}
Futexes avoid syscalls in the uncontended case while providing efficient blocking when waiting is needed.
One of the most impactful uses of shared mappings is shared libraries (dynamic libraries, .so files on Unix, .dll on Windows). Understanding how they work illustrates MAP_SHARED's power.
The Problem Without Sharing:
Consider a common library like libc (~2MB). If 100 processes each had their own copy:
The Solution: MAP_SHARED Code Pages:
When you run a dynamically linked program:
Why MAP_PRIVATE for Libraries?
Read-only pages under MAP_PRIVATE are shared just like MAP_SHARED (no copy-on-write needed since there are no writes). Using MAP_PRIVATE provides safety: if something does try to write (e.g., debugger patching instructions), only that process gets a modified copy—others are unaffected.
Examining Shared Mappings:
# See what libraries a process maps
cat /proc/<pid>/maps | grep libc
# Example output:
# 7f2c10800000-7f2c109c8000 r-xp 00000000 08:01 783 /lib/x86_64-linux-gnu/libc.so.6
# 7f2c109c8000-7f2c10bc8000 ---p 001c8000 08:01 783 /lib/x86_64-linux-gnu/libc.so.6
# 7f2c10bc8000-7f2c10bcc000 r--p 001c8000 08:01 783 /lib/x86_64-linux-gnu/libc.so.6
# 7f2c10bcc000-7f2c10bce000 rw-p 001cc000 08:01 783 /lib/x86_64-linux-gnu/libc.so.6
# Interpretation:
# r-xp: Read + Execute + Private = code sections (shared in practice)
# ---p: No access = guard pages
# r--p: Read-only + Private = read-only data (shared)
# rw-p: Read-write + Private = writable data (COW, each process gets its own copy)
Memory Savings in Practice:
On a typical desktop system with 100+ processes all using libc, glibc, libstdc++, etc.:
The savings are enormous—and it's all enabled by MAP_SHARED's page-sharing mechanism.
Shared mappings are foundational to many high-performance systems:
Pattern 1: Memory-Mapped Database Files
Databases like SQLite (with mmap mode), LMDB, and BoltDB use mmap() for data access:
// Simplified LMDB-style pattern
typedef struct {
size_t file_size;
void *data_map; // MAP_SHARED mapping of the data file
// ... metadata, locks, etc.
} Database;
Database *db_open(const char *path) {
Database *db = malloc(sizeof(Database));
int fd = open(path, O_RDWR);
struct stat sb;
fstat(fd, &sb);
db->file_size = sb.st_size;
// Map the entire database file
db->data_map = mmap(NULL, db->file_size, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
close(fd);
return db;
}
void *db_get_page(Database *db, uint64_t page_num) {
size_t offset = page_num * PAGE_SIZE;
return (char *)db->data_map + offset;
}
// Reading data: just access memory!
// Writing data: just write to memory + msync() for durability
Benefits for Databases:
Pattern 2: Ring Buffer for Inter-Process Communication
A lock-free ring buffer in shared memory enables fast producer-consumer communication:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
#include <stdatomic.h>#include <sys/mman.h>#include <string.h> #define BUFFER_SIZE 1024#define ENTRY_SIZE 256 typedef struct { atomic_size_t head; // Write position (producer advances) atomic_size_t tail; // Read position (consumer advances) char data[BUFFER_SIZE * ENTRY_SIZE];} SharedRingBuffer; // Producer sideint ring_buffer_write(SharedRingBuffer *rb, const char *message) { size_t head = atomic_load_explicit(&rb->head, memory_order_relaxed); size_t tail = atomic_load_explicit(&rb->tail, memory_order_acquire); size_t next_head = (head + 1) % BUFFER_SIZE; if (next_head == tail) { return -1; // Buffer full } // Write data memcpy(&rb->data[head * ENTRY_SIZE], message, ENTRY_SIZE); // Make data visible before updating head atomic_store_explicit(&rb->head, next_head, memory_order_release); return 0;} // Consumer sideint ring_buffer_read(SharedRingBuffer *rb, char *out_message) { size_t tail = atomic_load_explicit(&rb->tail, memory_order_relaxed); size_t head = atomic_load_explicit(&rb->head, memory_order_acquire); if (tail == head) { return -1; // Buffer empty } // Read data memcpy(out_message, &rb->data[tail * ENTRY_SIZE], ENTRY_SIZE); // Make read complete before updating tail atomic_store_explicit(&rb->tail, (tail + 1) % BUFFER_SIZE, memory_order_release); return 0;} // Setup: Create in shared memory accessible to producer and consumer processesSharedRingBuffer *create_shared_ring_buffer() { int fd = shm_open("/my_ring_buffer", O_CREAT | O_RDWR, 0666); ftruncate(fd, sizeof(SharedRingBuffer)); SharedRingBuffer *rb = mmap(NULL, sizeof(SharedRingBuffer), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); close(fd); atomic_init(&rb->head, 0); atomic_init(&rb->tail, 0); return rb;}Pattern 3: Configuration Hot-Reload
Applications can share configuration data that updates live:
// Config server maintains shared mapping of config data
typedef struct {
atomic_uint64_t version;
char config_data[4096];
} SharedConfig;
// Client applications map the same file
SharedConfig *config = mmap(...);
// Fast check: has config changed?
uint64_t local_version = 0;
void maybe_reload_config() {
uint64_t current = atomic_load(&config->version);
if (current != local_version) {
// Config changed! Re-read config_data
process_new_config(config->config_data);
local_version = current;
}
}
Pattern 4: Shared State Caches
Multiple worker processes can share computed results:
With MAP_SHARED backed by a file, your writes eventually reach the disk file. But "eventually" isn't sufficient for data durability—you need explicit control.
The msync() System Call:
#include <sys/mman.h>
int msync(void *addr, size_t length, int flags);
Flags:
Usage Patterns:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
#include <sys/mman.h>#include <stdio.h> // Pattern 1: Durability for critical writesvoid durable_write(void *map, size_t offset, void *data, size_t len) { memcpy((char *)map + offset, data, len); // Ensure write reaches disk before returning if (msync((char *)map + offset, len, MS_SYNC) == -1) { perror("msync MS_SYNC"); // Handle error: data may not be durable! }} // Pattern 2: Batched async sync for throughputtypedef struct { void *map; size_t map_size; int pending_writes;} BufferedMapper; void buffered_write(BufferedMapper *bm, size_t offset, void *data, size_t len) { memcpy((char *)bm->map + offset, data, len); bm->pending_writes++; // Batch: sync every N writes if (bm->pending_writes >= 100) { msync(bm->map, bm->map_size, MS_ASYNC); // Non-blocking bm->pending_writes = 0; }} void flush_all(BufferedMapper *bm) { msync(bm->map, bm->map_size, MS_SYNC); // Wait for all to complete bm->pending_writes = 0;} // Pattern 3: Transaction safety (simplified)typedef struct { uint32_t version; uint32_t data;} TransactionalRecord; void atomic_update_record(void *map, TransactionalRecord *new_record) { TransactionalRecord *rec = (TransactionalRecord *)map; // Write new data uint32_t new_version = rec->version + 1; rec->data = new_record->data; // Sync data first msync(&rec->data, sizeof(rec->data), MS_SYNC); // Then update version (acts as commit marker) rec->version = new_version; msync(&rec->version, sizeof(rec->version), MS_SYNC); // On recovery, if version is odd, data might be incomplete // If version is even, data is consistent}Performance Implications:
What msync() Guarantees:
After a successful MS_SYNC:
However, on some file systems with write caching, you may also need fsync(fd) for absolute guarantees.
Interaction with munmap():
Unmapping a shared mapping (munmap()) does NOT guarantee data is synced to disk. Always call msync() before munmap() if durability matters:
// Wrong: data might be lost
munmap(map, size);
// Right: ensure durability
msync(map, size, MS_SYNC);
munmap(map, size);
Opening a file with O_DIRECT and then mmap()'ing it leads to undefined behavior on most systems. O_DIRECT bypasses the page cache, but mmap() relies on the page cache. If you need unbuffered I/O, use read()/write() with O_DIRECT, not mmap().
Shared mappings are powerful but come with pitfalls that can lead to subtle bugs:
123456789101112131415161718192021222324252627
// WRONG: Pointer in shared memorytypedef struct { char *next; // Virtual address - INVALID in other processes! data_t data;} SharedNodeBad; // RIGHT: Offset-based linked structuretypedef struct { size_t next_offset; // Offset from mapping base - valid everywhere data_t data;} SharedNodeGood; // Helper macros for offset-based access#define OFFSET(base, ptr) ((size_t)((char *)(ptr) - (char *)(base)))#define RESOLVE(base, offset) ((void *)((char *)(base) + (offset))) // Usage:void *map_base; // Set during mmap() SharedNodeGood *get_next(SharedNodeGood *node) { if (node->next_offset == 0) return NULL; return (SharedNodeGood *)RESOLVE(map_base, node->next_offset);} void set_next(SharedNodeGood *node, SharedNodeGood *next) { node->next_offset = next ? OFFSET(map_base, next) : 0;}The Pointer Problem Illustrated:
Process A maps at 0x7f100000:
struct { char *ptr; } shared;
shared.ptr = &some_data; // 0x7f100500
Process B maps same region at 0x7f200000:
// shared.ptr is still 0x7f100500
// But in Process B, that address is:
// - Unmapped (SIGSEGV)
// - Or worse: mapped to something completely different!
*shared.ptr = 'x'; // Crash or corruption
Solution: Always use offsets relative to the mapping base, or indices into fixed arrays. Never store raw pointers in shared memory.
We've comprehensively explored shared memory mappings—the powerful mechanism that enables efficient multi-process data sharing, shared libraries, and high-performance inter-process communication.
What's Next:
Having explored shared mappings, we'll now examine the complementary technique: private mappings. Private mappings provide copy-on-write isolation, enabling each process to have its own modifiable view of file data without affecting others. This is essential for understanding process memory isolation and fork() semantics.
You now have deep understanding of shared memory mappings—from MAP_SHARED semantics to synchronization requirements to real-world applications. You can design shared-memory IPC systems, understand how shared libraries work, and avoid the common pitfalls that plague multi-process systems.