Loading content...
When a database server like PostgreSQL needs to share buffer pool data with 100 backend processes, or when a Chrome browser coordinates between dozens of renderer and plugin processes, what IPC mechanism do they choose?
Shared memory — because it's the only IPC mechanism with zero copy overhead.
Compare the alternatives:
| IPC Mechanism | Copies per Transfer | Kernel Crossings | Latency |
|---|---|---|---|
| Pipes/sockets | 2 (write buffer → kernel → read buffer) | 2 | ~1-2 µs |
| Message queues | 2 (similar to pipes) | 2 | ~1-2 µs |
| Shared memory | 0 (direct memory access) | 0 (after setup) | ~50 ns |
Shared memory is 20-40x faster than other IPC mechanisms for bulk data transfer. The data lives in memory that both processes can access directly — no copying, no system calls, no kernel involvement.
In this page, we'll explore how operating systems implement explicit shared memory for IPC: the APIs, the underlying virtual memory mechanics, and the critical synchronization requirements that make it safe to use.
By the end of this page, you will understand: the two major shared memory APIs (POSIX and System V), how shared memory maps to virtual and physical address spaces, the synchronization primitives required for safe shared memory use, memory-mapped files as an IPC mechanism, and practical patterns for building shared memory systems.
Shared memory for IPC is conceptually simple: create a region of physical memory that multiple processes can map into their virtual address spaces. But the implementation involves careful coordination between the virtual memory system, the kernel, and user-space programs.
How Shared Memory Works:
The Key Properties:
Same physical memory: Both processes' page tables point to identical physical frames.
Different virtual addresses: Each process can map the shared region at whatever virtual address is convenient. The addresses don't need to match.
Immediate visibility: A write by Process A is immediately visible to Process B (subject to CPU cache coherency, which is handled by hardware).
Persistence options: Shared memory can persist beyond the lifetime of creating processes (useful for handoff scenarios).
Kernel involvement only for setup: Once mapped, data access is direct memory access — no system calls needed.
Types of Shared Memory:
| Type | Backing | Persistence | Primary Use |
|---|---|---|---|
| POSIX Shared Memory | tmpfs (/dev/shm) | Until unlink or reboot | Modern IPC, portable |
| System V Shared Memory | Kernel shmem | Until explicit removal | Legacy systems, some databases |
| Anonymous mmap + fork | Anonymous pages | Until all mappers exit | Parent-child sharing |
| Memory-mapped files | File on disk | Persistent (file lifetime) | Database buffer pools, config |
| memfd_create | Anonymous tmpfs | FD-based lifetime | Sandboxed sharing, sealing |
POSIX shared memory is the modern, portable API for creating shared memory regions. It uses a simple paradigm: shared memory objects are named entities in a special filesystem (/dev/shm on Linux), accessed via file descriptors.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
#include <stdio.h>#include <stdlib.h>#include <string.h>#include <fcntl.h>#include <sys/mman.h>#include <sys/stat.h>#include <unistd.h>#include <semaphore.h> #define SHM_NAME "/my_shared_memory"#define SHM_SIZE 4096 // Shared data structuretypedef struct { sem_t mutex; // Synchronization primitive (placed IN shared memory) int counter; char message[256];} SharedData; // ============================================// PRODUCER PROCESS// ============================================int producer_main() { // 1. Create shared memory object int fd = shm_open(SHM_NAME, O_CREAT | O_RDWR, // Create if not exists, read-write 0666); // Permissions if (fd == -1) { perror("shm_open"); return 1; } // 2. Set the size (only needed on creation) if (ftruncate(fd, SHM_SIZE) == -1) { perror("ftruncate"); return 1; } // 3. Map into address space SharedData *data = mmap(NULL, // Let kernel choose address SHM_SIZE, // Size PROT_READ | PROT_WRITE, // Permissions MAP_SHARED, // Modifications visible to others fd, // File descriptor 0); // Offset if (data == MAP_FAILED) { perror("mmap"); return 1; } // 4. Close fd - mapping persists! close(fd); // 5. Initialize synchronization (must be PTHREAD_PROCESS_SHARED) sem_init(&data->mutex, 1, 1); // 1 = shared between processes // 6. Use the shared memory for (int i = 0; i < 100; i++) { sem_wait(&data->mutex); data->counter++; snprintf(data->message, sizeof(data->message), "Message %d from producer", data->counter); sem_post(&data->mutex); usleep(10000); // 10ms } // 7. Cleanup (last process should do this) munmap(data, SHM_SIZE); // shm_unlink(SHM_NAME); // Uncomment to remove shared memory object return 0;} // ============================================// CONSUMER PROCESS// ============================================int consumer_main() { // 1. Open existing shared memory int fd = shm_open(SHM_NAME, O_RDWR, 0666); // No O_CREAT if (fd == -1) { perror("shm_open (consumer)"); return 1; } // 2. Map it SharedData *data = mmap(NULL, SHM_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); close(fd); if (data == MAP_FAILED) { perror("mmap (consumer)"); return 1; } // 3. Read from shared memory int last_value = -1; while (1) { sem_wait(&data->mutex); if (data->counter != last_value) { printf("Consumer sees: counter=%d, message='%s'\n", data->counter, data->message); last_value = data->counter; } sem_post(&data->mutex); if (last_value >= 100) break; usleep(5000); } munmap(data, SHM_SIZE); return 0;}/. Object exists in /dev/shm/.MAP_SHARED is required for changes to be visible across processes.A newly created shared memory object has size 0. Attempting to map it without first calling ftruncate() will cause mmap to fail or return a zero-length mapping. Always set the size immediately after shm_open with O_CREAT.
System V shared memory predates POSIX and uses a key-based identification system rather than file paths. While POSIX is generally preferred for new code, System V IPC remains important for interfacing with legacy systems and certain high-performance applications (like PostgreSQL's buffer management).
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
#include <stdio.h>#include <stdlib.h>#include <string.h>#include <sys/ipc.h>#include <sys/shm.h>#include <sys/types.h> #define SHM_SIZE 4096 typedef struct { int counter; char data[256];} SharedData; int main() { // 1. Generate a unique key (or use IPC_PRIVATE for related processes) key_t key = ftok("/tmp/my_app", 'A'); // Path must exist, char is project ID if (key == -1) { perror("ftok"); return 1; } // 2. Create/get shared memory segment int shmid = shmget(key, SHM_SIZE, IPC_CREAT | 0666); // Create if not exists if (shmid == -1) { perror("shmget"); return 1; } // 3. Attach segment to process address space SharedData *data = (SharedData *)shmat(shmid, NULL, 0); // ^ ^ ^ // shmid | | flags (0 = R/W) // NULL = let kernel choose address if (data == (void *)-1) { perror("shmat"); return 1; } // 4. Use the shared memory printf("Attached at address: %p\n", (void *)data); data->counter = 42; strcpy(data->data, "Hello from System V shared memory!"); // 5. Detach from address space (memory persists!) if (shmdt(data) == -1) { perror("shmdt"); return 1; } // 6. To remove (when no longer needed): // shmctl(shmid, IPC_RMID, NULL); return 0;} // ============================================// Inspecting System V shared memory// ============================================/*$ ipcs -m # List all System V shared memory segments------ Shared Memory Segments --------key shmid owner perms bytes nattch0x41010203 1234567 user 666 4096 2 $ ipcrm -m 1234567 # Remove segment by ID$ ipcrm -M 0x41010203 # Remove segment by key*/| Aspect | POSIX | System V |
|---|---|---|
| Identification | Path-like names (/my_shm) | Numeric keys (ftok() or literal) |
| Namespace | /dev/shm filesystem | Kernel IPC namespace |
| API style | File descriptor based | ID-based (non-FD) |
| Persistence | Until unlink + unmap | Until IPC_RMID + detach |
| Portability | More portable (POSIX) | Available on older Unix |
| Size changes | ftruncate() anytime | Fixed at creation |
| Permissions | File-like (chmod analogy) | IPC permissions (shmctl) |
| Introspection | ls /dev/shm | ipcs -m |
| Limits | Filesystem limits | /proc/sys/kernel/shm* |
Prefer POSIX for new projects. Use System V when: (1) Interfacing with existing System V applications, (2) Need guaranteed huge page support (Linux shmget with SHM_HUGETLB), (3) Require precise control over memory placement (shmctl with SHM_LOCK), (4) Working on systems where /dev/shm has restrictive size limits.
Memory-mapped files blur the line between file I/O and shared memory. When multiple processes map the same file with MAP_SHARED, they share the same physical pages — the pages from the kernel's page cache.
Why Use mmap for IPC?
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889
#include <stdio.h>#include <stdlib.h>#include <string.h>#include <fcntl.h>#include <sys/mman.h>#include <sys/stat.h>#include <unistd.h> #define IPC_FILE "/tmp/my_ipc_data"#define DATA_SIZE 4096 typedef struct { uint32_t version; uint32_t sequence; char payload[DATA_SIZE - 8];} IPCData; // ============================================// Writer process// ============================================int writer() { // 1. Open/create file int fd = open(IPC_FILE, O_RDWR | O_CREAT, 0666); if (fd == -1) { perror("open"); return 1; } // 2. Ensure file is right size if (ftruncate(fd, DATA_SIZE) == -1) { perror("ftruncate"); return 1; } // 3. Map with MAP_SHARED - changes go to file AND are visible to other mappers IPCData *data = mmap(NULL, DATA_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, // Key flag! fd, 0); close(fd); if (data == MAP_FAILED) { perror("mmap"); return 1; } // 4. Write data data->version = 1; for (int i = 0; i < 100; i++) { data->sequence = i; snprintf(data->payload, sizeof(data->payload), "Update #%d at time %ld", i, time(NULL)); // Ensure data is visible to readers msync(data, DATA_SIZE, MS_SYNC); // Force to page cache // Or: __sync_synchronize(); // Memory barrier (CPU caches) sleep(1); } munmap(data, DATA_SIZE); return 0;} // ============================================// Reader process// ============================================int reader() { int fd = open(IPC_FILE, O_RDONLY); if (fd == -1) { perror("open"); return 1; } struct stat st; fstat(fd, &st); IPCData *data = mmap(NULL, st.st_size, PROT_READ, // Read-only mapping MAP_SHARED, fd, 0); close(fd); if (data == MAP_FAILED) { perror("mmap"); return 1; } // Read data uint32_t last_seq = -1; while (1) { if (data->sequence != last_seq) { printf("Reader sees: seq=%u, payload='%s'\n", data->sequence, data->payload); last_seq = data->sequence; } usleep(100000); // Poll every 100ms } munmap(data, st.st_size); return 0;}When you mmap a file with MAP_SHARED, you're mapping the page cache pages — the same pages used by read()/write(). This means: (1) No extra memory copies, (2) Traditional I/O and mmap share the same cached pages, (3) Multiple mmap processes share the same page cache pages. The file is truly a shared memory backing store.
Shared memory provides fast data transfer, but unsynchronized concurrent access leads to data corruption. Unlike message-passing IPC (pipes, sockets) where the kernel serializes operations, shared memory requires explicit synchronization.
The Race Condition Example:
// Process A // Process B
read counter (value: 10) read counter (value: 10)
calculate: 10 + 1 calculate: 10 + 1
write counter (11) write counter (11)
// Expected: 12, Actual: 11 — lost update!
Synchronization Options:
POSIX mutexes can be configured to work across processes. The mutex must be placed in shared memory and initialized with PTHREAD_PROCESS_SHARED.
#include <pthread.h>
typedef struct {
pthread_mutex_t mutex;
pthread_cond_t cond;
int data;
} SharedData;
// In creator process:
void init_shared(SharedData *sh) {
pthread_mutexattr_t mutex_attr;
pthread_mutexattr_init(&mutex_attr);
pthread_mutexattr_setpshared(&mutex_attr, PTHREAD_PROCESS_SHARED);
pthread_mutex_init(&sh->mutex, &mutex_attr);
pthread_mutexattr_destroy(&mutex_attr);
pthread_condattr_t cond_attr;
pthread_condattr_init(&cond_attr);
pthread_condattr_setpshared(&cond_attr, PTHREAD_PROCESS_SHARED);
pthread_cond_init(&sh->cond, &cond_attr);
pthread_condattr_destroy(&cond_attr);
}
// In any process:
void update(SharedData *sh) {
pthread_mutex_lock(&sh->mutex);
sh->data++;
pthread_cond_signal(&sh->cond);
pthread_mutex_unlock(&sh->mutex);
}
Caveats:
CPU cache coherency (MESI protocol) ensures that writes by one CPU are eventually visible to others. But 'eventually' isn't enough for correct concurrent programs! You need explicit synchronization (mutexes, memory barriers) to guarantee ordering. Cache coherency prevents seeing stale data; synchronization prevents data races.
Real-world shared memory systems employ sophisticated patterns to balance performance, reliability, and complexity. Let's examine several production-proven approaches.
123456789101112131415161718192021222324252627282930313233
typedef struct { atomic_int current; // 0 or 1 - which buffer is current atomic_uint version[2]; // Version counter for each buffer char data[2][DATA_SIZE]; // The actual data buffers} DoubleBuffer; // Writer: Update the non-current buffer, then swapvoid update(DoubleBuffer *db, const char *new_data) { int write_idx = 1 - atomic_load(&db->current); // Write to non-current memcpy(db->data[write_idx], new_data, DATA_SIZE); atomic_fetch_add(&db->version[write_idx], 1); // Increment version atomic_store(&db->current, write_idx); // Atomic swap} // Reader: Read from current buffer, verify consistencybool read(DoubleBuffer *db, char *out) { while (1) { int read_idx = atomic_load(&db->current); uint32_t v1 = atomic_load(&db->version[read_idx]); memcpy(out, db->data[read_idx], DATA_SIZE); uint32_t v2 = atomic_load(&db->version[read_idx]); // If version changed during read, data might be inconsistent if (v1 == v2 && (v1 & 1) == 0) { // Even version = complete write return true; } // Else: retry }}False sharing occurs when unrelated data shares a cache line (typically 64 bytes). Process A writes field X, Process B reads field Y, but both are in the same cache line → cache invalidation ping-pong. Align frequently-accessed separate fields to different cache lines: alignas(64) atomic_int head; alignas(64) atomic_int tail;
memfd_create() (Linux 3.17+) creates an anonymous file in memory that can be shared via file descriptor passing. It combines the simplicity of anonymous memory with the security benefits of file descriptor-based access control.
Why memfd?
| Feature | POSIX shm | System V | memfd |
|---|---|---|---|
| Namespace visible | Yes (/dev/shm) | Yes (ipcs) | No (truly anonymous) |
| Access control | Filesystem | IPC perms | FD passing only |
| Sealing support | No | No | Yes |
| Sandboxing compatible | Partially | No | Yes |
Sealing is the killer feature: you can permanently prevent modifications to the shared memory, guaranteeing immutability.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081
#define _GNU_SOURCE#include <sys/mman.h>#include <sys/socket.h>#include <linux/memfd.h>#include <fcntl.h>#include <unistd.h> // ============================================// Creating sealed shared memory// ============================================int create_immutable_shared_mem(size_t size, const void *data) { // Create anonymous memory file int fd = memfd_create("immutable_config", MFD_ALLOW_SEALING); if (fd == -1) return -1; // Set size and write data ftruncate(fd, size); void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); memcpy(ptr, data, size); munmap(ptr, size); // SEAL: No more modifications allowed, ever fcntl(fd, F_ADD_SEALS, F_SEAL_SHRINK | F_SEAL_GROW | F_SEAL_WRITE | F_SEAL_SEAL); // F_SEAL_SHRINK: Cannot shrink // F_SEAL_GROW: Cannot grow // F_SEAL_WRITE: Cannot write (all mappings become read-only!) // F_SEAL_SEAL: Cannot add more seals return fd; // Share this FD with other processes} // ============================================// Passing FD to another process// ============================================void send_fd(int socket, int fd_to_send) { struct msghdr msg = {0}; char buf[CMSG_SPACE(sizeof(int))]; msg.msg_control = buf; msg.msg_controllen = sizeof(buf); struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg); cmsg->cmsg_level = SOL_SOCKET; cmsg->cmsg_type = SCM_RIGHTS; // Pass file descriptors cmsg->cmsg_len = CMSG_LEN(sizeof(int)); *((int *)CMSG_DATA(cmsg)) = fd_to_send; // Send minimal data (FD is in ancillary message) char dummy = 'x'; struct iovec iov = { .iov_base = &dummy, .iov_len = 1 }; msg.msg_iov = &iov; msg.msg_iovlen = 1; sendmsg(socket, &msg, 0);} // ============================================// Receiving process maps it// ============================================void *receive_and_map(int socket, size_t *size_out) { // Receive FD (reverse of send_fd) int received_fd = /* ... ancillary message reception ... */; // Verify seals before trusting the data int seals = fcntl(received_fd, F_GET_SEALS); if (!(seals & F_SEAL_WRITE)) { // WARNING: Data could still be modified! close(received_fd); return NULL; } struct stat st; fstat(received_fd, &st); *size_out = st.st_size; // Map read-only (WRITE seal enforces this anyway) void *ptr = mmap(NULL, st.st_size, PROT_READ, MAP_SHARED, received_fd, 0); close(received_fd); // Mapping persists return ptr; // Guaranteed immutable!}memfd excels in: (1) Sandboxed environments (no /dev/shm access needed), (2) Immutable shared config/data (sealing guarantees), (3) GPU buffer sharing (graphics drivers use it), (4) Container-to-host sharing (pass FD through Unix socket), (5) Zero-copy deserialization (map serialized data, verify seals, use directly).
We've explored the complete landscape of shared memory for inter-process communication. Let's consolidate the key takeaways:
What's Next:
Now that we understand how to share memory between processes, we must address a critical concern: protection. How does the operating system prevent one process from corrupting another's shared memory regions? How do we ensure that read-only mappings are truly read-only? The next page explores the protection mechanisms that make shared memory safe.
You now understand inter-process shared memory comprehensively: the APIs (POSIX, System V, mmap, memfd), the synchronization requirements (mutexes, semaphores, futexes, lock-free), and advanced patterns (ring buffers, double buffering). This knowledge enables you to build high-performance IPC systems and debug shared memory issues.