Loading content...
Shared memory is a double-edged sword. The same mechanism that enables zero-copy, high-performance IPC also creates potential security vulnerabilities. When two processes share memory:
These questions aren't academic—they're the difference between a secure system and one vulnerable to exploitation. Shared memory vulnerabilities have been at the heart of major security incidents, from privilege escalation attacks to data leakage between containers.
In this page, we'll systematically examine the protection mechanisms that make shared memory safe: the hardware foundations, the kernel policies, security vulnerabilities and mitigations, and best practices for production systems.
By the end of this page, you will understand: how page table protection bits enforce access permissions, kernel-level access control for shared memory objects, the principle of least privilege in shared memory design, common vulnerabilities and their mitigations, and secure patterns for building shared memory systems.
The foundation of memory protection is hardware-enforced. The CPU's Memory Management Unit (MMU) checks every memory access against protection bits in the page table entry (PTE). This enforcement happens at processor speed—there's no performance penalty for protection.
Page Table Entry Protection Bits (x86-64):
| Bit | Name | Meaning When Set | Protection Effect |
|---|---|---|---|
| Bit 0 | Present | Page is in physical memory | Access to non-present page triggers page fault |
| Bit 1 | Read/Write | Page is writable | Write to read-only page triggers protection fault |
| Bit 2 | User/Supervisor | Page accessible from user mode | User access to supervisor page triggers protection fault |
| Bit 63 | NX (No Execute) | Execution disabled (when EFER.NXE=1) | Execute from non-executable page triggers protection fault |
| Bit 5 | Accessed | Page has been read | No protection; used for page replacement algorithms |
| Bit 6 | Dirty | Page has been written | No protection; used for write-back decisions |
How Hardware Protection Works:
CPU executes: mov rax, [0x7f00001000] ; Read from user virtual address
1. TLB lookup:
- If TLB hit with matching ASID, check protection
- If TLB miss, walk page table
2. Page table walk (if needed):
- Traverse PML4 → PDP → PD → PT
- Each level must have Present bit set
3. Protection check:
- If CPL (Current Privilege Level) = 3 (user mode):
- Check User/Supervisor bit (must be 1)
- If instruction is write:
- Check Read/Write bit (must be 1)
- If fetching instruction:
- Check NX bit (must be 0)
4. Outcome:
- All checks pass: Translate and access memory
- Any check fails: Raise exception (#PF page fault)
Key insight: This enforcement is non-bypassable for user-space code. There's no syscall, no API, no privilege level that allows user code to skip page table checks.
Intel's Memory Protection Keys (PKU, from Skylake) add another protection layer. Each page can be tagged with a 4-bit key (16 domains), and a user-accessible register (PKRU) controls access rights per domain. This enables fast, fine-grained protection changes without modifying page tables. Use case: temporarily disable write access to a region during security-sensitive operations, then re-enable with a single register write.
123456789101112131415161718192021222324252627282930313233
#include <stdio.h>#include <signal.h>#include <sys/mman.h>#include <unistd.h> void segfault_handler(int sig, siginfo_t *info, void *context) { printf("Protection fault at address: %p\n", info->si_addr); printf("Fault type: %s\n", (info->si_code == SEGV_MAPERR) ? "No mapping" : (info->si_code == SEGV_ACCERR) ? "Permission denied" : "Unknown"); _exit(1);} int main() { // Set up signal handler for protection faults struct sigaction sa = { .sa_sigaction = segfault_handler, .sa_flags = SA_SIGINFO }; sigaction(SIGSEGV, &sa, NULL); // Allocate read-only memory char *readonly = mmap(NULL, 4096, PROT_READ, // Read-only permission MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); printf("Read succeeds: %c\n", readonly[0]); // OK printf("Attempting write to read-only page...\n"); readonly[0] = 'X'; // Will trigger SIGSEGV (SEGV_ACCERR) return 0; // Never reached}A critical feature of virtual memory is that the same physical page can have different permissions in different mappings. This enables powerful protection patterns for shared memory.
Asymmetric Permission Example:
Process A (Producer): Maps shared region as Read/Write Process B (Consumer): Maps same physical pages as Read-Only
Producer can modify the data; consumer can only observe. If consumer tries to write, hardware triggers a fault.
Physical Frame 0x1234
┌──────────────────┐
│ Shared Data │
│ │
└──────────────────┘
▲
│
┌───────────────┴───────────────┐
│ │
Process A PTE Process B PTE
Frame: 0x1234 Frame: 0x1234
R/W: 1 (writable) R/W: 0 (read-only)
User: 1 User: 1
NX: 1 (no exec) NX: 1 (no exec)
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
#include <sys/mman.h>#include <fcntl.h>#include <unistd.h>#include <stdio.h> #define SHM_NAME "/asymmetric_demo"#define SHM_SIZE 4096 // Process A: Creates shared memory with read-write accessvoid producer() { int fd = shm_open(SHM_NAME, O_CREAT | O_RDWR, 0644); // Note: 0644 permissions ftruncate(fd, SHM_SIZE); char *data = mmap(NULL, SHM_SIZE, PROT_READ | PROT_WRITE, // Producer has write access MAP_SHARED, fd, 0); close(fd); // Producer can write strcpy(data, "Hello from producer"); printf("Producer wrote data\n"); munmap(data, SHM_SIZE);} // Process B: Opens shared memory read-onlyvoid consumer() { int fd = shm_open(SHM_NAME, O_RDONLY, 0); // Open read-only if (fd == -1) { perror("shm_open"); return; } char *data = mmap(NULL, SHM_SIZE, PROT_READ, // Consumer has read-only access MAP_SHARED, fd, 0); close(fd); if (data == MAP_FAILED) { perror("mmap"); return; } // Consumer can read printf("Consumer reads: %s\n", data); // Consumer cannot write - this would cause SIGSEGV! // data[0] = 'X'; // CRASH: Hardware protection fault munmap(data, SHM_SIZE);} // Key point: The kernel enforces that Process B cannot request// PROT_WRITE on an O_RDONLY file descriptor. Even if it tried,// mmap would fail or the hardware protection would enforce read-only.Modern security practice mandates that no memory region should be both writable AND executable simultaneously. This prevents attackers from injecting code (requires write) and then executing it (requires execute). Enforce this by never using PROT_WRITE | PROT_EXEC together. For JIT compilers, use mprotect() to switch between write-mode (for compilation) and execute-mode (for running).
Beyond hardware protection, the kernel enforces access control policies that determine which processes can create, open, and map shared memory objects. This is the first line of defense—before a process can even attempt to access shared memory, the kernel verifies authorization.
POSIX Shared Memory Access Control
POSIX shared memory objects use filesystem-like permissions. On Linux, they exist in /dev/shm and have standard Unix permission bits.
// Create shared memory with permissions 0640:
// Owner: read + write (6)
// Group: read only (4)
// Others: no access (0)
int fd = shm_open("/my_data", O_CREAT | O_RDWR, 0640);
// In /dev/shm:
// -rw-r----- 1 user group 4096 Jan 15 10:00 my_data
Permission Checks:
shm_open(): Kernel checks if calling process's UID/GID allows requested access (read/write) according to permission bits.
mmap(): Kernel verifies that the file descriptor's open mode (O_RDONLY, O_RDWR) is compatible with requested protection (PROT_* flags).
Common patterns:
| Scenario | Permissions | Rationale |
|---|---|---|
| Single-user app | 0600 | Only owner can access |
| System daemon + clients | 0660 | Daemon and group members |
| World-readable config | 0644 | Anyone can read, owner writes |
| Exclusive IPC | 0600 + flock | Owner only, with locking |
When designing shared memory systems: (1) Grant the minimum permissions needed—if a process only reads, use 0444 or PROT_READ. (2) Prefer capability passing (FD) over named objects. (3) Avoid world-readable shared memory for sensitive data. (4) Consider separate shared regions for different trust levels.
Shared memory introduces unique security challenges that have been exploited in real-world attacks. Understanding these vulnerabilities is essential for building secure systems.
123456789101112131415161718192021222324252627282930313233
// VULNERABLE CODEtypedef struct { size_t length; char data[1024];} SharedBuffer; void process_message(SharedBuffer *shared, char *output) { // CHECK: Validate length if (shared->length > 1024) { return; // Invalid } // VULNERABILITY: Between check and use, attacker modifies shared->length // USE: Copy data based on (now modified) length memcpy(output, shared->data, shared->length); // Buffer overflow!} // SECURE CODEvoid process_message_safe(SharedBuffer *shared, char *output) { // Copy length to LOCAL variable size_t len = shared->length; // CHECK: Validate local copy if (len > 1024) { return; } // USE: Use local variable (attacker cannot modify) memcpy(output, shared->data, len); // Safe! // Even better: copy entire structure to private memory first // SharedBuffer local_copy = *shared; // Then validate and use local_copy}Dirty COW exploited a race condition in the Linux kernel's handling of copy-on-write mappings. An attacker could write to files they only had read access to (like /etc/passwd) by racing the COW page fault handler. The fix required careful synchronization in the page fault path. This attack demonstrates that even kernel-enforced protections can have vulnerabilities.
Containers use Linux namespaces to isolate shared memory between groups of processes. This is crucial for multi-tenant systems where different customers' containers run on the same host.
| Namespace | Effect on Shared Memory | Isolation Level |
|---|---|---|
| IPC namespace | Separate System V IPC ID space; /dev/shm isolated | Strong: Containers cannot see each other's shm |
| Mount namespace | Separate /dev/shm filesystems | Strong: POSIX shm isolated |
| PID namespace | Different PID views; affects IPC tools | Indirect: ipcs shows only local view |
| User namespace | UID/GID mapping; affects permission checks | Variable: Depends on mapping |
1234567891011121314151617181920212223242526272829303132
# Demonstrate IPC namespace isolation # Create shared memory in host namespace$ ipcmk -M 4096Shared memory id: 12345 # Verify it exists$ ipcs -m | grep 123450x... 12345 user 666 4096 0 # Run a new shell in a NEW IPC namespace$ sudo unshare --ipc bash # In the new namespace, the segment is NOT visible!$ ipcs -m------ Shared Memory Segments --------key shmid owner perms bytes nattch# (empty!) # Create a segment in the new namespace$ ipcmk -M 2048Shared memory id: 0 # IDs start fresh in new namespace # Exit and check host namespace - new segment not visible there$ exit$ ipcs -m | grep "2048"# (nothing - the segment exists only in the isolated namespace) # Docker containers use IPC namespace isolation by default:$ docker run --ipc=host ... # Share host IPC namespace (less secure)$ docker run --ipc=private ... # Isolated IPC namespace (default, secure)$ docker run --ipc=container:X # Share with container X's namespacedocker run --shm-size=256m ...GPU workloads often require large shared memory regions for inter-process GPU buffer sharing. This creates tension with container isolation—some GPU use cases require --ipc=host or carefully crafted shared IPC namespaces. Evaluate the security trade-offs: is GPU sharing more important than container isolation?
Building secure shared memory systems requires disciplined patterns. Here are battle-tested approaches used in production systems.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
#include <openssl/hmac.h>#include <string.h> // Critical shared data includes integrity protectiontypedef struct { uint32_t sequence; uint32_t data_length; uint8_t data[1024]; uint8_t hmac[32]; // SHA-256 HMAC} SecureMessage; static const uint8_t SECRET_KEY[32] = { /* ... */ }; // Writer: Compute and store HMACvoid write_secure(SecureMessage *msg, const uint8_t *data, size_t len) { msg->sequence++; msg->data_length = len; memcpy(msg->data, data, len); // Compute HMAC over sequence + length + data unsigned int hmac_len; HMAC(EVP_sha256(), SECRET_KEY, sizeof(SECRET_KEY), (uint8_t *)msg, offsetof(SecureMessage, hmac), msg->hmac, &hmac_len);} // Reader: Verify HMAC before trusting databool read_secure(SecureMessage *msg, uint8_t *out, size_t *len) { // FIRST: Copy to local storage (prevent TOCTOU) SecureMessage local = *msg; // SECOND: Validate length if (local.data_length > sizeof(local.data)) { return false; // Invalid length } // THIRD: Verify integrity uint8_t expected_hmac[32]; unsigned int hmac_len; HMAC(EVP_sha256(), SECRET_KEY, sizeof(SECRET_KEY), (uint8_t *)&local, offsetof(SecureMessage, hmac), expected_hmac, &hmac_len); if (memcmp(local.hmac, expected_hmac, 32) != 0) { return false; // Integrity check failed! } // FOURTH: Data is trusted, copy out memcpy(out, local.data, local.data_length); *len = local.data_length; return true;}12345678910111213141516171819202122232425262728293031
#include <sys/mman.h> void *create_guarded_shared_memory(size_t data_size) { size_t page_size = sysconf(_SC_PAGESIZE); // Allocate: guard + data + guard size_t total_size = page_size + data_size + page_size; void *region = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); if (region == MAP_FAILED) return NULL; // First page: No access (guard) if (mprotect(region, page_size, PROT_NONE) != 0) { munmap(region, total_size); return NULL; } // Last page: No access (guard) void *last_page = (char *)region + page_size + data_size; if (mprotect(last_page, page_size, PROT_NONE) != 0) { munmap(region, total_size); return NULL; } // Return pointer to the usable region (between guards) return (char *)region + page_size;} // Any access outside [data_ptr, data_ptr + data_size) triggers SIGSEGVIn production systems, visibility into shared memory usage is essential for security monitoring, debugging, and capacity planning.
123456789101112131415161718192021222324252627282930313233343536373839
#!/bin/bash# Comprehensive shared memory audit script echo "=== POSIX Shared Memory (/dev/shm) ==="ls -la /dev/shm/du -sh /dev/shm/ echo ""echo "=== System V Shared Memory ==="ipcs -m -t # With timestampsecho ""echo "Orphaned segments (nattch = 0):"ipcs -m | awk 'NR>3 && $6==0 {print $0}' echo ""echo "=== Per-Process Shared Mappings ==="for pid in $(pgrep -f "my_application"); do echo "PID $pid:" grep "shm\|shared" /proc/$pid/maps 2>/dev/null || echo " (no shared mappings)" echo " Shared memory totals:" grep -E 'Shared_(Clean|Dirty)' /proc/$pid/smaps 2>/dev/null | awk '{sum += $2} END {printf " Shared: %d KB\n", sum}'done echo ""echo "=== Large Shared Memory Consumers ==="# Find processes with most shared memoryfor pid in /proc/[0-9]*/smaps; do grep "Shared" $pid 2>/dev/null | awk -v pid=$(dirname $pid | cut -d'/' -f3) '{sum += $2} END {if (sum > 0) print sum " KB " pid}'done | sort -rn | head -10 echo ""echo "=== Security Concerns ==="# Shared memory with world-read/writeecho "World-accessible in /dev/shm:"find /dev/shm -perm -006 -ls 2>/dev/null # Suspiciously named shared memoryecho "Hidden files in /dev/shm:"ls -la /dev/shm/.* 2>/dev/null | grep -v "^total\|^d"| Metric | Source | Alert Threshold |
|---|---|---|
| /dev/shm usage | df /dev/shm | 80% capacity |
| Orphaned System V segments | ipcs -m (nattch=0) | 10 segments or growing |
| Shared memory per process | /proc/PID/smaps | Unusual growth pattern |
| World-writable shm objects | find /dev/shm -perm | Any occurrence |
| shm_open/shmget syscalls | auditd, strace, eBPF | From unexpected processes |
Modern Linux systems can use eBPF to trace shared memory operations with minimal overhead. Tools like bpftrace can hook shmget, shmat, shm_open, mmap, and report in real-time which processes access which shared memory. This is invaluable for security monitoring and debugging complex multi-process systems.
We've explored the comprehensive landscape of protection mechanisms for shared memory. Let's consolidate the key takeaways:
What's Next:
Now that we've covered the theoretical foundations and security aspects of shared memory, the final page will bring everything together with implementation details — how real operating systems like Linux implement shared memory, the kernel data structures involved, and how all the pieces we've studied fit together in practice.
You now understand shared memory protection comprehensively: hardware mechanisms (page table protection bits), kernel access control (permissions), common vulnerabilities (TOCTOU, disclosure, races), container isolation (namespaces), and secure coding patterns. This knowledge enables you to build secure shared memory systems and audit existing ones.