Loading learning content...
We've established that memory protection isolates processes from each other. But consider this scenario: you have 100 processes all running the same library—say, the C standard library (libc). Without sharing, each process would need its own copy of the library's code in physical memory. With libc being roughly 2MB, that's 200MB of RAM consumed by 100 identical copies of the same instructions.
This is absurd waste. The library code is read-only—it's identical in every process. Why not have all 100 processes share a single physical copy?
Memory sharing is the third fundamental goal of memory management: enabling controlled, safe access to common memory regions. Sharing seems to contradict protection, but the operating system achieves both simultaneously. This page explores how.
By the end of this page, you will understand: why memory sharing is essential for efficiency, the different forms of sharing (code, data, IPC), how sharing works with virtual memory mechanisms, copy-on-write as a sharing optimization, shared memory for inter-process communication, and how sharing and protection coexist.
Memory sharing serves three primary purposes in operating systems: efficiency, communication, and functionality. Each represents a different use case with different requirements.
| Component | Size | Without Sharing (100 processes) | With Sharing |
|---|---|---|---|
| C Library (libc) | 2 MB | 200 MB | 2 MB |
| GUI Toolkit (Qt/GTK) | 20 MB | 2 GB | 20 MB |
| Language Runtime (Java/Python) | 50 MB | 5 GB | 50 MB |
| Kernel Code (mapped read-only) | 10 MB | 1 GB | 10 MB |
| Total | — | 8.2 GB | 82 MB |
The table above illustrates the dramatic impact of sharing. A server running 100 identical processes could consume 8+ GB of RAM for libraries alone—or just 82 MB with sharing enabled. This isn't optimization; it's the difference between a system that works and one that doesn't.
The Sharing-Protection Balance:
Sharing and protection might seem contradictory:
The resolution lies in controlled sharing:
By default, processes are fully isolated. Sharing only occurs when explicitly configured: shared libraries loaded by the dynamic linker, memory regions explicitly shared via shmat() or mmap(), or pages subject to copy-on-write after fork(). This default-isolated approach maintains security while enabling efficiency.
Memory sharing is implemented through the same virtual memory mechanisms used for protection. The key insight: multiple page table entries can point to the same physical frame.
Process A Page Table Physical Memory Process B Page Table
┌──────────────────┐ ┌─────────────┐ ┌──────────────────┐
│ VA 0x1000 │ │ │ │ VA 0x5000 │
│ → Frame 500 ─────┼─────────►│ Frame 500 │◄───────┼─── Frame 500 │
│ (R-X, User) │ │ (libc code)│ │ (R-X, User) │
└──────────────────┘ │ │ └──────────────────┘
└─────────────┘
In this example:
Important Observations:
Different Virtual Addresses, Same Physical Frame
Reference Counting
Consistent Permissions
123456789101112131415161718192021222324252627282930313233343536
// Simplified: Creating a shared mappingfunction create_shared_mapping(process, virtual_addr, size, permissions): // Find or create the shared memory object shmem = find_shared_memory_object(key) if shmem is NULL: shmem = create_shared_memory_object(size) allocate_physical_frames(shmem, size) // Map into this process's address space for page in range(0, size, PAGE_SIZE): vpage = virtual_addr + page pframe = shmem.frames[page / PAGE_SIZE] // Create page table entry pointing to shared frame create_pte(process.page_table, vpage, pframe, permissions) // Increment reference count on physical frame pframe.ref_count += 1 return virtual_addr // When unmapping shared memoryfunction unmap_shared_memory(process, virtual_addr, size): for page in range(0, size, PAGE_SIZE): pte = get_pte(process.page_table, virtual_addr + page) pframe = pte.frame // Remove mapping invalidate_pte(pte) // Decrement reference count pframe.ref_count -= 1 // Only free frame if no one else is using it if pframe.ref_count == 0: free_frame(pframe)The TLB (Translation Lookaside Buffer) caches virtual-to-physical translations. Different processes have different page tables, so their TLB entries are tagged with an Address Space ID (ASID). When sharing, each process still looks up its own virtual address—it just happens to resolve to the same physical frame. The TLB entries are separate but point to the same destination.
The most impactful application of memory sharing is shared libraries (also called dynamic libraries or DLLs on Windows). Rather than including library code in every executable, programs link against shared libraries that are loaded once and shared among all processes that use them.
Static vs. Dynamic Linking:
How Shared Library Loading Works:
Position-Independent Code (PIC):
For sharing to work, library code cannot contain absolute addresses (which would only work at one specific virtual address). Instead, shared libraries are compiled as Position-Independent Code:
// Non-PIC (problematic for sharing):
mov eax, [0x12345678] // Absolute address - only works at one location
// PIC (works anywhere):
lea rbx, [rip + got] // Get address of GOT relative to current instruction
mov eax, [rbx + offset] // Access through GOT
Position-independent code has slight overhead due to GOT/PLT indirection. On x86, this was significant (~5%); on x86-64 with PC-relative addressing, it's minimal (~1%). The memory savings from sharing far outweigh this penalty in virtually all scenarios.
Copy-on-Write (COW) is one of the most elegant optimizations in operating systems. It allows memory to be shared initially, with copying deferred until actually necessary—which may be never.
The Fork Problem:
The fork() system call creates a new process as an exact copy of the parent. Naively, this would require:
For a 1GB process, this means allocating and copying 1GB of memory—even if the child immediately exec()s a different program, discarding all that copied data.
The COW Solution:
Instead of copying, share everything:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
// Fork with copy-on-writefunction fork(): child = create_process() // Copy page table structure (but not frame data) child.page_table = clone_page_table(parent.page_table) for each pte in parent.page_table: if pte.present and pte.writable: // Mark as read-only and copy-on-write pte.writable = false pte.cow_flag = true // Custom flag to track COW pages child_pte.writable = false child_pte.cow_flag = true // Increment reference count pte.frame.ref_count += 1 // Flush TLB (protection bits changed) flush_tlb() return child // Handle write fault on COW pagefunction handle_cow_fault(process, virtual_addr): pte = get_pte(process.page_table, virtual_addr) if not pte.cow_flag: // Not a COW page - genuine protection violation send_signal(process, SIGSEGV) return old_frame = pte.frame if old_frame.ref_count == 1: // We're the only user - just make it writable again pte.writable = true pte.cow_flag = false else: // Others are sharing - need to actually copy new_frame = allocate_frame() copy_frame_contents(old_frame, new_frame) pte.frame = new_frame pte.writable = true pte.cow_flag = false old_frame.ref_count -= 1 // Retry the write instructionCOW in Action: The Timeline
Time T0: Before Fork
┌─────────────────────────────────────────┐
│ Parent Process │
│ Page 1 [RW] → Frame 100 (ref=1) │
│ Page 2 [RW] → Frame 101 (ref=1) │
└─────────────────────────────────────────┘
Time T1: After Fork (COW setup)
┌─────────────────────────────────────────┐
│ Parent Process │
│ Page 1 [RO,COW] → Frame 100 (ref=2) │
│ Page 2 [RO,COW] → Frame 101 (ref=2) │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Child Process │
│ Page 1 [RO,COW] → Frame 100 (ref=2) │
│ Page 2 [RO,COW] → Frame 101 (ref=2) │
└─────────────────────────────────────────┘
Time T2: Child writes to Page 2 (COW triggered)
┌─────────────────────────────────────────┐
│ Parent Process │
│ Page 1 [RO,COW] → Frame 100 (ref=2) │
│ Page 2 [RO,COW] → Frame 101 (ref=1) │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Child Process │
│ Page 1 [RO,COW] → Frame 100 (ref=2) │
│ Page 2 [RW] → Frame 102 (ref=1) │ ← New frame!
└─────────────────────────────────────────┘
Note: Only the page that was written gets copied. Pages that are never written remain shared forever.
Without COW, fork() would be prohibitively expensive for large processes. A 4GB browser forking would require allocating and copying 4GB of memory. With COW, fork() completes in microseconds regardless of process size. Pages are copied only when actually modified, and read-only pages (like code) are never copied at all.
While pipes, sockets, and message queues are common IPC mechanisms, they all involve copying data—from the sender's address space into kernel buffers, then from kernel buffers into the receiver's address space. For high-performance communication, this copying is unacceptable.
Shared memory IPC eliminates all copying. Both processes map the same physical frames, so data written by one is immediately visible to the other.
Performance Comparison:
| IPC Method | Copies per Message | Syscalls per Message | Latency | Throughput |
|---|---|---|---|---|
| Pipe/Socket | 2 (sender→kernel→receiver) | 2 (write/read) | Medium | Medium |
| Message Queue | 2 (sender→kernel→receiver) | 2 (msgsnd/msgrcv) | Medium | Medium |
| Shared Memory | 0 (direct access) | 0 (after setup) | Lowest | Highest |
12345678910111213141516171819202122232425262728293031
// POSIX Shared Memory API (Linux/macOS/BSD)#include <sys/mman.h>#include <fcntl.h> // ===== CREATE/OPEN SHARED MEMORY =====// Open or create named shared memory objectint shm_fd = shm_open("/my_shared_mem", O_CREAT | O_RDWR, // Create if not exists 0666); // Permissions // Set the sizeftruncate(shm_fd, 4096); // 4KB region // Map into address spacevoid *ptr = mmap(NULL, // Let OS choose address 4096, // Size PROT_READ | PROT_WRITE, MAP_SHARED, // Changes visible to others shm_fd, 0); // Offset // Now both processes can use ptr to read/write shared data! // ===== IMPORTANT: Synchronization required! =====// Shared memory provides no synchronization// You MUST use semaphores, mutexes, or atomics to prevent races // ===== CLEANUP =====munmap(ptr, 4096); // Unmap from this processclose(shm_fd); // Close file descriptorshm_unlink("/my_shared_mem"); // Remove shared memory objectShared memory provides no synchronization whatsoever. Without external synchronization (semaphores, mutexes, or atomic operations), simultaneous reads and writes will cause race conditions, data corruption, and subtle bugs. Always pair shared memory with appropriate synchronization primitives.
Memory-mapped files extend the sharing concept to include disk files. Instead of using read() and write() system calls, a file is mapped into the process's address space. The file's contents appear as memory, and modifications are (eventually) written back to disk.
How It Works:
Sharing Memory-Mapped Files:
When multiple processes map the same file with MAP_SHARED, they share the same physical frames:
Process A Physical Memory Process B
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ VA: 0x10000 │────────────▶ Page Cache ◀────────────│ VA: 0x20000 │
│ file offset 0│ │ Frame 500 │ │ file offset 0│
└──────────────┘ │ (file page 0) │ └──────────────┘
└──────────────┘
│
▼
┌──────────────┐
│ Disk File │
│ data.bin │
└──────────────┘
This enables:
1234567891011121314151617181920212223242526272829303132333435363738
// Memory-mapped file example#include <sys/mman.h>#include <fcntl.h>#include <unistd.h> int main() { // Open the file int fd = open("data.bin", O_RDWR); // Get file size off_t size = lseek(fd, 0, SEEK_END); // Map the file into memory void *mapped = mmap(NULL, // Let OS choose address size, // Map entire file PROT_READ | PROT_WRITE, MAP_SHARED, // Changes visible to others fd, // file descriptor 0); // Start at beginning of file if (mapped == MAP_FAILED) { perror("mmap failed"); return 1; } // Now we can access file contents as memory! int *data = (int *)mapped; data[0] = 42; // This writes to the file! // Changes written to disk automatically, but we can force it: msync(mapped, size, MS_SYNC); // Cleanup munmap(mapped, size); close(fd); return 0;}Memory-mapped files excel for random access to large files, when multiple processes need to share file data, for read-only access to configuration/data files, and when simplifying file I/O code. They're less suitable for sequential-only access (read() is fine), small files (setup overhead not worthwhile), or when you need precise control over when writes occur.
Memory sharing introduces complexities that don't exist with isolated processes. Understanding these challenges is essential for correctly using shared memory.
False Sharing in Detail:
False sharing is a particularly subtle performance problem. Consider this code:
struct { int counter_a; int counter_b; } shared;
// Thread A // Thread B
while(1) { while(1) {
shared.counter_a++; shared.counter_b++;
} }
Logically, threads A and B access different variables—no data race exists. But if counter_a and counter_b are on the same cache line (64 bytes on most systems), every write by thread A invalidates thread B's cache line and vice versa. Performance may be 10-100x worse than expected.
Solution: Pad structures to ensure separate cache lines:
struct {
int counter_a;
char padding[60]; // Ensure counter_b is on a different cache line
int counter_b;
} shared;
Modern compilers offer alignas() and specialized types (C++11's std::atomic_ref with alignment) to handle this correctly.
Shared memory is the fastest IPC mechanism, but also the most dangerous. Without rigorous synchronization and careful design, bugs are almost guaranteed. For most applications, higher-level mechanisms (message passing, RPC, channels) are safer. Reserve shared memory for performance-critical paths where the complexity is justified.
This page explored memory sharing as the third fundamental goal of memory management. Let's consolidate the key concepts:
Beyond Physical Organization:
We've now covered three of the five memory management goals: allocation, protection, and sharing. The next page explores Logical Organization—how the OS organizes memory to match programmer expectations and enable modular program design.
You now understand how memory sharing works, why it's essential, and the challenges it introduces. Sharing complements protection—together they enable efficient, safe multiprogramming. Next, we'll examine how memory is organized from the programmer's logical perspective.