Loading learning content...
When an operating system creates a child process, one of the most consequential design decisions is how resources are shared between parent and child. This choice sits at the intersection of performance, isolation, security, and programming model—and different operating systems make fundamentally different tradeoffs.
At one extreme lies complete sharing: parent and child access the same memory, file descriptors, and identity. At the other extreme lies complete isolation: child receives entirely independent copies of everything. Most practical systems occupy a nuanced middle ground, sharing some resources while copying or isolating others.
Understanding these options is essential for writing correct concurrent programs, optimizing performance, debugging memory corruption, and designing secure systems.
By the end of this page, you will understand the complete taxonomy of resource sharing options: when to share, when to copy, and when to isolate. You'll learn the specifics of memory sharing (including copy-on-write), file descriptor semantics, and the clone() system call that enables fine-grained control over resource inheritance.
Operating systems offer three fundamental approaches to resource sharing between parent and child processes:
Parent and child access the same resources. Changes made by one are immediately visible to the other.
Characteristics:
Example: Threads share the same address space and file descriptor table.
Child receives an independent copy of all resources. Parent and child are completely isolated after creation.
Characteristics:
Example: Traditional fork() semantics—child gets copy of memory.
Some resources are shared, others are copied. Provides fine-grained control over isolation boundaries.
Characteristics:
Example: clone() with specific flags controls sharing.
| Aspect | Share Everything | Copy Everything | Selective Sharing |
|---|---|---|---|
| Creation Cost | Very Low | High (mitigated by COW) | Variable |
| Memory Isolation | None | Complete | Configurable |
| Communication Ease | Trivial (shared memory) | Requires IPC | Depends on choices |
| Synchronization Need | Critical | None needed | For shared resources |
| Fault Isolation | None (crash affects all) | Complete | Configurable |
| Security Boundary | Weak | Strong | Configurable |
| Typical Use Case | Threads | Processes (fork) | Containers, specialized apps |
The difference between threads and processes is primarily one of resource sharing. Threads share address space, file descriptors, and signal handlers within a process. Separate processes have independent address spaces. The clone() system call blurs this line by allowing any combination.
Memory is the most significant resource to consider when creating processes. The choice between sharing and copying has profound implications for performance, correctness, and isolation.
Naively copying all memory at fork() time would be prohibitively expensive—a process with 1GB of memory would require copying 1GB of data. Modern operating systems use Copy-on-Write (COW) to defer this cost.
How Copy-on-Write Works:
Cache-Efficient Fork:
When COW Isn't Enough:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sys/wait.h>#include <string.h> #define PAGE_SIZE 4096#define NUM_PAGES 1000#define ARRAY_SIZE (PAGE_SIZE * NUM_PAGES) /** * Demonstrates Copy-on-Write behavior * * Key observations: * 1. Fork is fast even with large memory allocation * 2. Reading shared memory doesn't trigger copies * 3. Writing triggers per-page copies */ int main() { // Allocate large memory region char* large_array = malloc(ARRAY_SIZE); if (!large_array) { perror("malloc"); return 1; } // Initialize with pattern printf("Parent: Initializing %d bytes...\n", ARRAY_SIZE); memset(large_array, 'P', ARRAY_SIZE); printf("Parent: Memory initialized. Forking...\n"); // Time the fork pid_t pid = fork(); if (pid == 0) { // Child process // Reading doesn't copy - pages remain shared printf("Child: Reading from shared memory (no copy)...\n"); int first_char = large_array[0]; int last_char = large_array[ARRAY_SIZE - 1]; printf("Child: First='%c', Last='%c'\n", first_char, last_char); // Writing triggers COW for specific pages printf("Child: Writing to first page (triggers COW copy)...\n"); large_array[0] = 'C'; // Triggers COW for page 0 printf("Child: Writing to page 500 (triggers COW copy)...\n"); large_array[500 * PAGE_SIZE] = 'C'; // Triggers COW for page 500 // Verify our writes printf("Child: After write, first='%c' (expected 'C')\n", large_array[0]); // Check that unmodified pages still have parent's data printf("Child: Unmodified page 100, char='%c' (expected 'P')\n", large_array[100 * PAGE_SIZE]); _exit(0); } // Parent continues wait(NULL); // Verify parent's data unchanged despite child's writes printf("Parent: After child exit, first='%c' (expected 'P')\n", large_array[0]); printf("Parent: Isolation confirmed - child's writes didn't affect us\n"); free(large_array); return 0;}COW means the kernel can 'promise' more memory than physically exists, betting that not all COW pages will be written. If this bet fails (many processes write heavily), the system runs out of memory (OOM). Linux's OOM killer then terminates processes to reclaim memory. This is the 'overcommit' policy—configurable via /proc/sys/vm/overcommit_memory.
Different memory regions have different sharing behaviors:
| Region | Typical Sharing Behavior | Notes |
|---|---|---|
| Text (Code) Segment | Shared (read-only) | Never needs COW—identical, immutable code |
| Initialized Data | COW | Variables with initial values |
| BSS (Uninitialized) | COW | Zero-initialized global/static variables |
| Heap | COW | Dynamically allocated memory |
| Stack | COW (per-thread) | Each thread/process gets own stack |
| Shared Memory (mmap SHARED) | Truly Shared | MAP_SHARED regions bypass COW |
| Mapped Files (PRIVATE) | COW | MAP_PRIVATE copies on write |
File descriptors present a unique sharing scenario: the descriptor table is copied, but the underlying file table entries are shared. This creates both opportunities and hazards.
To understand file descriptor sharing, you must understand the three levels of abstraction:
File Descriptor Table (per-process)
File Table (system-wide)
Inode Table (system-wide)
Shared File Offset:
The most important consequence: parent and child share the file offset.
int fd = open("data.txt", O_RDWR);
lseek(fd, 0, SEEK_SET); // Offset = 0
pid_t pid = fork();
if (pid == 0) {
// Child
char buf[100];
read(fd, buf, 100); // Reads bytes 0-99, advances offset to 100
_exit(0);
}
wait(NULL);
// Parent's offset is now 100! Child's read advanced it.
char buf[100];
read(fd, buf, 100); // Reads bytes 100-199
This behavior is intentional—it enables concurrent log writing without explicit coordination:
// Both parent and child can write to log
// Writes are atomic (up to PIPE_BUF for pipes, larger for regular files)
// Shared offset ensures sequential log entries without gaps
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364
#include <stdio.h>#include <unistd.h>#include <fcntl.h>#include <sys/wait.h>#include <string.h> /** * Demonstrates file descriptor sharing semantics * * Key points: * 1. FD table is copied (parent and child have own tables) * 2. Both tables point to SAME file table entry * 3. File offset is shared and advances for both * 4. Closing FD in child doesn't affect parent's FD */ int main() { // Create test file int fd = open("shared_test.txt", O_RDWR | O_CREAT | O_TRUNC, 0644); const char* initial = "0123456789ABCDEF"; // 16 bytes write(fd, initial, 16); lseek(fd, 0, SEEK_SET); // Reset to start printf("Parent: File position before fork: %ld\n", (long)lseek(fd, 0, SEEK_CUR)); pid_t pid = fork(); if (pid == 0) { // Child printf("Child: Initial file position: %ld\n", (long)lseek(fd, 0, SEEK_CUR)); // Read 4 bytes - advances SHARED offset char buf[5] = {0}; read(fd, buf, 4); printf("Child: Read '%s', position now: %ld\n", buf, (long)lseek(fd, 0, SEEK_CUR)); // Close FD in child - parent's FD still works close(fd); printf("Child: Closed fd, exiting\n"); _exit(0); } wait(NULL); // Parent continues - offset was advanced by child printf("Parent: File position after child: %ld\n", (long)lseek(fd, 0, SEEK_CUR)); char buf[5] = {0}; read(fd, buf, 4); // Reads "4567" not "0123" printf("Parent: Read '%s' (expected '4567')\n", buf); // Our FD still works despite child closing theirs printf("Parent: FD still valid, position: %ld\n", (long)lseek(fd, 0, SEEK_CUR)); close(fd); unlink("shared_test.txt"); return 0;}Sometimes you DON'T want file descriptors to be inherited by children, especially when exec() is called. The close-on-exec flag causes a file descriptor to be automatically closed when the process exec()s a new program.
// Method 1: Set flag after open
int fd = open("secret.key", O_RDONLY);
fcntl(fd, F_SETFD, FD_CLOEXEC);
// Method 2: Use O_CLOEXEC at open time (preferred, atomic)
int fd = open("secret.key", O_RDONLY | O_CLOEXEC);
// Method 3: Set non-inheritable (same effect)
// After exec(), this FD will not exist in the new program
Security Implication: Sensitive file descriptors (passwords, crypto keys, privileged sockets) should always use close-on-exec to prevent leaking to exec'd programs.
Modern security-conscious code sets O_CLOEXEC by default on all file descriptors. Only explicitly clear it when you specifically want inheritance. This prevents accidentally leaking sensitive resources to exec'd child processes.
While fork() provides all-or-nothing semantics (copy everything, except what COW optimizes), Linux's clone() system call enables precise control over what is shared and what is copied.
Conceptually:
fork() = clone() with no sharing flagsclone() with maximum sharing flagsThe clone() system call accepts flags that individually control each resource:
| Flag | Effect When SET | Effect When UNSET |
|---|---|---|
| CLONE_VM | Share virtual memory (address space) | Copy address space (COW) |
| CLONE_FS | Share filesystem info (cwd, root, umask) | Copy filesystem info |
| CLONE_FILES | Share file descriptor table | Copy file descriptor table |
| CLONE_SIGHAND | Share signal handlers | Copy signal handlers |
| CLONE_THREAD | Place in same thread group | Create new process (new TGID) |
| CLONE_PARENT | New process is sibling, not child | Normal parent-child |
| CLONE_NEWPID | Create new PID namespace | Inherit PID namespace |
| CLONE_NEWNS | Create new mount namespace | Inherit mount namespace |
| CLONE_NEWNET | Create new network namespace | Inherit network namespace |
| CLONE_NEWUSER | Create new user namespace | Inherit user namespace |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475
#define _GNU_SOURCE#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sched.h>#include <sys/wait.h>#include <sys/mman.h> #define STACK_SIZE (1024 * 1024) // Shared variable (only meaningful with CLONE_VM)int shared_variable = 0; int child_function(void* arg) { printf("Child: Entered child function\n"); printf("Child: PID=%d, PPID=%d\n", getpid(), getppid()); // Modify shared variable shared_variable = 42; printf("Child: Set shared_variable to %d\n", shared_variable); return 0;} /** * Demonstrate clone() with different sharing combinations */int main() { // Allocate stack for child (required for clone) void* child_stack = mmap(NULL, STACK_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0); if (child_stack == MAP_FAILED) { perror("mmap"); return 1; } // Stack grows downward - pass top of stack void* stack_top = child_stack + STACK_SIZE; printf("Parent: Before clone, shared_variable = %d\n", shared_variable); // Example 1: Share virtual memory (like threads) printf("\n=== Clone with CLONE_VM (shared memory) ===\n"); pid_t pid = clone(child_function, stack_top, CLONE_VM | SIGCHLD, // Share VM, send SIGCHLD on exit NULL); if (pid == -1) { perror("clone"); return 1; } printf("Parent: Created child with PID %d\n", pid); // Wait for child waitpid(pid, NULL, 0); // With CLONE_VM, child's modification is visible to parent printf("Parent: After child exit, shared_variable = %d\n", shared_variable); if (shared_variable == 42) { printf("Parent: Memory was SHARED (expected with CLONE_VM)\n"); } else { printf("Parent: Memory was COPIED\n"); } // Cleanup munmap(child_stack, STACK_SIZE); return 0;}POSIX threads (pthreads) are implemented using clone() with these typical flags:
clone(thread_function,
child_stack,
CLONE_VM | // Share address space
CLONE_FS | // Share FS info
CLONE_FILES | // Share file descriptors
CLONE_SIGHAND | // Share signal handlers
CLONE_THREAD | // Same thread group (TGID)
CLONE_SETTLS | // Set thread-local storage
CLONE_PARENT_SETTID | // Write TID to parent memory
CLONE_CHILD_CLEARTID, // Clear TID on exit (for futex)
arg);
This creates a new thread that shares almost everything with its creator—the defining characteristic of threads versus processes.
The CLONE_NEW* flags create new namespaces, which is the foundation of Linux containers:
clone(init_function,
stack_top,
CLONE_NEWPID | // New PID namespace (child sees itself as PID 1)
CLONE_NEWNS | // New mount namespace (isolated filesystem view)
CLONE_NEWNET | // New network namespace (private network stack)
CLONE_NEWUTS | // New UTS namespace (private hostname)
CLONE_NEWIPC | // New IPC namespace (private message queues)
CLONE_NEWUSER | // New user namespace (private UID mappings)
SIGCHLD,
NULL);
Docker, Kubernetes, and LXC all use these mechanisms.
Linux 5.3 introduced clone3(), which uses a struct for arguments instead of positional parameters. This allows adding new features without API breakage. New code should prefer clone3() when available, though glibc may still use clone() internally.
Let's examine how different applications and patterns use resource sharing strategically.
Scenario: Web server wants multiple worker processes to serve requests.
Sharing Strategy:
// Master opens socket, then forks
int listen_fd = socket(AF_INET, SOCK_STREAM, 0);
bind(listen_fd, ...);
listen(listen_fd, 128);
for (int i = 0; i < num_workers; i++) {
if (fork() == 0) {
// Worker process
while (1) {
int conn = accept(listen_fd, ...); // Inherited FD
handle_request(conn);
close(conn);
}
}
}
Examples: Apache prefork MPM, Nginx workers, Gunicorn
Scenario: Run untrusted code with minimal access to parent's resources.
Sharing Strategy:
// Highly isolated child
clone(sandbox_function,
stack_top,
CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET | CLONE_NEWUSER | SIGCHLD,
NULL);
// In sandbox_function:
// - Close all FDs
// - Drop capabilities
// - Apply seccomp filter
// - exec untrusted binary
Examples: Chrome sandbox, Firejail, systemd-nspawn
Scenario: Parallel algorithm with shared data structures.
Sharing Strategy:
pthread_t threads[NUM_THREADS];
// Shared data
int results[NUM_THREADS];
int shared_counter = 0;
pthread_mutex_t counter_lock = PTHREAD_MUTEX_INITIALIZER;
for (int i = 0; i < NUM_THREADS; i++) {
pthread_create(&threads[i], NULL, worker, &i);
}
// Workers share address space, can access shared_counter with synchronization
Examples: OpenMP parallel regions, Java parallel streams, Rayon in Rust
Many real systems use hybrid approaches: fork() for process isolation, then mmap(MAP_SHARED) for specific shared regions. This gives per-process protection for most memory while enabling efficient communication through designated shared areas.
Different operating systems offer different resource sharing mechanisms, reflecting different design philosophies.
Windows does not have fork()—it uses CreateProcess() which always loads a new executable. Resource sharing is controlled differently:
| Resource | Linux (fork/clone) | Windows (CreateProcess) |
|---|---|---|
| Memory | COW copy (fork) or shared (clone+VM) | Always separate; use shared memory explicitly |
| File Handles | All inherited by default | Only handles marked inheritable are inherited |
| Threads | clone() with CLONE_THREAD | CreateThread() within process |
| Environment | Inherited by default | Explicitly passed or inherited |
| Directory | Inherited | Explicitly specified or inherited |
| Security Context | Inherited UID/GID | Inherited token or specified |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
#include <windows.h>#include <stdio.h> /** * Windows handle inheritance example * * Handles must be explicitly marked inheritable */int main() { // Create security attributes for inheritable handle SECURITY_ATTRIBUTES sa; sa.nLength = sizeof(sa); sa.bInheritHandle = TRUE; // Key: mark as inheritable sa.lpSecurityDescriptor = NULL; // Create pipe with inheritable handles HANDLE read_pipe, write_pipe; CreatePipe(&read_pipe, &write_pipe, &sa, 0); // Prepare child process STARTUPINFOA si = {0}; PROCESS_INFORMATION pi = {0}; si.cb = sizeof(si); // Pass handle value to child via environment or command line char cmd[256]; sprintf(cmd, "child.exe %p", write_pipe); // Pass handle as argument // Create child with handle inheritance ENABLED BOOL success = CreateProcessA( NULL, // App name cmd, // Command line with handle NULL, // Process security NULL, // Thread security TRUE, // INHERIT HANDLES (crucial) 0, // Flags NULL, // Environment NULL, // Directory &si, &pi ); if (!success) { printf("CreateProcess failed: %d\n", GetLastError()); return 1; } printf("Parent: Created child, PID=%d\n", pi.dwProcessId); printf("Parent: Child inherited write_pipe handle\n"); // Close our copy of write end CloseHandle(write_pipe); // Read from child char buf[100]; DWORD bytesRead; ReadFile(read_pipe, buf, sizeof(buf), &bytesRead, NULL); buf[bytesRead] = '\0'; printf("Parent read from child: %s\n", buf); WaitForSingleObject(pi.hProcess, INFINITE); CloseHandle(pi.hProcess); CloseHandle(pi.hThread); CloseHandle(read_pipe); return 0;}macOS and BSD support fork() but encourage posix_spawn() for new code:
posix_spawn() atomically creates process and execs new program#include <spawn.h>
extern char **environ;
posix_spawn_file_actions_t actions;
posix_spawn_file_actions_init(&actions);
// Close fd 3 in child
posix_spawn_file_actions_addclose(&actions, 3);
// Redirect stdin from file
posix_spawn_file_actions_addopen(&actions, 0, "input.txt", O_RDONLY, 0);
pid_t pid;
posix_spawn(&pid, "/bin/ls", &actions, NULL,
(char*[]){"ls", "-la", NULL}, environ);
Resource sharing is one of the most consequential decisions in process creation, affecting performance, isolation, security, and programming model. Modern operating systems provide flexible mechanisms to make these tradeoffs explicit.
What's Next:
With resource sharing options understood, the next page examines execution options—the choices a parent has for how it and its child execute after creation: concurrently, sequentially, or in some combined pattern. These decisions affect program structure, responsiveness, and resource utilization.
You now understand the complete spectrum of resource sharing options available during process creation. From COW memory to shared file descriptors to namespace isolation, these mechanisms enable everything from high-performance threading to secure containerization.