Loading learning content...
The address space—the virtual memory layout containing code, data, heap, and stack—is perhaps the most fundamental resource of any process. When creating a new process, the operating system faces a critical architectural decision: should the child have its own independent address space, share the parent's address space, or something in between?
This choice defines the boundary between processes and threads, affects performance characteristics, determines isolation guarantees, and shapes the programming model. Understanding address space options is essential for making informed decisions about concurrency, security, and system architecture.
By the end of this page, you will master the complete spectrum of address space configurations: separate address spaces (fork), shared address spaces (threads/CLONE_VM), vfork for immediate exec optimization, posix_spawn as a modern alternative, and the memory management mechanisms that make each option efficient.
Before examining address space options, let's review the structure of a process's virtual address space and how the operating system manages it.
A typical process address space contains several distinct regions:
| Region | Contents | Permissions | Growth |
|---|---|---|---|
| Text | Compiled code (instructions) | Read + Execute | Fixed at load time |
| Data | Initialized global/static variables | Read + Write | Fixed at load time |
| BSS | Uninitialized global/static variables | Read + Write | Fixed (zero-filled) |
| Heap | malloc(), new allocations | Read + Write | Grows upward with brk/mmap |
| mmap | Shared libraries, mapped files | Variable | Created on demand |
| Stack | Local variables, return addresses | Read + Write | Grows downward with calls |
| Kernel | Kernel code and data | None (user mode) | Shared across processes |
Each process has its own virtual address space that is mapped to physical memory through page tables:
Key Insight: Address space isolation is achieved through separate page tables. When the CPU switches between processes, it loads different page tables, making the same virtual address point to different physical memory.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
#include <stdio.h>#include <stdlib.h>#include <unistd.h> // Global variables - in data/bss sectionsint initialized_global = 42; // Data sectionint uninitialized_global; // BSS sectionconst int const_global = 100; // Often in read-only data void sample_function() { // This function's code is in the text segment} /** * Displays the memory layout of the current process */int main() { // Stack variable int stack_var = 1; // Heap variable int *heap_var = malloc(sizeof(int)); *heap_var = 2; printf("=== Process Memory Layout ===\n"); printf("PID: %d\n\n", getpid()); printf("Text (Code) Section:\n"); printf(" main(): %p\n", (void*)main); printf(" sample_function(): %p\n", (void*)sample_function); printf("\nData/BSS Sections:\n"); printf(" initialized_global: %p (value: %d)\n", (void*)&initialized_global, initialized_global); printf(" uninitialized_global:%p (value: %d)\n", (void*)&uninitialized_global, uninitialized_global); printf(" const_global: %p (value: %d)\n", (void*)&const_global, const_global); printf("\nHeap:\n"); printf(" heap_var: %p (value: %d)\n", (void*)heap_var, *heap_var); printf("\nStack:\n"); printf(" stack_var: %p (value: %d)\n", (void*)&stack_var, stack_var); // Show relative positions printf("\n=== Address Comparison ===\n"); printf("Stack is %s heap\n", (void*)&stack_var > (void*)heap_var ? "ABOVE" : "BELOW"); printf("Heap is %s globals\n", (void*)heap_var > (void*)&initialized_global ? "ABOVE" : "BELOW"); printf("Globals are %s code\n", (void*)&initialized_global > (void*)main ? "ABOVE" : "BELOW"); free(heap_var); return 0;}The classic Unix fork() system call creates a child with its own separate address space. This provides maximum isolation: parent and child cannot directly affect each other's memory.
At the moment of fork():
| Element | fork() Behavior | Notes |
|---|---|---|
| Page Tables | Copied (new set for child) | Maps same virtual addresses initially |
| Virtual Address Layout | Identical copy | All addresses look the same |
| Physical Pages (initially) | Shared via COW | Marked read-only for both |
| Physical Pages (after write) | Copied on demand | Writer gets private copy |
| Text Segment | Shared permanently | Read-only code never needs copying |
| Stack | COW copy | Each process gets own copy when modified |
| Heap | COW copy | malloc'd data is private after write |
| Memory Maps (PRIVATE) | COW copy | MAP_PRIVATE regions are copied |
| Memory Maps (SHARED) | Truly shared | MAP_SHARED regions stay shared |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sys/wait.h> /** * Demonstrates separate address spaces after fork() * * Key insight: Same virtual addresses, different physical memory */ int global_var = 100; int main() { int stack_var = 200; int *heap_var = malloc(sizeof(int)); *heap_var = 300; printf("=== Before fork() ===\n"); printf("global_var: value=%d, address=%p\n", global_var, (void*)&global_var); printf("stack_var: value=%d, address=%p\n", stack_var, (void*)&stack_var); printf("heap_var: value=%d, address=%p\n", *heap_var, (void*)heap_var); pid_t pid = fork(); if (pid == 0) { // Child modifies all variables global_var = 111; stack_var = 222; *heap_var = 333; printf("\n=== Child after modification ===\n"); printf("global_var: value=%d, address=%p\n", global_var, (void*)&global_var); printf("stack_var: value=%d, address=%p\n", stack_var, (void*)&stack_var); printf("heap_var: value=%d, address=%p\n", *heap_var, (void*)heap_var); // Notice: SAME ADDRESSES as parent, DIFFERENT VALUES // This proves separate address spaces (different page mappings) _exit(0); } // Parent waits, then shows its unchanged values wait(NULL); printf("\n=== Parent after child exits ===\n"); printf("global_var: value=%d, address=%p\n", global_var, (void*)&global_var); printf("stack_var: value=%d, address=%p\n", stack_var, (void*)&stack_var); printf("heap_var: value=%d, address=%p\n", *heap_var, (void*)heap_var); printf("\n=== Conclusion ===\n"); printf("Child's modifications did NOT affect parent!\n"); printf("Same virtual addresses, separate physical memory.\n"); free(heap_var); return 0;} /* * Output: * === Before fork() === * global_var: value=100, address=0x404028 * stack_var: value=200, address=0x7ffd12345678 * heap_var: value=300, address=0x1234567890ab * * === Child after modification === * global_var: value=111, address=0x404028 <- Same address! * stack_var: value=222, address=0x7ffd12345678 <- Same address! * heap_var: value=333, address=0x1234567890ab <- Same address! * * === Parent after child exits === * global_var: value=100, address=0x404028 <- Original value * stack_var: value=200, address=0x7ffd12345678 <- Original value * heap_var: value=300, address=0x1234567890ab <- Original value */The demonstration above shows the power of virtual memory: parent and child see the SAME virtual addresses, but the MMU transparently directs them to DIFFERENT physical memory (after COW copies). Neither process needs to know anything about physical addresses or page tables—the hardware handles it invisibly.
The opposite extreme of fork() is creating entities that share the same address space. These are traditionally called threads. In Linux, threads are created using clone() with the CLONE_VM flag.
With CLONE_VM:
| Resource | Shared/Private | Implications |
|---|---|---|
| Code (Text) | Shared | All threads execute same program |
| Global Variables | Shared | Requires synchronization |
| Heap | Shared | malloc'd memory accessible by all; needs sync |
| Stack | Private | Each thread has own stack for local vars |
| Registers | Private | Each thread has own register set |
| Instruction Pointer | Private | Threads at different code locations |
| Thread-Local Storage | Private | __thread variables are per-thread |
| File Descriptors | Shared* | *Unless CLONE_FILES not set |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
#include <stdio.h>#include <stdlib.h>#include <pthread.h>#include <unistd.h> /** * Demonstrates shared address space in threads * * All threads see the same memory - no isolation! */ int global_counter = 0; // Thread-local storage - each thread has its own copy__thread int thread_local_var = 0; void* thread_function(void* arg) { int thread_id = *(int*)arg; // Modify shared global - visible to all threads global_counter += 100; // Modify thread-local - only this thread sees it thread_local_var = thread_id * 10; printf("Thread %d:\n", thread_id); printf(" global_counter = %d (address: %p)\n", global_counter, (void*)&global_counter); printf(" thread_local_var = %d (address: %p)\n", thread_local_var, (void*)&thread_local_var); // Small delay to interleave output usleep(100000); return NULL;} int main() { pthread_t threads[3]; int thread_ids[3] = {1, 2, 3}; printf("=== Before creating threads ===\n"); printf("global_counter = %d\n\n", global_counter); // Create threads for (int i = 0; i < 3; i++) { pthread_create(&threads[i], NULL, thread_function, &thread_ids[i]); } // Wait for all threads for (int i = 0; i < 3; i++) { pthread_join(threads[i], NULL); } printf("\n=== After all threads complete ===\n"); printf("global_counter = %d\n", global_counter); printf("(Each thread added 100, so 3 threads = 300)\n"); printf("\n=== Key Observations ===\n"); printf("1. global_counter has SAME address in all threads\n"); printf("2. thread_local_var has DIFFERENT addresses per thread\n"); printf("3. Modifications to global are immediately visible to all\n"); return 0;}The code above has a data race: multiple threads modify global_counter without synchronization. While it works by luck in this demo, in real code this causes undefined behavior. Always use mutexes, atomics, or other synchronization when multiple threads access shared mutable data.
Under the hood, pthread_create() uses clone() with sharing flags:
// Typical flags for pthread-like thread creation
clone(thread_function,
stack_top,
CLONE_VM | // Share address space
CLONE_FS | // Share filesystem info
CLONE_FILES | // Share file descriptors
CLONE_SIGHAND | // Share signal handlers
CLONE_THREAD | // Same thread group
CLONE_SYSVSEM, // Share System V semaphores
arg);
CLONE_VM without CLONE_THREAD:
Interestingly, you can use CLONE_VM without CLONE_THREAD to create entities that share memory but have different PIDs and signal handling. This is unusual but possible for specialized applications.
A very common pattern is fork() immediately followed by exec(). In this case, the elaborate COW setup of fork() is wasteful—the address space is about to be replaced anyway. The vfork() system call optimizes this case.
vfork() creates a child that:
Critical Constraints:
vfork() is notoriously easy to misuse. If the child modifies memory before exec(), the parent sees corrupted state when it resumes. If the child returns from the calling function, both parent and child will try to use the same stack frame. Modern systems have made fork() fast enough that vfork() is rarely needed.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sys/wait.h> /** * Demonstrates vfork() - USE WITH EXTREME CAUTION * * This example shows CORRECT usage: vfork + immediate exec * The WRONG_USAGE section shows what NOT to do */ int global = 42; void demonstrate_correct_vfork() { printf("=== Correct vfork() Usage ===\n"); printf("Parent: Before vfork(), global = %d\n", global); pid_t pid = vfork(); if (pid == 0) { // Child: ONLY call exec or _exit // Do NOT modify any variables // Do NOT call other functions // Do NOT return from this block // Correct: immediately exec execlp("echo", "echo", "Child exec'd successfully", NULL); // If exec fails, ONLY _exit (not exit!) _exit(127); } // Parent resumes only after child calls exec() or _exit() printf("Parent: Resumed after child exec'd\n"); wait(NULL); printf("Parent: global still = %d\n", global);} /* * WRONG_USAGE - Never do this! * Uncommenting this would cause undefined behavior */void WRONG_do_not_do_this() { pid_t pid = vfork(); if (pid == 0) { // WRONG: modifying shared memory global = 999; // Parent will see this corruption! // WRONG: calling functions other than exec/_exit printf("Child running\n"); // May corrupt parent's state // WRONG: using exit() instead of _exit() exit(0); // exit() calls atexit handlers in parent's state! // WRONG: returning from function return; // Both try to use same stack frame! }} int main() { demonstrate_correct_vfork(); printf("\n=== vfork() Summary ===\n"); printf("vfork() is faster than fork() for immediate exec()\n"); printf("BUT: It is dangerous and easy to misuse\n"); printf("Modern fork() with COW is usually fast enough\n"); printf("Consider posix_spawn() instead\n"); return 0;}In early Unix systems (before COW), fork() had to physically copy all memory pages—extremely expensive for large processes. vfork() was created as an optimization for the common fork-then-exec pattern.
Timeline:
Modern Status:
In memory-constrained systems (embedded Linux, small containers), a large parent process may not have enough memory for even COW overhead of fork(). vfork() uses zero additional memory. Android's Zygote uses a related technique for fast app spawning.
posix_spawn() provides a modern, safe API for the common fork-exec pattern. It atomically creates a new process and executes a program, avoiding the complexity and risks of fork()+exec() or vfork().
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105
#include <stdio.h>#include <stdlib.h>#include <spawn.h>#include <sys/wait.h>#include <fcntl.h>#include <unistd.h> extern char **environ; /** * Demonstrates posix_spawn() - the modern process creation API * * This is the recommended way to spawn new programs */ int run_with_posix_spawn(const char *program, char *const argv[]) { pid_t pid; int status; // Simple spawn - no special file actions or attributes int ret = posix_spawnp(&pid, program, NULL, NULL, argv, environ); if (ret != 0) { fprintf(stderr, "posix_spawn failed: %s\n", strerror(ret)); return -1; } printf("Spawned process with PID %d\n", pid); // Wait for completion waitpid(pid, &status, 0); if (WIFEXITED(status)) { return WEXITSTATUS(status); } return -1;} int run_with_redirection(const char *program, char *const argv[], const char *output_file) { pid_t pid; int status; // Initialize file actions posix_spawn_file_actions_t actions; posix_spawn_file_actions_init(&actions); // Redirect stdout to file posix_spawn_file_actions_addopen(&actions, STDOUT_FILENO, output_file, O_WRONLY | O_CREAT | O_TRUNC, 0644); // Could also do: // posix_spawn_file_actions_addclose(&actions, 3); // Close fd 3 // posix_spawn_file_actions_adddup2(&actions, 4, 1); // dup2(4, 1) int ret = posix_spawnp(&pid, program, &actions, NULL, argv, environ); // Cleanup file actions (even if spawn failed) posix_spawn_file_actions_destroy(&actions); if (ret != 0) { fprintf(stderr, "posix_spawn failed: %s\n", strerror(ret)); return -1; } printf("Spawned process PID %d with stdout -> %s\n", pid, output_file); waitpid(pid, &status, 0); if (WIFEXITED(status)) { return WEXITSTATUS(status); } return -1;} int main() { printf("=== posix_spawn() Examples ===\n\n"); // Example 1: Simple command printf("Example 1: Simple spawn\n"); char *args1[] = {"echo", "Hello from posix_spawn!", NULL}; run_with_posix_spawn("echo", args1); printf("\n"); // Example 2: With output redirection printf("Example 2: Spawn with I/O redirection\n"); char *args2[] = {"ls", "-la", NULL}; run_with_redirection("ls", args2, "ls_output.txt"); // Show redirected output printf("\nContents of ls_output.txt:\n"); char *args3[] = {"head", "-5", "ls_output.txt", NULL}; run_with_posix_spawn("head", args3); printf("\n=== Advantages of posix_spawn() ===\n"); printf("1. Single atomic call (no fork + exec race)\n"); printf("2. Explicit file actions (clear redirection)\n"); printf("3. Can use vfork internally (efficient)\n"); printf("4. POSIX portable\n"); return 0;}The posix_spawn_file_actions_t structure allows specifying file descriptor manipulations:
| Function | Effect |
|---|---|
addclose(actions, fd) | Close fd in child |
adddup2(actions, oldfd, newfd) | dup2(oldfd, newfd) in child |
addopen(actions, fd, path, flags, mode) | Open file at specified fd |
addchdir_np(actions, path) | Change directory (non-POSIX extension) |
addfchdir_np(actions, fd) | fchdir to fd (non-POSIX extension) |
posix_spawnattr_t controls process attributes:
| Function | Controls |
|---|---|
setflags() | Flags like POSIX_SPAWN_RESETIDS, POSIX_SPAWN_SETSIGDEF |
setpgroup() | Set process group of child |
setsigmask() | Set signal mask of child |
setsigdefault() | Reset certain signals to default |
setschedparam() | Set scheduling parameters |
setschedpolicy() | Set scheduling policy |
Let's compare all address space options to help you choose the right mechanism for your needs.
| Option | Address Space | Creation Cost | Isolation | Best Use Case |
|---|---|---|---|---|
| fork() | Separate (COW) | Medium (page tables + COW setup) | Complete | General process creation |
| clone(CLONE_VM) | Shared | Low (no memory copying) | None | Threads, parallel computation |
| vfork() | Parent's (temporarily) | Very Low | None (dangerous!) | Immediate exec only (legacy) |
| posix_spawn() | Separate (new) | Low-Medium | Complete | Running external programs |
| pthread_create() | Shared | Low | None | In-process parallelism |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
/** * Decision guide for choosing address space options */ // SCENARIO 1: Running an external program// Use: posix_spawn() or fork() + exec()void run_external_command(const char *cmd) { pid_t pid; char *args[] = {(char*)cmd, NULL}; posix_spawnp(&pid, cmd, NULL, NULL, args, environ); waitpid(pid, NULL, 0);} // SCENARIO 2: Parallel computation on shared data// Use: pthread_create() (shared address space)void process_data_in_parallel(int *data, size_t n) { pthread_t threads[NUM_THREADS]; for (int i = 0; i < NUM_THREADS; i++) { pthread_create(&threads[i], NULL, worker, &data[i * n/NUM_THREADS]); } for (int i = 0; i < NUM_THREADS; i++) { pthread_join(threads[i], NULL); }} // SCENARIO 3: Subprocesses that must be isolated// Use: fork() (separate address space with COW)void run_isolated_task(void (*task)(void)) { pid_t pid = fork(); if (pid == 0) { task(); // Even if this crashes, parent is unaffected _exit(0); } waitpid(pid, NULL, 0);} // SCENARIO 4: Memory-constrained immediate exec// Use: vfork() (only if you really know what you're doing)// Better: posix_spawn() which may use vfork internallyvoid spawn_minimal_memory(const char *cmd) { pid_t pid = vfork(); if (pid == 0) { execlp(cmd, cmd, NULL); _exit(127); } waitpid(pid, NULL, 0);}Address space configuration is one of the most fundamental decisions in process creation, defining the boundary between processes and threads, affecting performance, isolation, and programming model.
Module Complete:
You have now completed Module 1: Process Creation. You understand the complete mechanics of how operating systems create new processes: parent-child relationships, process tree structure, resource sharing options, execution patterns, and address space configurations.
This knowledge is foundational for understanding process termination (Module 2), the fork() system call (Module 3), the exec() family (Module 4), and the management of zombie and orphan processes (Modules 5-6).
Congratulations! You now have a comprehensive understanding of process creation mechanisms. From parent-child relationships to address space options, you've mastered the foundational concepts that enable all process management in modern operating systems. These concepts apply across Unix, Linux, and (with variations) Windows and embedded systems.