Operating SystemsSystem Calls & API

System Call Types

LevelIntermediate

Duration90 mins

TopicSystem Calls & API

1 / 5

Process Control System Calls

The Imperative of Process Control

Every running program—every web browser, database server, compiler, and shell command—exists as a process under the jurisdiction of the operating system. Yet programs are not born spontaneously; they must be created, managed, synchronized, and eventually terminated through explicit requests to the kernel. These requests constitute process control system calls: the programmatic interface through which user-space applications orchestrate the lifecycle of processes.

Process control represents perhaps the most foundational category of system calls. Without the ability to create new processes, no multi-tasking would exist. Without termination mechanisms, resources would leak indefinitely. Without synchronization primitives, parent-child relationships would devolve into chaos. Understanding process control system calls is therefore prerequisite to understanding everything that follows in operating systems—scheduling, inter-process communication, security, and resource management all depend on the semantics we explore here.

What You Will Learn

By the end of this page, you will understand the complete spectrum of process control system calls: how processes are created through fork() and its variants, how program images are replaced via exec(), how processes terminate normally or abnormally, how parents wait for and collect child exit status, and how the operating system maintains process lifecycle integrity. You'll grasp not merely the syntax but the profound design decisions underlying these interfaces.

Process Creation — The fork() System Call

The UNIX process model, which forms the foundation for Linux, macOS, and many other operating systems, centers on a remarkably elegant primitive: fork(). This system call creates a new process by duplicating the calling process, producing a parent-child relationship that forms the basis of the entire UNIX process hierarchy.

The Semantics of fork()

When a process invokes fork(), the kernel performs the following operations:

Allocate new process control block (PCB): A new entry is created in the process table with a unique Process ID (PID)
Copy parent's address space: The child receives a complete copy of the parent's memory—code segment, data segment, heap, and stack
Copy file descriptor table: The child inherits references to all open files, with shared file position offsets
Copy signal dispositions: Signal handlers and masks are duplicated
Initialize child-specific state: The child's PID differs from the parent, parent PID is set appropriately, and resource utilization counters reset
Return control to both processes: Both parent and child resume execution after the fork() call

The brilliance lies in the return value semantics: fork() returns twice—once in the parent (returning the child's PID) and once in the child (returning 0). This asymmetric return enables the two processes to diverge behaviorally despite having identical code.

fork_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
 
int main() {
    pid_t pid;
    int shared_var = 42;
    
    printf("Before fork: shared_var = %d, process = %d\n", 
           shared_var, getpid());
    
    pid = fork();
    
    if (pid < 0) {
        // Fork failed - typically due to resource exhaustion
        perror("fork failed");
        return 1;
    } 
    else if (pid == 0) {
        // Child process executes this branch
        // Child has its own copy of shared_var
        shared_var = 100;
        printf("Child: shared_var = %d, my PID = %d, parent PID = %d\n",
               shared_var, getpid(), getppid());
    } 
    else {
        // Parent process executes this branch
        // pid contains the child's PID
        shared_var = 200;
        printf("Parent: shared_var = %d, my PID = %d, child PID = %d\n",
               shared_var, getpid(), pid);
    }
    
    // Both processes execute this
    printf("Process %d ending with shared_var = %d\n", 
           getpid(), shared_var);
    
    return 0;
}

Copy-on-Write Optimization

Modern operating systems do not literally copy the entire address space during fork(). Instead, they employ copy-on-write (COW)—pages are initially shared between parent and child as read-only. Only when either process attempts to modify a page is it physically copied. This optimization makes fork() efficient even for processes with large memory footprints, as the copy is deferred and often avoided entirely when followed by exec().

Why Fork + Exec Rather Than Direct Spawn?

Developers new to UNIX often wonder why process creation requires two separate system calls—fork() to duplicate and exec() to replace—rather than a single call to spawn a new program (as Windows' CreateProcess does). The answer reveals deep philosophical commitments:

Separation of Concerns: Fork handles process creation; exec handles program loading. Each can be used independently.
Pre-Execution Configuration: Between fork and exec, the child can modify its own environment—redirect file descriptors, change working directory, modify signal handling, adjust resource limits—without affecting the parent.
Shell Implementation: The fork-exec model elegantly supports shell I/O redirection. When you type ls > output.txt, the shell forks, the child redirects stdout to the file, then execs ls. The redirection happens naturally.
Process Relationships: The fork model inherently creates a process tree with clear parent-child relationships, enabling job control, process groups, and session management.

fork() Return Values and Their Meanings
Return Value	Context	Meaning	Typical Action
Positive integer	Parent process	Child's PID	Store for later wait() or signaling
0	Child process	Successful fork	Execute child-specific logic or exec()
-1	Calling process	Fork failed	Check errno: EAGAIN (resource limit), ENOMEM (memory)

fork() Variants — vfork() and clone()

While fork() is the canonical process creation primitive, performance considerations and the need for finer control have driven the development of specialized variants.

vfork() — Optimized for Immediate exec()

Historically, before copy-on-write became ubiquitous, fork() was expensive because it literally copied the entire address space. The vfork() system call was introduced as an optimization for the common case where fork() is immediately followed by exec().

vfork() creates a child process that shares the parent's address space rather than copying it. The parent is suspended until the child either calls exec() (replacing the shared address space) or _exit(). This sharing is dangerous—any modification the child makes to variables affects the parent—but extremely efficient when used correctly.

vfork() Hazards

After vfork(), the child MUST NOT modify any variables, call any functions that are not async-signal-safe, or return from the function containing vfork(). Violating these constraints causes undefined behavior because the parent's stack and heap are shared. Modern systems with efficient COW make vfork() largely unnecessary, but it remains in POSIX for historical compatibility.

clone() — The Swiss Army Knife of Process Creation

Linux's clone() system call represents the generalization of fork(). Rather than using fixed semantics, clone() accepts flags specifying exactly which resources to share between parent and child:

CLONE_VM: Share address space (enables threads)
CLONE_FS: Share filesystem information (root, cwd, umask)
CLONE_FILES: Share file descriptor table
CLONE_SIGHAND: Share signal handlers
CLONE_THREAD: Place child in same thread group
CLONE_PARENT: Child's parent is caller's parent
CLONE_NEWPID: Create new PID namespace (containers)
CLONE_NEWNET: Create new network namespace

This flexibility enables clone() to implement both traditional fork (share nothing) and POSIX threads (share everything except stack), as well as modern container isolation through namespace flags.

clone_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
 
#define STACK_SIZE (1024 * 1024)  // 1MB stack for child
 
int shared_counter = 0;
 
static int child_function(void *arg) {
    // This function executes in the child
    const char *name = (const char *)arg;
    
    printf("Child: name = %s, PID = %d\n", name, getpid());
    shared_counter = 42;  // If CLONE_VM, parent sees this!
    printf("Child set shared_counter to %d\n", shared_counter);
    
    return 0;  // Child exit status
}
 
int main() {
    char *child_stack;
    char *child_stack_top;
    pid_t child_pid;
    
    // Allocate stack for child (stacks grow downward on most architectures)
    child_stack = malloc(STACK_SIZE);
    if (!child_stack) {
        perror("malloc");
        return 1;
    }
    child_stack_top = child_stack + STACK_SIZE;
    
    printf("Parent: shared_counter initially = %d\n", shared_counter);
    
    // Clone with shared virtual memory (like a thread)
    // CLONE_VM: share address space
    // SIGCHLD: send SIGCHLD on termination
    child_pid = clone(child_function, child_stack_top,
                      CLONE_VM | SIGCHLD, "CloneChild");
    
    if (child_pid == -1) {
        perror("clone");
        free(child_stack);
        return 1;
    }
    
    // Wait for child to complete
    waitpid(child_pid, NULL, 0);
    
    // Because of CLONE_VM, we see the child's modification!
    printf("Parent after wait: shared_counter = %d\n", shared_counter);
    
    free(child_stack);
    return 0;
}

Comparing Process Creation Mechanisms

Understanding when to use each mechanism is crucial for systems programming:

Process Creation System Calls Comparison
System Call	Address Space	Use Case	Performance	Portability
fork()	Copied (COW)	General process creation	Good (COW optimized)	POSIX standard
vfork()	Shared temporarily	fork() immediately before exec()	Best for exec pattern	POSIX (deprecated)
clone()	Configurable via flags	Threads, containers, custom sharing	Depends on flags	Linux-specific
posix_spawn()	N/A (new process)	Portable process+exec combination	Implementation-dependent	POSIX standard

Program Execution — The exec() Family

While fork() creates processes, the exec() family of system calls gives them new purpose. An exec() call replaces the current process image with a new program—same PID, same file descriptors (unless marked close-on-exec), same parent, but entirely different code, data, and stack.

The exec() family is not a single function but a collection of six variants, each providing different conveniences for specifying the program and its arguments:

Naming Convention Decoded

The suffix characters indicate behavior:

l (list): Arguments passed as variadic list (arg0, arg1, ..., NULL)
v (vector): Arguments passed as NULL-terminated array
e (environment): Explicit environment array provided
p (path): Search PATH for executable if no slash in filename

The exec() Family of Functions
Function	Arguments	PATH Search	Environment
execl()	Variadic list	No	Inherited
execv()	Array	No	Inherited
execle()	Variadic list	No	Explicit
execve()	Array	No	Explicit
execlp()	Variadic list	Yes	Inherited
execvp()	Array	Yes	Inherited

execve() — The True System Call

Only execve() is an actual system call in Linux; the others are library wrappers that ultimately invoke execve(). The kernel interface requires the array form with explicit environment, so all convenience variants are translated to this canonical form.

exec_examples.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
 
void demonstrate_exec_variants() {
    pid_t pid;
    
    // ========================================
    // Example 1: execl() - list form, full path
    // ========================================
    pid = fork();
    if (pid == 0) {
        // Child: execute /bin/ls with arguments
        // First argument (argv[0]) is conventionally the program name
        execl("/bin/ls", "ls", "-l", "-a", "/tmp", NULL);
        
        // If we reach here, exec failed!
        perror("execl failed");
        _exit(127);
    }
    waitpid(pid, NULL, 0);
    
    // ========================================
    // Example 2: execlp() - list form, PATH search
    // ========================================
    pid = fork();
    if (pid == 0) {
        // Don't need full path - searches PATH
        execlp("grep", "grep", "-r", "pattern", ".", NULL);
        perror("execlp failed");
        _exit(127);
    }
    waitpid(pid, NULL, 0);
    
    // ========================================
    // Example 3: execv() - array form
    // ========================================
    pid = fork();
    if (pid == 0) {
        char *args[] = {"sort", "-n", "-r", "numbers.txt", NULL};
        execv("/usr/bin/sort", args);
        perror("execv failed");
        _exit(127);
    }
    waitpid(pid, NULL, 0);
    
    // ========================================
    // Example 4: execve() - explicit environment
    // ========================================
    pid = fork();
    if (pid == 0) {
        char *args[] = {"env", NULL};
        char *envp[] = {
            "PATH=/usr/bin:/bin",
            "HOME=/home/user",
            "CUSTOM_VAR=hello",
            NULL
        };
        // Child will have ONLY these environment variables
        execve("/usr/bin/env", args, envp);
        perror("execve failed");
        _exit(127);
    }
    waitpid(pid, NULL, 0);
}
 
int main() {
    demonstrate_exec_variants();
    return 0;
}

What exec() Preserves and What It Replaces

Understanding exactly what survives an exec() call is essential for correct program design:

Preserved across exec():

Process ID (PID) and parent process ID (PPID)
Real user ID and real group ID
Session ID and controlling terminal
Current working directory and root directory
File mode creation mask (umask)
File locks (those belonging to the process)
Process signal mask
Pending signals
Resource limits
CPU time counters
File descriptors (unless FD_CLOEXEC flag is set)

Replaced by exec():

Process image (code, data, BSS segments)
Stack and heap
Memory mappings
Signal dispositions (handlers reset to default)
Thread state (all threads except caller are terminated)
Pending timers
Memory locks
Capabilities (if setuid/setgid)

Close-on-Exec Flag

Security best practice dictates that file descriptors should be marked FD_CLOEXEC (close-on-exec) unless the child specifically needs them. This prevents sensitive file handles from leaking to executed programs. Modern system calls like open(..., O_CLOEXEC), pipe2(..., O_CLOEXEC), and accept4(..., SOCK_CLOEXEC) enable setting this atomically.

Process Termination — exit(), abort(), and Signals

Process termination is the inevitable conclusion of every process's lifecycle. The operating system must reclaim resources, notify interested parties, and maintain the integrity of process relationships. Multiple mechanisms exist for termination, each with distinct semantics and use cases.

Normal Termination — exit() and _exit()

Normal termination occurs when a process voluntarily chooses to end its execution. Two primary functions serve this purpose:

_exit(status) — The raw system call that immediately terminates the process:

Closes all file descriptors
Releases memory mappings
Sends SIGCHLD to parent
Process becomes zombie until parent calls wait()

exit(status) — The C library wrapper that performs cleanup before _exit():

Flushes all stdio buffers
Calls functions registered with atexit()
Calls functions registered with on_exit()
Then calls _exit()

termination_examples.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
 
void cleanup_function(void) {
    printf("Cleanup: atexit handler called\n");
    // Flush logs, close database connections, release locks, etc.
}
 
void on_exit_handler(int status, void *arg) {
    printf("on_exit: status=%d, arg=%s\n", status, (char *)arg);
}
 
void demonstrate_exit_difference() {
    // Register cleanup handlers
    atexit(cleanup_function);
    on_exit(on_exit_handler, "context data");
    
    printf("About to exit...\n");
    
    // Uncommenting exit() will call cleanup handlers
    // exit(0);
    
    // Uncommenting _exit() will NOT call cleanup handlers
    // and stdout may not be flushed if not newline-terminated
    // _exit(0);
}
 
// What happens with return from main()?
int main() {
    atexit(cleanup_function);
    
    printf("Main function executing...\n");
    
    // Returning from main() is equivalent to exit(return_value)
    // All cleanup handlers will be called
    return 0;
    
    // Note: calling exit(0) here would be redundant but equivalent
}

Abnormal Termination — abort() and Fatal Signals

Abnormal termination occurs when a process terminates unexpectedly, either by its own volition or due to external factors:

abort()

Raises SIGABRT signal
If signal handler returns, process is still terminated
Typically produces core dump for debugging
Used by assertion failures (assert macro)
Cannot be blocked or ignored reliably

Fatal Signals Certain signals cause termination when their default disposition is not overridden:

SIGTERM: Polite termination request (default for kill command)
SIGKILL: Unconditional termination (cannot be caught or ignored)
SIGSEGV: Segmentation fault (invalid memory access)
SIGBUS: Bus error (alignment or nonexistent memory)
SIGFPE: Floating-point exception
SIGILL: Illegal instruction

SIGKILL Cannot Be Caught

The SIGKILL signal (signal number 9) is handled entirely by the kernel and cannot be caught, blocked, or ignored by the process. This ensures the system always has a mechanism to terminate any process, regardless of bugs or malicious behavior. However, this also means SIGKILL prevents cleanup—prefer SIGTERM when possible.

Termination Mechanisms Compared
Mechanism	Cleanly?	Exit Status	Core Dump?	Parent Notification
return from main()	Yes	Return value	No	SIGCHLD
exit(status)	Yes	Argument value	No	SIGCHLD
_exit(status)	Partial	Argument value	No	SIGCHLD
abort()	No	128+SIGABRT	Yes (typically)	SIGCHLD
SIGTERM default	No	128+SIGTERM	No	SIGCHLD
SIGKILL	No	128+SIGKILL	No	SIGCHLD
Segfault (SIGSEGV)	No	128+SIGSEGV	Yes (typically)	SIGCHLD

Exit Status Conventions

The exit status is an 8-bit value (0-255) that communicates information to the parent process:

0: Conventional success
1: General errors
2: Misuse of shell command (per Bash convention)
126: Command found but not executable
127: Command not found
128+N: Terminated by signal N
130: Control-C (128 + SIGINT(2))
137: SIGKILL (128 + 9)
143: SIGTERM (128 + 15)

The parent retrieves this status via wait() family calls, enabling sophisticated inter-process coordination and error propagation.

Waiting for Children — wait(), waitpid(), and waitid()

When a process terminates, it enters the zombie state—its resources are freed, but its entry in the process table persists to hold the exit status until the parent retrieves it. The wait() family of system calls serves this essential purpose: collecting child exit status and reaping zombie processes.

wait() — Simple Child Reaping

The basic wait() call blocks until any child process terminates:

pid_t wait(int *wstatus);

Returns the PID of the terminated child
Stores termination status in wstatus (if non-NULL)
Returns -1 if no children exist (errno = ECHILD)
Cannot specify which child to wait for

waitpid() — Targeted and Flexible

For finer control, waitpid() allows specifying which child(ren) to wait for:

pid_t waitpid(pid_t pid, int *wstatus, int options);

waitpid() pid Argument Interpretations
pid Value	Which Children to Wait For
< -1	Any child in process group \|pid\|
-1	Any child (equivalent to wait())
0	Any child in same process group as caller
0	Specific child with PID = pid

wait_examples.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <errno.h>
 
void decode_wait_status(int wstatus) {
    if (WIFEXITED(wstatus)) {
        // Child exited normally via exit() or return from main
        int exit_code = WEXITSTATUS(wstatus);
        printf("  Child exited normally with status %d\n", exit_code);
    }
    else if (WIFSIGNALED(wstatus)) {
        // Child was terminated by a signal
        int signal = WTERMSIG(wstatus);
        printf("  Child killed by signal %d", signal);
        #ifdef WCOREDUMP
        if (WCOREDUMP(wstatus)) {
            printf(" (core dumped)");
        }
        #endif
        printf("\n");
    }
    else if (WIFSTOPPED(wstatus)) {
        // Child was stopped by a signal (requires WUNTRACED)
        int signal = WSTOPSIG(wstatus);
        printf("  Child stopped by signal %d\n", signal);
    }
    else if (WIFCONTINUED(wstatus)) {
        // Child resumed by SIGCONT (requires WCONTINUED)
        printf("  Child continued\n");
    }
}
 
int main() {
    pid_t child1, child2, child3;
    int wstatus;
    
    // Create three children with different behaviors
    
    // Child 1: Normal exit
    child1 = fork();
    if (child1 == 0) {
        sleep(1);
        exit(42);  // Exit with status 42
    }
    
    // Child 2: Abnormal termination
    child2 = fork();
    if (child2 == 0) {
        sleep(2);
        abort();  // Terminate abnormally
    }
    
    // Child 3: Long-running task
    child3 = fork();
    if (child3 == 0) {
        sleep(10);
        exit(0);
    }
    
    printf("Parent created children: %d, %d, %d\n", child1, child2, child3);
    
    // Wait for specific child (child1) - BLOCKING
    printf("\nWaiting for child1 (%d)...\n", child1);
    waitpid(child1, &wstatus, 0);
    decode_wait_status(wstatus);
    
    // Non-blocking check for child3
    printf("\nNon-blocking check for child3 (%d)...\n", child3);
    pid_t result = waitpid(child3, &wstatus, WNOHANG);
    if (result == 0) {
        printf("  Child3 is still running\n");
    } else if (result > 0) {
        printf("  Child3 has terminated\n");
        decode_wait_status(wstatus);
    }
    
    // Wait for any remaining children
    printf("\nWaiting for remaining children...\n");
    while ((result = waitpid(-1, &wstatus, 0)) > 0) {
        printf("Reaped child %d:\n", result);
        decode_wait_status(wstatus);
    }
    
    printf("\nAll children reaped.\n");
    return 0;
}

waitpid() Options

The options argument provides additional control:

WNOHANG: Return immediately if no child has exited (non-blocking)
WUNTRACED: Also return if a child has stopped (not just terminated)
WCONTINUED: Also return if a stopped child has been resumed by SIGCONT

These options enable sophisticated job control implementations, as used by shells to manage foreground and background processes.

waitid() — Modern and Most Flexible

The waitid() system call provides the most control and information:

int waitid(idtype_t idtype, id_t id, siginfo_t *infop, int options);

waitid() fills a siginfo_t structure with detailed information including:

PID and UID of the child
Exit status or signal number
CPU time consumed
Precise reason for state change

Avoiding Zombie Accumulation

Long-running servers that spawn children must reap them promptly to prevent zombie accumulation. Common strategies: (1) Call waitpid() with WNOHANG periodically, (2) Install a SIGCHLD handler that calls wait(), (3) Use double-fork: parent forks, child forks again then exits immediately, grandchild is orphaned and adopted by init which handles reaping. The third approach is used by daemons to detach from the terminal.

Process Attribute Control

Beyond creation, execution, and termination, process control encompasses manipulating various process attributes. These system calls enable processes to inspect and modify their own state or, with appropriate privileges, the state of other processes.

Process Identification

Process Identification System Calls
System Call	Purpose	Returns
getpid()	Get process ID	Calling process's PID
getppid()	Get parent process ID	Parent's PID
getpgrp()	Get process group ID	Calling process's PGID
getsid(pid)	Get session ID	Session ID of pid
getuid()/geteuid()	Get real/effective user ID	UID/EUID
getgid()/getegid()	Get real/effective group ID	GID/EGID

Process Priority and Resource Control

The kernel scheduler uses priority values to determine process execution order. User-space processes can influence (within limits) their scheduling priority:

priority_examples.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#include <stdio.h>
#include <unistd.h>
#include <sys/resource.h>
#include <errno.h>
 
void demonstrate_priority_control() {
    int current_nice;
    int new_nice;
    
    // Get current nice value
    // Note: getpriority can return -1 legitimately, so clear errno first
    errno = 0;
    current_nice = getpriority(PRIO_PROCESS, 0);  // 0 = current process
    if (current_nice == -1 && errno != 0) {
        perror("getpriority failed");
        return;
    }
    printf("Current nice value: %d\n", current_nice);
    
    // Nice values range from -20 (highest priority) to 19 (lowest)
    // Only root can set negative nice values
    
    // Increase nice value (lower priority) - any process can do this
    new_nice = nice(5);  // Increment nice by 5
    if (new_nice == -1 && errno != 0) {
        perror("nice failed");
    } else {
        printf("New nice value: %d\n", new_nice);
    }
    
    // Alternative: setpriority for more control
    // Set priority of process group or user's processes
    if (setpriority(PRIO_PROCESS, 0, 10) == -1) {
        perror("setpriority failed");
    }
    
    // Resource limits also affect process behavior
    struct rlimit limits;
    
    // Get CPU time limit
    if (getrlimit(RLIMIT_CPU, &limits) == 0) {
        printf("CPU time limit: soft=%llu, hard=%llu seconds\n",
               (unsigned long long)limits.rlim_cur,
               (unsigned long long)limits.rlim_max);
    }
    
    // Get and display memory limits
    if (getrlimit(RLIMIT_AS, &limits) == 0) {
        if (limits.rlim_cur == RLIM_INFINITY) {
            printf("Address space limit: unlimited\n");
        } else {
            printf("Address space limit: %llu bytes\n",
                   (unsigned long long)limits.rlim_cur);
        }
    }
}

Session and Process Group Management

Proce groups and sessions enable job control—the ability to manage multiple processes as units:

Process Group: Collection of processes for signal delivery and job control
Session: Collection of process groups sharing a controlling terminal
Session Leader: Process that created the session (typically the shell)
Foreground Process Group: Receives keyboard signals (SIGINT, SIGQUIT)
Background Process Group: Does not receive keyboard signals

Session and Process Group System Calls

•setpgid(pid, pgid) — Move process to a different process group
•setsid() — Create a new session with calling process as leader
•tcsetpgrp(fd, pgrp) — Set foreground process group for terminal
•tcgetpgrp(fd) — Get foreground process group for terminal

Daemon Creation Pattern

Daemons (background services) use setsid() to detach from the controlling terminal: (1) Fork and parent exits, (2) Child calls setsid() to create new session, (3) Fork again and parent (now session leader) exits, (4) Grandchild can never acquire a controlling terminal. This is the traditional 'double-fork' daemon pattern.

Summary: Process Control System Calls

Process control system calls form the foundation upon which all multiprocessing rests. We've explored the complete lifecycle:

Key Takeaways

•fork() duplicates the calling process, creating a parent-child relationship with copy-on-write semantics. Variants like vfork() and clone() offer specialized behaviors for performance or fine-grained resource sharing.
•exec() family replaces the current process image with a new program while preserving PID and open file descriptors. The fork-exec pattern enables powerful pre-execution configuration.
•Process termination occurs via exit() for clean shutdown or abort()/signals for abnormal termination. Exit status follows conventions enabling parent processes to understand child outcomes.
•wait() family reaps zombie processes and collects exit status. waitpid() enables selective waiting and non-blocking checks essential for job control.
•Process attributes including PID, priority, and session relationships can be queried and modified through dedicated system calls, enabling sophisticated process management.

Looking Ahead

Process control gives us the mechanisms to create and manage processes, but processes must also manipulate files, devices, and communicate with each other. The following pages explore file management, device management, information maintenance, and communication system calls—completing our survey of the system call taxonomy.

Page Complete

You now understand the comprehensive landscape of process control system calls—from creation through fork() and exec(), to termination and cleanup via exit() and wait(). These primitives form the foundation for everything from simple shell commands to complex server architectures. Next, we'll explore how processes interact with the file system through file management system calls.

1 / 5

Loading learning content...

Operating SystemsSystem Calls & API

System Call Types

LevelIntermediate

Duration90 mins

TopicSystem Calls & API

1 / 5

Process Control System Calls

The Imperative of Process Control

What You Will Learn

Process Creation — The fork() System Call

The Semantics of fork()

When a process invokes fork(), the kernel performs the following operations:

Allocate new process control block (PCB): A new entry is created in the process table with a unique Process ID (PID)
Copy parent's address space: The child receives a complete copy of the parent's memory—code segment, data segment, heap, and stack
Copy file descriptor table: The child inherits references to all open files, with shared file position offsets
Copy signal dispositions: Signal handlers and masks are duplicated
Initialize child-specific state: The child's PID differs from the parent, parent PID is set appropriately, and resource utilization counters reset
Return control to both processes: Both parent and child resume execution after the fork() call

fork_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
 
int main() {
    pid_t pid;
    int shared_var = 42;
    
    printf("Before fork: shared_var = %d, process = %d\n", 
           shared_var, getpid());
    
    pid = fork();
    
    if (pid < 0) {
        // Fork failed - typically due to resource exhaustion
        perror("fork failed");
        return 1;
    } 
    else if (pid == 0) {
        // Child process executes this branch
        // Child has its own copy of shared_var
        shared_var = 100;
        printf("Child: shared_var = %d, my PID = %d, parent PID = %d\n",
               shared_var, getpid(), getppid());
    } 
    else {
        // Parent process executes this branch
        // pid contains the child's PID
        shared_var = 200;
        printf("Parent: shared_var = %d, my PID = %d, child PID = %d\n",
               shared_var, getpid(), pid);
    }
    
    // Both processes execute this
    printf("Process %d ending with shared_var = %d\n", 
           getpid(), shared_var);
    
    return 0;
}

Copy-on-Write Optimization

Why Fork + Exec Rather Than Direct Spawn?

Separation of Concerns: Fork handles process creation; exec handles program loading. Each can be used independently.
Pre-Execution Configuration: Between fork and exec, the child can modify its own environment—redirect file descriptors, change working directory, modify signal handling, adjust resource limits—without affecting the parent.
Shell Implementation: The fork-exec model elegantly supports shell I/O redirection. When you type ls > output.txt, the shell forks, the child redirects stdout to the file, then execs ls. The redirection happens naturally.
Process Relationships: The fork model inherently creates a process tree with clear parent-child relationships, enabling job control, process groups, and session management.

fork() Return Values and Their Meanings
Return Value	Context	Meaning	Typical Action
Positive integer	Parent process	Child's PID	Store for later wait() or signaling
0	Child process	Successful fork	Execute child-specific logic or exec()
-1	Calling process	Fork failed	Check errno: EAGAIN (resource limit), ENOMEM (memory)

fork() Variants — vfork() and clone()

While fork() is the canonical process creation primitive, performance considerations and the need for finer control have driven the development of specialized variants.

vfork() — Optimized for Immediate exec()

vfork() Hazards

clone() — The Swiss Army Knife of Process Creation

Linux's clone() system call represents the generalization of fork(). Rather than using fixed semantics, clone() accepts flags specifying exactly which resources to share between parent and child:

CLONE_VM: Share address space (enables threads)
CLONE_FS: Share filesystem information (root, cwd, umask)
CLONE_FILES: Share file descriptor table
CLONE_SIGHAND: Share signal handlers
CLONE_THREAD: Place child in same thread group
CLONE_PARENT: Child's parent is caller's parent
CLONE_NEWPID: Create new PID namespace (containers)
CLONE_NEWNET: Create new network namespace

This flexibility enables clone() to implement both traditional fork (share nothing) and POSIX threads (share everything except stack), as well as modern container isolation through namespace flags.

clone_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
 
#define STACK_SIZE (1024 * 1024)  // 1MB stack for child
 
int shared_counter = 0;
 
static int child_function(void *arg) {
    // This function executes in the child
    const char *name = (const char *)arg;
    
    printf("Child: name = %s, PID = %d\n", name, getpid());
    shared_counter = 42;  // If CLONE_VM, parent sees this!
    printf("Child set shared_counter to %d\n", shared_counter);
    
    return 0;  // Child exit status
}
 
int main() {
    char *child_stack;
    char *child_stack_top;
    pid_t child_pid;
    
    // Allocate stack for child (stacks grow downward on most architectures)
    child_stack = malloc(STACK_SIZE);
    if (!child_stack) {
        perror("malloc");
        return 1;
    }
    child_stack_top = child_stack + STACK_SIZE;
    
    printf("Parent: shared_counter initially = %d\n", shared_counter);
    
    // Clone with shared virtual memory (like a thread)
    // CLONE_VM: share address space
    // SIGCHLD: send SIGCHLD on termination
    child_pid = clone(child_function, child_stack_top,
                      CLONE_VM | SIGCHLD, "CloneChild");
    
    if (child_pid == -1) {
        perror("clone");
        free(child_stack);
        return 1;
    }
    
    // Wait for child to complete
    waitpid(child_pid, NULL, 0);
    
    // Because of CLONE_VM, we see the child's modification!
    printf("Parent after wait: shared_counter = %d\n", shared_counter);
    
    free(child_stack);
    return 0;
}

Comparing Process Creation Mechanisms

Understanding when to use each mechanism is crucial for systems programming:

Process Creation System Calls Comparison
System Call	Address Space	Use Case	Performance	Portability
fork()	Copied (COW)	General process creation	Good (COW optimized)	POSIX standard
vfork()	Shared temporarily	fork() immediately before exec()	Best for exec pattern	POSIX (deprecated)
clone()	Configurable via flags	Threads, containers, custom sharing	Depends on flags	Linux-specific
posix_spawn()	N/A (new process)	Portable process+exec combination	Implementation-dependent	POSIX standard

Program Execution — The exec() Family

The exec() family is not a single function but a collection of six variants, each providing different conveniences for specifying the program and its arguments:

Naming Convention Decoded

The suffix characters indicate behavior:

l (list): Arguments passed as variadic list (arg0, arg1, ..., NULL)
v (vector): Arguments passed as NULL-terminated array
e (environment): Explicit environment array provided
p (path): Search PATH for executable if no slash in filename

The exec() Family of Functions
Function	Arguments	PATH Search	Environment
execl()	Variadic list	No	Inherited
execv()	Array	No	Inherited
execle()	Variadic list	No	Explicit
execve()	Array	No	Explicit
execlp()	Variadic list	Yes	Inherited
execvp()	Array	Yes	Inherited

execve() — The True System Call

exec_examples.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
 
void demonstrate_exec_variants() {
    pid_t pid;
    
    // ========================================
    // Example 1: execl() - list form, full path
    // ========================================
    pid = fork();
    if (pid == 0) {
        // Child: execute /bin/ls with arguments
        // First argument (argv[0]) is conventionally the program name
        execl("/bin/ls", "ls", "-l", "-a", "/tmp", NULL);
        
        // If we reach here, exec failed!
        perror("execl failed");
        _exit(127);
    }
    waitpid(pid, NULL, 0);
    
    // ========================================
    // Example 2: execlp() - list form, PATH search
    // ========================================
    pid = fork();
    if (pid == 0) {
        // Don't need full path - searches PATH
        execlp("grep", "grep", "-r", "pattern", ".", NULL);
        perror("execlp failed");
        _exit(127);
    }
    waitpid(pid, NULL, 0);
    
    // ========================================
    // Example 3: execv() - array form
    // ========================================
    pid = fork();
    if (pid == 0) {
        char *args[] = {"sort", "-n", "-r", "numbers.txt", NULL};
        execv("/usr/bin/sort", args);
        perror("execv failed");
        _exit(127);
    }
    waitpid(pid, NULL, 0);
    
    // ========================================
    // Example 4: execve() - explicit environment
    // ========================================
    pid = fork();
    if (pid == 0) {
        char *args[] = {"env", NULL};
        char *envp[] = {
            "PATH=/usr/bin:/bin",
            "HOME=/home/user",
            "CUSTOM_VAR=hello",
            NULL
        };
        // Child will have ONLY these environment variables
        execve("/usr/bin/env", args, envp);
        perror("execve failed");
        _exit(127);
    }
    waitpid(pid, NULL, 0);
}
 
int main() {
    demonstrate_exec_variants();
    return 0;
}

What exec() Preserves and What It Replaces

Understanding exactly what survives an exec() call is essential for correct program design:

Preserved across exec():

Process ID (PID) and parent process ID (PPID)
Real user ID and real group ID
Session ID and controlling terminal
Current working directory and root directory
File mode creation mask (umask)
File locks (those belonging to the process)
Process signal mask
Pending signals
Resource limits
CPU time counters
File descriptors (unless FD_CLOEXEC flag is set)

Replaced by exec():

Process image (code, data, BSS segments)
Stack and heap
Memory mappings
Signal dispositions (handlers reset to default)
Thread state (all threads except caller are terminated)
Pending timers
Memory locks
Capabilities (if setuid/setgid)

Close-on-Exec Flag

Process Termination — exit(), abort(), and Signals

Normal Termination — exit() and _exit()

Normal termination occurs when a process voluntarily chooses to end its execution. Two primary functions serve this purpose:

_exit(status) — The raw system call that immediately terminates the process:

Closes all file descriptors
Releases memory mappings
Sends SIGCHLD to parent
Process becomes zombie until parent calls wait()

exit(status) — The C library wrapper that performs cleanup before _exit():

Flushes all stdio buffers
Calls functions registered with atexit()
Calls functions registered with on_exit()
Then calls _exit()

termination_examples.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
 
void cleanup_function(void) {
    printf("Cleanup: atexit handler called\n");
    // Flush logs, close database connections, release locks, etc.
}
 
void on_exit_handler(int status, void *arg) {
    printf("on_exit: status=%d, arg=%s\n", status, (char *)arg);
}
 
void demonstrate_exit_difference() {
    // Register cleanup handlers
    atexit(cleanup_function);
    on_exit(on_exit_handler, "context data");
    
    printf("About to exit...\n");
    
    // Uncommenting exit() will call cleanup handlers
    // exit(0);
    
    // Uncommenting _exit() will NOT call cleanup handlers
    // and stdout may not be flushed if not newline-terminated
    // _exit(0);
}
 
// What happens with return from main()?
int main() {
    atexit(cleanup_function);
    
    printf("Main function executing...\n");
    
    // Returning from main() is equivalent to exit(return_value)
    // All cleanup handlers will be called
    return 0;
    
    // Note: calling exit(0) here would be redundant but equivalent
}

Abnormal Termination — abort() and Fatal Signals

Abnormal termination occurs when a process terminates unexpectedly, either by its own volition or due to external factors:

abort()

Raises SIGABRT signal
If signal handler returns, process is still terminated
Typically produces core dump for debugging
Used by assertion failures (assert macro)
Cannot be blocked or ignored reliably

Fatal Signals Certain signals cause termination when their default disposition is not overridden:

SIGTERM: Polite termination request (default for kill command)
SIGKILL: Unconditional termination (cannot be caught or ignored)
SIGSEGV: Segmentation fault (invalid memory access)
SIGBUS: Bus error (alignment or nonexistent memory)
SIGFPE: Floating-point exception
SIGILL: Illegal instruction

SIGKILL Cannot Be Caught

Termination Mechanisms Compared
Mechanism	Cleanly?	Exit Status	Core Dump?	Parent Notification
return from main()	Yes	Return value	No	SIGCHLD
exit(status)	Yes	Argument value	No	SIGCHLD
_exit(status)	Partial	Argument value	No	SIGCHLD
abort()	No	128+SIGABRT	Yes (typically)	SIGCHLD
SIGTERM default	No	128+SIGTERM	No	SIGCHLD
SIGKILL	No	128+SIGKILL	No	SIGCHLD
Segfault (SIGSEGV)	No	128+SIGSEGV	Yes (typically)	SIGCHLD

Exit Status Conventions

The exit status is an 8-bit value (0-255) that communicates information to the parent process:

0: Conventional success
1: General errors
2: Misuse of shell command (per Bash convention)
126: Command found but not executable
127: Command not found
128+N: Terminated by signal N
130: Control-C (128 + SIGINT(2))
137: SIGKILL (128 + 9)
143: SIGTERM (128 + 15)

The parent retrieves this status via wait() family calls, enabling sophisticated inter-process coordination and error propagation.

Waiting for Children — wait(), waitpid(), and waitid()

wait() — Simple Child Reaping

The basic wait() call blocks until any child process terminates:

pid_t wait(int *wstatus);

Returns the PID of the terminated child
Stores termination status in wstatus (if non-NULL)
Returns -1 if no children exist (errno = ECHILD)
Cannot specify which child to wait for

waitpid() — Targeted and Flexible

For finer control, waitpid() allows specifying which child(ren) to wait for:

pid_t waitpid(pid_t pid, int *wstatus, int options);

waitpid() pid Argument Interpretations
pid Value	Which Children to Wait For
< -1	Any child in process group \|pid\|
-1	Any child (equivalent to wait())
0	Any child in same process group as caller
0	Specific child with PID = pid

wait_examples.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <errno.h>
 
void decode_wait_status(int wstatus) {
    if (WIFEXITED(wstatus)) {
        // Child exited normally via exit() or return from main
        int exit_code = WEXITSTATUS(wstatus);
        printf("  Child exited normally with status %d\n", exit_code);
    }
    else if (WIFSIGNALED(wstatus)) {
        // Child was terminated by a signal
        int signal = WTERMSIG(wstatus);
        printf("  Child killed by signal %d", signal);
        #ifdef WCOREDUMP
        if (WCOREDUMP(wstatus)) {
            printf(" (core dumped)");
        }
        #endif
        printf("\n");
    }
    else if (WIFSTOPPED(wstatus)) {
        // Child was stopped by a signal (requires WUNTRACED)
        int signal = WSTOPSIG(wstatus);
        printf("  Child stopped by signal %d\n", signal);
    }
    else if (WIFCONTINUED(wstatus)) {
        // Child resumed by SIGCONT (requires WCONTINUED)
        printf("  Child continued\n");
    }
}
 
int main() {
    pid_t child1, child2, child3;
    int wstatus;
    
    // Create three children with different behaviors
    
    // Child 1: Normal exit
    child1 = fork();
    if (child1 == 0) {
        sleep(1);
        exit(42);  // Exit with status 42
    }
    
    // Child 2: Abnormal termination
    child2 = fork();
    if (child2 == 0) {
        sleep(2);
        abort();  // Terminate abnormally
    }
    
    // Child 3: Long-running task
    child3 = fork();
    if (child3 == 0) {
        sleep(10);
        exit(0);
    }
    
    printf("Parent created children: %d, %d, %d\n", child1, child2, child3);
    
    // Wait for specific child (child1) - BLOCKING
    printf("\nWaiting for child1 (%d)...\n", child1);
    waitpid(child1, &wstatus, 0);
    decode_wait_status(wstatus);
    
    // Non-blocking check for child3
    printf("\nNon-blocking check for child3 (%d)...\n", child3);
    pid_t result = waitpid(child3, &wstatus, WNOHANG);
    if (result == 0) {
        printf("  Child3 is still running\n");
    } else if (result > 0) {
        printf("  Child3 has terminated\n");
        decode_wait_status(wstatus);
    }
    
    // Wait for any remaining children
    printf("\nWaiting for remaining children...\n");
    while ((result = waitpid(-1, &wstatus, 0)) > 0) {
        printf("Reaped child %d:\n", result);
        decode_wait_status(wstatus);
    }
    
    printf("\nAll children reaped.\n");
    return 0;
}

waitpid() Options

The options argument provides additional control:

WNOHANG: Return immediately if no child has exited (non-blocking)
WUNTRACED: Also return if a child has stopped (not just terminated)
WCONTINUED: Also return if a stopped child has been resumed by SIGCONT

These options enable sophisticated job control implementations, as used by shells to manage foreground and background processes.

waitid() — Modern and Most Flexible

The waitid() system call provides the most control and information:

int waitid(idtype_t idtype, id_t id, siginfo_t *infop, int options);

waitid() fills a siginfo_t structure with detailed information including:

PID and UID of the child
Exit status or signal number
CPU time consumed
Precise reason for state change

Avoiding Zombie Accumulation

Process Attribute Control

Process Identification

Process Identification System Calls
System Call	Purpose	Returns
getpid()	Get process ID	Calling process's PID
getppid()	Get parent process ID	Parent's PID
getpgrp()	Get process group ID	Calling process's PGID
getsid(pid)	Get session ID	Session ID of pid
getuid()/geteuid()	Get real/effective user ID	UID/EUID
getgid()/getegid()	Get real/effective group ID	GID/EGID

Process Priority and Resource Control

The kernel scheduler uses priority values to determine process execution order. User-space processes can influence (within limits) their scheduling priority:

priority_examples.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#include <stdio.h>
#include <unistd.h>
#include <sys/resource.h>
#include <errno.h>
 
void demonstrate_priority_control() {
    int current_nice;
    int new_nice;
    
    // Get current nice value
    // Note: getpriority can return -1 legitimately, so clear errno first
    errno = 0;
    current_nice = getpriority(PRIO_PROCESS, 0);  // 0 = current process
    if (current_nice == -1 && errno != 0) {
        perror("getpriority failed");
        return;
    }
    printf("Current nice value: %d\n", current_nice);
    
    // Nice values range from -20 (highest priority) to 19 (lowest)
    // Only root can set negative nice values
    
    // Increase nice value (lower priority) - any process can do this
    new_nice = nice(5);  // Increment nice by 5
    if (new_nice == -1 && errno != 0) {
        perror("nice failed");
    } else {
        printf("New nice value: %d\n", new_nice);
    }
    
    // Alternative: setpriority for more control
    // Set priority of process group or user's processes
    if (setpriority(PRIO_PROCESS, 0, 10) == -1) {
        perror("setpriority failed");
    }
    
    // Resource limits also affect process behavior
    struct rlimit limits;
    
    // Get CPU time limit
    if (getrlimit(RLIMIT_CPU, &limits) == 0) {
        printf("CPU time limit: soft=%llu, hard=%llu seconds\n",
               (unsigned long long)limits.rlim_cur,
               (unsigned long long)limits.rlim_max);
    }
    
    // Get and display memory limits
    if (getrlimit(RLIMIT_AS, &limits) == 0) {
        if (limits.rlim_cur == RLIM_INFINITY) {
            printf("Address space limit: unlimited\n");
        } else {
            printf("Address space limit: %llu bytes\n",
                   (unsigned long long)limits.rlim_cur);
        }
    }
}

Session and Process Group Management

Proce groups and sessions enable job control—the ability to manage multiple processes as units:

Process Group: Collection of processes for signal delivery and job control
Session: Collection of process groups sharing a controlling terminal
Session Leader: Process that created the session (typically the shell)
Foreground Process Group: Receives keyboard signals (SIGINT, SIGQUIT)
Background Process Group: Does not receive keyboard signals

Session and Process Group System Calls

•setpgid(pid, pgid) — Move process to a different process group
•setsid() — Create a new session with calling process as leader
•tcsetpgrp(fd, pgrp) — Set foreground process group for terminal
•tcgetpgrp(fd) — Get foreground process group for terminal

Daemon Creation Pattern

Summary: Process Control System Calls

Process control system calls form the foundation upon which all multiprocessing rests. We've explored the complete lifecycle:

Key Takeaways

•fork() duplicates the calling process, creating a parent-child relationship with copy-on-write semantics. Variants like vfork() and clone() offer specialized behaviors for performance or fine-grained resource sharing.
•exec() family replaces the current process image with a new program while preserving PID and open file descriptors. The fork-exec pattern enables powerful pre-execution configuration.
•Process termination occurs via exit() for clean shutdown or abort()/signals for abnormal termination. Exit status follows conventions enabling parent processes to understand child outcomes.
•wait() family reaps zombie processes and collects exit status. waitpid() enables selective waiting and non-blocking checks essential for job control.
•Process attributes including PID, priority, and session relationships can be queried and modified through dedicated system calls, enabling sophisticated process management.

Looking Ahead

Page Complete

1 / 5