Exec Family - Learning Module

Loading content...

0/227

exec() Variants

The Other Half of Process Creation

In the previous module, we explored fork()—the system call that creates a new process by duplicating the calling process. But here's a critical question: if fork() just makes a copy of the parent, how do we ever run a different program?

The answer lies in the exec() family of system calls. While fork() creates a new process, exec() transforms it. Together, they form the complete Unix process creation mechanism—a design so elegant and flexible that it has survived virtually unchanged for over 50 years.

But exec() isn't a single system call. It's a family of related functions, each with subtle differences that serve specific use cases. Understanding these variants—and knowing which to use when—separates confident systems programmers from those who guess and check.

What You Will Learn

By the end of this page, you will understand every major exec() variant, decode their naming conventions, recognize their semantic differences, and know exactly which variant to use for any given situation. You'll see why this apparent complexity actually provides elegant flexibility.

The exec() Concept

Before diving into variants, let's establish what exec() fundamentally does. The exec() system call performs a complete process image replacement. When a process calls exec():

The current process's text (code) segment is replaced with the new program's code
The data segment is replaced with the new program's initialized data
The BSS segment is replaced with the new program's uninitialized data
The heap is discarded and replaced with the new program's heap
The stack is discarded and replaced with a fresh stack for the new program

What remains unchanged:

The Process ID (PID)
The Parent Process ID (PPID)
Open file descriptors (unless marked close-on-exec)
Current working directory
Root directory
Process group ID and session ID
Real user ID and real group ID
Pending signals
Resource limits
Controlling terminal

The Key Insight

exec() does not create a new process—it transforms an existing one. The process retains its identity (PID) but completely changes its behavior. Think of it as a caterpillar becoming a butterfly: same organism, entirely different form.

exec_basic_concept.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <stdio.h>
#include <unistd.h>
 
int main() {
    printf("Before exec: PID = %d\n", getpid());
    
    // This replaces the entire process image with /bin/ls
    // If exec succeeds, the following line NEVER executes
    execl("/bin/ls", "ls", "-l", NULL);
    
    // Only reached if exec fails
    perror("exec failed");
    return 1;
}

Critical observation: If exec() succeeds, it never returns. The calling code is completely replaced by the new program. The only time exec() returns is when it fails—and in that case, it returns -1 and sets errno.

This "no return on success" semantic is unlike almost any other function in C. It's a one-way door: once you exec(), there's no coming back to your original code.

The Naming Convention

The exec() family follows a systematic naming convention that encodes the function's behavior in its name. Once you understand this convention, you can decode any exec variant instantly.

Every variant starts with exec followed by one or more suffix letters:

Suffix	Meaning	Effect
`l`	list	Arguments passed as a list (varargs)
`v`	vector	Arguments passed as an array (char *argv[])
`e`	environment	Environment passed explicitly as parameter
`p`	path	Uses PATH environment variable to find executable

These suffixes combine to form the complete set of exec variants:

The Complete exec() Family
Function	Arguments	Environment	Path Search	Signature
`execl`	list	inherited	no	`execl(path, arg0, arg1, ..., NULL)`
`execv`	vector	inherited	no	`execv(path, argv[])`
`execle`	list	explicit	no	`execle(path, arg0, ..., NULL, envp[])`
`execve`	vector	explicit	no	`execve(path, argv[], envp[])`
`execlp`	list	inherited	yes	`execlp(file, arg0, arg1, ..., NULL)`
`execvp`	vector	inherited	yes	`execvp(file, argv[])`
`execvpe`	vector	explicit	yes	`execvpe(file, argv[], envp[])`

The Fundamental System Call

Among all these variants, only execve() is an actual system call. All others are library wrappers that ultimately call execve(). This is why execve() has both 'v' and 'e' in its name—it requires both explicit argument vector and explicit environment.

Converting Mermaid diagram...

List vs Vector: The 'l' and 'v' Variants

The most fundamental distinction in the exec() family is how arguments are passed. This comes down to compile-time knowledge: do you know the number of arguments when writing the code, or is it determined at runtime?

List Variants (execl, execlp, execle)

•Arguments passed as separate parameters
•Use C's variadic function mechanism
•Must end with NULL sentinel
•Number of arguments known at compile time
•Convenient for fixed-argument commands
•Easier to read for simple cases

Vector Variants (execv, execvp, execvpe)

•Arguments passed as array of strings
•Array must be NULL-terminated
•Number of arguments determined at runtime
•Essential for dynamic argument construction
•Used when forwarding command-line args
•More flexible for complex scenarios

list_vs_vector.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <unistd.h>
#include <stdio.h>
 
void using_list_variant() {
    // Known at compile time: exactly 3 arguments
    // execl(path, arg0, arg1, arg2, NULL)
    execl("/bin/ls", "ls", "-l", "-a", NULL);
    // arg0 is conventionally the program name
}
 
void using_vector_variant(int argc, char *argv[]) {
    // Runtime-determined arguments
    // Useful when forwarding arguments from another program
    
    // Build argument vector dynamically
    char *my_args[argc + 1];  // +1 for NULL terminator
    
    my_args[0] = "myprogram";  // arg0 = program name
    for (int i = 1; i < argc; i++) {
        my_args[i] = argv[i];  // Copy arguments
    }
    my_args[argc] = NULL;  // NULL terminator required
    
    execv("/path/to/myprogram", my_args);
}
 
void shell_command_example() {
    // Real-world example: building a grep command dynamically
    char *pattern = "error";  // Could come from user input
    char *file = "/var/log/syslog";  // Could be dynamic
    
    // With list variant - awkward for dynamic arguments:
    execl("/bin/grep", "grep", pattern, file, NULL);
    
    // With vector variant - natural for dynamic arguments:
    char *grep_args[] = {"grep", pattern, file, NULL};
    execv("/bin/grep", grep_args);
}

The NULL Terminator is Critical

Both list and vector variants MUST have a NULL terminator. For list variants, the last argument must be (char *)NULL. For vector variants, the array must end with a NULL pointer. Omitting this causes undefined behavior—potentially reading garbage memory as arguments.

Practical guidance:

Use list variants when executing a fixed command with known arguments (like running ls -l)
Use vector variants when:
- Arguments are constructed dynamically
- You're forwarding arguments from main(argc, argv)
- The argument count is determined at runtime
- You're building a shell or command executor

Path Search: The 'p' Variants

The 'p' suffix variants (execlp, execvp, execvpe) add a crucial capability: PATH environment variable searching. This is how your shell finds commands without requiring full paths.

How path search works:

If the filename contains a slash (/), it's treated as a path (no search performed)
Otherwise, the directories in PATH are searched in order
The first executable match is used
If no match is found, exec fails with ENOENT

path_search_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
 
int main() {
    // WITHOUT path search - must specify full path
    execl("/usr/bin/python3", "python3", "--version", NULL);
    
    // Error: if python3 isn't at exactly /usr/bin/python3, this fails
    
    // WITH path search - searches PATH automatically
    execlp("python3", "python3", "--version", NULL);
    
    // This checks each directory in PATH:
    // 1. /usr/local/bin/python3?  No
    // 2. /usr/bin/python3?        Yes! Execute it.
    
    perror("exec failed");
    return 1;
}
 
// Demonstrating the search behavior explicitly
void show_path_search() {
    // Assume PATH=/usr/local/bin:/usr/bin:/bin
    
    // execlp("ls", ...) searches:
    //   /usr/local/bin/ls - not found
    //   /usr/bin/ls       - not found  
    //   /bin/ls           - FOUND! Execute.
    
    // execlp("/bin/ls", ...) does NOT search
    //   because the filename contains '/'
    //   It tries /bin/ls directly (same as execl)
}

Use 'p' Variants When

•Executing standard system commands
•Building shell-like interfaces
•User specifies command by name
•Portability across different systems
•Command location may vary

Avoid 'p' Variants When

•Security-sensitive operations
•You know the exact executable path
•PATH manipulation is a concern
•Running your own application binaries
•Reproducibility is critical

Security Warning: PATH Attacks

The 'p' variants introduce security risks. An attacker who can modify PATH could redirect your exec to a malicious binary. For security-critical code (setuid programs, daemons, etc.), always use full paths with non-'p' variants. Never trust PATH in security contexts.

The conffile vs. PATH distinction:

Notice that 'p' variants take a file parameter while others take a path parameter. This isn't just terminology:

path = a pathname (absolute or relative), used as-is
file = a filename that may be searched for in PATH

If file contains no slashes, PATH is searched. If it contains a slash (like ./myprogram or /usr/bin/python), it's used directly as a path.

Environment Control: The 'e' Variants

The 'e' suffix variants (execle, execve, execvpe) allow you to specify the environment for the new process explicitly. Without the 'e', the new program inherits the current process's environment unchanged.

Why control the environment?

Environment variables configure program behavior in foundational ways:

PATH – where to find executables
HOME – user's home directory
USER – current username
LANG/LC_* – localization settings
LD_LIBRARY_PATH – dynamic library search path
Application-specific settings (DATABASE_URL, API_KEY, etc.)

Sometimes you need to:

Sanitize the environment for security (remove dangerous variables)
Augment the environment (add variables for child process)
Replace the environment entirely (controlled sandbox)
Pass secrets securely to child processes

environment_control.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
 
// The global environ variable holds current environment
extern char **environ;
 
int main() {
    // Non-'e' variant: child inherits parent's environment
    execl("/usr/bin/env", "env", NULL);
    // Child sees all of parent's environment variables
    
    // 'e' variant: specify environment explicitly
    char *custom_env[] = {
        "PATH=/usr/bin:/bin",
        "HOME=/tmp",
        "MY_APP_MODE=production",
        "DATABASE_URL=postgres://localhost/mydb",
        NULL  // NULL terminator required
    };
    
    execle("/usr/bin/env", "env", NULL, custom_env);
    // Child sees ONLY the four variables we specified
    // Parent's other variables are NOT inherited
    
    perror("exec failed");
    return 1;
}
 
// Security: creating a clean environment
void secure_exec() {
    // For setuid programs or security-sensitive operations,
    // never inherit the untrusted environment
    
    char *safe_env[] = {
        "PATH=/usr/bin:/bin",  // Known-safe PATH
        "IFS= \t\n",           // Safe IFS
        "HOME=/tmp",           // Neutral home
        NULL
    };
    
    // Execute with controlled environment only
    execve("/path/to/secure/program", argv, safe_env);
}
 
// Augmenting the environment
void add_to_environment() {
    // Sometimes you want parent's environment PLUS some additions
    // This requires constructing a new array
    
    int env_count = 0;
    for (char **e = environ; *e != NULL; e++) {
        env_count++;
    }
    
    // Create new array: original + 2 new + NULL terminator
    char *new_env[env_count + 3];
    
    // Copy existing environment
    for (int i = 0; i < env_count; i++) {
        new_env[i] = environ[i];
    }
    
    // Add new variables
    new_env[env_count] = "MY_NEW_VAR=value";
    new_env[env_count + 1] = "ANOTHER_VAR=other";
    new_env[env_count + 2] = NULL;
    
    execve("/path/to/program", argv, new_env);
}

Environment Array Format

The environment array (envp) follows the same format as argv: a NULL-terminated array of strings. Each string has the format 'NAME=value' (no spaces around the equals sign). The array must be NULL-terminated.

Environment Inheritance in exec() Variants
Variant	Environment Behavior	Use Case
`execl`, `execv`, `execlp`, `execvp`	Inherits parent's `environ` automatically	Normal program execution
`execle`, `execve`, `execvpe`	Uses explicitly provided `envp` array	Security, sandboxing, configuration

The Core System Call: execve()

execve() is the only actual system call in the exec() family. All other variants are C library functions that eventually call execve(). Understanding execve() means understanding the fundamental interface to the kernel.

execve_signature.c
1
2
3
4
5
6
7
8
9
#include <unistd.h>
 
// The actual system call signature
int execve(const char *pathname,  // Full path to executable
           char *const argv[],     // Argument vector (NULL-terminated)
           char *const envp[]);    // Environment vector (NULL-terminated)
 
// Returns: -1 on error (only return case), sets errno
// On success: does not return (process image replaced)

What happens inside the kernel when execve() is called:

Pathname resolution: The kernel resolves the pathname to an inode
Permission checks:
- Is the file executable?
- Does the calling process have execute permission?
- Is the filesystem mounted with noexec?
File format recognition:
- Is it an ELF binary? → Load using ELF loader
- Is it a script with #! (shebang)? → Invoke the interpreter
- Other formats check (a.out, etc.)
Memory space preparation:
- Allocate new memory regions
- Release old text, data, heap, stack segments
- Load new program's segments
Security transitions:
- Apply setuid/setgid if applicable
- Clear capabilities as appropriate
- Reset signal dispositions to default for caught signals
Stack setup:
- Copy argv strings to new stack
- Copy envp strings to new stack
- Set up initial stack frame
Jump to entry point:
- Transfer control to the new program's _start symbol
- _start initializes runtime, then calls main(argc, argv, environ)

Converting Mermaid diagram...

The Point of No Return

Once execve() begins modifying the process memory, there's no undo. The old program is gone. If something fails during the loading process, the kernel must terminate the process rather than restore it—there's nothing left to restore. This is why exec() is atomic: it either succeeds completely or fails without any effect.

Complete Variant Reference

Let's examine each exec() variant with its complete signature, behavior, and typical use case.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
int execve(const char *pathname, char *const argv[],
           char *const envp[]);
 
// Example:
char *args[] = {"ls", "-l", NULL};
char *env[] = {"PATH=/usr/bin", "HOME=/tmp", NULL};
execve("/bin/ls", args, env);
 
// - pathname: Full path to executable
// - argv: NULL-terminated argument array
// - envp: NULL-terminated environment array
 
// THE ONLY ACTUAL SYSTEM CALL
// All other exec functions ultimately call this

execve() is the fundamental system call that all others reduce to. If you need maximum control or are writing systems code, this is the variant to use. It requires both explicit argument vector and explicit environment.

Common Errors and Pitfalls

The exec() functions have several subtle error conditions and common mistakes that can derail even experienced programmers.

Common exec() Errors (errno values)
errno	Meaning	Common Cause
`ENOENT`	No such file or directory	Wrong path, file doesn't exist, or command not found in PATH
`EACCES`	Permission denied	File not executable, or directory in path not searchable
`ENOEXEC`	Exec format error	File is not a valid executable (wrong architecture, corrupt binary)
`E2BIG`	Argument list too long	argv + envp exceeds system limit (typically 2MB on Linux)
`ENOMEM`	Out of memory	Insufficient memory to create new process image
`ETXTBSY`	Text file busy	Executable is open for writing by another process
`EFAULT`	Bad address	argv or envp points to invalid memory
`ELOOP`	Too many symbolic links	Symlink chain is too deep or circular

common_mistakes.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>
 
// MISTAKE 1: Forgetting the NULL terminator
void mistake_no_null() {
    // WRONG: No NULL terminator - undefined behavior!
    execl("/bin/ls", "ls", "-l");  // Missing NULL
    
    // CORRECT:
    execl("/bin/ls", "ls", "-l", NULL);
}
 
// MISTAKE 2: Using wrong pathname vs filename
void mistake_path_type() {
    // WRONG with non-'p' variant:
    execl("ls", "ls", "-l", NULL);  // "ls" is not a path!
    // This looks for "./ls" or fails
    
    // CORRECT: Use 'p' variant for name-only
    execlp("ls", "ls", "-l", NULL);
    
    // Or specify full path for non-'p':
    execl("/bin/ls", "ls", "-l", NULL);
}
 
// MISTAKE 3: Ignoring exec failure
void mistake_no_error_check() {
    // WRONG: No error handling
    execl("/bin/nonexistent", "nonexistent", NULL);
    // If we reach here, exec failed, but we don't know why
    printf("continuing normally...\n");  // Bad!
    
    // CORRECT: Always check for failure
    execl("/bin/nonexistent", "nonexistent", NULL);
    // If we reach here, exec definitely failed
    fprintf(stderr, "exec failed: %s\n", strerror(errno));
    exit(1);  // Or handle error appropriately
}
 
// MISTAKE 4: Expecting exec to return on success
void mistake_expect_return() {
    execl("/bin/ls", "ls", NULL);
    
    // WRONG EXPECTATION: This code never runs on success!
    printf("ls finished\n");  // Never printed if exec succeeds
}
 
// MISTAKE 5: Wrong arg0 convention
void mistake_arg0() {
    // UNUSUAL (but valid): arg0 doesn't match program name
    execl("/bin/busybox", "ls", "-l", NULL);
    // This actually runs busybox, which uses arg0 to decide behavior
    // Works for busybox, but confusing for regular programs
    
    // CONVENTION: arg0 should be the program name
    execl("/bin/ls", "ls", "-l", NULL);
}
 
// MISTAKE 6: Modifying strings in argv (they should be const-ish)
void mistake_modify_argv() {
    char *args[] = {"prog", "arg1", NULL};
    
    // Some programs might try to modify argv for ps display
    // But exec copies these strings, so modifications after
    // exec setup are lost anyway
}

The "Reached After Exec" Pattern

Any code that appears after an exec() call will only run if exec() failed. This means you should always have error handling after exec(). A common defensive pattern is to call _exit(127) after exec() fails, mimicking shell behavior for command-not-found (127 is the conventional exit code).

Choosing the Right Variant

With seven variants to choose from, how do you pick the right one? Here's a decision framework:

Converting Mermaid diagram...

Quick Selection Guide

•Most flexible (default choice): execvp() — vector args + PATH search, covers most use cases
•Maximum control: execve() — the system call itself, explicit everything
•Simple known command: execl() — clean syntax for fixed commands
•Shell implementation: execvp() — dynamic args from parsed input + PATH search
•Security-sensitive: execve() — full path, explicit controlled environment
•Quick system utility: execlp() — easy PATH search with fixed args

When in Doubt

If you're unsure, start with execvp(). It handles the most common case (runtime args + PATH search) and you can always switch to a more specialized variant if needed. Only use execve() when you specifically need environment control or are writing security-sensitive code.

Summary: The exec() Family

We've thoroughly explored the exec() family of functions. Let's consolidate the key concepts:

Key Takeaways

•exec() replaces the process image — The calling program is completely replaced; only the PID survives.
•exec() never returns on success — If you see code after exec(), it only runs on failure.
•execve() is the only system call — All six other variants are library wrappers that ultimately call execve().
•'l' vs 'v': List (compile-time args) vs Vector (runtime args array).
•'p' enables PATH search — Finds executables like a shell does; introduces security considerations.
•'e' provides explicit environment — Control what environment the new program sees.
•Always include NULL terminators — Both argument lists and arrays must be NULL-terminated.
•Always check for failure — Handle errors after every exec() call.

What's next:

Now that you understand what exec() does and which variant to use, the next page dives deep into how exec() works internally. We'll explore the complete process of replacing a process image—from releasing old memory regions to loading new program segments to setting up the initial stack. Understanding this mechanism is essential for debugging exec() issues and appreciating the elegance of Unix process creation.

Page Complete

You now understand the complete exec() family: seven variants, their naming conventions, semantic differences, and appropriate use cases. You can decode any exec variant instantly and choose the right one for any situation. Next, we'll explore how process image replacement actually works at the system level.

exec() Variants

The Other Half of Process Creation

What You Will Learn

The exec() Concept

Before diving into variants, let's establish what exec() fundamentally does. The exec() system call performs a complete process image replacement. When a process calls exec():

The current process's text (code) segment is replaced with the new program's code
The data segment is replaced with the new program's initialized data
The BSS segment is replaced with the new program's uninitialized data
The heap is discarded and replaced with the new program's heap
The stack is discarded and replaced with a fresh stack for the new program

What remains unchanged:

The Process ID (PID)
The Parent Process ID (PPID)
Open file descriptors (unless marked close-on-exec)
Current working directory
Root directory
Process group ID and session ID
Real user ID and real group ID
Pending signals
Resource limits
Controlling terminal

The Key Insight

exec_basic_concept.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <stdio.h>
#include <unistd.h>
 
int main() {
    printf("Before exec: PID = %d\n", getpid());
    
    // This replaces the entire process image with /bin/ls
    // If exec succeeds, the following line NEVER executes
    execl("/bin/ls", "ls", "-l", NULL);
    
    // Only reached if exec fails
    perror("exec failed");
    return 1;
}

This "no return on success" semantic is unlike almost any other function in C. It's a one-way door: once you exec(), there's no coming back to your original code.

The Naming Convention

The exec() family follows a systematic naming convention that encodes the function's behavior in its name. Once you understand this convention, you can decode any exec variant instantly.

Every variant starts with exec followed by one or more suffix letters:

Suffix	Meaning	Effect
`l`	list	Arguments passed as a list (varargs)
`v`	vector	Arguments passed as an array (char *argv[])
`e`	environment	Environment passed explicitly as parameter
`p`	path	Uses PATH environment variable to find executable

These suffixes combine to form the complete set of exec variants:

The Complete exec() Family
Function	Arguments	Environment	Path Search	Signature
`execl`	list	inherited	no	`execl(path, arg0, arg1, ..., NULL)`
`execv`	vector	inherited	no	`execv(path, argv[])`
`execle`	list	explicit	no	`execle(path, arg0, ..., NULL, envp[])`
`execve`	vector	explicit	no	`execve(path, argv[], envp[])`
`execlp`	list	inherited	yes	`execlp(file, arg0, arg1, ..., NULL)`
`execvp`	vector	inherited	yes	`execvp(file, argv[])`
`execvpe`	vector	explicit	yes	`execvpe(file, argv[], envp[])`

The Fundamental System Call

Converting Mermaid diagram...

List vs Vector: The 'l' and 'v' Variants

List Variants (execl, execlp, execle)

•Arguments passed as separate parameters
•Use C's variadic function mechanism
•Must end with NULL sentinel
•Number of arguments known at compile time
•Convenient for fixed-argument commands
•Easier to read for simple cases

Vector Variants (execv, execvp, execvpe)

•Arguments passed as array of strings
•Array must be NULL-terminated
•Number of arguments determined at runtime
•Essential for dynamic argument construction
•Used when forwarding command-line args
•More flexible for complex scenarios

list_vs_vector.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <unistd.h>
#include <stdio.h>
 
void using_list_variant() {
    // Known at compile time: exactly 3 arguments
    // execl(path, arg0, arg1, arg2, NULL)
    execl("/bin/ls", "ls", "-l", "-a", NULL);
    // arg0 is conventionally the program name
}
 
void using_vector_variant(int argc, char *argv[]) {
    // Runtime-determined arguments
    // Useful when forwarding arguments from another program
    
    // Build argument vector dynamically
    char *my_args[argc + 1];  // +1 for NULL terminator
    
    my_args[0] = "myprogram";  // arg0 = program name
    for (int i = 1; i < argc; i++) {
        my_args[i] = argv[i];  // Copy arguments
    }
    my_args[argc] = NULL;  // NULL terminator required
    
    execv("/path/to/myprogram", my_args);
}
 
void shell_command_example() {
    // Real-world example: building a grep command dynamically
    char *pattern = "error";  // Could come from user input
    char *file = "/var/log/syslog";  // Could be dynamic
    
    // With list variant - awkward for dynamic arguments:
    execl("/bin/grep", "grep", pattern, file, NULL);
    
    // With vector variant - natural for dynamic arguments:
    char *grep_args[] = {"grep", pattern, file, NULL};
    execv("/bin/grep", grep_args);
}

The NULL Terminator is Critical

Practical guidance:

Use list variants when executing a fixed command with known arguments (like running ls -l)
Use vector variants when:
- Arguments are constructed dynamically
- You're forwarding arguments from main(argc, argv)
- The argument count is determined at runtime
- You're building a shell or command executor

Path Search: The 'p' Variants

The 'p' suffix variants (execlp, execvp, execvpe) add a crucial capability: PATH environment variable searching. This is how your shell finds commands without requiring full paths.

How path search works:

If the filename contains a slash (/), it's treated as a path (no search performed)
Otherwise, the directories in PATH are searched in order
The first executable match is used
If no match is found, exec fails with ENOENT

path_search_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
 
int main() {
    // WITHOUT path search - must specify full path
    execl("/usr/bin/python3", "python3", "--version", NULL);
    
    // Error: if python3 isn't at exactly /usr/bin/python3, this fails
    
    // WITH path search - searches PATH automatically
    execlp("python3", "python3", "--version", NULL);
    
    // This checks each directory in PATH:
    // 1. /usr/local/bin/python3?  No
    // 2. /usr/bin/python3?        Yes! Execute it.
    
    perror("exec failed");
    return 1;
}
 
// Demonstrating the search behavior explicitly
void show_path_search() {
    // Assume PATH=/usr/local/bin:/usr/bin:/bin
    
    // execlp("ls", ...) searches:
    //   /usr/local/bin/ls - not found
    //   /usr/bin/ls       - not found  
    //   /bin/ls           - FOUND! Execute.
    
    // execlp("/bin/ls", ...) does NOT search
    //   because the filename contains '/'
    //   It tries /bin/ls directly (same as execl)
}

Use 'p' Variants When

•Executing standard system commands
•Building shell-like interfaces
•User specifies command by name
•Portability across different systems
•Command location may vary

Avoid 'p' Variants When

•Security-sensitive operations
•You know the exact executable path
•PATH manipulation is a concern
•Running your own application binaries
•Reproducibility is critical

Security Warning: PATH Attacks

The conffile vs. PATH distinction:

Notice that 'p' variants take a file parameter while others take a path parameter. This isn't just terminology:

path = a pathname (absolute or relative), used as-is
file = a filename that may be searched for in PATH

If file contains no slashes, PATH is searched. If it contains a slash (like ./myprogram or /usr/bin/python), it's used directly as a path.

Environment Control: The 'e' Variants

Why control the environment?

Environment variables configure program behavior in foundational ways:

PATH – where to find executables
HOME – user's home directory
USER – current username
LANG/LC_* – localization settings
LD_LIBRARY_PATH – dynamic library search path
Application-specific settings (DATABASE_URL, API_KEY, etc.)

Sometimes you need to:

Sanitize the environment for security (remove dangerous variables)
Augment the environment (add variables for child process)
Replace the environment entirely (controlled sandbox)
Pass secrets securely to child processes

environment_control.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
 
// The global environ variable holds current environment
extern char **environ;
 
int main() {
    // Non-'e' variant: child inherits parent's environment
    execl("/usr/bin/env", "env", NULL);
    // Child sees all of parent's environment variables
    
    // 'e' variant: specify environment explicitly
    char *custom_env[] = {
        "PATH=/usr/bin:/bin",
        "HOME=/tmp",
        "MY_APP_MODE=production",
        "DATABASE_URL=postgres://localhost/mydb",
        NULL  // NULL terminator required
    };
    
    execle("/usr/bin/env", "env", NULL, custom_env);
    // Child sees ONLY the four variables we specified
    // Parent's other variables are NOT inherited
    
    perror("exec failed");
    return 1;
}
 
// Security: creating a clean environment
void secure_exec() {
    // For setuid programs or security-sensitive operations,
    // never inherit the untrusted environment
    
    char *safe_env[] = {
        "PATH=/usr/bin:/bin",  // Known-safe PATH
        "IFS= \t\n",           // Safe IFS
        "HOME=/tmp",           // Neutral home
        NULL
    };
    
    // Execute with controlled environment only
    execve("/path/to/secure/program", argv, safe_env);
}
 
// Augmenting the environment
void add_to_environment() {
    // Sometimes you want parent's environment PLUS some additions
    // This requires constructing a new array
    
    int env_count = 0;
    for (char **e = environ; *e != NULL; e++) {
        env_count++;
    }
    
    // Create new array: original + 2 new + NULL terminator
    char *new_env[env_count + 3];
    
    // Copy existing environment
    for (int i = 0; i < env_count; i++) {
        new_env[i] = environ[i];
    }
    
    // Add new variables
    new_env[env_count] = "MY_NEW_VAR=value";
    new_env[env_count + 1] = "ANOTHER_VAR=other";
    new_env[env_count + 2] = NULL;
    
    execve("/path/to/program", argv, new_env);
}

Environment Array Format

Environment Inheritance in exec() Variants
Variant	Environment Behavior	Use Case
`execl`, `execv`, `execlp`, `execvp`	Inherits parent's `environ` automatically	Normal program execution
`execle`, `execve`, `execvpe`	Uses explicitly provided `envp` array	Security, sandboxing, configuration

The Core System Call: execve()

execve_signature.c
1
2
3
4
5
6
7
8
9
#include <unistd.h>
 
// The actual system call signature
int execve(const char *pathname,  // Full path to executable
           char *const argv[],     // Argument vector (NULL-terminated)
           char *const envp[]);    // Environment vector (NULL-terminated)
 
// Returns: -1 on error (only return case), sets errno
// On success: does not return (process image replaced)

What happens inside the kernel when execve() is called:

Pathname resolution: The kernel resolves the pathname to an inode
Permission checks:
- Is the file executable?
- Does the calling process have execute permission?
- Is the filesystem mounted with noexec?
File format recognition:
- Is it an ELF binary? → Load using ELF loader
- Is it a script with #! (shebang)? → Invoke the interpreter
- Other formats check (a.out, etc.)
Memory space preparation:
- Allocate new memory regions
- Release old text, data, heap, stack segments
- Load new program's segments
Security transitions:
- Apply setuid/setgid if applicable
- Clear capabilities as appropriate
- Reset signal dispositions to default for caught signals
Stack setup:
- Copy argv strings to new stack
- Copy envp strings to new stack
- Set up initial stack frame
Jump to entry point:
- Transfer control to the new program's _start symbol
- _start initializes runtime, then calls main(argc, argv, environ)

Converting Mermaid diagram...

The Point of No Return

Complete Variant Reference

Let's examine each exec() variant with its complete signature, behavior, and typical use case.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
int execve(const char *pathname, char *const argv[],
           char *const envp[]);
 
// Example:
char *args[] = {"ls", "-l", NULL};
char *env[] = {"PATH=/usr/bin", "HOME=/tmp", NULL};
execve("/bin/ls", args, env);
 
// - pathname: Full path to executable
// - argv: NULL-terminated argument array
// - envp: NULL-terminated environment array
 
// THE ONLY ACTUAL SYSTEM CALL
// All other exec functions ultimately call this

Common Errors and Pitfalls

The exec() functions have several subtle error conditions and common mistakes that can derail even experienced programmers.

Common exec() Errors (errno values)
errno	Meaning	Common Cause
`ENOENT`	No such file or directory	Wrong path, file doesn't exist, or command not found in PATH
`EACCES`	Permission denied	File not executable, or directory in path not searchable
`ENOEXEC`	Exec format error	File is not a valid executable (wrong architecture, corrupt binary)
`E2BIG`	Argument list too long	argv + envp exceeds system limit (typically 2MB on Linux)
`ENOMEM`	Out of memory	Insufficient memory to create new process image
`ETXTBSY`	Text file busy	Executable is open for writing by another process
`EFAULT`	Bad address	argv or envp points to invalid memory
`ELOOP`	Too many symbolic links	Symlink chain is too deep or circular

common_mistakes.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>
 
// MISTAKE 1: Forgetting the NULL terminator
void mistake_no_null() {
    // WRONG: No NULL terminator - undefined behavior!
    execl("/bin/ls", "ls", "-l");  // Missing NULL
    
    // CORRECT:
    execl("/bin/ls", "ls", "-l", NULL);
}
 
// MISTAKE 2: Using wrong pathname vs filename
void mistake_path_type() {
    // WRONG with non-'p' variant:
    execl("ls", "ls", "-l", NULL);  // "ls" is not a path!
    // This looks for "./ls" or fails
    
    // CORRECT: Use 'p' variant for name-only
    execlp("ls", "ls", "-l", NULL);
    
    // Or specify full path for non-'p':
    execl("/bin/ls", "ls", "-l", NULL);
}
 
// MISTAKE 3: Ignoring exec failure
void mistake_no_error_check() {
    // WRONG: No error handling
    execl("/bin/nonexistent", "nonexistent", NULL);
    // If we reach here, exec failed, but we don't know why
    printf("continuing normally...\n");  // Bad!
    
    // CORRECT: Always check for failure
    execl("/bin/nonexistent", "nonexistent", NULL);
    // If we reach here, exec definitely failed
    fprintf(stderr, "exec failed: %s\n", strerror(errno));
    exit(1);  // Or handle error appropriately
}
 
// MISTAKE 4: Expecting exec to return on success
void mistake_expect_return() {
    execl("/bin/ls", "ls", NULL);
    
    // WRONG EXPECTATION: This code never runs on success!
    printf("ls finished\n");  // Never printed if exec succeeds
}
 
// MISTAKE 5: Wrong arg0 convention
void mistake_arg0() {
    // UNUSUAL (but valid): arg0 doesn't match program name
    execl("/bin/busybox", "ls", "-l", NULL);
    // This actually runs busybox, which uses arg0 to decide behavior
    // Works for busybox, but confusing for regular programs
    
    // CONVENTION: arg0 should be the program name
    execl("/bin/ls", "ls", "-l", NULL);
}
 
// MISTAKE 6: Modifying strings in argv (they should be const-ish)
void mistake_modify_argv() {
    char *args[] = {"prog", "arg1", NULL};
    
    // Some programs might try to modify argv for ps display
    // But exec copies these strings, so modifications after
    // exec setup are lost anyway
}

The "Reached After Exec" Pattern

Choosing the Right Variant

With seven variants to choose from, how do you pick the right one? Here's a decision framework:

Converting Mermaid diagram...

Quick Selection Guide

•Most flexible (default choice): execvp() — vector args + PATH search, covers most use cases
•Maximum control: execve() — the system call itself, explicit everything
•Simple known command: execl() — clean syntax for fixed commands
•Shell implementation: execvp() — dynamic args from parsed input + PATH search
•Security-sensitive: execve() — full path, explicit controlled environment
•Quick system utility: execlp() — easy PATH search with fixed args

When in Doubt

Summary: The exec() Family

We've thoroughly explored the exec() family of functions. Let's consolidate the key concepts:

Key Takeaways

•exec() replaces the process image — The calling program is completely replaced; only the PID survives.
•exec() never returns on success — If you see code after exec(), it only runs on failure.
•execve() is the only system call — All six other variants are library wrappers that ultimately call execve().
•'l' vs 'v': List (compile-time args) vs Vector (runtime args array).
•'p' enables PATH search — Finds executables like a shell does; introduces security considerations.
•'e' provides explicit environment — Control what environment the new program sees.
•Always include NULL terminators — Both argument lists and arrays must be NULL-terminated.
•Always check for failure — Handle errors after every exec() call.

What's next:

Page Complete