Loading content...
When you type ls -la /home in a terminal, three pieces of information flow from the shell to the ls program: the command name itself (ls), and two arguments (-la and /home). This seemingly simple data transfer involves a careful handoff between the calling process, the kernel, and the new program.
The mechanism is argv—the argument vector—and it has been the universal method for command-line argument passing in Unix systems since the 1970s. Understanding exactly how argv works is essential for writing robust programs that correctly handle command-line input, building tools that invoke other programs, and debugging mysterious argument-related bugs.
By the end of this page, you will understand the complete argv lifecycle: how arguments are passed to exec(), how the kernel stores them, how the new program receives them, system limits on argument size, and best practices for both constructing and parsing arguments.
In C, the standard function signature for main() reveals the argv structure:
int main(int argc, char *argv[]);
Let's break this down:
The array layout follows strict conventions:
| Index | Contents | Description |
|---|---|---|
argv[0] | Program name/path | By convention, the name used to invoke the program |
argv[1] | First argument | First user-supplied argument |
argv[2] | Second argument | Second user-supplied argument |
| ... | ... | Additional arguments |
argv[argc-1] | Last argument | Final user-supplied argument |
argv[argc] | NULL | Mandatory NULL terminator (equal to (char *)0) |
1234567891011121314151617181920212223242526272829303132333435363738394041424344
#include <stdio.h> int main(int argc, char *argv[]) { printf("====== Argument Analysis ======\n"); printf("argc = %d\n\n", argc); // Iterate through all arguments printf("Arguments (via index):\n"); for (int i = 0; i < argc; i++) { printf(" argv[%d] = "%s" (addr: %p)\n", i, argv[i], (void*)argv[i]); } // Verify NULL terminator printf("\nNULL check:\n"); printf(" argv[%d] = %p (should be NULL)\n", argc, (void*)argv[argc]); // Alternative iteration using NULL terminator printf("\nArguments (via pointer iteration):\n"); for (char **p = argv; *p != NULL; p++) { printf(" %s\n", *p); } // Memory layout observation printf("\nMemory layout of argument strings:\n"); for (int i = 0; i < argc; i++) { printf(" "%s" starts at %p, length %zu\n", argv[i], (void*)argv[i], strlen(argv[i])); } return 0;} /* Example run: ./program hello world====== Argument Analysis ======argc = 3 Arguments (via index): argv[0] = "./program" (addr: 0x7ffd1234abcd) argv[1] = "hello" (addr: 0x7ffd1234abd7) argv[2] = "world" (addr: 0x7ffd1234abdd) NULL check: argv[3] = (nil) (should be NULL)*/While argv[0] conventionally holds the program name, nothing enforces this. The calling process can set argv[0] to anything. Programs like busybox exploit this—they change behavior based on argv[0], making 'ls', 'cat', 'grep' all aliases to the same binary. Never assume argv[0] is trustworthy or that it contains a valid path.
When you call exec() with arguments, those arguments flow through a multi-stage pipeline from your process to the kernel to the new program's stack. Let's trace this journey.
Stage 1: Caller prepares arguments
// List variant: arguments as function parameters
execl("/bin/ls", "ls", "-la", "/home", NULL);
// Vector variant: arguments as array
char *argv[] = {"ls", "-la", "/home", NULL};
execv("/bin/ls", argv);
All exec variants eventually package arguments as a NULL-terminated array of string pointers for the kernel.
Stage 2: Kernel validates and measures
The kernel doesn't just accept any arguments. It:
ARG_MAX limitStage 3: Kernel builds the new stack
During process image replacement, the kernel constructs a new stack containing:
123456789101112131415161718192021222324252627282930313233343536373839
/* * Simplified view of what the kernel builds on the new process's stack * (x86-64 Linux, addresses decrease going up) */ // After exec(), the stack pointer (%rsp) points to://// Higher addresses// ┌──────────────────────────────────────────┐// │ "LAST_ENV=value\0" │ ← environment strings// │ "PATH=/usr/bin\0" │// │ "HOME=/home/user\0" │// │ ... │// │ "/home\0" │ ← argument strings // │ "-la\0" │// │ "ls\0" │// ├──────────────────────────────────────────┤// │ auxv entries... │ ← auxiliary vector// ├──────────────────────────────────────────┤// │ NULL │ ← end of envp// │ pointer to "LAST_ENV=value" │// │ pointer to "PATH=/usr/bin" │// │ pointer to "HOME=/home/user" │ ← envp[]// ├──────────────────────────────────────────┤// │ NULL │ ← end of argv// │ pointer to "/home" │// │ pointer to "-la" │// │ pointer to "ls" │ ← argv[]// ├──────────────────────────────────────────┤// │ 3 (argc) │ ← %rsp points here// └──────────────────────────────────────────┘// Lower addresses (stack grows down) // The C runtime (_start → __libc_start_main) then calls:// main(argc, argv, envp)// where:// argc = *(long*)(%rsp)// argv = (%rsp + 8)// envp = &argv[argc + 1]The kernel copies all argument data from the calling process to the new process's stack. After exec(), the new program has its own copy. Modifications to argv strings in the new process don't affect anything in the parent. This copy semantics provides memory safety isolation between processes.
The kernel imposes limits on argument size to prevent denial-of-service attacks and ensure system stability. Understanding these limits is essential for building robust tools.
| Limit | Meaning | Typical Value (Linux) | How to Query |
|---|---|---|---|
ARG_MAX | Maximum bytes for argv[] + envp[] combined | 2097152 bytes (2 MB) | getconf ARG_MAX |
MAX_ARG_STRLEN | Maximum length of a single argument string | 131072 bytes (128 KB) | Kernel constant |
MAX_ARG_STRINGS | Maximum number of strings in argv + envp | 0x7FFFFFFF | Kernel constant |
What counts toward ARG_MAX:
On a 64-bit system with 2 MB ARG_MAX:
But beware: The environment also counts! If you have a large environment with many variables, that reduces available space for arguments.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <errno.h>#include <string.h> // Query ARG_MAX programmaticallyvoid check_limits() { long arg_max = sysconf(_SC_ARG_MAX); printf("ARG_MAX = %ld bytes (%.2f MB)\n", arg_max, arg_max / (1024.0 * 1024.0)); // Calculate current environment size extern char **environ; size_t env_size = 0; for (char **e = environ; *e != NULL; e++) { env_size += strlen(*e) + 1; // string + null terminator env_size += sizeof(char*); // pointer } printf("Current environment uses ~%zu bytes\n", env_size); printf("Available for argv: ~%ld bytes\n", arg_max - env_size);} // Demonstrate hitting the limitvoid test_arg_limit() { // Create a very long argument size_t len = 150000; // 150KB - above MAX_ARG_STRLEN char *huge_arg = malloc(len); memset(huge_arg, 'x', len - 1); huge_arg[len - 1] = '\0'; char *argv[] = {"echo", huge_arg, NULL}; execv("/bin/echo", argv); // If we get here, exec failed perror("exec failed"); printf("errno = %d\n", errno); // Likely E2BIG (Argument list too long) free(huge_arg);} int main() { check_limits(); printf("\n"); test_arg_limit(); return 0;}You've likely seen 'Argument list too long' when running commands like rm * in a directory with many files. The shell expands the glob before calling exec(), creating too many arguments. Solutions include: using find -exec, xargs (which batches arguments), or for loops.
1234567891011121314151617181920
# Problem: Too many files for single command$ rm /path/to/dir/*-bash: /bin/rm: Argument list too long # Solution 1: Use find -exec (one rm per file - slow but works)$ find /path/to/dir -type f -exec rm {} \; # Solution 2: Use find with xargs (batches efficiently)$ find /path/to/dir -type f -print0 | xargs -0 rm # Solution 3: Use find -delete (fastest for deletion)$ find /path/to/dir -type f -delete # Solution 4: Loop (portable, flexible)$ for f in /path/to/dir/*; do rm "$f"; done # Why xargs works: it automatically batches arguments# to stay under ARG_MAX, calling rm multiple times if needed$ echo {1..1000000} | xargs echo | wc -c# Runs multiple echo commands, each within limitsWhen building argument arrays for exec(), several subtle issues can cause bugs. Let's examine correct patterns and common mistakes.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
#include <unistd.h>#include <stdio.h>#include <stdlib.h>#include <string.h> // CORRECT: Static array with known argumentsvoid correct_static() { char *argv[] = { "grep", // argv[0]: program name "-i", // case insensitive "-n", // show line numbers "pattern", // search pattern "file.txt", // file to search NULL // REQUIRED terminator }; execv("/bin/grep", argv);} // CORRECT: Dynamic array constructionvoid correct_dynamic(int file_count, char **files) { // Allocate: prog name + flags + files + NULL int argc = 1 + 2 + file_count + 1; char **argv = malloc(argc * sizeof(char *)); int i = 0; argv[i++] = "grep"; argv[i++] = "-i"; argv[i++] = "-n"; for (int j = 0; j < file_count; j++) { argv[i++] = files[j]; // Careful: these must persist until exec } argv[i] = NULL; // Terminate execv("/bin/grep", argv); // Only reached on failure; remember to free free(argv);} // CORRECT: Building argument strings dynamicallyvoid correct_dynamic_strings() { char *argv[10]; int i = 0; argv[i++] = "myprogram"; // Dynamically build a flag argument int verbose_level = 3; char verbose_flag[20]; snprintf(verbose_flag, sizeof(verbose_flag), "--verbose=%d", verbose_level); argv[i++] = verbose_flag; // Must remain in scope until exec! // Use strdup for dynamic lifetime char *port_flag = NULL; asprintf(&port_flag, "--port=%d", 8080); // GNU extension argv[i++] = port_flag; // Dynamically allocated, persists argv[i] = NULL; execv("/path/to/myprogram", argv); free(port_flag); // Only on failure}Common mistakes to avoid:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
#include <unistd.h>#include <stdio.h> // MISTAKE 1: Forgetting NULL terminatorvoid mistake_no_null() { char *argv[] = {"ls", "-l"}; // Missing NULL! execv("/bin/ls", argv); // UNDEFINED BEHAVIOR: kernel reads past array bounds} // MISTAKE 2: Using string literal as argv[0] and modifyingvoid mistake_modify_literal() { char *argv[] = {"ls", "-l", NULL}; argv[0][0] = 'L'; // CRASH: modifying read-only string literal execv("/bin/ls", argv);} // MISTAKE 3: Dangling pointer from inner scopevoid mistake_dangling() { char *argv[10]; int i = 0; { char buffer[100]; snprintf(buffer, sizeof(buffer), "--port=%d", 8080); argv[i++] = buffer; // DANGER: buffer goes out of scope! } // buffer destroyed here argv[i++] = "other_arg"; argv[i] = NULL; execv("/path/to/prog", argv); // argv[0] points to garbage!} // MISTAKE 4: Not handling exec failure properlyvoid mistake_no_error() { char *argv[] = {"ls", "-l", NULL}; execv("/bin/ls", argv); // If exec fails, we continue running! // Could lead to security issues or logic bugs printf("This runs if exec failed!\n");} // MISTAKE 5: Passing mutable static to multiple threads (if multi-threaded)void mistake_shared_argv() { static char *argv[] = {"prog", NULL, NULL}; argv[1] = "something"; // NOT thread-safe! execv("/path/prog", argv);}When you know all arguments at compile time and just want to run a command, use execlp() or execl() instead of building an array. They're less error-prone: execlp("grep", "grep", "-i", "pattern", "file.txt", NULL); — just remember the NULL at the end!
On the receiving end, programs need to parse argv correctly. While simple for basic cases, robust argument parsing requires handling many edge cases.
123456789101112131415161718192021222324252627282930313233343536373839404142434445
#include <stdio.h>#include <string.h>#include <stdlib.h> // Basic manual parsingint main(int argc, char *argv[]) { // Default values int verbose = 0; const char *output = "stdout"; const char *input = NULL; // Parse arguments for (int i = 1; i < argc; i++) { if (strcmp(argv[i], "-v") == 0 || strcmp(argv[i], "--verbose") == 0) { verbose = 1; } else if (strcmp(argv[i], "-o") == 0) { if (i + 1 < argc) { output = argv[++i]; // Consume next argument as value } else { fprintf(stderr, "Error: -o requires a filename\n"); return 1; } } else if (strncmp(argv[i], "--output=", 9) == 0) { output = argv[i] + 9; // Skip prefix } else if (argv[i][0] == '-') { fprintf(stderr, "Unknown option: %s\n", argv[i]); return 1; } else { // Positional argument input = argv[i]; } } // Use parsed values printf("verbose: %d\n", verbose); printf("output: %s\n", output); printf("input: %s\n", input ? input : "(none)"); return 0;}Using getopt() for standard parsing:
For POSIX-compliant option parsing, use getopt() or GNU's getopt_long():
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
#define _GNU_SOURCE#include <stdio.h>#include <stdlib.h>#include <getopt.h> void print_usage(const char *progname) { fprintf(stderr, "Usage: %s [options] <input_file>\n", progname); fprintf(stderr, "Options:\n"); fprintf(stderr, " -h, --help Show this help\n"); fprintf(stderr, " -v, --verbose Enable verbose output\n"); fprintf(stderr, " -o, --output=FILE Write output to FILE\n"); fprintf(stderr, " -n, --count=NUM Process NUM items\n");} int main(int argc, char *argv[]) { // Default values int verbose = 0; const char *output = NULL; int count = 10; // Define long options static struct option long_options[] = { {"help", no_argument, NULL, 'h'}, {"verbose", no_argument, NULL, 'v'}, {"output", required_argument, NULL, 'o'}, {"count", required_argument, NULL, 'n'}, {NULL, 0, NULL, 0 } }; // Parse options int opt; while ((opt = getopt_long(argc, argv, "hvo:n:", long_options, NULL)) != -1) { switch (opt) { case 'h': print_usage(argv[0]); return 0; case 'v': verbose = 1; break; case 'o': output = optarg; // optarg set by getopt break; case 'n': count = atoi(optarg); if (count <= 0) { fprintf(stderr, "Error: count must be positive\n"); return 1; } break; default: // '?' for unknown option print_usage(argv[0]); return 1; } } // Remaining arguments (optind points to first non-option) if (optind >= argc) { fprintf(stderr, "Error: input file required\n"); print_usage(argv[0]); return 1; } const char *input = argv[optind]; printf("Configuration:\n"); printf(" verbose: %d\n", verbose); printf(" output: %s\n", output ? output : "(stdout)"); printf(" count: %d\n", count); printf(" input: %s\n", input); // Handle additional positional arguments for (int i = optind + 1; i < argc; i++) { printf(" extra file: %s\n", argv[i]); } return 0;}A standalone '--' argument conventionally signals the end of options. After '--', everything is treated as positional arguments even if it starts with '-'. This allows: rm -- -f (removes a file named '-f'). The getopt functions handle this automatically.
Arguments can contain any bytes except null (\0). However, shells interpret special characters, which leads to frequent confusion about quoting. Understanding the difference between shell processing and exec() is crucial.
Key insight: The shell parses your command line before calling exec(). By the time exec() runs, all quoting and expansion has already happened. exec() passes exactly the strings it receives—no interpretation.
Shell input: echo "hello world" $HOME *.txt
↓
Shell processing:
- Remove quotes → "hello world" becomes hello world
- Expand variable → $HOME becomes /home/user
- Glob expansion → *.txt becomes file1.txt file2.txt
↓
exec() sees: argv = ["echo", "hello world", "/home/user",
"file1.txt", "file2.txt", NULL]
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
#include <unistd.h>#include <stdio.h> /* * When the SHELL runs: grep "hello world" *.txt * The shell: * 1. Removes quotes → "hello world" becomes the string: hello world (with space) * 2. Expands *.txt → file1.txt, file2.txt, ... * 3. Calls exec with the processed arguments * * When YOU call exec() directly, no shell processing happens. * You pass exactly what you want, including special characters. */ void exec_with_spaces() { // Passing argument with spaces - no quotes needed char *argv[] = { "grep", "hello world", // String with space - works fine "file.txt", NULL }; execv("/bin/grep", argv); // grep receives "hello world" as pattern - searches for that phrase} void exec_with_special_chars() { // All these work directly - no shell escaping needed char *argv[] = { "echo", "price: $100", // $ has no special meaning to exec "file*.txt", // * is literal, not glob "(parens)", // Literal parentheses "|pipe|", // Literal pipe character "a bc", // Actual tab and newline characters NULL }; execv("/bin/echo", argv); // Prints literally: price: $100 file*.txt (parens) |pipe| a b // ^tab ^newline} void shell_processing_needed() { // If you WANT shell processing, you must invoke the shell execlp("sh", "sh", "-c", "echo *.txt", NULL); // Now the shell will expand *.txt before running echo}| Character | Shell Meaning | In Direct exec() |
|---|---|---|
*, ?, [] | Glob patterns (expanded) | Literal characters |
$VAR | Variable expansion | Literal string "$VAR" |
$(cmd), `cmd` | Command substitution | Literal string |
|, &, ; | Pipeline, background, separator | Literal characters |
>, <, >> | Redirection | Literal characters |
'...' | Single quotes (no expansion) | Quotes are literal! |
"..." | Double quotes (some expansion) | Quotes are literal! |
\x | Escape character | Literal backslash |
A frequent mistake is including shell quotes in exec() arguments. execl("grep", "grep", ""hello world"", NULL) passes the literal string "hello world" (with quote characters!) to grep, not hello world. Unlike shell commands, you don't need quotes to preserve spaces in direct exec() calls.
Many programs need to pass their arguments (or a subset) to child processes. This is common in wrapper scripts, command dispatchers, and system utilities.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
#include <unistd.h>#include <stdio.h>#include <stdlib.h>#include <string.h> // Pattern 1: Forward all arguments to another programvoid forward_all(int argc, char *argv[]) { // Simply replace program name, keep rest of argv argv[0] = "real_program"; execv("/path/to/real_program", argv); perror("exec failed");} // Pattern 2: Forward arguments after processing somevoid forward_remaining(int argc, char *argv[], int first_to_forward) { // Build new argv starting from specified index char **new_argv = malloc((argc - first_to_forward + 2) * sizeof(char*)); new_argv[0] = "target_program"; for (int i = first_to_forward; i < argc; i++) { new_argv[i - first_to_forward + 1] = argv[i]; } new_argv[argc - first_to_forward + 1] = NULL; execv("/path/to/target_program", new_argv); free(new_argv);} // Pattern 3: Insert additional argumentsvoid add_arguments(int argc, char *argv[]) { // Original command: wrapper file1 file2 // We want: real_program --verbose --log=debug file1 file2 int extra_args = 2; char **new_argv = malloc((argc + extra_args + 1) * sizeof(char*)); new_argv[0] = "real_program"; new_argv[1] = "--verbose"; new_argv[2] = "--log=debug"; // Copy original arguments (skipping argv[0]) for (int i = 1; i < argc; i++) { new_argv[i + extra_args] = argv[i]; } new_argv[argc + extra_args] = NULL; execv("/path/to/real_program", new_argv); free(new_argv);} // Pattern 4: Wrapper that processes flags then forwardsint main(int argc, char *argv[]) { // Process our own flags, then forward the rest int verbose = 0; int forward_start = 1; while (forward_start < argc && argv[forward_start][0] == '-') { if (strcmp(argv[forward_start], "--wrapper-verbose") == 0) { verbose = 1; forward_start++; } else if (strcmp(argv[forward_start], "--") == 0) { forward_start++; // Skip the '--' break; // Everything after is for child } else { break; // Unknown flag - belongs to child } } if (verbose) { fprintf(stderr, "Wrapper running in verbose mode\n"); fprintf(stderr, "Forwarding %d arguments to child\n", argc - forward_start); } // Forward remaining arguments forward_remaining(argc, argv, forward_start); return 1; // Only reached on exec failure}For simple forwarding, exploit the fact that argv is already an array. If you want to forward argv[2] onwards: execv(program, &argv[2]); But remember to ensure argv[2] becomes argv[0] with the right program name, or accept that the child's argv[0] might be a regular argument.
The argv[0] convention—storing the program name—enables a clever technique: multi-call binaries. A single executable can provide multiple commands by examining how it was invoked.
How multi-call binaries work:
This is widely used in:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576
#include <stdio.h>#include <string.h>#include <libgen.h> // for basename() // Individual tool implementationsint cat_main(int argc, char *argv[]) { printf("[cat implementation would run here]\n"); return 0;} int echo_main(int argc, char *argv[]) { for (int i = 1; i < argc; i++) { printf("%s%s", argv[i], (i < argc - 1) ? " " : "\n"); } return 0;} int ls_main(int argc, char *argv[]) { printf("[ls implementation would run here]\n"); return 0;} // Dispatch tablestruct applet { const char *name; int (*main)(int, char**);}; static const struct applet applets[] = { {"cat", cat_main}, {"echo", echo_main}, {"ls", ls_main}, {NULL, NULL}}; int main(int argc, char *argv[]) { // Get just the filename, not the path // "/usr/bin/cat" → "cat" // "./echo" → "echo" char *name = basename(argv[0]); // Handle "multicall <applet> <args>" invocation if (strcmp(name, "multicall") == 0 && argc > 1) { name = argv[1]; argc--; argv++; } // Find and run the applet for (const struct applet *a = applets; a->name; a++) { if (strcmp(name, a->name) == 0) { return a->main(argc, argv); } } // Unknown applet fprintf(stderr, "Unknown applet: %s\n", name); fprintf(stderr, "Available: cat, echo, ls\n"); return 1;} /*Setup:$ gcc -o multicall multicall.c$ ln -s multicall cat$ ln -s multicall echo$ ln -s multicall ls Usage:$ ./cat file.txt # Runs cat_main via symlink$ ./echo hello world # Runs echo_main via symlink$ ./multicall ls -la # Runs ls_main via direct call All three symlinks point to the same binary, saving disk spacewhile providing multiple commands.*/While useful for functionality, never use argv[0] for security decisions. An attacker can set argv[0] to anything when calling exec(). To find your true executable path on Linux, read /proc/self/exe. On other systems, this requires platform-specific approaches.
There's one character that arguments absolutely cannot contain: the null byte (\0). This is because C strings are null-terminated—the null byte marks the end of the string.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
#include <unistd.h>#include <stdio.h>#include <string.h> void demonstrate_null_problem() { // This argument is truncated at the null byte char arg[] = "hello\0world"; // Despite containing "hello\0world" in memory, // strlen(arg) returns 5, and exec sees only "hello" char *argv[] = {"echo", arg, NULL}; execv("/bin/echo", argv); // echo receives "hello", not "hello\0world"} void workaround_with_encoding() { // If you need to pass binary data with null bytes, // you must encode it (base64, hex, etc.) // Original data: "hello\0world" (12 bytes including both nulls) char *encoded = "aGVsbG8Ad29ybGQA"; // base64 encoded char *argv[] = {"decoder_program", encoded, NULL}; execv("/path/to/decoder_program", argv); // decoder_program can base64-decode to get original bytes} // For arbitrary binary data, file-based communication worksvoid pass_binary_via_file() { // Write binary data to temp file FILE *f = fopen("/tmp/binary_data", "wb"); char data[] = {'h', 'e', 'l', 'l', 'o', '\0', 'w', 'o', 'r', 'l', 'd'}; fwrite(data, 1, sizeof(data), f); fclose(f); // Tell the child where to find it char *argv[] = {"processor", "--input=/tmp/binary_data", NULL}; execv("/path/to/processor", argv);} // Or use pipes/file descriptorsvoid pass_binary_via_pipe() { int pipefd[2]; pipe(pipefd); // Create pipe if (fork() == 0) { // Child: reads from pipe close(pipefd[1]); // Close write end dup2(pipefd[0], STDIN_FILENO); // Pipe → stdin close(pipefd[0]); execlp("some_program", "some_program", NULL); _exit(1); } // Parent: writes binary data to pipe close(pipefd[0]); // Close read end char data[] = {'h', 'e', 'l', 'l', 'o', '\0', 'w', 'o', 'r', 'l', 'd'}; write(pipefd[1], data, sizeof(data)); close(pipefd[1]); // Signal EOF // Wait for child... (omitted)}The find -print0 | xargs -0 pattern uses null bytes as delimiters. This works because null isn't inside the arguments (filenames), it's between them. Each filename can contain any character except null, and the null safely separates them. This is why it's the most robust way to handle filenames with special characters.
We've explored the complete lifecycle of argument passing through exec(). Let's consolidate the key concepts:
What's next:
Arguments come from the command line, but programs also receive configuration through environment variables. The next page explores the environment: how it's inherited, how to pass custom environments via exec(), and the security considerations that come with environment-based configuration.
You now understand how arguments flow from exec() through the kernel to main(argc, argv), including memory layout, size limits, construction patterns, parsing approaches, and special cases. Next, we'll explore environment variables—the other key channel for configuring program behavior.