Loading content...
By default, wait() and waitpid() are blocking calls—the parent process is suspended until a child terminates. This simple behavior is correct for many use cases: a shell waiting for a foreground command, a build system waiting for a compiler, or a test harness waiting for test processes.
But what happens when:
In these scenarios, blocking is catastrophic. A web server that calls wait() stops accepting new connections. A GUI that blocks becomes frozen. This page explores the non-blocking wait mechanism and the architectural patterns that maintain responsiveness.
By the end of this page, you will understand the difference between blocking and non-blocking waits, master the WNOHANG flag and its semantics, learn polling patterns and their tradeoffs, understand signal-driven child reaping, and know how to integrate process waiting with event loops.
When a parent calls wait() or waitpid() without the WNOHANG flag, and no children have terminated, the kernel performs a blocking operation:
What "Blocked" Really Means:
A blocked process:
This is both a strength and a weakness. Efficiency is high, but responsiveness is zero.
When Blocking is Appropriate:
Blocking waits are the right choice when the parent has nothing else to do until the child completes. Examples:
// Shell: wait for foreground command
pid_t pid = fork();
if (pid == 0) {
exec(command); // Child runs command
}
wait(&status); // Shell waits—nothing else to do
show_prompt(); // Only after child completes
// Build step: sequential compilation
for (int i = 0; i < num_files; i++) {
compile_file(files[i]); // forks and waits internally
}
// All files compiled in order
When Blocking is Problematic:
Blocking fails when the parent must remain responsive:
// WRONG: Server becomes unresponsive
while (running) {
connection = accept(socket); // Accept new connection
pid = fork(); // Fork handler
if (pid > 0) {
wait(&status); // BUG: Blocks until child done!
// Server cannot accept new connections while handling this one
}
}
This server handles only one connection at a time—completely defeating the purpose of forking.
A blocking wait() inside an event loop or request handler is almost always a bug. If you need to wait for children while also handling other events, you must use non-blocking waits, signal handlers, or event loop integration.
The WNOHANG flag ("Wait, No Hang") transforms waitpid() from a blocking to a non-blocking call:
pid_t waitpid(pid_t pid, int *status, int options);
// Blocking (default):
waitpid(pid, &status, 0); // Blocks until child terminates
// Non-blocking:
waitpid(pid, &status, WNOHANG); // Returns immediately
Return Value Semantics with WNOHANG:
| Return Value | Meaning | Action to Take |
|---|---|---|
> 0 (PID) | A child with this PID has terminated | Process the status, child is reaped |
0 | No children have terminated (yet) | Do other work, try again later |
-1 with ECHILD | No children exist | All children already reaped |
-1 with other errno | An error occurred | Handle the error |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sys/wait.h>#include <errno.h> /** * Demonstrates WNOHANG for non-blocking wait */int main() { pid_t child = fork(); if (child == 0) { // Child: simulate some work printf("Child: Working for 3 seconds...\n"); sleep(3); printf("Child: Done, exiting\n"); exit(42); } // Parent: poll for child completion without blocking printf("Parent: Child started (PID %d)\n", child); int status; pid_t result; int polls = 0; while (1) { result = waitpid(child, &status, WNOHANG); if (result > 0) { // Child has terminated printf("Parent: Child terminated!\n"); if (WIFEXITED(status)) { printf("Parent: Exit code = %d\n", WEXITSTATUS(status)); } break; } else if (result == 0) { // Child still running polls++; printf("Parent: Child still running (poll #%d)\n", polls); printf("Parent: Doing other work...\n"); usleep(500000); // 0.5 second - could do real work here } else { // Error (-1) if (errno == ECHILD) { printf("Parent: No child to wait for\n"); } else { perror("waitpid"); } break; } } printf("Parent: Exiting after %d polls\n", polls); return 0;} /* * Expected Output: * Parent: Child started (PID 12345) * Child: Working for 3 seconds... * Parent: Child still running (poll #1) * Parent: Doing other work... * Parent: Child still running (poll #2) * Parent: Doing other work... * Parent: Child still running (poll #3) * Parent: Doing other work... * Parent: Child still running (poll #4) * Parent: Doing other work... * Parent: Child still running (poll #5) * Parent: Doing other work... * Child: Done, exiting * Parent: Child still running (poll #6) * Parent: Doing other work... * Parent: Child terminated! * Parent: Exit code = 42 * Parent: Exiting after 6 polls */The Critical Insight:
With WNOHANG, the parent can check for child termination without suspending execution. If no child has terminated, the call returns 0 immediately, and the parent can continue doing useful work.
This enables patterns like:
SIGCHLDCombining WNOHANG with Other Flags:
You can OR multiple flags together:
// Check for termination OR stopping
waitpid(pid, &status, WNOHANG | WUNTRACED);
// Check for termination, stopping, OR continuing
waitpid(pid, &status, WNOHANG | WUNTRACED | WCONTINUED);
Remember: with WNOHANG, a return value of 0 is NOT an error—it means 'no child has terminated yet.' Without WNOHANG, waitpid() never returns 0 (it either returns a PID or -1). Don't confuse these two behaviors.
Pattern 1: Simple Polling Loop
Check periodically with a sleep between checks:
1234567891011121314151617181920212223242526272829303132
/** * Simple polling: check child status at regular intervals * * Tradeoffs: * - Simple to implement * - Wastes CPU if interval too short * - High latency if interval too long * - Cannot respond immediately to termination */void monitor_children_polling(pid_t *children, int count) { int remaining = count; while (remaining > 0) { for (int i = 0; i < count; i++) { if (children[i] == 0) continue; // Already reaped int status; pid_t result = waitpid(children[i], &status, WNOHANG); if (result > 0) { printf("Child %d terminated\n", children[i]); children[i] = 0; // Mark as reaped remaining--; } } if (remaining > 0) { // Do other work here, or just sleep usleep(100000); // 100ms polling interval } }}Pattern 2: Work-Interleaved Polling
Check for child completion between units of work:
12345678910111213141516171819202122232425262728293031323334353637
/** * Work-interleaved polling: check between work units * * Better than pure sleep-based polling because: * - Actually does useful work * - Check frequency tied to work pace * - No explicit sleep/timer management */void server_with_child_monitoring(int listen_fd) { while (running) { // Check for terminated children before handling request reap_terminated_children(); // Uses WNOHANG // Handle one client request (may block briefly on accept) int client_fd = accept(listen_fd, NULL, NULL); if (client_fd < 0) continue; pid_t handler = fork(); if (handler == 0) { handle_client(client_fd); // Child handles request _exit(0); } close(client_fd); // Parent closes client socket // Could also check here for fairness }} void reap_terminated_children() { int status; pid_t pid; // Reap ALL terminated children (not just one) while ((pid = waitpid(-1, &status, WNOHANG)) > 0) { log_child_completion(pid, status); }}The polling interval is a tradeoff between latency and CPU usage. For short-lived children (< 1 second), 10-50ms is reasonable. For longer tasks, 100-500ms is often sufficient. Very long-running children (minutes+) might only need checks every few seconds.
Polling is simple but imprecise. The kernel already knows exactly when children terminate and can tell us immediately via the SIGCHLD signal. Using signals for child reaping combines the responsiveness of blocking waits with the non-blocking nature we need.
How SIGCHLD Works:
exit() or is killed)SIGCHLD to the parentwaitpid() with WNOHANG to reap the childWhy WNOHANG in the Signal Handler?
Even in a signal handler, we use WNOHANG because:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <signal.h>#include <sys/wait.h>#include <errno.h> // Global counter for statistics (in practice, use atomic or proper sync)volatile sig_atomic_t children_reaped = 0; /** * SIGCHLD handler: reap all terminated children * * Key requirements: * 1. Use WNOHANG - never block in a signal handler * 2. Reap ALL children - signals can coalesce * 3. Preserve errno - system calls might set it * 4. Be async-signal-safe - only call safe functions */void sigchld_handler(int sig) { // Preserve errno (signal handlers can interrupt system calls) int saved_errno = errno; int status; pid_t pid; // Loop to reap ALL terminated children // Critical: if 3 children exit before handler runs, // we only get ONE SIGCHLD but must reap all 3 while ((pid = waitpid(-1, &status, WNOHANG)) > 0) { children_reaped++; // Note: printf is NOT async-signal-safe! // This is for demonstration only. // In production, use write() or set a flag. if (WIFEXITED(status)) { // Would log: Child pid exited with WEXITSTATUS(status) } else if (WIFSIGNALED(status)) { // Would log: Child pid killed by signal WTERMSIG(status) } } // Restore errno errno = saved_errno;} int main() { // Install SIGCHLD handler struct sigaction sa; sa.sa_handler = sigchld_handler; sigemptyset(&sa.sa_mask); sa.sa_flags = SA_RESTART | SA_NOCLDSTOP; // SA_RESTART: restart interrupted system calls // SA_NOCLDSTOP: don't signal for stopped children, only terminated if (sigaction(SIGCHLD, &sa, NULL) < 0) { perror("sigaction"); exit(1); } printf("Parent: Starting (PID %d)\n", getpid()); // Create several children with different lifespans for (int i = 0; i < 5; i++) { pid_t pid = fork(); if (pid == 0) { sleep(i + 1); // Children exit at different times _exit(i * 10); } printf("Parent: Created child %d (PID %d)\n", i, pid); } // Main loop - does work while children run // SIGCHLD handler reaps children in the background printf("\nParent: Doing main work...\n"); for (int i = 0; i < 10; i++) { printf("Parent: Main loop iteration %d (reaped: %d)\n", i, children_reaped); sleep(1); } printf("\nParent: Finished. Total children reaped: %d\n", children_reaped); return 0;}Critical Signal Handler Requirements:
Use WNOHANG: Never block in a signal handler
Reap in a loop: Multiple children may have terminated before the handler runs. Standard UNIX signals don't queue—if three SIGCHLD signals arrive while the handler is blocked, you may only get one invocation.
Preserve errno: The signal handler might interrupt code that was about to check errno. Save and restore it.
Async-signal-safety: Only call functions that are async-signal-safe. printf() is NOT safe (in the example it's for demonstration). Use write() for logging or set a flag for later processing.
SA_RESTART flag: When a signal interrupts a blocking call (like read()), SA_RESTART causes the call to be automatically restarted instead of failing with EINTR.
SA_NOCLDSTOP flag: Only receive SIGCHLD for termination, not for stopping (SIGSTOP/SIGTSTP). This avoids unnecessary handler invocations.
If 10 children terminate simultaneously, you might receive only 1 SIGCHLD. This is why the loop 'while ((pid = waitpid(-1, ..., WNOHANG)) > 0)' is essential—it reaps ALL available zombies, not just one. Never call waitpid() just once in a signal handler.
Modern servers use event loops (select, poll, epoll, kqueue) to wait on multiple I/O sources simultaneously. Integrating child process management into these loops requires careful design.
The Challenge:
select() and poll() wait on file descriptors, not process IDs. You can't directly add a "wait for this child" item to your event loop.
Solution 1: Self-Pipe Trick
Create a pipe that the SIGCHLD handler writes to. The event loop includes the pipe's read end. When a child terminates, the handler writes a byte, waking the event loop.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <signal.h>#include <sys/wait.h>#include <sys/select.h>#include <fcntl.h>#include <errno.h> // Self-pipe for waking event loop on SIGCHLDint sigchld_pipe[2]; /** * Signal handler: write to self-pipe to wake event loop * Writing a single byte is async-signal-safe */void sigchld_handler(int sig) { int saved_errno = errno; write(sigchld_pipe[1], "C", 1); // 'C' for child, any byte works errno = saved_errno;} /** * Make a file descriptor non-blocking */void make_nonblocking(int fd) { int flags = fcntl(fd, F_GETFL, 0); fcntl(fd, F_SETFL, flags | O_NONBLOCK);} /** * Process all terminated children */void reap_children() { int status; pid_t pid; while ((pid = waitpid(-1, &status, WNOHANG)) > 0) { printf("Event loop: Reaped child %d\n", pid); if (WIFEXITED(status)) { printf(" Exit code: %d\n", WEXITSTATUS(status)); } }} /** * Drain the self-pipe (consume notification bytes) */void drain_pipe() { char buf[16]; while (read(sigchld_pipe[0], buf, sizeof(buf)) > 0) { // Discard bytes - we just needed the wake-up }} int main() { // Create self-pipe if (pipe(sigchld_pipe) < 0) { perror("pipe"); exit(1); } make_nonblocking(sigchld_pipe[0]); make_nonblocking(sigchld_pipe[1]); // Install signal handler struct sigaction sa = {0}; sa.sa_handler = sigchld_handler; sa.sa_flags = SA_RESTART | SA_NOCLDSTOP; sigaction(SIGCHLD, &sa, NULL); // Spawn some children for (int i = 0; i < 3; i++) { pid_t pid = fork(); if (pid == 0) { sleep(i + 1); printf("Child %d: exiting\n", getpid()); _exit(i); } printf("Spawned child %d\n", pid); } // Event loop printf("\nEntering event loop...\n"); int iterations = 0; while (iterations < 10) { fd_set readfds; FD_ZERO(&readfds); FD_SET(sigchld_pipe[0], &readfds); // Watch for child signals // Would also add: FD_SET(socket_fd, &readfds); struct timeval timeout = {1, 0}; // 1 second timeout int ready = select(sigchld_pipe[0] + 1, &readfds, NULL, NULL, &timeout); if (ready > 0 && FD_ISSET(sigchld_pipe[0], &readfds)) { printf("Event loop: SIGCHLD notification received\n"); drain_pipe(); reap_children(); } else if (ready == 0) { printf("Event loop: Timeout, doing other work...\n"); } iterations++; } printf("Event loop completed\n"); return 0;}Solution 2: signalfd() (Linux-specific)
Linux provides signalfd(), which creates a file descriptor that becomes readable when a signal arrives. This integrates directly with epoll/select without the self-pipe complexity:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <signal.h>#include <sys/signalfd.h>#include <sys/epoll.h>#include <sys/wait.h> int main() { // Block SIGCHLD (signalfd requires the signal to be blocked) sigset_t mask; sigemptyset(&mask); sigaddset(&mask, SIGCHLD); sigprocmask(SIG_BLOCK, &mask, NULL); // Create signalfd for SIGCHLD int sfd = signalfd(-1, &mask, SFD_NONBLOCK); if (sfd < 0) { perror("signalfd"); exit(1); } // Create epoll instance int epfd = epoll_create1(0); struct epoll_event ev = {.events = EPOLLIN, .data.fd = sfd}; epoll_ctl(epfd, EPOLL_CTL_ADD, sfd, &ev); // Spawn children for (int i = 0; i < 3; i++) { if (fork() == 0) { sleep(i + 1); _exit(i); } } // Event loop with epoll struct epoll_event events[10]; for (int i = 0; i < 10; i++) { int n = epoll_wait(epfd, events, 10, 1000); for (int j = 0; j < n; j++) { if (events[j].data.fd == sfd) { // SIGCHLD received struct signalfd_siginfo si; read(sfd, &si, sizeof(si)); printf("Child %d terminated\n", si.ssi_pid); // Still need to reap all children int status; pid_t pid; while ((pid = waitpid(-1, &status, WNOHANG)) > 0) { printf("Reaped %d (status %d)\n", pid, WEXITSTATUS(status)); } } } } close(sfd); close(epfd); return 0;}Let's consolidate the key patterns and best practices for blocking vs. non-blocking waits:
| Scenario | Recommended Approach | Rationale |
|---|---|---|
| Shell waiting for foreground command | Blocking wait() | Nothing else to do until command completes |
| Build system running sequential tasks | Blocking waitpid() for specific child | Tasks must complete in order |
| Parallel build with N workers | SIGCHLD + WNOHANG | Start new tasks as workers finish |
| Pre-forking server | SIGCHLD handler or self-pipe | Must accept connections while handlers run |
| Event loop (epoll/select) | signalfd or self-pipe | Unified event handling mechanism |
| Simple daemon with few children | Periodic polling | Simple, sufficient for few children |
If you want to spawn a truly independent process (daemon), use double-fork: the parent forks, the intermediate child forks again and exits immediately, and the grandchild is orphaned to init. The parent waits only for the intermediate child (which exits quickly). This avoids both blocking and zombies.
This page has explored the fundamental distinction between blocking and non-blocking waits, providing you with the tools to build responsive systems that properly manage child processes.
What's Next:
We've covered waiting for a single child, but real programs often spawn many children. The next page explores handling multiple children: tracking PIDs, waiting for specific children, and managing parallel worker pools.
You now understand the crucial difference between blocking and non-blocking waits, and have multiple strategies for keeping your applications responsive while managing child processes. Next, we'll tackle the complexities of handling multiple children.