Loading content...
You've mastered signal concepts, learned the common signals, written proper handlers, and can send signals to any process. Yet experienced developers speak of signals with wariness—treating them as a mechanism of last resort rather than a primary communication channel. Why?
The answer lies in signal reliability. Despite decades of improvement, signals have fundamental characteristics that make them tricky in edge cases. Standard signals don't queue, handlers can miss events, and race conditions lurk in even well-designed code. Understanding these limitations—and the techniques to work around them—is what separates robust systems from ones that mostly work until they don't.
This page examines signal reliability from historical context to modern best practices, ensuring you can build systems that handle signals correctly under all conditions.
By the end of this page, you will understand: the historical unreliable signal problem and its solutions, why standard signals don't queue, race condition patterns in signal handling, reliable signal programming techniques, when to use signals vs. other IPC, and real-time signal reliability guarantees.
The original signal implementation in Version 7 Unix (1979) had fundamental reliability problems that caused intermittent, hard-to-diagnose bugs.
In V7 Unix, when a signal handler was invoked, the signal disposition was automatically reset to SIG_DFL before the handler ran. If the same signal arrived during handler execution, the default action (usually terminate) would occur.
/* V7-style signal handler - UNRELIABLE */
void sigint_handler(int sig) {
/* DANGER: Signal disposition is now SIG_DFL! */
/* If SIGINT arrives NOW, process terminates! */
signal(SIGINT, sigint_handler); /* Re-register */
/* ... but there was a window ... */
/* Handle the signal */
}
The race condition was real, not theoretical:
Under heavy signal load, this race triggered regularly. Worse, it was probabilistic—the bug was nearly impossible to reproduce consistently in testing but appeared in production.
No signal blocking: There was no way to temporarily block signals during critical sections. Code like:
/* Critical section - nothing protected it from signals */
lock();
modify_shared_data();
unlock();
Could be interrupted at any point, with the handler potentially touching the same data.
System call interruption without restart: Blocked system calls were interrupted by signals and returned EINTR, but there was no automatic restart. Every blocking call needed retry loops.
Lost signals: If the same signal occurred multiple times while blocked or during handler execution, only the last instance was 'remembered' (pending). All intermediate occurrences were lost.
BSD addressed these issues with "reliable signals":
sigblock() and sigsetmask() for blockingsigpause() to atomically unblock and waitSA_RESTART semantic for interrupted syscallsBut BSD's interface differed from System V, causing portability nightmares until POSIX unified signal handling in 1988.
Modern systems implement signal() with BSD-style reliable semantics (handlers stay installed). However, POSIX doesn't mandate this—signal() behavior is implementation-defined. Always use sigaction() for guaranteed reliability and portability.
Even with modern POSIX signals, a fundamental reliability limitation remains: standard signals (1-31) do not queue. If the same signal is generated multiple times while blocked or pending, only one instance is recorded.
For each signal, the kernel maintains a single pending bit:
/* Conceptual kernel data structure */
struct process {
uint32_t pending_signals; /* Bitmask, not a queue! */
/* Signal 5 pending? Check bit 5. That's ALL the state. */
};
When signal N is generated:
pending_signals123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
#include <signal.h>#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sys/wait.h> /* * Demonstrates signal counting problem with SIGCHLD. * Three children exit, but handler may only run once or twice! */ volatile sig_atomic_t sigchld_count = 0; void sigchld_handler(int sig) { sigchld_count++; /* WRONG: Handler may not run for each child! */} /* CORRECT approach using loop in handler */volatile sig_atomic_t children_handled = 0; void proper_sigchld_handler(int sig) { int status; pid_t pid; /* * Loop to reap ALL terminated children. * Multiple SIGCHLD may have collapsed into one. */ while ((pid = waitpid(-1, &status, WNOHANG)) > 0) { children_handled++; /* Process child exit */ }} int main() { struct sigaction sa; /* Use the incorrect handler first */ sa.sa_handler = sigchld_handler; sigemptyset(&sa.sa_mask); sa.sa_flags = SA_RESTART; /* Or SA_NOCLDSTOP to not get stop signals */ sigaction(SIGCHLD, &sa, NULL); /* Create three children that exit immediately */ for (int i = 0; i < 3; i++) { if (fork() == 0) { _exit(i); /* Child exits immediately */ } } /* Give children time to exit */ sleep(1); printf("Simple counter shows %d SIGCHLD signals\n", sigchld_count); printf("But we had 3 children! Some signals may have been lost.\n\n"); /* Now demonstrate correct approach */ sa.sa_handler = proper_sigchld_handler; sigaction(SIGCHLD, &sa, NULL); sigchld_count = 0; children_handled = 0; for (int i = 0; i < 3; i++) { if (fork() == 0) { _exit(i); } } sleep(1); printf("Proper loop handled %d children\n", children_handled); printf("Even if SIGCHLD only delivered once, we found all children.\n"); return 0;} /* * KEY INSIGHT: * Never assume handler runs once per event. * Multiple events may collapse into one signal. * Use loops to drain all pending events (waitpid with WNOHANG, * read from pipe until EAGAIN, etc.). */SIGCHLD (child termination notification) is the most commonly affected signal. Consider a server that forks children for each connection:
Solution: Always loop in SIGCHLD handler with waitpid(..., WNOHANG) until no more children:
while (waitpid(-1, &status, WNOHANG) > 0) {
/* Process each terminated child */
}
If you need to count signal occurrences (not just react to them), standard signals are unsuitable:
| Scenario | Standard Signals | Real-Time Signals |
|---|---|---|
| Count exact occurrences | ❌ May lose count | ✅ All queued |
| Event notification | ✅ "Something happened" | ✅ Plus count |
| Precise synchronization | ❌ May miss some | ✅ All delivered |
Never rely on receiving one signal per event with standard signals. Design as if signals mean 'at least one event occurred, maybe more.' Always check for all pending events when handling the signal. Use real-time signals (SIGRTMIN+) if counting matters.
Signals introduce concurrency into otherwise sequential programs. Even single-threaded code must reason about race conditions when signals are involved.
A common bug pattern:
volatile sig_atomic_t got_signal = 0;
void handler(int sig) { got_signal = 1; }
int main() {
/* Race condition pattern */
while (!got_signal) { /* Check: flag is 0 */
/* SIGNAL ARRIVES HERE - flag becomes 1 */
pause(); /* Act: wait forever, signal already handled! */
/* pause() will block until NEXT signal */
}
}
The signal arrives after the check but before pause(). The handler runs, sets the flag, but pause() still blocks because it wasn't in pause() when the signal arrived.
sigsuspend() atomically combines unblocking and waiting:
#include <signal.h>
int sigsuspend(const sigset_t *mask);
It temporarily replaces the signal mask with mask and suspends until a signal is caught. Since these operations are atomic, no race exists.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
#include <signal.h>#include <stdio.h>#include <unistd.h> volatile sig_atomic_t got_sigint = 0; void sigint_handler(int sig) { /* Minimal handler - uses write(), not printf() */ const char msg[] = "\nHandler: SIGINT received\n"; write(STDERR_FILENO, msg, sizeof(msg) - 1); got_sigint = 1;} int main() { struct sigaction sa; sigset_t mask, oldmask, emptymask; /* Install handler */ sa.sa_handler = sigint_handler; sigemptyset(&sa.sa_mask); sa.sa_flags = 0; sigaction(SIGINT, &sa, NULL); /* Block SIGINT */ sigemptyset(&mask); sigaddset(&mask, SIGINT); sigprocmask(SIG_BLOCK, &mask, &oldmask); sigemptyset(&emptymask); /* For sigsuspend: unblock all */ printf("SIGINT blocked. Do some work...\n"); /* Simulate critical work while SIGINT blocked */ for (int i = 0; i < 3; i++) { printf("Working... (Ctrl+C is deferred)\n"); sleep(1); } printf("Critical work done. Waiting for SIGINT...\n"); printf("(Any pending SIGINT will be delivered NOW)\n"); /* * CORRECT: Atomic unblock-and-wait using sigsuspend() * This replaces the buggy check-then-pause pattern. */ while (!got_sigint) { /* * Atomically: * 1. Set mask to emptymask (unblocking SIGINT) * 2. Suspend until signal * 3. Restore original mask on return */ sigsuspend(&emptymask); /* Handler ran, we woke up */ } /* Restore original mask */ sigprocmask(SIG_SETMASK, &oldmask, NULL); printf("Got SIGINT, exiting cleanly.\n"); return 0;} /* * Why sigsuspend works: * The unblock and suspend are ATOMIC. There's no window where * the signal can arrive after unblocking but before suspending. * The signal either: * - Was pending (delivered immediately, sigsuspend returns) * - Arrives after (wakes sigsuspend, which returns) */When handler and main code access the same data:
struct {
int count;
int values[100];
} data;
void handler(int sig) {
data.values[data.count] = sig; /* Read data.count */
data.count++; /* Modify data.count */
}
int main() {
while (1) {
if (data.count > 0) { /* Read data.count */
/* SIGNAL INTERRUPTS HERE */
int val = data.values[--data.count]; /* Read/modify */
process(val);
}
}
}
Main reads count, signal modifies count, main modifies based on stale read → corruption.
Solutions:
The safest strategy: handlers don't touch shared data at all. Set a volatile sig_atomic_t flag. Everything else happens in main code, which can block signals during critical sections. This eliminates nearly all race conditions.
When standard signal limitations are unacceptable, POSIX real-time signals provide stronger guarantees.
| Property | Standard Signals | Real-Time Signals |
|---|---|---|
| Queuing | No | Yes |
| Ordering | Undefined | Lowest number first |
| Payload | No | Yes (sigqueue) |
| Count | May lose | All delivered |
| Delivery | One pending bit | Full queue |
Real-time signals span from SIGRTMIN to SIGRTMAX, typically:
/* Portable real-time signal access */
int my_signal = SIGRTMIN + 2; /* Third real-time signal */
/* Check range */
if (SIGRTMAX - SIGRTMIN >= 8) {
/* At least 8 real-time signals available */
}
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
#include <signal.h>#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <sys/wait.h> /* * Demonstrates real-time signal queuing. * Send multiple signals, ALL are delivered. */ volatile sig_atomic_t rt_signal_count = 0;volatile sig_atomic_t last_received_value = -1; void rt_handler(int sig, siginfo_t *info, void *context) { rt_signal_count++; last_received_value = info->si_value.sival_int; char buf[100]; int len = snprintf(buf, sizeof(buf), "Received RT signal, value=%d, count=%d\n", info->si_value.sival_int, rt_signal_count); write(STDOUT_FILENO, buf, len);} int main() { struct sigaction sa; sigset_t block_mask, old_mask; union sigval val; /* Install real-time signal handler */ sa.sa_sigaction = rt_handler; sigemptyset(&sa.sa_mask); sa.sa_flags = SA_SIGINFO; /* Required for si_value */ sigaction(SIGRTMIN, &sa, NULL); /* Block SIGRTMIN to demonstrate queuing */ sigemptyset(&block_mask); sigaddset(&block_mask, SIGRTMIN); sigprocmask(SIG_BLOCK, &block_mask, &old_mask); printf("SIGRTMIN blocked. Sending 5 signals...\n"); /* Send 5 signals with different values */ for (int i = 1; i <= 5; i++) { val.sival_int = i * 100; if (sigqueue(getpid(), SIGRTMIN, val) == -1) { perror("sigqueue"); exit(1); } printf(" Queued signal with value %d\n", i * 100); } printf("\nUnblocking SIGRTMIN. All queued signals should deliver:\n\n"); sigprocmask(SIG_SETMASK, &old_mask, NULL); /* Give time for signals to process */ sleep(1); printf("\nTotal signals received: %d\n", rt_signal_count); printf("Compare with standard signals: they would have delivered ONCE.\n"); return 0;}Real-time signals queue, but the queue has finite capacity:
# Linux: Check per-process queue limit
ulimit -i
# Or
cat /proc/sys/kernel/rtsig-max
When the queue is full, sigqueue() fails with EAGAIN:
if (sigqueue(pid, sig, val) == -1) {
if (errno == EAGAIN) {
/* Queue full - signal not delivered */
/* Retry, use different IPC, or drop */
}
}
Good candidates:
Overkill for:
Real-time signal behavior is consistent across modern POSIX systems, but the number of available signals and queue sizes vary. Test on target platforms. Also note: some older systems, and some real-time extensions, define additional real-time signals with special behaviors.
When a signal is delivered while a process is blocked in a system call, the interaction is complex. Understanding this is essential for reliable I/O code.
Many system calls (read, write, wait, sleep, select, etc.) block until an event occurs. If a signal arrives during the block:
errno = EINTR ("Interrupted")Without handling EINTR, your code fails mysteriously:
/* WRONG: Doesn't handle EINTR */
int n = read(fd, buf, size);
if (n == -1) {
perror("read failed"); /* Might just be EINTR - not a real error! */
exit(1);
}
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
#include <signal.h>#include <unistd.h>#include <errno.h>#include <stdio.h> /* * Wrapper functions that handle EINTR correctly. * These should be used for all blocking I/O in signal-aware code. */ /* Read that retries on EINTR */ssize_t safe_read(int fd, void *buf, size_t count) { ssize_t n; while ((n = read(fd, buf, count)) == -1) { if (errno == EINTR) { /* Interrupted by signal - safe to retry */ continue; } /* Real error */ return -1; } return n;} /* Write that retries on EINTR and handles partial writes */ssize_t safe_write(int fd, const void *buf, size_t count) { const char *ptr = buf; size_t remaining = count; while (remaining > 0) { ssize_t n = write(fd, ptr, remaining); if (n == -1) { if (errno == EINTR) { /* Interrupted - retry */ continue; } return -1; /* Real error */ } ptr += n; remaining -= n; } return count;} /* Wait that retries on EINTR */pid_t safe_waitpid(pid_t pid, int *status, int options) { pid_t result; while ((result = waitpid(pid, status, options)) == -1) { if (errno == EINTR) { continue; } return -1; } return result;} /* Select that restarts on EINTR (with timeout adjustment for precision) */int safe_select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout) { int result; /* Note: For precise timeout handling, should use pselect or track elapsed time */ while ((result = select(nfds, readfds, writefds, exceptfds, timeout)) == -1) { if (errno == EINTR) { continue; /* Timeout may need adjustment for accuracy */ } return -1; } return result;}The SA_RESTART flag in sigaction() tells the kernel to automatically restart certain syscalls after the handler returns:
struct sigaction sa;
sa.sa_handler = handler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART; /* Auto-restart */
sigaction(SIGINT, &sa, NULL);
With SA_RESTART, many blocking calls transparently resume instead of returning EINTR.
Even with SA_RESTART, not all syscalls restart:
Generally restart (with SA_RESTART):
Generally DON'T restart (even with SA_RESTART):
Rationale: Syscalls that should "break out" on signal (like pause) shouldn't auto-restart. Timer syscalls don't restart because the elapsed time would be lost.
| Strategy | Pros | Cons |
|---|---|---|
| Handle EINTR explicitly | Full control, works everywhere | Verbose code |
| Use SA_RESTART | Simpler code | Some syscalls still return EINTR |
| Both approaches | Most robust | Belt-and-suspenders |
Always handle EINTR, even with SA_RESTART. The flag helps but doesn't cover all syscalls. Production code should either: (1) retry on EINTR in loops, or (2) treat EINTR as a valid 'check for shutdown' point by checking a flag before retrying.
Having explored signals in depth, it's worth stepping back to consider when signals are appropriate and when other IPC mechanisms are better choices.
Appropriate for:
Problematic for:
| Use Case | Signals | Pipes | Sockets | Shared Memory |
|---|---|---|---|---|
| Shutdown notification | ✅ Best | OK | OK | ❌ |
| Data transfer | ❌ | ✅ Good | ✅ Best | ✅ Fastest |
| High frequency events | ❌ Slow | Good | Good | ✅ Best |
| Counted events | ❌ May lose | ✅ | ✅ | ✅ |
| Cross-machine | ❌ No | ❌ No | ✅ Yes | ❌ No |
| Process management | ✅ Best | ❌ | ❌ | ❌ |
| Debugging (stop/resume) | ✅ Only option | ❌ | ❌ | ❌ |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
/* * Modern alternatives to signal-based IPC on Linux */ #include <sys/eventfd.h>#include <sys/signalfd.h>#include <sys/timerfd.h> /* * eventfd: Fast, queueing event notification * Better than signals for high-frequency notifications */int efd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);uint64_t count = 1;write(efd, &count, sizeof(count)); /* Notify */read(efd, &count, sizeof(count)); /* Receive - count is accurate */ /* * signalfd: Synchronous signal handling via fd * Eliminates async-signal-safety concerns entirely */sigset_t mask;sigemptyset(&mask);sigaddset(&mask, SIGINT);sigaddset(&mask, SIGTERM);sigprocmask(SIG_BLOCK, &mask, NULL); /* Block these signals */ int sfd = signalfd(-1, &mask, SFD_NONBLOCK | SFD_CLOEXEC);/* Now read from sfd to get signals - no handlers needed */ struct signalfd_siginfo info;if (read(sfd, &info, sizeof(info)) > 0) { /* Handle info.ssi_signo synchronously, safely */} /* * timerfd: Timer notifications as fd events * Far cleaner than SIGALRM for event loops */int tfd = timerfd_create(CLOCK_MONOTONIC, TFD_NONBLOCK | TFD_CLOEXEC);struct itimerspec ts = { .it_value = { .tv_sec = 5, .tv_nsec = 0 }, /* First in 5s */ .it_interval = { .tv_sec = 1, .tv_nsec = 0 } /* Then every 1s */};timerfd_settime(tfd, 0, &ts, NULL);/* Now poll/select/epoll on tfd like any other fd */For new code on Linux, consider these guidelines:
For event loops: Use signalfd() + eventfd() + timerfd() with epoll(). Signals become just another event source, handled synchronously and safely.
For simple shutdown: Traditional signal handlers are fine, but keep handlers minimal (set flag only).
For data transfer: Don't use signals. Use pipes, sockets, or shared memory.
For multi-threaded apps: Block signals in all threads except a dedicated handler thread using sigwait().
For portable code: Stick to basic POSIX signal handling with sigaction() and proper EINTR handling.
signalfd(), eventfd(), and timerfd() are Linux-specific. For portable Unix code, you're limited to traditional signal handling, pipes, and sockets. The self-pipe trick remains the portable way to integrate signals with event loops.
Let's consolidate everything into patterns you can apply directly in production systems.
123456789101112
volatile sig_atomic_t shutdown_requested = 0; void handler(int sig) { shutdown_requested = 1;} void main_loop() { while (!shutdown_requested) { do_work(); } cleanup();}12345678910111213141516171819202122232425262728
static int signal_pipe[2]; void handler(int sig) { int saved_errno = errno; char s = sig; write(signal_pipe[1], &s, 1); errno = saved_errno;} void event_loop() { pipe2(signal_pipe, O_NONBLOCK | O_CLOEXEC); while (1) { struct pollfd fds[] = { { .fd = signal_pipe[0], .events = POLLIN }, /* other fds... */ }; int n = poll(fds, 1, -1); if (fds[0].revents & POLLIN) { char sig; while (read(signal_pipe[0], &sig, 1) > 0) { handle_signal_safely((int)sig); } } }}123456789101112
void sigchld_handler(int sig) { int saved_errno = errno; int status; pid_t pid; /* Reap ALL terminated children */ while ((pid = waitpid(-1, &status, WNOHANG)) > 0) { /* Log or record child exit */ } errno = saved_errno;}12345678910111213141516171819202122
volatile sig_atomic_t shutdown_phase = 0; void term_handler(int sig) { if (shutdown_phase == 0) { shutdown_phase = 1; /* First SIGTERM: graceful */ } else { shutdown_phase = 2; /* Second: force */ }} void shutdown_gracefully() { signal(SIGTERM, term_handler); signal(SIGINT, term_handler); alarm(30); /* Force exit after 30s */ while (shutdown_phase < 2 && work_remaining()) { do_remaining_work(); } cleanup(); exit(0);}12345678910111213141516171819202122232425262728293031
void *signal_thread(void *arg) { sigset_t *mask = arg; int sig; while (1) { sigwait(mask, &sig); /* Handle synchronously - can use ANY function */ switch (sig) { case SIGTERM: initiate_shutdown(); return NULL; case SIGHUP: reload_config(); break; case SIGUSR1: dump_stats(); break; } }} int main() { sigset_t mask; sigemptyset(&mask); sigaddset(&mask, SIGTERM); sigaddset(&mask, SIGHUP); sigaddset(&mask, SIGUSR1); pthread_sigmask(SIG_BLOCK, &mask, NULL); /* Block before creating threads */ pthread_t sig_thr; pthread_create(&sig_thr, NULL, signal_thread, &mask); /* Worker threads inherit blocked mask, won't receive signals */ /* ... */}Pattern 1: Simple single-threaded apps. Pattern 2: Event-driven servers. Pattern 3: Any code that forks children. Pattern 4: Long-running daemons. Pattern 5: Multi-threaded servers. Mix and match as needed—they're complementary.
Signal reliability is about understanding limitations and designing around them. You now have the complete picture—from historical problems to modern solutions. Let's consolidate:
Module Complete:
You've completed the Signals module. You understand signal concepts, know the common signals, can implement handlers correctly, can send signals to processes and threads, and understand reliability concerns and solutions.
Signals are a fundamental UNIX mechanism—you'll encounter them in every substantial systems programming project. The knowledge from this module will serve you throughout your career in systems development, DevOps, and backend engineering on Unix-like platforms.
Congratulations! You now have deep, practical knowledge of UNIX signals—from concepts through reliability. You can implement robust signal handling in production systems, debug signal-related issues, and make informed decisions about when signals are the right tool versus other IPC mechanisms.