Signals - Learning Module

Loading content...

0/227

Signal Reliability: Edge Cases and Robustness

When Signals Go Wrong

You've mastered signal concepts, learned the common signals, written proper handlers, and can send signals to any process. Yet experienced developers speak of signals with wariness—treating them as a mechanism of last resort rather than a primary communication channel. Why?

The answer lies in signal reliability. Despite decades of improvement, signals have fundamental characteristics that make them tricky in edge cases. Standard signals don't queue, handlers can miss events, and race conditions lurk in even well-designed code. Understanding these limitations—and the techniques to work around them—is what separates robust systems from ones that mostly work until they don't.

This page examines signal reliability from historical context to modern best practices, ensuring you can build systems that handle signals correctly under all conditions.

What You Will Learn

By the end of this page, you will understand: the historical unreliable signal problem and its solutions, why standard signals don't queue, race condition patterns in signal handling, reliable signal programming techniques, when to use signals vs. other IPC, and real-time signal reliability guarantees.

Historical Unreliability: The V7 Unix Problem

The original signal implementation in Version 7 Unix (1979) had fundamental reliability problems that caused intermittent, hard-to-diagnose bugs.

The Core Problem: Handler Reset

In V7 Unix, when a signal handler was invoked, the signal disposition was automatically reset to SIG_DFL before the handler ran. If the same signal arrived during handler execution, the default action (usually terminate) would occur.

/* V7-style signal handler - UNRELIABLE */
void sigint_handler(int sig) {
    /* DANGER: Signal disposition is now SIG_DFL! */
    /* If SIGINT arrives NOW, process terminates! */
    
    signal(SIGINT, sigint_handler);  /* Re-register */
    /* ... but there was a window ... */
    
    /* Handle the signal */
}

The Race Window

The race condition was real, not theoretical:

SIGINT arrives
Handler starts executing, disposition reset to SIG_DFL
Second SIGINT arrives NOW → Process terminates!
Handler would have re-registered if it had time

Under heavy signal load, this race triggered regularly. Worse, it was probabilistic—the bug was nearly impossible to reproduce consistently in testing but appeared in production.

Converting Mermaid diagram...

The BSD Solution (4.2BSD, 1983)

BSD addressed these issues with "reliable signals":

Handlers remained installed after invocation
sigblock() and sigsetmask() for blocking
sigpause() to atomically unblock and wait
SA_RESTART semantic for interrupted syscalls

But BSD's interface differed from System V, causing portability nightmares until POSIX unified signal handling in 1988.

Legacy Compatibility

Modern systems implement signal() with BSD-style reliable semantics (handlers stay installed). However, POSIX doesn't mandate this—signal() behavior is implementation-defined. Always use sigaction() for guaranteed reliability and portability.

Standard Signals Don't Queue

Even with modern POSIX signals, a fundamental reliability limitation remains: standard signals (1-31) do not queue. If the same signal is generated multiple times while blocked or pending, only one instance is recorded.

The Mechanism

For each signal, the kernel maintains a single pending bit:

/* Conceptual kernel data structure */
struct process {
    uint32_t pending_signals;  /* Bitmask, not a queue! */
    /* Signal 5 pending? Check bit 5. That's ALL the state. */
};

When signal N is generated:

Bit N is set in pending_signals
If bit N was already set, nothing changes—information is lost

Practical Implications

signal_count_problem.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
 
/*
 * Demonstrates signal counting problem with SIGCHLD.
 * Three children exit, but handler may only run once or twice!
 */
 
volatile sig_atomic_t sigchld_count = 0;
 
void sigchld_handler(int sig) {
    sigchld_count++;
    /* WRONG: Handler may not run for each child! */
}
 
/* CORRECT approach using loop in handler */
volatile sig_atomic_t children_handled = 0;
 
void proper_sigchld_handler(int sig) {
    int status;
    pid_t pid;
    
    /* 
     * Loop to reap ALL terminated children.
     * Multiple SIGCHLD may have collapsed into one.
     */
    while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
        children_handled++;
        /* Process child exit */
    }
}
 
int main() {
    struct sigaction sa;
    
    /* Use the incorrect handler first */
    sa.sa_handler = sigchld_handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_RESTART;  /* Or SA_NOCLDSTOP to not get stop signals */
    sigaction(SIGCHLD, &sa, NULL);
    
    /* Create three children that exit immediately */
    for (int i = 0; i < 3; i++) {
        if (fork() == 0) {
            _exit(i);  /* Child exits immediately */
        }
    }
    
    /* Give children time to exit */
    sleep(1);
    
    printf("Simple counter shows %d SIGCHLD signals\n", sigchld_count);
    printf("But we had 3 children! Some signals may have been lost.\n\n");
    
    /* Now demonstrate correct approach */
    sa.sa_handler = proper_sigchld_handler;
    sigaction(SIGCHLD, &sa, NULL);
    sigchld_count = 0;
    children_handled = 0;
    
    for (int i = 0; i < 3; i++) {
        if (fork() == 0) {
            _exit(i);
        }
    }
    
    sleep(1);
    
    printf("Proper loop handled %d children\n", children_handled);
    printf("Even if SIGCHLD only delivered once, we found all children.\n");
    
    return 0;
}
 
/*
 * KEY INSIGHT: 
 * Never assume handler runs once per event.
 * Multiple events may collapse into one signal.
 * Use loops to drain all pending events (waitpid with WNOHANG,
 * read from pipe until EAGAIN, etc.).
 */

The SIGCHLD Problem: Classic Example

SIGCHLD (child termination notification) is the most commonly affected signal. Consider a server that forks children for each connection:

Three children terminate simultaneously
Kernel generates three SIGCHLD signals
First SIGCHLD sets pending bit
Second and third SIGCHLD arrive—bit already set, nothing changes
Handler runs once, reaps one child
Two zombies remain!

Solution: Always loop in SIGCHLD handler with waitpid(..., WNOHANG) until no more children:

while (waitpid(-1, &status, WNOHANG) > 0) {
    /* Process each terminated child */
}

When Signal Counting Matters

If you need to count signal occurrences (not just react to them), standard signals are unsuitable:

Scenario	Standard Signals	Real-Time Signals
Count exact occurrences	❌ May lose count	✅ All queued
Event notification	✅ "Something happened"	✅ Plus count
Precise synchronization	❌ May miss some	✅ All delivered

Design Rule

Never rely on receiving one signal per event with standard signals. Design as if signals mean 'at least one event occurred, maybe more.' Always check for all pending events when handling the signal. Use real-time signals (SIGRTMIN+) if counting matters.

Race Conditions in Signal Code

Signals introduce concurrency into otherwise sequential programs. Even single-threaded code must reason about race conditions when signals are involved.

Classic Race: Check-Then-Act

A common bug pattern:

volatile sig_atomic_t got_signal = 0;

void handler(int sig) { got_signal = 1; }

int main() {
    /* Race condition pattern */
    while (!got_signal) {  /* Check: flag is 0 */
        /* SIGNAL ARRIVES HERE - flag becomes 1 */
        pause();  /* Act: wait forever, signal already handled! */
        /* pause() will block until NEXT signal */
    }
}

The signal arrives after the check but before pause(). The handler runs, sets the flag, but pause() still blocks because it wasn't in pause() when the signal arrived.

The Solution: sigsuspend()

sigsuspend() atomically combines unblocking and waiting:

#include <signal.h>

int sigsuspend(const sigset_t *mask);

It temporarily replaces the signal mask with mask and suspends until a signal is caught. Since these operations are atomic, no race exists.

sigsuspend_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
 
volatile sig_atomic_t got_sigint = 0;
 
void sigint_handler(int sig) {
    /* Minimal handler - uses write(), not printf() */
    const char msg[] = "\nHandler: SIGINT received\n";
    write(STDERR_FILENO, msg, sizeof(msg) - 1);
    got_sigint = 1;
}
 
int main() {
    struct sigaction sa;
    sigset_t mask, oldmask, emptymask;
    
    /* Install handler */
    sa.sa_handler = sigint_handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;
    sigaction(SIGINT, &sa, NULL);
    
    /* Block SIGINT */
    sigemptyset(&mask);
    sigaddset(&mask, SIGINT);
    sigprocmask(SIG_BLOCK, &mask, &oldmask);
    
    sigemptyset(&emptymask);  /* For sigsuspend: unblock all */
    
    printf("SIGINT blocked. Do some work...\n");
    
    /* Simulate critical work while SIGINT blocked */
    for (int i = 0; i < 3; i++) {
        printf("Working... (Ctrl+C is deferred)\n");
        sleep(1);
    }
    
    printf("Critical work done. Waiting for SIGINT...\n");
    printf("(Any pending SIGINT will be delivered NOW)\n");
    
    /* 
     * CORRECT: Atomic unblock-and-wait using sigsuspend()
     * This replaces the buggy check-then-pause pattern.
     */
    while (!got_sigint) {
        /* 
         * Atomically:
         * 1. Set mask to emptymask (unblocking SIGINT)
         * 2. Suspend until signal
         * 3. Restore original mask on return
         */
        sigsuspend(&emptymask);
        /* Handler ran, we woke up */
    }
    
    /* Restore original mask */
    sigprocmask(SIG_SETMASK, &oldmask, NULL);
    
    printf("Got SIGINT, exiting cleanly.\n");
    return 0;
}
 
/*
 * Why sigsuspend works:
 * The unblock and suspend are ATOMIC. There's no window where
 * the signal can arrive after unblocking but before suspending.
 * The signal either:
 * - Was pending (delivered immediately, sigsuspend returns)
 * - Arrives after (wakes sigsuspend, which returns)
 */

Race: Modifying Shared Data

When handler and main code access the same data:

struct {
    int count;
    int values[100];
} data;

void handler(int sig) {
    data.values[data.count] = sig;  /* Read data.count */
    data.count++;                    /* Modify data.count */
}

int main() {
    while (1) {
        if (data.count > 0) {         /* Read data.count */
            /* SIGNAL INTERRUPTS HERE */
            int val = data.values[--data.count];  /* Read/modify */
            process(val);
        }
    }
}

Main reads count, signal modifies count, main modifies based on stale read → corruption.

Solutions:

Block signals during critical sections (sigprocmask)
Use lock-free techniques for simple cases (sig_atomic_t for single values)
Avoid shared data—handler only sets flags, main does all data manipulation

The Simplicity Principle

The safest strategy: handlers don't touch shared data at all. Set a volatile sig_atomic_t flag. Everything else happens in main code, which can block signals during critical sections. This eliminates nearly all race conditions.

Real-Time Signals for Reliability

When standard signal limitations are unacceptable, POSIX real-time signals provide stronger guarantees.

Real-Time Signal Guarantees

Property	Standard Signals	Real-Time Signals
Queuing	No	Yes
Ordering	Undefined	Lowest number first
Payload	No	Yes (sigqueue)
Count	May lose	All delivered
Delivery	One pending bit	Full queue

The Range

Real-time signals span from SIGRTMIN to SIGRTMAX, typically:

Linux: SIGRTMIN (34) to SIGRTMAX (64) = 31 signals
The exact numbers vary by platform; always use symbolic constants

/* Portable real-time signal access */
int my_signal = SIGRTMIN + 2;  /* Third real-time signal */

/* Check range */
if (SIGRTMAX - SIGRTMIN >= 8) {
    /* At least 8 real-time signals available */
}

realtime_signal_queue.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
 
/*
 * Demonstrates real-time signal queuing.
 * Send multiple signals, ALL are delivered.
 */
 
volatile sig_atomic_t rt_signal_count = 0;
volatile sig_atomic_t last_received_value = -1;
 
void rt_handler(int sig, siginfo_t *info, void *context) {
    rt_signal_count++;
    last_received_value = info->si_value.sival_int;
    
    char buf[100];
    int len = snprintf(buf, sizeof(buf), 
                       "Received RT signal, value=%d, count=%d\n",
                       info->si_value.sival_int, rt_signal_count);
    write(STDOUT_FILENO, buf, len);
}
 
int main() {
    struct sigaction sa;
    sigset_t block_mask, old_mask;
    union sigval val;
    
    /* Install real-time signal handler */
    sa.sa_sigaction = rt_handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_SIGINFO;  /* Required for si_value */
    sigaction(SIGRTMIN, &sa, NULL);
    
    /* Block SIGRTMIN to demonstrate queuing */
    sigemptyset(&block_mask);
    sigaddset(&block_mask, SIGRTMIN);
    sigprocmask(SIG_BLOCK, &block_mask, &old_mask);
    
    printf("SIGRTMIN blocked. Sending 5 signals...\n");
    
    /* Send 5 signals with different values */
    for (int i = 1; i <= 5; i++) {
        val.sival_int = i * 100;
        
        if (sigqueue(getpid(), SIGRTMIN, val) == -1) {
            perror("sigqueue");
            exit(1);
        }
        
        printf("  Queued signal with value %d\n", i * 100);
    }
    
    printf("\nUnblocking SIGRTMIN. All queued signals should deliver:\n\n");
    
    sigprocmask(SIG_SETMASK, &old_mask, NULL);
    
    /* Give time for signals to process */
    sleep(1);
    
    printf("\nTotal signals received: %d\n", rt_signal_count);
    printf("Compare with standard signals: they would have delivered ONCE.\n");
    
    return 0;
}

Queue Limits

Real-time signals queue, but the queue has finite capacity:

# Linux: Check per-process queue limit
ulimit -i
# Or
cat /proc/sys/kernel/rtsig-max

When the queue is full, sigqueue() fails with EAGAIN:

if (sigqueue(pid, sig, val) == -1) {
    if (errno == EAGAIN) {
        /* Queue full - signal not delivered */
        /* Retry, use different IPC, or drop */
    }
}

When to Use Real-Time Signals

Good candidates:

Event counting applications
Real-time systems requiring guaranteed delivery
Complex IPC with payload data
Applications needing multiple distinct signal types

Overkill for:

Simple termination handling
One-time notifications
Cases where 'at least one' is sufficient

Platform Considerations

Real-time signal behavior is consistent across modern POSIX systems, but the number of available signals and queue sizes vary. Test on target platforms. Also note: some older systems, and some real-time extensions, define additional real-time signals with special behaviors.

Interrupted System Calls: EINTR and SA_RESTART

When a signal is delivered while a process is blocked in a system call, the interaction is complex. Understanding this is essential for reliable I/O code.

The EINTR Problem

Many system calls (read, write, wait, sleep, select, etc.) block until an event occurs. If a signal arrives during the block:

Kernel delivers signal, handler runs
Handler returns
System call returns -1 with errno = EINTR ("Interrupted")

Without handling EINTR, your code fails mysteriously:

/* WRONG: Doesn't handle EINTR */
int n = read(fd, buf, size);
if (n == -1) {
    perror("read failed");  /* Might just be EINTR - not a real error! */
    exit(1);
}

eintr_handling.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
#include <signal.h>
#include <unistd.h>
#include <errno.h>
#include <stdio.h>
 
/*
 * Wrapper functions that handle EINTR correctly.
 * These should be used for all blocking I/O in signal-aware code.
 */
 
/* Read that retries on EINTR */
ssize_t safe_read(int fd, void *buf, size_t count) {
    ssize_t n;
    
    while ((n = read(fd, buf, count)) == -1) {
        if (errno == EINTR) {
            /* Interrupted by signal - safe to retry */
            continue;
        }
        /* Real error */
        return -1;
    }
    
    return n;
}
 
/* Write that retries on EINTR and handles partial writes */
ssize_t safe_write(int fd, const void *buf, size_t count) {
    const char *ptr = buf;
    size_t remaining = count;
    
    while (remaining > 0) {
        ssize_t n = write(fd, ptr, remaining);
        
        if (n == -1) {
            if (errno == EINTR) {
                /* Interrupted - retry */
                continue;
            }
            return -1;  /* Real error */
        }
        
        ptr += n;
        remaining -= n;
    }
    
    return count;
}
 
/* Wait that retries on EINTR */
pid_t safe_waitpid(pid_t pid, int *status, int options) {
    pid_t result;
    
    while ((result = waitpid(pid, status, options)) == -1) {
        if (errno == EINTR) {
            continue;
        }
        return -1;
    }
    
    return result;
}
 
/* Select that restarts on EINTR (with timeout adjustment for precision) */
int safe_select(int nfds, fd_set *readfds, fd_set *writefds,
                fd_set *exceptfds, struct timeval *timeout) {
    int result;
    /* Note: For precise timeout handling, should use pselect or track elapsed time */
    
    while ((result = select(nfds, readfds, writefds, exceptfds, timeout)) == -1) {
        if (errno == EINTR) {
            continue;  /* Timeout may need adjustment for accuracy */
        }
        return -1;
    }
    
    return result;
}

SA_RESTART: Automatic System Call Restart

The SA_RESTART flag in sigaction() tells the kernel to automatically restart certain syscalls after the handler returns:

struct sigaction sa;
sa.sa_handler = handler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;  /* Auto-restart */
sigaction(SIGINT, &sa, NULL);

With SA_RESTART, many blocking calls transparently resume instead of returning EINTR.

Syscalls That Restart vs. Don't Restart

Even with SA_RESTART, not all syscalls restart:

Generally restart (with SA_RESTART):

read(), write(), readv(), writev() on "slow" devices
wait(), waitpid()
accept(), connect()
fcntl() for locking (F_SETLKW)
flock()

Generally DON'T restart (even with SA_RESTART):

sleep(), nanosleep(), usleep()
select(), pselect(), poll(), epoll_wait()
pause(), sigsuspend()
semop(), msgrcv(), msgsnd()
connect() (for non-blocking)

Rationale: Syscalls that should "break out" on signal (like pause) shouldn't auto-restart. Timer syscalls don't restart because the elapsed time would be lost.

Which Approach to Use?

Strategy	Pros	Cons
Handle EINTR explicitly	Full control, works everywhere	Verbose code
Use SA_RESTART	Simpler code	Some syscalls still return EINTR
Both approaches	Most robust	Belt-and-suspenders

Golden Rule

Always handle EINTR, even with SA_RESTART. The flag helps but doesn't cover all syscalls. Production code should either: (1) retry on EINTR in loops, or (2) treat EINTR as a valid 'check for shutdown' point by checking a flag before retrying.

When Not to Use Signals

Having explored signals in depth, it's worth stepping back to consider when signals are appropriate and when other IPC mechanisms are better choices.

Signal Strengths

Appropriate for:

Termination and shutdown requests (SIGTERM, SIGINT)
Hangup and configuration reload (SIGHUP)
Child process status monitoring (SIGCHLD)
Alarm and timer notifications (SIGALRM)
Process suspension and resumption (SIGSTOP, SIGCONT)
Abnormal termination (SIGKILL, SIGABRT)
Simple one-bit notifications (SIGUSR1/2)

Signal Weaknesses

Problematic for:

High-frequency notifications (signals are relatively expensive)
Data transfer (limited to one int via sigqueue)
Reliable counting (standard signals don't queue)
Complex protocols (no acknowledgment, difficult state management)
Cross-thread communication (masks per thread, complex)
Bidirectional communication (signals are one-way)

Choosing IPC Mechanisms
Use Case	Signals	Pipes	Sockets	Shared Memory
Shutdown notification	✅ Best	OK	OK	❌
Data transfer	❌	✅ Good	✅ Best	✅ Fastest
High frequency events	❌ Slow	Good	Good	✅ Best
Counted events	❌ May lose	✅	✅	✅
Cross-machine	❌ No	❌ No	✅ Yes	❌ No
Process management	✅ Best	❌	❌	❌
Debugging (stop/resume)	✅ Only option	❌	❌	❌

modern_alternatives.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
/*
 * Modern alternatives to signal-based IPC on Linux
 */
 
#include <sys/eventfd.h>
#include <sys/signalfd.h>
#include <sys/timerfd.h>
 
/* 
 * eventfd: Fast, queueing event notification
 * Better than signals for high-frequency notifications
 */
int efd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
uint64_t count = 1;
write(efd, &count, sizeof(count));  /* Notify */
read(efd, &count, sizeof(count));   /* Receive - count is accurate */
 
/*
 * signalfd: Synchronous signal handling via fd
 * Eliminates async-signal-safety concerns entirely
 */
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
sigaddset(&mask, SIGTERM);
sigprocmask(SIG_BLOCK, &mask, NULL);  /* Block these signals */
 
int sfd = signalfd(-1, &mask, SFD_NONBLOCK | SFD_CLOEXEC);
/* Now read from sfd to get signals - no handlers needed */
 
struct signalfd_siginfo info;
if (read(sfd, &info, sizeof(info)) > 0) {
    /* Handle info.ssi_signo synchronously, safely */
}
 
/*
 * timerfd: Timer notifications as fd events
 * Far cleaner than SIGALRM for event loops
 */
int tfd = timerfd_create(CLOCK_MONOTONIC, TFD_NONBLOCK | TFD_CLOEXEC);
struct itimerspec ts = {
    .it_value = { .tv_sec = 5, .tv_nsec = 0 },    /* First in 5s */
    .it_interval = { .tv_sec = 1, .tv_nsec = 0 }  /* Then every 1s */
};
timerfd_settime(tfd, 0, &ts, NULL);
/* Now poll/select/epoll on tfd like any other fd */

The Modern Recommendation

For new code on Linux, consider these guidelines:

For event loops: Use signalfd() + eventfd() + timerfd() with epoll(). Signals become just another event source, handled synchronously and safely.
For simple shutdown: Traditional signal handlers are fine, but keep handlers minimal (set flag only).
For data transfer: Don't use signals. Use pipes, sockets, or shared memory.
For multi-threaded apps: Block signals in all threads except a dedicated handler thread using sigwait().
For portable code: Stick to basic POSIX signal handling with sigaction() and proper EINTR handling.

Portability vs. Elegance

signalfd(), eventfd(), and timerfd() are Linux-specific. For portable Unix code, you're limited to traditional signal handling, pipes, and sockets. The self-pipe trick remains the portable way to integrate signals with event loops.

Robust Signal Handling Patterns

Let's consolidate everything into patterns you can apply directly in production systems.

Pattern 1: Minimal Handler with Flag

pattern_minimal.c
1
2
3
4
5
6
7
8
9
10
11
12
volatile sig_atomic_t shutdown_requested = 0;
 
void handler(int sig) {
    shutdown_requested = 1;
}
 
void main_loop() {
    while (!shutdown_requested) {
        do_work();
    }
    cleanup();
}

Pattern 2: Self-Pipe for Event Loops

pattern_selfpipe.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
static int signal_pipe[2];
 
void handler(int sig) {
    int saved_errno = errno;
    char s = sig;
    write(signal_pipe[1], &s, 1);
    errno = saved_errno;
}
 
void event_loop() {
    pipe2(signal_pipe, O_NONBLOCK | O_CLOEXEC);
    
    while (1) {
        struct pollfd fds[] = {
            { .fd = signal_pipe[0], .events = POLLIN },
            /* other fds... */
        };
        
        int n = poll(fds, 1, -1);
        
        if (fds[0].revents & POLLIN) {
            char sig;
            while (read(signal_pipe[0], &sig, 1) > 0) {
                handle_signal_safely((int)sig);
            }
        }
    }
}

Pattern 3: SIGCHLD with Loop

pattern_sigchld.c
1
2
3
4
5
6
7
8
9
10
11
12
void sigchld_handler(int sig) {
    int saved_errno = errno;
    int status;
    pid_t pid;
    
    /* Reap ALL terminated children */
    while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
        /* Log or record child exit */
    }
    
    errno = saved_errno;
}

Pattern 4: Graceful Shutdown with Timeout

pattern_graceful.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
volatile sig_atomic_t shutdown_phase = 0;
 
void term_handler(int sig) {
    if (shutdown_phase == 0) {
        shutdown_phase = 1;  /* First SIGTERM: graceful */
    } else {
        shutdown_phase = 2;  /* Second: force */
    }
}
 
void shutdown_gracefully() {
    signal(SIGTERM, term_handler);
    signal(SIGINT, term_handler);
    alarm(30);  /* Force exit after 30s */
    
    while (shutdown_phase < 2 && work_remaining()) {
        do_remaining_work();
    }
    
    cleanup();
    exit(0);
}

Pattern 5: Dedicated Signal Thread (Multi-threaded)

pattern_thread.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
void *signal_thread(void *arg) {
    sigset_t *mask = arg;
    int sig;
    
    while (1) {
        sigwait(mask, &sig);
        
        /* Handle synchronously - can use ANY function */
        switch (sig) {
            case SIGTERM: initiate_shutdown(); return NULL;
            case SIGHUP: reload_config(); break;
            case SIGUSR1: dump_stats(); break;
        }
    }
}
 
int main() {
    sigset_t mask;
    sigemptyset(&mask);
    sigaddset(&mask, SIGTERM);
    sigaddset(&mask, SIGHUP);
    sigaddset(&mask, SIGUSR1);
    
    pthread_sigmask(SIG_BLOCK, &mask, NULL);  /* Block before creating threads */
    
    pthread_t sig_thr;
    pthread_create(&sig_thr, NULL, signal_thread, &mask);
    
    /* Worker threads inherit blocked mask, won't receive signals */
    /* ... */
}

Choose Your Pattern

Pattern 1: Simple single-threaded apps. Pattern 2: Event-driven servers. Pattern 3: Any code that forks children. Pattern 4: Long-running daemons. Pattern 5: Multi-threaded servers. Mix and match as needed—they're complementary.

Summary: Signal Reliability

Signal reliability is about understanding limitations and designing around them. You now have the complete picture—from historical problems to modern solutions. Let's consolidate:

Key Takeaways

•Historical problems are solved — Use sigaction(), not signal(). Modern signals are reliable in the sense of 'handlers stay installed.'
•Standard signals don't queue — Multiple occurrences collapse to one. Design for 'at least one' not 'exactly one.'
•Always loop in SIGCHLD — Multiple children may exit, but you get one signal. waitpid() with WNOHANG in a loop.
•Race conditions require sigsuspend() — Atomic unblock-and-wait eliminates the check-then-pause race.
•Real-time signals queue — When counting matters or you need payload, use SIGRTMIN+ with sigqueue().
•Handle EINTR everywhere — Even with SA_RESTART, some syscalls return EINTR. Retry or check shutdown flag.
•Consider alternatives — signalfd(), eventfd(), pipes, or dedicated signal threads often produce cleaner code.
•Keep handlers minimal — Set a flag, possibly write to a pipe. Do everything else in normal code context.

Module Complete:

You've completed the Signals module. You understand signal concepts, know the common signals, can implement handlers correctly, can send signals to processes and threads, and understand reliability concerns and solutions.

Signals are a fundamental UNIX mechanism—you'll encounter them in every substantial systems programming project. The knowledge from this module will serve you throughout your career in systems development, DevOps, and backend engineering on Unix-like platforms.

Module Complete: Signals

Congratulations! You now have deep, practical knowledge of UNIX signals—from concepts through reliability. You can implement robust signal handling in production systems, debug signal-related issues, and make informed decisions about when signals are the right tool versus other IPC mechanisms.

Signal Reliability: Edge Cases and Robustness

When Signals Go Wrong

This page examines signal reliability from historical context to modern best practices, ensuring you can build systems that handle signals correctly under all conditions.

What You Will Learn

Historical Unreliability: The V7 Unix Problem

The original signal implementation in Version 7 Unix (1979) had fundamental reliability problems that caused intermittent, hard-to-diagnose bugs.

The Core Problem: Handler Reset

/* V7-style signal handler - UNRELIABLE */
void sigint_handler(int sig) {
    /* DANGER: Signal disposition is now SIG_DFL! */
    /* If SIGINT arrives NOW, process terminates! */
    
    signal(SIGINT, sigint_handler);  /* Re-register */
    /* ... but there was a window ... */
    
    /* Handle the signal */
}

The Race Window

The race condition was real, not theoretical:

SIGINT arrives
Handler starts executing, disposition reset to SIG_DFL
Second SIGINT arrives NOW → Process terminates!
Handler would have re-registered if it had time

Under heavy signal load, this race triggered regularly. Worse, it was probabilistic—the bug was nearly impossible to reproduce consistently in testing but appeared in production.

Converting Mermaid diagram...

The BSD Solution (4.2BSD, 1983)

BSD addressed these issues with "reliable signals":

Handlers remained installed after invocation
sigblock() and sigsetmask() for blocking
sigpause() to atomically unblock and wait
SA_RESTART semantic for interrupted syscalls

But BSD's interface differed from System V, causing portability nightmares until POSIX unified signal handling in 1988.

Legacy Compatibility

Standard Signals Don't Queue

The Mechanism

For each signal, the kernel maintains a single pending bit:

/* Conceptual kernel data structure */
struct process {
    uint32_t pending_signals;  /* Bitmask, not a queue! */
    /* Signal 5 pending? Check bit 5. That's ALL the state. */
};

When signal N is generated:

Bit N is set in pending_signals
If bit N was already set, nothing changes—information is lost

Practical Implications

signal_count_problem.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
 
/*
 * Demonstrates signal counting problem with SIGCHLD.
 * Three children exit, but handler may only run once or twice!
 */
 
volatile sig_atomic_t sigchld_count = 0;
 
void sigchld_handler(int sig) {
    sigchld_count++;
    /* WRONG: Handler may not run for each child! */
}
 
/* CORRECT approach using loop in handler */
volatile sig_atomic_t children_handled = 0;
 
void proper_sigchld_handler(int sig) {
    int status;
    pid_t pid;
    
    /* 
     * Loop to reap ALL terminated children.
     * Multiple SIGCHLD may have collapsed into one.
     */
    while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
        children_handled++;
        /* Process child exit */
    }
}
 
int main() {
    struct sigaction sa;
    
    /* Use the incorrect handler first */
    sa.sa_handler = sigchld_handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_RESTART;  /* Or SA_NOCLDSTOP to not get stop signals */
    sigaction(SIGCHLD, &sa, NULL);
    
    /* Create three children that exit immediately */
    for (int i = 0; i < 3; i++) {
        if (fork() == 0) {
            _exit(i);  /* Child exits immediately */
        }
    }
    
    /* Give children time to exit */
    sleep(1);
    
    printf("Simple counter shows %d SIGCHLD signals\n", sigchld_count);
    printf("But we had 3 children! Some signals may have been lost.\n\n");
    
    /* Now demonstrate correct approach */
    sa.sa_handler = proper_sigchld_handler;
    sigaction(SIGCHLD, &sa, NULL);
    sigchld_count = 0;
    children_handled = 0;
    
    for (int i = 0; i < 3; i++) {
        if (fork() == 0) {
            _exit(i);
        }
    }
    
    sleep(1);
    
    printf("Proper loop handled %d children\n", children_handled);
    printf("Even if SIGCHLD only delivered once, we found all children.\n");
    
    return 0;
}
 
/*
 * KEY INSIGHT: 
 * Never assume handler runs once per event.
 * Multiple events may collapse into one signal.
 * Use loops to drain all pending events (waitpid with WNOHANG,
 * read from pipe until EAGAIN, etc.).
 */

The SIGCHLD Problem: Classic Example

SIGCHLD (child termination notification) is the most commonly affected signal. Consider a server that forks children for each connection:

Three children terminate simultaneously
Kernel generates three SIGCHLD signals
First SIGCHLD sets pending bit
Second and third SIGCHLD arrive—bit already set, nothing changes
Handler runs once, reaps one child
Two zombies remain!

Solution: Always loop in SIGCHLD handler with waitpid(..., WNOHANG) until no more children:

while (waitpid(-1, &status, WNOHANG) > 0) {
    /* Process each terminated child */
}

When Signal Counting Matters

If you need to count signal occurrences (not just react to them), standard signals are unsuitable:

Scenario	Standard Signals	Real-Time Signals
Count exact occurrences	❌ May lose count	✅ All queued
Event notification	✅ "Something happened"	✅ Plus count
Precise synchronization	❌ May miss some	✅ All delivered

Design Rule

Race Conditions in Signal Code

Signals introduce concurrency into otherwise sequential programs. Even single-threaded code must reason about race conditions when signals are involved.

Classic Race: Check-Then-Act

A common bug pattern:

volatile sig_atomic_t got_signal = 0;

void handler(int sig) { got_signal = 1; }

int main() {
    /* Race condition pattern */
    while (!got_signal) {  /* Check: flag is 0 */
        /* SIGNAL ARRIVES HERE - flag becomes 1 */
        pause();  /* Act: wait forever, signal already handled! */
        /* pause() will block until NEXT signal */
    }
}

The signal arrives after the check but before pause(). The handler runs, sets the flag, but pause() still blocks because it wasn't in pause() when the signal arrived.

The Solution: sigsuspend()

sigsuspend() atomically combines unblocking and waiting:

#include <signal.h>

int sigsuspend(const sigset_t *mask);

It temporarily replaces the signal mask with mask and suspends until a signal is caught. Since these operations are atomic, no race exists.

sigsuspend_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
 
volatile sig_atomic_t got_sigint = 0;
 
void sigint_handler(int sig) {
    /* Minimal handler - uses write(), not printf() */
    const char msg[] = "\nHandler: SIGINT received\n";
    write(STDERR_FILENO, msg, sizeof(msg) - 1);
    got_sigint = 1;
}
 
int main() {
    struct sigaction sa;
    sigset_t mask, oldmask, emptymask;
    
    /* Install handler */
    sa.sa_handler = sigint_handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;
    sigaction(SIGINT, &sa, NULL);
    
    /* Block SIGINT */
    sigemptyset(&mask);
    sigaddset(&mask, SIGINT);
    sigprocmask(SIG_BLOCK, &mask, &oldmask);
    
    sigemptyset(&emptymask);  /* For sigsuspend: unblock all */
    
    printf("SIGINT blocked. Do some work...\n");
    
    /* Simulate critical work while SIGINT blocked */
    for (int i = 0; i < 3; i++) {
        printf("Working... (Ctrl+C is deferred)\n");
        sleep(1);
    }
    
    printf("Critical work done. Waiting for SIGINT...\n");
    printf("(Any pending SIGINT will be delivered NOW)\n");
    
    /* 
     * CORRECT: Atomic unblock-and-wait using sigsuspend()
     * This replaces the buggy check-then-pause pattern.
     */
    while (!got_sigint) {
        /* 
         * Atomically:
         * 1. Set mask to emptymask (unblocking SIGINT)
         * 2. Suspend until signal
         * 3. Restore original mask on return
         */
        sigsuspend(&emptymask);
        /* Handler ran, we woke up */
    }
    
    /* Restore original mask */
    sigprocmask(SIG_SETMASK, &oldmask, NULL);
    
    printf("Got SIGINT, exiting cleanly.\n");
    return 0;
}
 
/*
 * Why sigsuspend works:
 * The unblock and suspend are ATOMIC. There's no window where
 * the signal can arrive after unblocking but before suspending.
 * The signal either:
 * - Was pending (delivered immediately, sigsuspend returns)
 * - Arrives after (wakes sigsuspend, which returns)
 */

Race: Modifying Shared Data

When handler and main code access the same data:

struct {
    int count;
    int values[100];
} data;

void handler(int sig) {
    data.values[data.count] = sig;  /* Read data.count */
    data.count++;                    /* Modify data.count */
}

int main() {
    while (1) {
        if (data.count > 0) {         /* Read data.count */
            /* SIGNAL INTERRUPTS HERE */
            int val = data.values[--data.count];  /* Read/modify */
            process(val);
        }
    }
}

Main reads count, signal modifies count, main modifies based on stale read → corruption.

Solutions:

Block signals during critical sections (sigprocmask)
Use lock-free techniques for simple cases (sig_atomic_t for single values)
Avoid shared data—handler only sets flags, main does all data manipulation

The Simplicity Principle

Real-Time Signals for Reliability

When standard signal limitations are unacceptable, POSIX real-time signals provide stronger guarantees.

Real-Time Signal Guarantees

Property	Standard Signals	Real-Time Signals
Queuing	No	Yes
Ordering	Undefined	Lowest number first
Payload	No	Yes (sigqueue)
Count	May lose	All delivered
Delivery	One pending bit	Full queue

The Range

Real-time signals span from SIGRTMIN to SIGRTMAX, typically:

Linux: SIGRTMIN (34) to SIGRTMAX (64) = 31 signals
The exact numbers vary by platform; always use symbolic constants

/* Portable real-time signal access */
int my_signal = SIGRTMIN + 2;  /* Third real-time signal */

/* Check range */
if (SIGRTMAX - SIGRTMIN >= 8) {
    /* At least 8 real-time signals available */
}

realtime_signal_queue.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
 
/*
 * Demonstrates real-time signal queuing.
 * Send multiple signals, ALL are delivered.
 */
 
volatile sig_atomic_t rt_signal_count = 0;
volatile sig_atomic_t last_received_value = -1;
 
void rt_handler(int sig, siginfo_t *info, void *context) {
    rt_signal_count++;
    last_received_value = info->si_value.sival_int;
    
    char buf[100];
    int len = snprintf(buf, sizeof(buf), 
                       "Received RT signal, value=%d, count=%d\n",
                       info->si_value.sival_int, rt_signal_count);
    write(STDOUT_FILENO, buf, len);
}
 
int main() {
    struct sigaction sa;
    sigset_t block_mask, old_mask;
    union sigval val;
    
    /* Install real-time signal handler */
    sa.sa_sigaction = rt_handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_SIGINFO;  /* Required for si_value */
    sigaction(SIGRTMIN, &sa, NULL);
    
    /* Block SIGRTMIN to demonstrate queuing */
    sigemptyset(&block_mask);
    sigaddset(&block_mask, SIGRTMIN);
    sigprocmask(SIG_BLOCK, &block_mask, &old_mask);
    
    printf("SIGRTMIN blocked. Sending 5 signals...\n");
    
    /* Send 5 signals with different values */
    for (int i = 1; i <= 5; i++) {
        val.sival_int = i * 100;
        
        if (sigqueue(getpid(), SIGRTMIN, val) == -1) {
            perror("sigqueue");
            exit(1);
        }
        
        printf("  Queued signal with value %d\n", i * 100);
    }
    
    printf("\nUnblocking SIGRTMIN. All queued signals should deliver:\n\n");
    
    sigprocmask(SIG_SETMASK, &old_mask, NULL);
    
    /* Give time for signals to process */
    sleep(1);
    
    printf("\nTotal signals received: %d\n", rt_signal_count);
    printf("Compare with standard signals: they would have delivered ONCE.\n");
    
    return 0;
}

Queue Limits

Real-time signals queue, but the queue has finite capacity:

# Linux: Check per-process queue limit
ulimit -i
# Or
cat /proc/sys/kernel/rtsig-max

When the queue is full, sigqueue() fails with EAGAIN:

if (sigqueue(pid, sig, val) == -1) {
    if (errno == EAGAIN) {
        /* Queue full - signal not delivered */
        /* Retry, use different IPC, or drop */
    }
}

When to Use Real-Time Signals

Good candidates:

Event counting applications
Real-time systems requiring guaranteed delivery
Complex IPC with payload data
Applications needing multiple distinct signal types

Overkill for:

Simple termination handling
One-time notifications
Cases where 'at least one' is sufficient

Platform Considerations

Interrupted System Calls: EINTR and SA_RESTART

When a signal is delivered while a process is blocked in a system call, the interaction is complex. Understanding this is essential for reliable I/O code.

The EINTR Problem

Many system calls (read, write, wait, sleep, select, etc.) block until an event occurs. If a signal arrives during the block:

Kernel delivers signal, handler runs
Handler returns
System call returns -1 with errno = EINTR ("Interrupted")

Without handling EINTR, your code fails mysteriously:

/* WRONG: Doesn't handle EINTR */
int n = read(fd, buf, size);
if (n == -1) {
    perror("read failed");  /* Might just be EINTR - not a real error! */
    exit(1);
}

eintr_handling.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
#include <signal.h>
#include <unistd.h>
#include <errno.h>
#include <stdio.h>
 
/*
 * Wrapper functions that handle EINTR correctly.
 * These should be used for all blocking I/O in signal-aware code.
 */
 
/* Read that retries on EINTR */
ssize_t safe_read(int fd, void *buf, size_t count) {
    ssize_t n;
    
    while ((n = read(fd, buf, count)) == -1) {
        if (errno == EINTR) {
            /* Interrupted by signal - safe to retry */
            continue;
        }
        /* Real error */
        return -1;
    }
    
    return n;
}
 
/* Write that retries on EINTR and handles partial writes */
ssize_t safe_write(int fd, const void *buf, size_t count) {
    const char *ptr = buf;
    size_t remaining = count;
    
    while (remaining > 0) {
        ssize_t n = write(fd, ptr, remaining);
        
        if (n == -1) {
            if (errno == EINTR) {
                /* Interrupted - retry */
                continue;
            }
            return -1;  /* Real error */
        }
        
        ptr += n;
        remaining -= n;
    }
    
    return count;
}
 
/* Wait that retries on EINTR */
pid_t safe_waitpid(pid_t pid, int *status, int options) {
    pid_t result;
    
    while ((result = waitpid(pid, status, options)) == -1) {
        if (errno == EINTR) {
            continue;
        }
        return -1;
    }
    
    return result;
}
 
/* Select that restarts on EINTR (with timeout adjustment for precision) */
int safe_select(int nfds, fd_set *readfds, fd_set *writefds,
                fd_set *exceptfds, struct timeval *timeout) {
    int result;
    /* Note: For precise timeout handling, should use pselect or track elapsed time */
    
    while ((result = select(nfds, readfds, writefds, exceptfds, timeout)) == -1) {
        if (errno == EINTR) {
            continue;  /* Timeout may need adjustment for accuracy */
        }
        return -1;
    }
    
    return result;
}

SA_RESTART: Automatic System Call Restart

The SA_RESTART flag in sigaction() tells the kernel to automatically restart certain syscalls after the handler returns:

struct sigaction sa;
sa.sa_handler = handler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;  /* Auto-restart */
sigaction(SIGINT, &sa, NULL);

With SA_RESTART, many blocking calls transparently resume instead of returning EINTR.

Syscalls That Restart vs. Don't Restart

Even with SA_RESTART, not all syscalls restart:

Generally restart (with SA_RESTART):

read(), write(), readv(), writev() on "slow" devices
wait(), waitpid()
accept(), connect()
fcntl() for locking (F_SETLKW)
flock()

Generally DON'T restart (even with SA_RESTART):

sleep(), nanosleep(), usleep()
select(), pselect(), poll(), epoll_wait()
pause(), sigsuspend()
semop(), msgrcv(), msgsnd()
connect() (for non-blocking)

Rationale: Syscalls that should "break out" on signal (like pause) shouldn't auto-restart. Timer syscalls don't restart because the elapsed time would be lost.

Which Approach to Use?

Strategy	Pros	Cons
Handle EINTR explicitly	Full control, works everywhere	Verbose code
Use SA_RESTART	Simpler code	Some syscalls still return EINTR
Both approaches	Most robust	Belt-and-suspenders

Golden Rule

When Not to Use Signals

Having explored signals in depth, it's worth stepping back to consider when signals are appropriate and when other IPC mechanisms are better choices.

Signal Strengths

Appropriate for:

Termination and shutdown requests (SIGTERM, SIGINT)
Hangup and configuration reload (SIGHUP)
Child process status monitoring (SIGCHLD)
Alarm and timer notifications (SIGALRM)
Process suspension and resumption (SIGSTOP, SIGCONT)
Abnormal termination (SIGKILL, SIGABRT)
Simple one-bit notifications (SIGUSR1/2)

Signal Weaknesses

Problematic for:

High-frequency notifications (signals are relatively expensive)
Data transfer (limited to one int via sigqueue)
Reliable counting (standard signals don't queue)
Complex protocols (no acknowledgment, difficult state management)
Cross-thread communication (masks per thread, complex)
Bidirectional communication (signals are one-way)

Choosing IPC Mechanisms
Use Case	Signals	Pipes	Sockets	Shared Memory
Shutdown notification	✅ Best	OK	OK	❌
Data transfer	❌	✅ Good	✅ Best	✅ Fastest
High frequency events	❌ Slow	Good	Good	✅ Best
Counted events	❌ May lose	✅	✅	✅
Cross-machine	❌ No	❌ No	✅ Yes	❌ No
Process management	✅ Best	❌	❌	❌
Debugging (stop/resume)	✅ Only option	❌	❌	❌

modern_alternatives.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
/*
 * Modern alternatives to signal-based IPC on Linux
 */
 
#include <sys/eventfd.h>
#include <sys/signalfd.h>
#include <sys/timerfd.h>
 
/* 
 * eventfd: Fast, queueing event notification
 * Better than signals for high-frequency notifications
 */
int efd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
uint64_t count = 1;
write(efd, &count, sizeof(count));  /* Notify */
read(efd, &count, sizeof(count));   /* Receive - count is accurate */
 
/*
 * signalfd: Synchronous signal handling via fd
 * Eliminates async-signal-safety concerns entirely
 */
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
sigaddset(&mask, SIGTERM);
sigprocmask(SIG_BLOCK, &mask, NULL);  /* Block these signals */
 
int sfd = signalfd(-1, &mask, SFD_NONBLOCK | SFD_CLOEXEC);
/* Now read from sfd to get signals - no handlers needed */
 
struct signalfd_siginfo info;
if (read(sfd, &info, sizeof(info)) > 0) {
    /* Handle info.ssi_signo synchronously, safely */
}
 
/*
 * timerfd: Timer notifications as fd events
 * Far cleaner than SIGALRM for event loops
 */
int tfd = timerfd_create(CLOCK_MONOTONIC, TFD_NONBLOCK | TFD_CLOEXEC);
struct itimerspec ts = {
    .it_value = { .tv_sec = 5, .tv_nsec = 0 },    /* First in 5s */
    .it_interval = { .tv_sec = 1, .tv_nsec = 0 }  /* Then every 1s */
};
timerfd_settime(tfd, 0, &ts, NULL);
/* Now poll/select/epoll on tfd like any other fd */

The Modern Recommendation

For new code on Linux, consider these guidelines:

For event loops: Use signalfd() + eventfd() + timerfd() with epoll(). Signals become just another event source, handled synchronously and safely.
For simple shutdown: Traditional signal handlers are fine, but keep handlers minimal (set flag only).
For data transfer: Don't use signals. Use pipes, sockets, or shared memory.
For multi-threaded apps: Block signals in all threads except a dedicated handler thread using sigwait().
For portable code: Stick to basic POSIX signal handling with sigaction() and proper EINTR handling.

Portability vs. Elegance

Robust Signal Handling Patterns

Let's consolidate everything into patterns you can apply directly in production systems.

Pattern 1: Minimal Handler with Flag

pattern_minimal.c
1
2
3
4
5
6
7
8
9
10
11
12
volatile sig_atomic_t shutdown_requested = 0;
 
void handler(int sig) {
    shutdown_requested = 1;
}
 
void main_loop() {
    while (!shutdown_requested) {
        do_work();
    }
    cleanup();
}

Pattern 2: Self-Pipe for Event Loops

pattern_selfpipe.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
static int signal_pipe[2];
 
void handler(int sig) {
    int saved_errno = errno;
    char s = sig;
    write(signal_pipe[1], &s, 1);
    errno = saved_errno;
}
 
void event_loop() {
    pipe2(signal_pipe, O_NONBLOCK | O_CLOEXEC);
    
    while (1) {
        struct pollfd fds[] = {
            { .fd = signal_pipe[0], .events = POLLIN },
            /* other fds... */
        };
        
        int n = poll(fds, 1, -1);
        
        if (fds[0].revents & POLLIN) {
            char sig;
            while (read(signal_pipe[0], &sig, 1) > 0) {
                handle_signal_safely((int)sig);
            }
        }
    }
}

Pattern 3: SIGCHLD with Loop

pattern_sigchld.c
1
2
3
4
5
6
7
8
9
10
11
12
void sigchld_handler(int sig) {
    int saved_errno = errno;
    int status;
    pid_t pid;
    
    /* Reap ALL terminated children */
    while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
        /* Log or record child exit */
    }
    
    errno = saved_errno;
}

Pattern 4: Graceful Shutdown with Timeout

pattern_graceful.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
volatile sig_atomic_t shutdown_phase = 0;
 
void term_handler(int sig) {
    if (shutdown_phase == 0) {
        shutdown_phase = 1;  /* First SIGTERM: graceful */
    } else {
        shutdown_phase = 2;  /* Second: force */
    }
}
 
void shutdown_gracefully() {
    signal(SIGTERM, term_handler);
    signal(SIGINT, term_handler);
    alarm(30);  /* Force exit after 30s */
    
    while (shutdown_phase < 2 && work_remaining()) {
        do_remaining_work();
    }
    
    cleanup();
    exit(0);
}

Pattern 5: Dedicated Signal Thread (Multi-threaded)

pattern_thread.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
void *signal_thread(void *arg) {
    sigset_t *mask = arg;
    int sig;
    
    while (1) {
        sigwait(mask, &sig);
        
        /* Handle synchronously - can use ANY function */
        switch (sig) {
            case SIGTERM: initiate_shutdown(); return NULL;
            case SIGHUP: reload_config(); break;
            case SIGUSR1: dump_stats(); break;
        }
    }
}
 
int main() {
    sigset_t mask;
    sigemptyset(&mask);
    sigaddset(&mask, SIGTERM);
    sigaddset(&mask, SIGHUP);
    sigaddset(&mask, SIGUSR1);
    
    pthread_sigmask(SIG_BLOCK, &mask, NULL);  /* Block before creating threads */
    
    pthread_t sig_thr;
    pthread_create(&sig_thr, NULL, signal_thread, &mask);
    
    /* Worker threads inherit blocked mask, won't receive signals */
    /* ... */
}

Choose Your Pattern

Summary: Signal Reliability

Signal reliability is about understanding limitations and designing around them. You now have the complete picture—from historical problems to modern solutions. Let's consolidate:

Key Takeaways

•Historical problems are solved — Use sigaction(), not signal(). Modern signals are reliable in the sense of 'handlers stay installed.'
•Standard signals don't queue — Multiple occurrences collapse to one. Design for 'at least one' not 'exactly one.'
•Always loop in SIGCHLD — Multiple children may exit, but you get one signal. waitpid() with WNOHANG in a loop.
•Race conditions require sigsuspend() — Atomic unblock-and-wait eliminates the check-then-pause race.
•Real-time signals queue — When counting matters or you need payload, use SIGRTMIN+ with sigqueue().
•Handle EINTR everywhere — Even with SA_RESTART, some syscalls return EINTR. Retry or check shutdown flag.
•Consider alternatives — signalfd(), eventfd(), pipes, or dedicated signal threads often produce cleaner code.
•Keep handlers minimal — Set a flag, possibly write to a pipe. Do everything else in normal code context.

Module Complete:

Module Complete: Signals

Signal Reliability: Edge Cases and Robustness

The Core Problem: Handler Reset

The Race Window

Other V7 Problems

The BSD Solution (4.2BSD, 1983)

The Mechanism

Practical Implications

The SIGCHLD Problem: Classic Example

When Signal Counting Matters

Classic Race: Check-Then-Act

The Solution: sigsuspend()

Race: Modifying Shared Data

Real-Time Signal Guarantees

The Range

Queue Limits

When to Use Real-Time Signals

The EINTR Problem

SA_RESTART: Automatic System Call Restart

Syscalls That Restart vs. Don't Restart

Which Approach to Use?

Signal Strengths

Signal Weaknesses

The Modern Recommendation

Pattern 1: Minimal Handler with Flag

Pattern 2: Self-Pipe for Event Loops

Pattern 3: SIGCHLD with Loop

Pattern 4: Graceful Shutdown with Timeout

Pattern 5: Dedicated Signal Thread (Multi-threaded)

Signal Reliability: Edge Cases and Robustness

The Core Problem: Handler Reset

The Race Window

Other V7 Problems

The BSD Solution (4.2BSD, 1983)

The Mechanism

Practical Implications

The SIGCHLD Problem: Classic Example

When Signal Counting Matters

Classic Race: Check-Then-Act

The Solution: sigsuspend()

Race: Modifying Shared Data

Real-Time Signal Guarantees

The Range

Queue Limits

When to Use Real-Time Signals

The EINTR Problem

SA_RESTART: Automatic System Call Restart

Syscalls That Restart vs. Don't Restart

Which Approach to Use?

Signal Strengths

Signal Weaknesses

The Modern Recommendation

Pattern 1: Minimal Handler with Flag

Pattern 2: Self-Pipe for Event Loops

Pattern 3: SIGCHLD with Loop

Pattern 4: Graceful Shutdown with Timeout

Pattern 5: Dedicated Signal Thread (Multi-threaded)