Detection And Recovery - Learning Module

Loading content...

0/227

Process Termination

The Nuclear Option

Process termination is the most straightforward deadlock recovery mechanism: to break the cycle of waiting, we eliminate one or more processes entirely. This releases all resources held by the terminated process, potentially allowing other blocked processes to proceed.

But termination is not as simple as calling kill(). A process being terminated may hold:

Locks protecting shared data structures
File handles with uncommitted writes
Network connections in the middle of transactions
Memory mappings shared with other processes
Partial computation results needed for correctness

A careless termination can leave the system in a worse state than the deadlock itself. Corrupted files, inconsistent databases, orphaned resources, and undefined behavior are all possible consequences.

This page covers how to terminate processes safely, handle cleanup properly, and minimize collateral damage when using termination for deadlock recovery.

What You Will Learn

By the end of this page, you will master safe process termination techniques (graceful vs forceful), implementing proper cleanup handlers, understanding cascading termination effects, preventing data corruption and resource leaks, and production patterns for termination-based recovery.

Termination Mechanisms

Operating systems provide multiple ways to terminate processes, each with different semantics and safety properties.

Unix/Linux Process Termination Signals
Signal	Number	Behavior	Catchable?	Use Case
SIGTERM	15	Graceful termination request	Yes	Normal shutdown, allows cleanup
SIGINT	2	Interrupt (Ctrl+C)	Yes	Interactive termination
SIGQUIT	3	Quit with core dump	Yes	Debug termination
SIGKILL	9	Immediate termination	No	Force kill unresponsive process
SIGABRT	6	Abort with core dump	Yes*	Programmatic abort
SIGHUP	1	Hangup	Yes	Terminal closed, daemon reload

termination_signals.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
#include <signal.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <stdio.h>
#include <time.h>
 
/**
 * Safe process termination with fallback to forceful kill.
 * 
 * Strategy:
 * 1. Send SIGTERM - request graceful shutdown
 * 2. Wait up to timeout_ms for process to exit
 * 3. If still alive, send SIGKILL - force termination
 * 4. Wait for kernel to clean up (zombies)
 * 
 * Returns: 0 on success, -1 on error
 */
int terminate_process_safely(pid_t pid, int timeout_ms) {
    int status;
    pid_t result;
    
    // Step 1: Request graceful termination
    if (kill(pid, SIGTERM) == -1) {
        if (errno == ESRCH) {
            // Process already dead
            return 0;
        }
        perror("kill(SIGTERM)");
        return -1;
    }
    
    printf("Sent SIGTERM to PID %d, waiting for graceful exit...\n", pid);
    
    // Step 2: Wait for process to exit
    struct timespec start, now;
    clock_gettime(CLOCK_MONOTONIC, &start);
    
    while (1) {
        // Non-blocking wait
        result = waitpid(pid, &status, WNOHANG);
        
        if (result == pid) {
            // Process exited
            if (WIFEXITED(status)) {
                printf("Process %d exited with code %d\n", 
                       pid, WEXITSTATUS(status));
            } else if (WIFSIGNALED(status)) {
                printf("Process %d killed by signal %d\n",
                       pid, WTERMSIG(status));
            }
            return 0;
        }
        
        if (result == -1 && errno == ECHILD) {
            // Not our child, check if still alive
            if (kill(pid, 0) == -1 && errno == ESRCH) {
                // Process gone
                return 0;
            }
        }
        
        // Check timeout
        clock_gettime(CLOCK_MONOTONIC, &now);
        long elapsed_ms = (now.tv_sec - start.tv_sec) * 1000 +
                          (now.tv_nsec - start.tv_nsec) / 1000000;
        
        if (elapsed_ms >= timeout_ms) {
            break;  // Timeout reached
        }
        
        // Brief sleep to avoid busy-waiting
        usleep(10000);  // 10ms
    }
    
    // Step 3: Force kill - no more Mr. Nice Guy
    printf("Graceful shutdown timed out, sending SIGKILL to PID %d\n", pid);
    
    if (kill(pid, SIGKILL) == -1) {
        if (errno == ESRCH) {
            // Died between our checks
            return 0;
        }
        perror("kill(SIGKILL)");
        return -1;
    }
    
    // Step 4: Wait for kernel cleanup
    // SIGKILL cannot be caught, so process will definitely die
    result = waitpid(pid, &status, 0);  // Blocking wait
    
    if (result == pid) {
        printf("Process %d force-killed\n", pid);
        return 0;
    }
    
    // If we're not the parent, we can't wait - just assume it's dead
    return 0;
}
 
/**
 * Terminate multiple processes in dependency order.
 * 
 * When terminating for deadlock recovery, consider that:
 * 1. Some processes may depend on others being alive
 * 2. Child processes may become orphans
 * 3. Process groups may need to be terminated together
 */
int terminate_process_group(pid_t pgid, int timeout_ms) {
    // Send signal to entire process group (negative PID)
    if (kill(-pgid, SIGTERM) == -1 && errno != ESRCH) {
        perror("kill(-pgid, SIGTERM)");
        return -1;
    }
    
    // Wait for all processes in group
    // (Implementation would iterate through /proc or use waitpid in a loop)
    
    return 0;
}

The Two-Phase Termination Pattern:

Phase 1: Graceful (SIGTERM)
  → Process catches signal
  → Runs cleanup handlers
  → Releases resources properly
  → Exits cleanly

Phase 2: Forceful (SIGKILL) - only if Phase 1 fails
  → Kernel immediately terminates process
  → No cleanup handlers run
  → Resources reclaimed by kernel
  → Potential for inconsistent state

Always try graceful first. Only use SIGKILL as a last resort, and be prepared to clean up any orphaned resources.

SIGKILL is Not Catchable

SIGKILL cannot be caught, blocked, or ignored. The kernel terminates the process immediately. Any locks held via pthread_mutex_lock(), any open file handles, any shared memory references—all are abandoned in their current state. Use SIGKILL only when the process is truly unresponsive.

Implementing Cleanup Handlers

For graceful termination to work safely, processes must implement proper cleanup handlers that release resources before exiting.

cleanup_handlers.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
#include <signal.h>
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <stdbool.h>
 
// Global state that needs cleanup
static int log_fd = -1;
static pthread_mutex_t *shared_lock = NULL;
static void *shared_memory = NULL;
static int shared_memory_size = 0;
static volatile sig_atomic_t shutdown_requested = 0;
 
/**
 * Signal handler for graceful termination.
 * 
 * IMPORTANT: Signal handlers must be async-signal-safe.
 * Only call functions from the async-signal-safe list.
 * Cannot use: malloc, printf, mutex operations (mostly)
 */
void termination_handler(int signum) {
    // Only set a flag - actual cleanup happens in main thread
    shutdown_requested = 1;
    
    // If we need to wake up a sleeping thread:
    // write(self_pipe[1], "x", 1);  // Self-pipe trick
}
 
/**
 * Cleanup function called during graceful shutdown.
 * This runs in normal context (not signal handler).
 */
void perform_cleanup(void) {
    printf("Performing cleanup...\n");
    
    // 1. Release any held locks
    if (shared_lock != NULL) {
        // WARNING: If we hold this lock, we must release it.
        // If another thread holds it, we can't just destroy it.
        // Check if WE hold it before releasing.
        
        int trylock_result = pthread_mutex_trylock(shared_lock);
        if (trylock_result == 0) {
            // We just acquired it - release immediately
            pthread_mutex_unlock(shared_lock);
        } else if (trylock_result == EBUSY) {
            // Someone else holds it - we shouldn't destroy
            // But we're exiting, so log and continue
            fprintf(stderr, "Warning: Lock held during shutdown\n");
        }
        // If EDEADLK (we hold it), just unlock
        // Note: This is tricky and depends on mutex type
    }
    
    // 2. Flush and close file handles
    if (log_fd >= 0) {
        // fsync to ensure data is on disk
        fsync(log_fd);
        close(log_fd);
        log_fd = -1;
        printf("  Closed log file\n");
    }
    
    // 3. Unmap shared memory
    if (shared_memory != NULL) {
        munmap(shared_memory, shared_memory_size);
        shared_memory = NULL;
        printf("  Unmapped shared memory\n");
    }
    
    // 4. Notify other processes we're leaving
    // e.g., remove ourselves from a registry, send goodbye message
    
    // 5. Cancel pending operations
    // e.g., abort in-progress network requests
    
    printf("Cleanup complete\n");
}
 
/**
 * pthread cleanup handler for thread cancellation.
 * Called when thread is cancelled while in cancellation point.
 */
void thread_cleanup(void *arg) {
    const char *resource_name = (const char *)arg;
    printf("Thread cleanup: releasing %s\n", resource_name);
    // Release thread-local resources
}
 
/**
 * atexit handler for normal exit paths.
 */
void atexit_cleanup(void) {
    perform_cleanup();
}
 
/**
 * Main loop with graceful shutdown support.
 */
int main(int argc, char *argv[]) {
    // Install signal handlers
    struct sigaction sa;
    sa.sa_handler = termination_handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;
    
    sigaction(SIGTERM, &sa, NULL);
    sigaction(SIGINT, &sa, NULL);
    
    // Register atexit cleanup
    atexit(atexit_cleanup);
    
    // Initialize resources
    log_fd = open("/tmp/app.log", O_WRONLY | O_CREAT, 0644);
    
    printf("Process started, PID %d\n", getpid());
    
    // Main loop - checks shutdown flag
    while (!shutdown_requested) {
        // Do work...
        
        // Example: check shutdown flag periodically
        for (int i = 0; i < 10 && !shutdown_requested; i++) {
            // Actual work
            sleep(1);
        }
    }
    
    printf("Shutdown requested, exiting gracefully\n");
    
    // perform_cleanup() will be called by atexit
    
    return 0;
}

C++ RAII Pattern for Automatic Cleanup:

raii_cleanup.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
#include <mutex>
#include <memory>
#include <fstream>
#include <csignal>
#include <atomic>
#include <iostream>
 
/**
 * RAII wrappers ensure cleanup even on unexpected termination.
 * 
 * When the process terminates (even via SIGTERM):
 * - Stack unwinding destroys local objects
 * - Destructors run cleanup code
 * - Resources are automatically released
 * 
 * CAVEAT: SIGKILL prevents destructor execution!
 */
 
class ManagedResource {
public:
    ManagedResource(const std::string& name) : name_(name), valid_(true) {
        std::cout << "Acquired: " << name_ << std::endl;
    }
    
    ~ManagedResource() {
        if (valid_) {
            std::cout << "Released: " << name_ << std::endl;
            // Actual cleanup: close handles, release locks, etc.
        }
    }
    
    // Move semantics for ownership transfer
    ManagedResource(ManagedResource&& other) noexcept 
        : name_(std::move(other.name_)), valid_(other.valid_) {
        other.valid_ = false;
    }
    
    ManagedResource& operator=(ManagedResource&& other) noexcept {
        if (this != &other) {
            if (valid_) {
                // Release current resource
            }
            name_ = std::move(other.name_);
            valid_ = other.valid_;
            other.valid_ = false;
        }
        return *this;
    }
    
    // No copying
    ManagedResource(const ManagedResource&) = delete;
    ManagedResource& operator=(const ManagedResource&) = delete;
    
private:
    std::string name_;
    bool valid_;
};
 
class ScopedLock {
public:
    explicit ScopedLock(std::mutex& m) : mutex_(m), owned_(false) {
        mutex_.lock();
        owned_ = true;
    }
    
    ~ScopedLock() {
        if (owned_) {
            mutex_.unlock();
        }
    }
    
    void release() {
        if (owned_) {
            mutex_.unlock();
            owned_ = false;
        }
    }
    
private:
    std::mutex& mutex_;
    bool owned_;
};
 
// Signal handling with graceful cleanup
std::atomic<bool> shutdown_requested{false};
 
void signal_handler(int sig) {
    shutdown_requested = true;
}
 
void worker_function() {
    // RAII ensures cleanup even if we exit early
    ManagedResource file_handle("database_connection");
    ManagedResource memory_buffer("process_memory");
    
    static std::mutex global_mutex;
    
    while (!shutdown_requested) {
        // Lock is automatically released when scope exits
        std::lock_guard<std::mutex> lock(global_mutex);
        
        // Do protected work...
        
        // If we receive SIGTERM here, stack unwinding will:
        // 1. Release the lock_guard
        // 2. Destroy memory_buffer
        // 3. Destroy file_handle
    }
    
    std::cout << "Worker exiting cleanly" << std::endl;
    // Destructors run automatically as function exits
}
 
int main() {
    std::signal(SIGTERM, signal_handler);
    std::signal(SIGINT, signal_handler);
    
    worker_function();
    
    return 0;
}

Design for Termination

The best cleanup handlers are the ones you don't have to write. Use RAII in C++, context managers in Python, try-with-resources in Java. Design your code so resources are automatically released when scope ends, whether that's normal exit, exception, or termination signal.

Kernel Resource Recovery

When a process terminates (even via SIGKILL), the kernel automatically reclaims certain resources. Understanding what the kernel handles vs what requires explicit cleanup is crucial.

Resource Recovery on Process Termination
Resource	Kernel Recovers?	Notes
Process memory (heap, stack)	✅ Yes	All pages reclaimed automatically
Open file descriptors	✅ Yes	Closed, but unflushed buffers lost
Child processes	✅ Partial	Inherited by init, become orphans
POSIX mutexes (PTHREAD_PROCESS_PRIVATE)	✅ Yes	Destroyed with process memory
POSIX mutexes (PTHREAD_PROCESS_SHARED)	❌ No	Remain locked, other processes blocked!
System V semaphores	✅ Optional	With SEM_UNDO flag only
POSIX semaphores (named)	❌ No	Persist until sem_unlink()
Shared memory segments	❌ No	Persist until shmctl(IPC_RMID)
Message queues	❌ No	Persist until msgctl(IPC_RMID)
Sockets	✅ Yes	Closed, but TIME_WAIT state may linger
Flock/fcntl file locks	✅ Yes	Released automatically
Database connections	❌ No	Server-side cleanup needed

robust_mutexes.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>
#include <fcntl.h>
 
/**
 * Robust mutexes: Handle owner death gracefully
 * 
 * When a process holding a robust mutex dies:
 * 1. The next process to lock gets EOWNERDEAD
 * 2. That process can clean up inconsistent state
 * 3. Then call pthread_mutex_consistent()
 * 4. Then unlock, making the mutex usable again
 */
 
typedef struct {
    pthread_mutex_t mutex;
    int data;  // Protected data
} SharedState;
 
SharedState* create_shared_state() {
    // Create shared memory for IPC
    int fd = shm_open("/my_shared_state", O_CREAT | O_RDWR, 0666);
    ftruncate(fd, sizeof(SharedState));
    
    SharedState* state = mmap(NULL, sizeof(SharedState),
                               PROT_READ | PROT_WRITE,
                               MAP_SHARED, fd, 0);
    close(fd);
    
    // Initialize with ROBUST and SHARED attributes
    pthread_mutexattr_t attr;
    pthread_mutexattr_init(&attr);
    pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
    pthread_mutexattr_setrobust(&attr, PTHREAD_MUTEX_ROBUST);
    
    pthread_mutex_init(&state->mutex, &attr);
    pthread_mutexattr_destroy(&attr);
    
    return state;
}
 
int lock_with_recovery(SharedState* state) {
    int result = pthread_mutex_lock(&state->mutex);
    
    if (result == EOWNERDEAD) {
        // Previous owner died while holding the lock!
        printf("Previous lock owner died. Recovering...\n");
        
        // Step 1: Clean up any inconsistent state
        // The data may be in an inconsistent state
        // Application-specific recovery logic here
        recover_inconsistent_data(&state->data);
        
        // Step 2: Mark mutex as consistent
        result = pthread_mutex_consistent(&state->mutex);
        if (result != 0) {
            printf("Failed to make mutex consistent\n");
            pthread_mutex_unlock(&state->mutex);
            return -1;
        }
        
        printf("Mutex recovered successfully\n");
        return 0;
    }
    
    if (result == ENOTRECOVERABLE) {
        // Mutex is permanently unusable
        // Must destroy and reinitialize
        printf("Mutex not recoverable - reinitializing\n");
        // ... reinitialize ...
        return -1;
    }
    
    return result;  // 0 = success, other = error
}
 
void recover_inconsistent_data(int* data) {
    // Application-specific logic to detect and fix inconsistencies
    // For example: validate invariants, rollback partial updates
    *data = 0;  // Reset to known-good state
}
 
/**
 * Example: Deadlock recovery killing a robust mutex holder
 */
void deadlock_recovery_with_robust_mutex(pid_t victim_pid, SharedState* state) {
    // Kill the deadlock victim
    kill(victim_pid, SIGKILL);
    waitpid(victim_pid, NULL, 0);
    
    // The robust mutex will now return EOWNERDEAD to next locker
    // Next process to acquire will handle recovery
    
    printf("Victim killed. Next lock acquisition will recover.\n");
}

Robust Mutexes are Key

If you use PTHREAD_PROCESS_SHARED mutexes without the ROBUST attribute, killing a process holding the mutex will permanently block all other processes waiting for that mutex. Always use robust mutexes for shared resources, and always handle EOWNERDEAD in your lock code.

Cascading Termination Effects

Terminating one process can have ripple effects throughout the system. Understanding these cascading effects is crucial for safe deadlock recovery.

Cascading Effects of Process Termination

•Orphaned child processes — Children become orphaned and are adopted by init/systemd. They continue running but may malfunction without their parent.
•Broken pipes — Processes reading from or writing to the terminated process get SIGPIPE or read EOF. May cause unexpected exits or hangs.
•Waiting processes timeout — Processes waiting for the terminated one (IPC, network) eventually timeout. Some may crash, others retry.
•Session leaders — If the terminated process was a session leader, all processes in the session may receive SIGHUP.
•Process groups — Foreground process groups may receive SIGCONT if background process terminates as controlling terminal changes.
•Shared memory inconsistency — If termination occurred mid-write, shared memory may contain garbage or partial data.
•lock files left behind — Processes using file-based locking may leave stale lock files that block others.
•Database connections — Database server may not immediately detect client death; connections linger, locks held.

cascade_analysis.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
from dataclasses import dataclass
from typing import List, Dict, Set, Optional
from enum import Enum
 
class DependencyType(Enum):
    PARENT_CHILD = "parent_child"
    PIPE = "pipe"
    SOCKET = "socket"
    SHARED_MEMORY = "shared_memory"
    FILE_LOCK = "file_lock"
    IPC_MSG = "ipc_message_queue"
    SIGNAL = "signal_handler"
 
@dataclass
class ProcessDependency:
    """A dependency between two processes."""
    source_pid: int       # Depends on...
    target_pid: int       # ...this process
    dep_type: DependencyType
    critical: bool        # If true, source fails if target dies
 
@dataclass
class CascadeAnalysisResult:
    """Result of cascade analysis."""
    directly_affected: Set[int]
    transitively_affected: Set[int]
    critical_failures: Set[int]
    orphaned_processes: Set[int]
    broken_pipes: List[tuple]
    abandoned_locks: List[str]
 
class CascadeAnalyzer:
    """
    Analyze the cascade effects of terminating a process.
    
    Before terminating a deadlock victim, understand what else
    will be affected and prepare for cleanup.
    """
    
    def __init__(self):
        self.dependencies: List[ProcessDependency] = []
        self.process_tree: Dict[int, List[int]] = {}  # parent -> children
    
    def analyze_termination_impact(self, victim_pid: int) -> CascadeAnalysisResult:
        """
        Analyze what happens if we terminate the given process.
        """
        result = CascadeAnalysisResult(
            directly_affected=set(),
            transitively_affected=set(),
            critical_failures=set(),
            orphaned_processes=set(),
            broken_pipes=[],
            abandoned_locks=[]
        )
        
        # Find direct dependencies
        for dep in self.dependencies:
            if dep.target_pid == victim_pid:
                result.directly_affected.add(dep.source_pid)
                
                if dep.critical:
                    result.critical_failures.add(dep.source_pid)
                    
                if dep.dep_type == DependencyType.PIPE:
                    result.broken_pipes.append(
                        (dep.source_pid, victim_pid)
                    )
        
        # Find children (will become orphans)
        children = self.process_tree.get(victim_pid, [])
        result.orphaned_processes = set(children)
        
        # Transitive dependencies (what depends on the directly affected?)
        visited = {victim_pid}
        queue = list(result.directly_affected)
        
        while queue:
            pid = queue.pop(0)
            if pid in visited:
                continue
            visited.add(pid)
            result.transitively_affected.add(pid)
            
            # Find what depends on this process
            for dep in self.dependencies:
                if dep.target_pid == pid and dep.critical:
                    if dep.source_pid not in visited:
                        queue.append(dep.source_pid)
        
        # Check for abandoned locks
        result.abandoned_locks = self._find_held_locks(victim_pid)
        
        return result
    
    def _find_held_locks(self, pid: int) -> List[str]:
        """Find locks held by a process."""
        locks = []
        
        # Check /proc/locks on Linux
        try:
            with open('/proc/locks', 'r') as f:
                for line in f:
                    if str(pid) in line:
                        locks.append(line.strip())
        except:
            pass
        
        return locks
    
    def safe_to_terminate(self, victim_pid: int) -> tuple:
        """
        Determine if it's safe to terminate a process for deadlock recovery.
        
        Returns: (is_safe, reasons_if_not_safe)
        """
        impact = self.analyze_termination_impact(victim_pid)
        reasons = []
        
        # Check for critical failures
        if impact.critical_failures:
            reasons.append(
                f"Critical dependent processes: {impact.critical_failures}"
            )
        
        # Check for abandoned shared locks
        if impact.abandoned_locks:
            reasons.append(
                f"Will abandon {len(impact.abandoned_locks)} locks"
            )
        
        # Check transitive impact
        if len(impact.transitively_affected) > 5:
            reasons.append(
                f"Large transitive impact: {len(impact.transitively_affected)} processes"
            )
        
        return (len(reasons) == 0, reasons)
 
 
# Example usage
def analyze_deadlock_recovery_candidates(deadlocked_pids: List[int], 
                                          analyzer: CascadeAnalyzer):
    """
    For each deadlock victim candidate, analyze the cascade impact.
    Choose the one with minimal collateral damage.
    """
    best_victim = None
    best_impact = None
    best_score = float('inf')
    
    for pid in deadlocked_pids:
        impact = analyzer.analyze_termination_impact(pid)
        is_safe, reasons = analyzer.safe_to_terminate(pid)
        
        # Score based on impact (lower is better)
        score = (
            len(impact.directly_affected) * 10 +
            len(impact.transitively_affected) * 5 +
            len(impact.critical_failures) * 100 +
            len(impact.orphaned_processes) * 2 +
            len(impact.abandoned_locks) * 20
        )
        
        print(f"PID {pid}: score={score}, safe={is_safe}")
        if reasons:
            print(f"  Concerns: {reasons}")
        
        if score < best_score:
            best_score = score
            best_victim = pid
            best_impact = impact
    
    return best_victim, best_impact

Analyze Before You Kill

In complex systems, terminating the 'wrong' process for deadlock recovery can cause more damage than the deadlock itself. Build dependency graphs, analyze cascade effects, and choose victims carefully. Sometimes it's better to wait for operator intervention than to trigger a cascade of failures.

Starvation Prevention

A naive victim selection algorithm might always choose the same process to terminate. This leads to starvation: a process that can never complete because it's always selected as the deadlock victim.

starvation_prevention.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
from dataclasses import dataclass
from typing import Dict, List
from datetime import datetime, timedelta
from collections import defaultdict
 
@dataclass
class ProcessVictimHistory:
    """Track how often a process has been victimized."""
    pid: int
    times_victimized: int = 0
    last_victimized: datetime = None
    total_work_lost: float = 0.0  # Estimated lost computation
    consecutive_victims: int = 0
 
class StarvationAwareVictimSelector:
    """
    Victim selection that prevents starvation.
    
    Key strategies:
    1. Track victimization history
    2. Exponentially increase cost for frequently-victimized processes
    3. Implement "grace periods" after victimization
    4. Limit consecutive victimizations
    """
    
    def __init__(self, 
                 max_consecutive_victims: int = 2,
                 grace_period_seconds: int = 60,
                 history_decay_hours: int = 24):
        self.history: Dict[int, ProcessVictimHistory] = defaultdict(
            lambda: ProcessVictimHistory(pid=0)
        )
        self.max_consecutive = max_consecutive_victims
        self.grace_period = timedelta(seconds=grace_period_seconds)
        self.history_decay = timedelta(hours=history_decay_hours)
    
    def select_victim(self, 
                      deadlocked_pids: List[int],
                      base_costs: Dict[int, float]) -> int:
        """
        Select victim considering starvation prevention.
        
        Args:
            deadlocked_pids: PIDs involved in deadlock
            base_costs: Base termination cost for each PID
            
        Returns:
            Selected victim PID
        """
        now = datetime.now()
        adjusted_costs = {}
        
        for pid in deadlocked_pids:
            history = self.history[pid]
            base_cost = base_costs.get(pid, 0.0)
            
            # Check grace period
            if history.last_victimized:
                time_since = now - history.last_victimized
                if time_since < self.grace_period:
                    # In grace period - heavily penalize selecting this victim
                    adjusted_costs[pid] = base_cost + 10000
                    continue
            
            # Check consecutive victimization limit
            if history.consecutive_victims >= self.max_consecutive:
                # Hit the limit - cannot select this victim
                adjusted_costs[pid] = float('inf')
                continue
            
            # Apply exponential penalty based on history
            # Each past victimization doubles the cost
            history_penalty = 2 ** history.times_victimized
            
            # Decay penalty based on time since last victimization
            if history.last_victimized:
                time_since = now - history.last_victimized
                decay_factor = min(1.0, time_since / self.history_decay)
                history_penalty *= (1 - decay_factor * 0.9)  # Reduce up to 90%
            
            adjusted_costs[pid] = base_cost + (history_penalty * 10)
        
        # Select minimum cost
        if not adjusted_costs:
            return deadlocked_pids[0]  # Fallback
        
        victim = min(adjusted_costs.keys(), 
                     key=lambda pid: adjusted_costs[pid])
        
        # Can't select anyone?
        if adjusted_costs[victim] == float('inf'):
            # Reset consecutive counts and try again
            self._reset_consecutive_counts()
            return self.select_victim(deadlocked_pids, base_costs)
        
        return victim
    
    def record_victimization(self, pid: int, work_lost: float = 0.0):
        """Record that a process was selected as victim."""
        now = datetime.now()
        history = self.history[pid]
        
        # Check if this is consecutive
        if history.last_victimized:
            time_since = now - history.last_victimized
            if time_since < timedelta(minutes=5):
                history.consecutive_victims += 1
            else:
                history.consecutive_victims = 1
        else:
            history.consecutive_victims = 1
        
        history.pid = pid
        history.times_victimized += 1
        history.last_victimized = now
        history.total_work_lost += work_lost
        
        # Reset consecutive count for other processes
        for other_pid, other_history in self.history.items():
            if other_pid != pid:
                other_history.consecutive_victims = 0
    
    def record_completion(self, pid: int):
        """Record that a process completed successfully."""
        # Completing successfully reduces future victimization penalty
        if pid in self.history:
            self.history[pid].consecutive_victims = 0
    
    def _reset_consecutive_counts(self):
        """Reset consecutive counts when no victim is selectable."""
        for history in self.history.values():
            history.consecutive_victims = 0
    
    def get_statistics(self) -> Dict:
        """Get victimization statistics for monitoring."""
        return {
            'total_processes_tracked': len(self.history),
            'total_victimizations': sum(h.times_victimized 
                                         for h in self.history.values()),
            'total_work_lost': sum(h.total_work_lost 
                                    for h in self.history.values()),
            'most_victimized': max(
                self.history.values(),
                key=lambda h: h.times_victimized,
                default=None
            ),
        }

Anti-Starvation Strategies:

Strategy	Description	Trade-off
Grace period	Don't victimize recently-killed process	May delay recovery
Exponential backoff	Each victimization increases cost	Eventually selects anyway
Maximum consecutive	Hard limit on consecutive selections	Requires alternative victim
Priority aging	Increase priority of long-waiting processes	More complex bookkeeping
Round-robin fallback	Alternate between victim types	May not choose optimal victim

Log Victimization Statistics

Monitor victimization statistics over time. If one process type is consistently victimized, it indicates a systemic issue—perhaps that process type is genuinely low priority, or perhaps your cost function needs tuning. Use these statistics to improve your recovery strategy.

Clean Restart Patterns

After terminating a process for deadlock recovery, we often want to restart it. A clean restart pattern ensures the respawned process starts correctly without inheriting problems.

clean_restart.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
import subprocess
import os
import time
import signal
from typing import Optional, List, Dict
from dataclasses import dataclass, field
from datetime import datetime, timedelta
 
@dataclass
class RestartPolicy:
    """Configuration for process restart behavior."""
    max_restarts: int = 3                    # Max restarts in window
    restart_window_seconds: int = 60         # Window for counting restarts
    initial_delay_seconds: float = 0.5       # Delay before first restart
    max_delay_seconds: float = 30.0          # Maximum delay (exponential backoff)
    backoff_multiplier: float = 2.0          # Backoff multiplier
    reset_after_stable_seconds: int = 300    # Reset counter after stable period
 
@dataclass
class ProcessState:
    """Internal state for a managed process."""
    pid: Optional[int] = None
    restart_count: int = 0
    restart_times: List[datetime] = field(default_factory=list)
    last_exit_code: Optional[int] = None
    last_exit_reason: str = ""
    started_at: Optional[datetime] = None
    
class ProcessManager:
    """
    Manage process lifecycle with clean restart support.
    
    Features:
    - Exponential backoff on rapid restarts
    - Restart budget to prevent restart loops
    - Pre-start cleanup to ensure clean state
    - Post-start verification
    """
    
    def __init__(self, 
                 command: List[str],
                 policy: RestartPolicy = None,
                 env: Dict[str, str] = None,
                 cleanup_handler=None):
        self.command = command
        self.policy = policy or RestartPolicy()
        self.env = env or os.environ.copy()
        self.cleanup_handler = cleanup_handler
        self.state = ProcessState()
        self.process: Optional[subprocess.Popen] = None
    
    def start(self) -> int:
        """Start the process, performing pre-start cleanup."""
        # Pre-start cleanup
        self._perform_cleanup()
        
        # Calculate delay if restarting
        if self.state.restart_count > 0:
            delay = self._calculate_delay()
            print(f"Waiting {delay:.1f}s before restart (attempt {self.state.restart_count + 1})")
            time.sleep(delay)
        
        # Start the process
        self.process = subprocess.Popen(
            self.command,
            env=self.env,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
        )
        
        self.state.pid = self.process.pid
        self.state.started_at = datetime.now()
        
        print(f"Started process {self.state.pid}")
        
        # Verify process started correctly
        if not self._verify_startup():
            self.terminate()
            raise RuntimeError("Process failed startup verification")
        
        return self.state.pid
    
    def terminate(self, graceful_timeout: float = 5.0):
        """Terminate the process gracefully, then forcefully if needed."""
        if self.process is None:
            return
        
        # Try graceful termination
        self.process.terminate()
        
        try:
            self.state.last_exit_code = self.process.wait(timeout=graceful_timeout)
            self.state.last_exit_reason = "graceful"
        except subprocess.TimeoutExpired:
            # Force kill
            self.process.kill()
            self.state.last_exit_code = self.process.wait()
            self.state.last_exit_reason = "forced"
        
        self.process = None
    
    def restart_after_deadlock(self) -> Optional[int]:
        """
        Restart a process that was terminated for deadlock recovery.
        
        Returns: New PID or None if restart not allowed
        """
        # Check restart budget
        if not self._can_restart():
            print("Restart budget exhausted. Manual intervention required.")
            return None
        
        # Record restart attempt
        now = datetime.now()
        self.state.restart_count += 1
        self.state.restart_times.append(now)
        
        # Perform full cleanup before restart
        self._perform_cleanup()
        
        # Clear any problematic state
        self._reset_problem_state()
        
        # Start fresh
        return self.start()
    
    def _can_restart(self) -> bool:
        """Check if we're within restart budget."""
        now = datetime.now()
        window_start = now - timedelta(seconds=self.policy.restart_window_seconds)
        
        # Count restarts within window
        recent = sum(1 for t in self.state.restart_times if t > window_start)
        
        return recent < self.policy.max_restarts
    
    def _calculate_delay(self) -> float:
        """Calculate delay before restart (exponential backoff)."""
        recent_count = self.state.restart_count
        delay = self.policy.initial_delay_seconds * (
            self.policy.backoff_multiplier ** (recent_count - 1)
        )
        return min(delay, self.policy.max_delay_seconds)
    
    def _perform_cleanup(self):
        """Perform pre-start cleanup."""
        # 1. Custom cleanup handler
        if self.cleanup_handler:
            self.cleanup_handler()
        
        # 2. Clean up stale lock files
        self._cleanup_lock_files()
        
        # 3. Clean up temporary files
        self._cleanup_temp_files()
        
        # 4. Release any orphaned shared resources
        self._cleanup_shared_resources()
    
    def _cleanup_lock_files(self):
        """Remove stale lock files from previous run."""
        lock_patterns = [
            f"/tmp/{os.path.basename(self.command[0])}.lock",
            f"/var/run/{os.path.basename(self.command[0])}.pid",
        ]
        for pattern in lock_patterns:
            if os.path.exists(pattern):
                try:
                    os.unlink(pattern)
                    print(f"Removed stale lock file: {pattern}")
                except OSError:
                    pass
    
    def _cleanup_temp_files(self):
        """Clean up temporary files from previous run."""
        # Application-specific temp file cleanup
        pass
    
    def _cleanup_shared_resources(self):
        """Release orphaned shared resources."""
        # Application-specific: shared memory, semaphores, etc.
        pass
    
    def _reset_problem_state(self):
        """Reset any state that may have caused the deadlock."""
        # Application-specific state reset
        pass
    
    def _verify_startup(self, timeout: float = 5.0) -> bool:
        """Verify the process started correctly."""
        # Check if still running
        time.sleep(0.1)  # Brief wait
        if self.process.poll() is not None:
            return False  # Already exited
        
        # Application-specific health check
        # e.g., wait for health endpoint, check log file
        return True

Restart Loops Are Dangerous

If a process is being killed for deadlock and immediately deadlocks again after restart, you have a restart loop. Implement restart budgets and exponential backoff to prevent this. After the budget is exhausted, fall back to manual intervention or a different recovery strategy.

Summary: Safe Process Termination

We've covered the complete lifecycle of process termination for deadlock recovery, from signal handling to clean restart.

Key Takeaways

•Use two-phase termination: SIGTERM for graceful shutdown with cleanup, SIGKILL only as a last resort after timeout.
•Implement proper cleanup handlers: Signal handlers set flags; actual cleanup runs in main thread. Use RAII/scope guards for automatic cleanup.
•Understand what the kernel recovers: File descriptors yes, shared mutexes no. Use robust mutexes for shared resources.
•Analyze cascade effects: Before terminating, understand dependencies. Choose victims that minimize collateral damage.
•Prevent starvation: Track victimization history, apply grace periods, and limit consecutive terminations of the same process.
•Implement clean restart: Pre-start cleanup, exponential backoff, restart budgets, and startup verification ensure stable recovery.

What's Next:

The final page explores resource preemption—an alternative to termination that can sometimes recover from deadlock without killing any process. We'll examine which resources support preemption and how to implement it safely.

Page Complete

You now have comprehensive knowledge of process termination for deadlock recovery. You can implement safe termination with proper cleanup, prevent starvation, handle cascading effects, and restart processes cleanly. This is essential knowledge for building robust systems that recover gracefully from deadlock.