Thread Libraries - Learning Module

Loading content...

0/227

Thread Joining

Coordinating Thread Completion

One of the most critical aspects of multithreaded programming is knowing when threads finish and collecting their results. Thread joining is the fundamental mechanism for this coordination—it allows one thread to wait for another thread to complete execution, synchronize on its termination, and optionally retrieve its return value.

Without proper joining, concurrent programs suffer from:

Resource leaks — Zombie threads consume kernel resources indefinitely
Premature termination — Main thread exits before workers complete
Lost results — Computed values disappear when threads terminate uncollected
Undefined behavior — Accessing data from prematurely destroyed threads

This page provides an exhaustive exploration of thread joining across POSIX, Windows, and Java environments, covering the semantics, patterns, edge cases, and best practices that professional engineers must master.

What You Will Master

By the end of this page, you will understand the complete thread joining semantics across platforms, the relationship between joining and detaching, timeout-based joining techniques, patterns for joining multiple threads, and how to properly structure thread lifecycles for robust resource management.

Joining Fundamentals

Thread joining is a blocking synchronization operation. When thread A joins thread B:

Thread A blocks (suspends execution)
Thread A remains blocked until thread B terminates
When B terminates, A resumes execution
A may receive B's exit status/return value
B's resources can now be reclaimed

This simple concept has profound implications for program structure and correctness.

Why Joining Matters

Memory Reclamation: Even after a thread's function returns, its kernel structures and stack may persist until joined. This is analogous to zombie processes—the thread is dead but its metadata remains allocated.

Synchronization Guarantee: A join provides a happens-before relationship: all memory writes by the joined thread are visible to the joining thread after the join completes. Without this, you might read stale data.

Return Value Collection: Threads can return values (void* in POSIX, DWORD in Windows, Object in Java). Joining is the only way to retrieve these values.

Thread Joining API Comparison
Platform	Join Function	Timeout Support	Return Value
POSIX	pthread_join()	No (use timed alternatives)	void* (retval pointer)
POSIX	pthread_timedjoin_np()	Yes (Linux extension)	void* (retval pointer)
Windows	WaitForSingleObject()	Yes (DWORD milliseconds)	Via GetExitCodeThread()
Windows	WaitForMultipleObjects()	Yes (DWORD milliseconds)	Via GetExitCodeThread()
Java	Thread.join()	No (blocks indefinitely)	None (use shared state)
Java	Thread.join(millis)	Yes (long milliseconds)	None (use shared state)

Joinable vs Detached Threads

A thread can be either joinable (the default) or detached—never both. A joinable thread MUST be joined to reclaim its resources. A detached thread automatically releases resources upon termination but cannot be joined. Once detached, a thread cannot be made joinable again.

POSIX pthread_join

The pthread_join() function is the standard POSIX mechanism for joining threads. It blocks the calling thread until the target thread terminates.

Function Signature

int pthread_join(pthread_t thread, void **retval);

Parameters:

thread: The thread ID to wait for
retval: If non-NULL, receives the thread's return value (or PTHREAD_CANCELED if canceled)

Returns: 0 on success, error code on failure

Error Conditions

EDEADLK: A deadlock was detected (thread is joining itself, or two threads joining each other)
EINVAL: Thread is not joinable (previously joined or detached)
ESRCH: No thread with the given ID exists

pthread_join_patterns.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
 
/*
 * Basic pthread_join Usage
 */
 
void *compute_sum(void *arg) {
    int limit = (int)(intptr_t)arg;
    long sum = 0;
    
    for (int i = 1; i <= limit; i++) {
        sum += i;
    }
    
    // Return value via void* (must be long-lived!)
    // Allocate on heap or use static/global
    long *result = malloc(sizeof(long));
    *result = sum;
    
    return (void *)result;
}
 
void basic_join_example(void) {
    pthread_t tid;
    void *retval;
    int result;
    
    // Create thread
    result = pthread_create(&tid, NULL, compute_sum, (void *)1000);
    if (result != 0) {
        fprintf(stderr, "pthread_create: %s\n", strerror(result));
        exit(1);
    }
    
    printf("Main: waiting for thread to complete...\n");
    
    // Join thread - blocks until thread exits
    result = pthread_join(tid, &retval);
    if (result != 0) {
        fprintf(stderr, "pthread_join: %s\n", strerror(result));
        exit(1);
    }
    
    // Thread has terminated; retval contains its return value
    long *sum = (long *)retval;
    printf("Main: thread returned sum = %ld\n", *sum);
    
    // Free the heap-allocated result
    free(sum);
}
 
/*
 * Returning Values: Common Patterns
 */
 
// Pattern 1: Return integer via pointer cast
void *return_integer(void *arg) {
    int result = 42;
    return (void *)(intptr_t)result;  // Cast int to pointer
}
 
void join_integer(pthread_t tid) {
    void *retval;
    pthread_join(tid, &retval);
    int result = (int)(intptr_t)retval;  // Cast pointer back to int
    printf("Thread returned: %d\n", result);
}
 
// Pattern 2: Return status code
void *return_status(void *arg) {
    if (/* error condition */ 0) {
        return (void *)-1;  // Error
    }
    return (void *)0;  // Success
}
 
// Pattern 3: Return pointer to shared structure (careful with lifecycle!)
typedef struct {
    int computed_value;
    char message[256];
} TaskResult;
 
void *return_struct(void *arg) {
    TaskResult *task = (TaskResult *)arg;  // Get input struct
    
    // Compute and store results in same struct
    task->computed_value = 123;
    strcpy(task->message, "Task completed successfully");
    
    return task;  // Return pointer to same struct
}
 
/*
 * Joining Multiple Threads
 */
 
void join_multiple_threads(void) {
    const int NUM_THREADS = 10;
    pthread_t threads[NUM_THREADS];
    void *results[NUM_THREADS];
    
    // Create all threads
    for (int i = 0; i < NUM_THREADS; i++) {
        pthread_create(&threads[i], NULL, compute_sum, 
                      (void *)(intptr_t)(i * 100 + 100));
    }
    
    // Join all threads (in order)
    for (int i = 0; i < NUM_THREADS; i++) {
        int result = pthread_join(threads[i], &results[i]);
        if (result != 0) {
            fprintf(stderr, "pthread_join[%d]: %s\n", i, strerror(result));
            results[i] = NULL;
        }
    }
    
    // Process results
    long total = 0;
    for (int i = 0; i < NUM_THREADS; i++) {
        if (results[i] != NULL) {
            long *partial = (long *)results[i];
            total += *partial;
            free(partial);
        }
    }
    
    printf("Total sum from all threads: %ld\n", total);
}
 
/*
 * Detecting Canceled Threads
 */
 
void *cancellable_work(void *arg) {
    while (1) {
        // Do some work
        sleep(1);
        pthread_testcancel();  // Cancellation point
    }
    return (void *)0;  // Never reached if canceled
}
 
void join_canceled_thread(void) {
    pthread_t tid;
    void *retval;
    
    pthread_create(&tid, NULL, cancellable_work, NULL);
    
    sleep(2);
    pthread_cancel(tid);  // Request cancellation
    
    pthread_join(tid, &retval);
    
    if (retval == PTHREAD_CANCELED) {
        printf("Thread was canceled\n");
    } else {
        printf("Thread exited normally\n");
    }
}
 
/*
 * Error Handling: Double Join Prevention
 */
 
void prevent_double_join(void) {
    pthread_t tid;
    int result;
    
    pthread_create(&tid, NULL, compute_sum, (void *)100);
    
    // First join: success
    result = pthread_join(tid, NULL);
    printf("First join: %s\n", result == 0 ? "success" : strerror(result));
    
    // Second join: EINVAL or ESRCH
    result = pthread_join(tid, NULL);
    printf("Second join: %s\n", result == 0 ? "success" : strerror(result));
    
    /*
     * After a thread is joined:
     * - Its pthread_t may be reused for new threads
     * - Joining again is undefined behavior (may crash or corrupt state)
     * 
     * Best practice: Set tid to 0 or use a flag after joining
     */
}

Return Value Lifetime

Never return a pointer to a local variable from a thread function! The stack frame is deallocated when the thread exits. Options: (1) return an integer cast to void*, (2) heap-allocate the result (caller frees), (3) use a shared structure passed as argument, (4) write to global or thread-specific storage.

Detached Threads

A detached thread is one that cannot be joined. When a detached thread terminates, its resources are automatically reclaimed by the system—no other thread needs to wait for it. This is useful for:

Fire-and-forget background tasks
Daemon threads that run for the process lifetime
Situations where the creating thread shouldn't wait

Creating Detached Threads

There are two ways to create detached threads:

Method 1: Via Attributes

pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
pthread_create(&tid, &attr, func, arg);
pthread_attr_destroy(&attr);

Method 2: Detach After Creation

pthread_create(&tid, NULL, func, arg);
pthread_detach(tid);  // Thread is now detached

detached_threads.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
 
/*
 * Detached Thread Patterns
 */
 
void *background_task(void *arg) {
    int id = (int)(intptr_t)arg;
    
    printf("Background task %d starting\n", id);
    sleep(2);  // Simulate work
    printf("Background task %d complete\n", id);
    
    // When this function returns, resources are automatically freed
    return NULL;
}
 
/*
 * Method 1: Create as detached via attributes
 */
void create_detached_via_attr(void) {
    pthread_t tid;
    pthread_attr_t attr;
    int result;
    
    // Initialize attributes
    result = pthread_attr_init(&attr);
    if (result != 0) {
        fprintf(stderr, "attr_init: %s\n", strerror(result));
        return;
    }
    
    // Set detached state
    result = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
    if (result != 0) {
        fprintf(stderr, "setdetachstate: %s\n", strerror(result));
        pthread_attr_destroy(&attr);
        return;
    }
    
    // Create thread
    result = pthread_create(&tid, &attr, background_task, (void *)1);
    if (result != 0) {
        fprintf(stderr, "pthread_create: %s\n", strerror(result));
    } else {
        printf("Created detached thread\n");
    }
    
    // Cannot join this thread - it's detached
    // pthread_join(tid, NULL);  // Would return EINVAL!
    
    pthread_attr_destroy(&attr);
}
 
/*
 * Method 2: Detach after creation
 */
void create_then_detach(void) {
    pthread_t tid;
    int result;
    
    // Create as joinable (default)
    result = pthread_create(&tid, NULL, background_task, (void *)2);
    if (result != 0) {
        fprintf(stderr, "pthread_create: %s\n", strerror(result));
        return;
    }
    
    // Immediately detach
    result = pthread_detach(tid);
    if (result != 0) {
        fprintf(stderr, "pthread_detach: %s\n", strerror(result));
    }
    
    printf("Thread created and detached\n");
}
 
/*
 * Self-detaching thread
 */
void *self_detaching_task(void *arg) {
    // Detach ourselves so no one needs to join
    pthread_detach(pthread_self());
    
    printf("Self-detached thread running\n");
    sleep(1);
    printf("Self-detached thread exiting\n");
    
    return NULL;
}
 
/*
 * Daemon-style thread with graceful shutdown
 */
 
volatile int shutdown_requested = 0;
pthread_mutex_t shutdown_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t shutdown_cond = PTHREAD_COND_INITIALIZER;
 
void *daemon_thread(void *arg) {
    // Detach - we manage our own lifecycle
    pthread_detach(pthread_self());
    
    printf("Daemon thread started\n");
    
    while (1) {
        // Do periodic work
        printf("Daemon: heartbeat\n");
        
        // Wait for shutdown signal with timeout
        pthread_mutex_lock(&shutdown_mutex);
        
        struct timespec timeout;
        clock_gettime(CLOCK_REALTIME, &timeout);
        timeout.tv_sec += 1;  // 1 second timeout
        
        int result = pthread_cond_timedwait(&shutdown_cond, 
                                            &shutdown_mutex, &timeout);
        
        int should_exit = shutdown_requested;
        pthread_mutex_unlock(&shutdown_mutex);
        
        if (should_exit) {
            printf("Daemon: shutdown requested, exiting\n");
            break;
        }
    }
    
    return NULL;
}
 
void request_daemon_shutdown(void) {
    pthread_mutex_lock(&shutdown_mutex);
    shutdown_requested = 1;
    pthread_cond_signal(&shutdown_cond);
    pthread_mutex_unlock(&shutdown_mutex);
}
 
/*
 * Checking if a thread is detached
 */
void check_detach_state(pthread_t tid) {
    pthread_attr_t attr;
    int detach_state;
    
    // Get thread's current attributes (Linux extension)
#ifdef __linux__
    if (pthread_getattr_np(tid, &attr) == 0) {
        pthread_attr_getdetachstate(&attr, &detach_state);
        printf("Thread is %s\n", 
               detach_state == PTHREAD_CREATE_DETACHED ? "DETACHED" : "JOINABLE");
        pthread_attr_destroy(&attr);
    }
#endif
}
 
/*
 * DANGER: Joining a detached thread
 */
void demonstrate_join_detached_error(void) {
    pthread_t tid;
    
    // Create and detach
    pthread_create(&tid, NULL, background_task, (void *)3);
    pthread_detach(tid);
    
    // Try to join - will fail
    int result = pthread_join(tid, NULL);
    if (result != 0) {
        // EINVAL: thread is not joinable
        fprintf(stderr, "pthread_join on detached: %s\n", strerror(result));
    }
}

When to Detach

Detach threads when: (1) you don't need their return value, (2) you don't need to synchronize on their completion, (3) they should run independently for the process lifetime. Default to joinable threads unless you have a specific reason to detach—it's easier to debug and reason about thread lifecycles when you explicitly join.

Windows Thread Joining

Windows doesn't have a separate "join" concept—threads are joined using the unified wait functions that work on any synchronizable kernel object. A thread handle becomes signaled when the thread terminates, making WaitForSingleObject() the join equivalent.

Wait Functions for Threads

WaitForSingleObject(hThread, timeout) — Wait for one thread
WaitForMultipleObjects(count, handles, waitAll, timeout) — Wait for multiple threads
WaitForMultipleObjectsEx() — Extended version with alertable waits

Getting Exit Code

After waiting, use GetExitCodeThread() to retrieve the thread's return value:

DWORD exitCode;
WaitForSingleObject(hThread, INFINITE);
GetExitCodeThread(hThread, &exitCode);

windows_joining.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
#include <windows.h>
#include <stdio.h>
 
/*
 * Windows Thread Joining Patterns
 */
 
DWORD WINAPI ComputeTask(LPVOID lpParam) {
    int value = (int)(INT_PTR)lpParam;
    
    printf("Thread: computing with value %d\n", value);
    Sleep(1000);  // Simulate work
    
    // Return value becomes the exit code
    return value * 2;
}
 
/*
 * Basic joining with WaitForSingleObject
 */
void basic_join(void) {
    HANDLE hThread;
    DWORD threadId;
    DWORD exitCode;
    DWORD waitResult;
    
    hThread = CreateThread(NULL, 0, ComputeTask, 
                          (LPVOID)21, 0, &threadId);
    if (hThread == NULL) {
        printf("CreateThread failed: %lu\n", GetLastError());
        return;
    }
    
    printf("Main: waiting for thread %lu...\n", threadId);
    
    // Wait indefinitely for thread to terminate
    waitResult = WaitForSingleObject(hThread, INFINITE);
    
    switch (waitResult) {
        case WAIT_OBJECT_0:
            // Thread terminated
            GetExitCodeThread(hThread, &exitCode);
            printf("Thread exited with code: %lu\n", exitCode);
            break;
            
        case WAIT_FAILED:
            printf("WaitForSingleObject failed: %lu\n", GetLastError());
            break;
    }
    
    CloseHandle(hThread);
}
 
/*
 * Joining with timeout
 */
void join_with_timeout(void) {
    HANDLE hThread;
    DWORD threadId;
    DWORD waitResult;
    
    hThread = CreateThread(NULL, 0, ComputeTask, 
                          (LPVOID)100, 0, &threadId);
    
    // Wait for 500ms
    waitResult = WaitForSingleObject(hThread, 500);
    
    switch (waitResult) {
        case WAIT_OBJECT_0:
            printf("Thread completed within timeout\n");
            break;
            
        case WAIT_TIMEOUT:
            printf("Timeout! Thread still running\n");
            // Could terminate the thread (dangerous!) or wait more
            // TerminateThread(hThread, 1);  // BAD PRACTICE
            WaitForSingleObject(hThread, INFINITE);  // Wait for completion
            break;
            
        case WAIT_FAILED:
            printf("Wait failed: %lu\n", GetLastError());
            break;
    }
    
    CloseHandle(hThread);
}
 
/*
 * Joining multiple threads
 */
void join_multiple_threads(void) {
    const DWORD NUM_THREADS = 5;
    HANDLE threads[5];
    DWORD i;
    
    // Create multiple threads
    for (i = 0; i < NUM_THREADS; i++) {
        threads[i] = CreateThread(NULL, 0, ComputeTask,
                                 (LPVOID)(INT_PTR)(i * 10), 0, NULL);
        if (threads[i] == NULL) {
            printf("Failed to create thread %lu\n", i);
        }
    }
    
    printf("Waiting for all threads...\n");
    
    // Wait for ALL threads to complete
    DWORD waitResult = WaitForMultipleObjects(
        NUM_THREADS,    // Number of handles
        threads,        // Handle array
        TRUE,           // Wait for ALL (FALSE = any one)
        INFINITE        // Timeout
    );
    
    if (waitResult >= WAIT_OBJECT_0 && 
        waitResult < WAIT_OBJECT_0 + NUM_THREADS) {
        printf("All threads completed\n");
        
        // Get each thread's exit code
        for (i = 0; i < NUM_THREADS; i++) {
            DWORD exitCode;
            GetExitCodeThread(threads[i], &exitCode);
            printf("Thread %lu exit code: %lu\n", i, exitCode);
        }
    } else if (waitResult == WAIT_FAILED) {
        printf("WaitForMultipleObjects failed: %lu\n", GetLastError());
    }
    
    // Close all handles
    for (i = 0; i < NUM_THREADS; i++) {
        CloseHandle(threads[i]);
    }
}
 
/*
 * Waiting for first completion (any thread)
 */
void join_first_completion(void) {
    const DWORD NUM_THREADS = 3;
    HANDLE threads[3];
    
    // Create threads with different sleep times
    threads[0] = CreateThread(NULL, 0, ComputeTask, (LPVOID)1000, 0, NULL);
    threads[1] = CreateThread(NULL, 0, ComputeTask, (LPVOID)500, 0, NULL);
    threads[2] = CreateThread(NULL, 0, ComputeTask, (LPVOID)750, 0, NULL);
    
    // Wait for ANY one thread
    DWORD waitResult = WaitForMultipleObjects(
        NUM_THREADS,
        threads,
        FALSE,          // Wait for ANY (not all)
        INFINITE
    );
    
    if (waitResult >= WAIT_OBJECT_0 && 
        waitResult < WAIT_OBJECT_0 + NUM_THREADS) {
        DWORD index = waitResult - WAIT_OBJECT_0;
        printf("Thread %lu completed first\n", index);
    }
    
    // Still need to wait for remaining threads
    WaitForMultipleObjects(NUM_THREADS, threads, TRUE, INFINITE);
    
    // Close handles
    for (DWORD i = 0; i < NUM_THREADS; i++) {
        CloseHandle(threads[i]);
    }
}
 
/*
 * Checking thread state without blocking
 */
void non_blocking_check(HANDLE hThread) {
    // Zero timeout = non-blocking
    DWORD result = WaitForSingleObject(hThread, 0);
    
    switch (result) {
        case WAIT_OBJECT_0:
            printf("Thread has terminated\n");
            break;
        case WAIT_TIMEOUT:
            printf("Thread still running\n");
            break;
        case WAIT_FAILED:
            printf("Check failed: %lu\n", GetLastError());
            break;
    }
    
    // Alternative: use GetExitCodeThread
    DWORD exitCode;
    GetExitCodeThread(hThread, &exitCode);
    if (exitCode == STILL_ACTIVE) {
        printf("Thread is STILL_ACTIVE\n");
    } else {
        printf("Thread exit code: %lu\n", exitCode);
    }
}

STILL_ACTIVE (259)

GetExitCodeThread() returns STILL_ACTIVE (value 259) if the thread hasn't terminated. However, a thread could legitimately return 259 as its exit code! For reliable state checking, use WaitForSingleObject with 0 timeout. Only use GetExitCodeThread for the actual exit value after waiting confirms termination.

Java Thread Joining

Java's Thread.join() method provides straightforward join semantics. All Java threads are automatically managed by the garbage collector, so there are no "zombie" threads as in POSIX—but joining still provides essential synchronization guarantees.

Thread.join() Methods

void join()                    // Wait indefinitely
void join(long millis)         // Wait with millisecond timeout  
void join(long millis, int nanos)  // Wait with nanosecond precision

All join variants throw InterruptedException if the waiting thread is interrupted.

Memory Visibility Guarantee

A successful join() establishes a happens-before relationship: all actions in the terminated thread happen-before the join() returns. This makes all writes by the terminated thread visible to the joining thread.

JavaJoining.java
Java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
import java.util.concurrent.*;
import java.util.List;
import java.util.ArrayList;
 
/**
 * Java Thread Joining Patterns
 */
public class JavaJoining {
 
    /**
     * Basic join - wait indefinitely
     */
    public static void basicJoin() {
        Thread worker = new Thread(() -> {
            System.out.println("Worker starting");
            try {
                Thread.sleep(2000);
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
            System.out.println("Worker complete");
        });
        
        worker.start();
        
        try {
            System.out.println("Main: waiting for worker...");
            worker.join();  // Blocks until worker terminates
            System.out.println("Main: worker has terminated");
        } catch (InterruptedException e) {
            System.out.println("Main: interrupted while waiting");
            Thread.currentThread().interrupt();
        }
    }
    
    /**
     * Join with timeout
     */
    public static void joinWithTimeout() {
        Thread slowWorker = new Thread(() -> {
            try {
                Thread.sleep(5000);  // 5 seconds
            } catch (InterruptedException e) {
                System.out.println("Worker interrupted");
                Thread.currentThread().interrupt();
            }
        });
        
        slowWorker.start();
        
        try {
            // Wait 2 seconds max
            slowWorker.join(2000);
            
            if (slowWorker.isAlive()) {
                System.out.println("Timeout! Worker still running");
                // Option 1: Let it continue
                // Option 2: Interrupt it
                slowWorker.interrupt();
                slowWorker.join();  // Wait for interrupt to take effect
            } else {
                System.out.println("Worker completed within timeout");
            }
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
    }
    
    /**
     * Joining multiple threads
     */
    public static void joinMultipleThreads() {
        List<Thread> workers = new ArrayList<>();
        
        // Create and start workers
        for (int i = 0; i < 5; i++) {
            final int id = i;
            Thread t = new Thread(() -> {
                System.out.println("Worker " + id + " starting");
                try {
                    Thread.sleep(1000 + id * 500);
                } catch (InterruptedException e) {
                    Thread.currentThread().interrupt();
                }
                System.out.println("Worker " + id + " done");
            }, "Worker-" + i);
            
            workers.add(t);
            t.start();
        }
        
        System.out.println("All workers started, waiting for completion...");
        
        // Join all threads
        for (Thread t : workers) {
            try {
                t.join();
            } catch (InterruptedException e) {
                System.out.println("Interrupted while waiting for " + t.getName());
                Thread.currentThread().interrupt();
            }
        }
        
        System.out.println("All workers completed");
    }
    
    /**
     * Getting results: since join() returns void, use shared state
     */
    public static void getResultsFromThreads() {
        // Pattern 1: Results array
        final int NUM_THREADS = 4;
        final int[] results = new int[NUM_THREADS];
        Thread[] threads = new Thread[NUM_THREADS];
        
        for (int i = 0; i < NUM_THREADS; i++) {
            final int index = i;
            threads[i] = new Thread(() -> {
                results[index] = index * index;  // Compute result
            });
            threads[i].start();
        }
        
        for (Thread t : threads) {
            try {
                t.join();
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
        }
        
        // Results are guaranteed visible after join
        for (int i = 0; i < NUM_THREADS; i++) {
            System.out.println("Result " + i + ": " + results[i]);
        }
    }
    
    /**
     * Using Callable/Future for return values (preferred!)
     */
    public static void useCallableForResults() throws Exception {
        ExecutorService executor = Executors.newFixedThreadPool(4);
        
        // Submit tasks that return values
        List<Future<Integer>> futures = new ArrayList<>();
        
        for (int i = 0; i < 5; i++) {
            final int value = i;
            Future<Integer> future = executor.submit(() -> {
                Thread.sleep(500);
                return value * value;
            });
            futures.add(future);
        }
        
        // Get results (blocks until each task completes)
        int total = 0;
        for (Future<Integer> future : futures) {
            total += future.get();  // Blocks until result available
        }
        
        System.out.println("Total: " + total);
        
        executor.shutdown();
    }
    
    /**
     * Daemon threads don't need joining
     */
    public static void daemonThreads() {
        Thread daemon = new Thread(() -> {
            while (true) {
                System.out.println("Daemon heartbeat");
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    break;
                }
            }
        });
        
        daemon.setDaemon(true);  // Mark as daemon BEFORE start
        daemon.start();
        
        // Daemon threads:
        // - Don't prevent JVM shutdown
        // - Are abruptly killed when last non-daemon exits
        // - Generally shouldn't be joined (they run forever)
        
        System.out.println("Main exiting, daemon will be killed");
    }
    
    /**
     * Proper interrupt handling during join
     */
    public static void handleInterruptDuringJoin(Thread target) {
        boolean interrupted = false;
        
        while (true) {
            try {
                target.join();
                break;  // Successfully joined
            } catch (InterruptedException e) {
                // Remember we were interrupted, but keep waiting
                interrupted = true;
            }
        }
        
        // Restore interrupt status
        if (interrupted) {
            Thread.currentThread().interrupt();
        }
    }
}

Prefer Executors Over Raw Threads

For most applications, use ExecutorService with Callable/Future instead of raw threads with join(). Executors provide: cleaner result handling (Future.get()), automatic thread reuse, graceful shutdown, exception handling, and timeout support. Raw Thread.join() is still useful for simple cases or when you need precise thread lifecycle control.

Advanced Joining Patterns

Real-world applications often require more sophisticated join patterns than basic one-at-a-time waiting. This section covers advanced techniques for coordinating multiple thread completions.

Pattern Categories

Scatter-Gather: Spawn N threads, collect all results
First-to-Complete: Wait for any one of N threads
Barrier Synchronization: All threads wait for each other
Cascading Joins: Thread chains where each waits for its predecessor
Conditional Joining: Join only if certain conditions are met

advanced_joining.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#include <time.h>
#include <errno.h>
 
/*
 * Pattern: Parallel Map-Reduce with Join
 * ------------------------------------
 * Scatter work across threads, gather results
 */
 
typedef struct {
    int *data;
    int start;
    int end;
    long partial_sum;
} ChunkWork;
 
void *sum_chunk(void *arg) {
    ChunkWork *work = (ChunkWork *)arg;
    work->partial_sum = 0;
    
    for (int i = work->start; i < work->end; i++) {
        work->partial_sum += work->data[i];
    }
    
    return NULL;
}
 
long parallel_sum(int *data, int size, int num_threads) {
    pthread_t *threads = malloc(num_threads * sizeof(pthread_t));
    ChunkWork *works = malloc(num_threads * sizeof(ChunkWork));
    
    int chunk_size = size / num_threads;
    
    // Scatter
    for (int i = 0; i < num_threads; i++) {
        works[i].data = data;
        works[i].start = i * chunk_size;
        works[i].end = (i == num_threads - 1) ? size : (i + 1) * chunk_size;
        pthread_create(&threads[i], NULL, sum_chunk, &works[i]);
    }
    
    // Gather (join all and collect results)
    long total = 0;
    for (int i = 0; i < num_threads; i++) {
        pthread_join(threads[i], NULL);
        total += works[i].partial_sum;
    }
    
    free(threads);
    free(works);
    return total;
}
 
/*
 * Pattern: First-to-Complete (Racing Threads)
 * -------------------------------------------
 * Wait for any thread to finish, then proceed
 * Uses condition variable instead of sequential joins
 */
 
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    int completed_id;
    void *result;
    bool done;
} RaceContext;
 
typedef struct {
    int id;
    RaceContext *ctx;
    int work_duration_ms;
} RaceWorker;
 
void *race_worker(void *arg) {
    RaceWorker *worker = (RaceWorker *)arg;
    
    // Simulate variable work time
    usleep(worker->work_duration_ms * 1000);
    
    pthread_mutex_lock(&worker->ctx->mutex);
    
    if (!worker->ctx->done) {
        // We're the first!
        worker->ctx->done = true;
        worker->ctx->completed_id = worker->id;
        worker->ctx->result = (void *)(intptr_t)(worker->id * 100);
        pthread_cond_signal(&worker->ctx->cond);
    }
    
    pthread_mutex_unlock(&worker->ctx->mutex);
    
    return NULL;
}
 
void *first_to_complete(int num_racers) {
    RaceContext ctx = {
        .mutex = PTHREAD_MUTEX_INITIALIZER,
        .cond = PTHREAD_COND_INITIALIZER,
        .done = false,
        .completed_id = -1,
        .result = NULL
    };
    
    pthread_t *threads = malloc(num_racers * sizeof(pthread_t));
    RaceWorker *workers = malloc(num_racers * sizeof(RaceWorker));
    
    // Start racers with random delays
    srand(time(NULL));
    for (int i = 0; i < num_racers; i++) {
        workers[i].id = i;
        workers[i].ctx = &ctx;
        workers[i].work_duration_ms = 100 + (rand() % 200);
        pthread_create(&threads[i], NULL, race_worker, &workers[i]);
    }
    
    // Wait for first completion
    pthread_mutex_lock(&ctx.mutex);
    while (!ctx.done) {
        pthread_cond_wait(&ctx.cond, &ctx.mutex);
    }
    int winner = ctx.completed_id;
    void *result = ctx.result;
    pthread_mutex_unlock(&ctx.mutex);
    
    printf("Thread %d won the race!\n", winner);
    
    // Still need to join all threads
    for (int i = 0; i < num_racers; i++) {
        pthread_join(threads[i], NULL);
    }
    
    free(threads);
    free(workers);
    return result;
}
 
/*
 * Pattern: Join with Cleanup on Failure
 * --------------------------------------
 * If any thread fails, cancel and join all
 */
 
typedef struct {
    pthread_t tid;
    bool created;
    bool failed;
} ThreadSlot;
 
void *fallible_worker(void *arg) {
    int id = (int)(intptr_t)arg;
    
    // Simulate possible failure
    if (id == 2) {
        return (void *)-1;  // Failure
    }
    
    usleep(100000);  // Work
    return (void *)0;  // Success
}
 
int run_with_failure_handling(int num_threads) {
    ThreadSlot *slots = calloc(num_threads, sizeof(ThreadSlot));
    int successful = 0;
    
    // Create threads
    for (int i = 0; i < num_threads; i++) {
        int result = pthread_create(&slots[i].tid, NULL, 
                                    fallible_worker, (void *)(intptr_t)i);
        if (result == 0) {
            slots[i].created = true;
        } else {
            fprintf(stderr, "Failed to create thread %d\n", i);
            goto cleanup;
        }
    }
    
    // Join all and check results
    for (int i = 0; i < num_threads; i++) {
        if (!slots[i].created) continue;
        
        void *retval;
        pthread_join(slots[i].tid, &retval);
        
        if (retval == (void *)-1) {
            slots[i].failed = true;
            fprintf(stderr, "Thread %d failed\n", i);
        } else {
            successful++;
        }
    }
    
    free(slots);
    return successful;
    
cleanup:
    // Cancel and join all created threads
    for (int i = 0; i < num_threads; i++) {
        if (slots[i].created) {
            pthread_cancel(slots[i].tid);
        }
    }
    for (int i = 0; i < num_threads; i++) {
        if (slots[i].created) {
            pthread_join(slots[i].tid, NULL);
        }
    }
    free(slots);
    return -1;
}
 
/*
 * Pattern: Timed Join with pthread_timedjoin_np (Linux)
 * -----------------------------------------------------
 * Join with absolute timeout
 */
 
#ifdef __linux__
#define _GNU_SOURCE
#include <pthread.h>
 
int timed_join_example(pthread_t tid, int timeout_seconds) {
    struct timespec ts;
    clock_gettime(CLOCK_REALTIME, &ts);
    ts.tv_sec += timeout_seconds;
    
    int result = pthread_timedjoin_np(tid, NULL, &ts);
    
    if (result == ETIMEDOUT) {
        printf("Join timed out\n");
        return -1;
    } else if (result != 0) {
        fprintf(stderr, "pthread_timedjoin_np: %s\n", strerror(result));
        return -1;
    }
    
    return 0;
}
#endif

Barrier vs Join

A barrier (pthread_barrier) makes all threads wait for each other at a synchronization point, then all continue. This is different from joining, where the joined thread terminates. Use barriers when threads need to synchronize mid-execution; use joins for final thread completion.

Common Pitfalls and Best Practices

Thread joining seems straightforward but harbors subtle pitfalls that cause resource leaks, deadlocks, and crashes. Understanding these issues is essential for robust concurrent programming.

Common Joining Pitfalls

•Forgetting to join joinable threads — Causes resource leaks identical to zombie processes. Thread resources remain allocated until joined.
•Double joining — Joining a thread twice is undefined behavior. The thread ID may be reused, leading to joining the wrong thread or crashing.
•Joining detached threads — Returns EINVAL but developers often ignore return values. A thread that was detached cannot be joined.
•Joining from the wrong thread — A thread cannot join itself. Attempting this returns EDEADLK (deadlock detected).
•Returning pointers to local variables — The stack frame is gone when join() returns; the pointer is dangling.
•Not handling InterruptedException (Java) — Swallowing the exception loses the interrupt signal; always restore interrupt status or propagate.
•Assuming join order equals completion order — Joining in creation order doesn't mean threads finished in that order.
•Blocking main thread unnecessarily — Joining early can serialize execution; join as late as possible.

Thread Joining Best Practices

•Pair every create with either join or detach — Make the decision at creation time and document it. No joinable thread should go unjoined.
•Check return values — pthread_join can fail (EINVAL, ESRCH, EDEADLK). Handle errors appropriately.
•Use thread IDs carefully after join — After joining, the thread ID may be reused. Clear or invalidate your stored ID.
•Prefer heap allocation for return values — Allocate results on heap (caller frees) or use shared structures passed as arguments.
•Consider timeout-based joining — For responsiveness, use timed joins and handle timeout cases (cancel, retry, or continue).
•Join threads in reverse creation order when order matters — This helps avoid dependencies on thread completion timing.
•Use thread pools for task-based work — Pools manage thread lifecycles automatically; you collect results via Futures.
•Document thread ownership — Clearly specify who is responsible for joining each thread, especially in libraries.
•Test with stress — Thread timing issues are non-deterministic. Stress test join patterns under load.
•Consider structured concurrency — Newer patterns (Java 21+ StructuredTaskScope) make join semantics safer by scoping thread lifetimes.

Page Complete

You now have comprehensive knowledge of thread joining across POSIX, Windows, and Java platforms. You understand the semantics of joining vs detaching, timeout-based waiting, advanced synchronization patterns, and common pitfalls. This completes our exploration of thread libraries—you're now equipped to write robust, well-coordinated multithreaded applications on any major platform.