Loading content...
One of the most critical aspects of multithreaded programming is knowing when threads finish and collecting their results. Thread joining is the fundamental mechanism for this coordination—it allows one thread to wait for another thread to complete execution, synchronize on its termination, and optionally retrieve its return value.
Without proper joining, concurrent programs suffer from:
This page provides an exhaustive exploration of thread joining across POSIX, Windows, and Java environments, covering the semantics, patterns, edge cases, and best practices that professional engineers must master.
By the end of this page, you will understand the complete thread joining semantics across platforms, the relationship between joining and detaching, timeout-based joining techniques, patterns for joining multiple threads, and how to properly structure thread lifecycles for robust resource management.
Thread joining is a blocking synchronization operation. When thread A joins thread B:
This simple concept has profound implications for program structure and correctness.
Memory Reclamation: Even after a thread's function returns, its kernel structures and stack may persist until joined. This is analogous to zombie processes—the thread is dead but its metadata remains allocated.
Synchronization Guarantee: A join provides a happens-before relationship: all memory writes by the joined thread are visible to the joining thread after the join completes. Without this, you might read stale data.
Return Value Collection: Threads can return values (void* in POSIX, DWORD in Windows, Object in Java). Joining is the only way to retrieve these values.
| Platform | Join Function | Timeout Support | Return Value |
|---|---|---|---|
| POSIX | pthread_join() | No (use timed alternatives) | void* (retval pointer) |
| POSIX | pthread_timedjoin_np() | Yes (Linux extension) | void* (retval pointer) |
| Windows | WaitForSingleObject() | Yes (DWORD milliseconds) | Via GetExitCodeThread() |
| Windows | WaitForMultipleObjects() | Yes (DWORD milliseconds) | Via GetExitCodeThread() |
| Java | Thread.join() | No (blocks indefinitely) | None (use shared state) |
| Java | Thread.join(millis) | Yes (long milliseconds) | None (use shared state) |
A thread can be either joinable (the default) or detached—never both. A joinable thread MUST be joined to reclaim its resources. A detached thread automatically releases resources upon termination but cannot be joined. Once detached, a thread cannot be made joinable again.
The pthread_join() function is the standard POSIX mechanism for joining threads. It blocks the calling thread until the target thread terminates.
int pthread_join(pthread_t thread, void **retval);
Parameters:
thread: The thread ID to wait forretval: If non-NULL, receives the thread's return value (or PTHREAD_CANCELED if canceled)Returns: 0 on success, error code on failure
EDEADLK: A deadlock was detected (thread is joining itself, or two threads joining each other)EINVAL: Thread is not joinable (previously joined or detached)ESRCH: No thread with the given ID exists123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190
#include <pthread.h>#include <stdio.h>#include <stdlib.h>#include <string.h>#include <errno.h> /* * Basic pthread_join Usage */ void *compute_sum(void *arg) { int limit = (int)(intptr_t)arg; long sum = 0; for (int i = 1; i <= limit; i++) { sum += i; } // Return value via void* (must be long-lived!) // Allocate on heap or use static/global long *result = malloc(sizeof(long)); *result = sum; return (void *)result;} void basic_join_example(void) { pthread_t tid; void *retval; int result; // Create thread result = pthread_create(&tid, NULL, compute_sum, (void *)1000); if (result != 0) { fprintf(stderr, "pthread_create: %s\n", strerror(result)); exit(1); } printf("Main: waiting for thread to complete...\n"); // Join thread - blocks until thread exits result = pthread_join(tid, &retval); if (result != 0) { fprintf(stderr, "pthread_join: %s\n", strerror(result)); exit(1); } // Thread has terminated; retval contains its return value long *sum = (long *)retval; printf("Main: thread returned sum = %ld\n", *sum); // Free the heap-allocated result free(sum);} /* * Returning Values: Common Patterns */ // Pattern 1: Return integer via pointer castvoid *return_integer(void *arg) { int result = 42; return (void *)(intptr_t)result; // Cast int to pointer} void join_integer(pthread_t tid) { void *retval; pthread_join(tid, &retval); int result = (int)(intptr_t)retval; // Cast pointer back to int printf("Thread returned: %d\n", result);} // Pattern 2: Return status codevoid *return_status(void *arg) { if (/* error condition */ 0) { return (void *)-1; // Error } return (void *)0; // Success} // Pattern 3: Return pointer to shared structure (careful with lifecycle!)typedef struct { int computed_value; char message[256];} TaskResult; void *return_struct(void *arg) { TaskResult *task = (TaskResult *)arg; // Get input struct // Compute and store results in same struct task->computed_value = 123; strcpy(task->message, "Task completed successfully"); return task; // Return pointer to same struct} /* * Joining Multiple Threads */ void join_multiple_threads(void) { const int NUM_THREADS = 10; pthread_t threads[NUM_THREADS]; void *results[NUM_THREADS]; // Create all threads for (int i = 0; i < NUM_THREADS; i++) { pthread_create(&threads[i], NULL, compute_sum, (void *)(intptr_t)(i * 100 + 100)); } // Join all threads (in order) for (int i = 0; i < NUM_THREADS; i++) { int result = pthread_join(threads[i], &results[i]); if (result != 0) { fprintf(stderr, "pthread_join[%d]: %s\n", i, strerror(result)); results[i] = NULL; } } // Process results long total = 0; for (int i = 0; i < NUM_THREADS; i++) { if (results[i] != NULL) { long *partial = (long *)results[i]; total += *partial; free(partial); } } printf("Total sum from all threads: %ld\n", total);} /* * Detecting Canceled Threads */ void *cancellable_work(void *arg) { while (1) { // Do some work sleep(1); pthread_testcancel(); // Cancellation point } return (void *)0; // Never reached if canceled} void join_canceled_thread(void) { pthread_t tid; void *retval; pthread_create(&tid, NULL, cancellable_work, NULL); sleep(2); pthread_cancel(tid); // Request cancellation pthread_join(tid, &retval); if (retval == PTHREAD_CANCELED) { printf("Thread was canceled\n"); } else { printf("Thread exited normally\n"); }} /* * Error Handling: Double Join Prevention */ void prevent_double_join(void) { pthread_t tid; int result; pthread_create(&tid, NULL, compute_sum, (void *)100); // First join: success result = pthread_join(tid, NULL); printf("First join: %s\n", result == 0 ? "success" : strerror(result)); // Second join: EINVAL or ESRCH result = pthread_join(tid, NULL); printf("Second join: %s\n", result == 0 ? "success" : strerror(result)); /* * After a thread is joined: * - Its pthread_t may be reused for new threads * - Joining again is undefined behavior (may crash or corrupt state) * * Best practice: Set tid to 0 or use a flag after joining */}Never return a pointer to a local variable from a thread function! The stack frame is deallocated when the thread exits. Options: (1) return an integer cast to void*, (2) heap-allocate the result (caller frees), (3) use a shared structure passed as argument, (4) write to global or thread-specific storage.
A detached thread is one that cannot be joined. When a detached thread terminates, its resources are automatically reclaimed by the system—no other thread needs to wait for it. This is useful for:
There are two ways to create detached threads:
Method 1: Via Attributes
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
pthread_create(&tid, &attr, func, arg);
pthread_attr_destroy(&attr);
Method 2: Detach After Creation
pthread_create(&tid, NULL, func, arg);
pthread_detach(tid); // Thread is now detached
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177
#include <pthread.h>#include <stdio.h>#include <stdlib.h>#include <string.h>#include <unistd.h> /* * Detached Thread Patterns */ void *background_task(void *arg) { int id = (int)(intptr_t)arg; printf("Background task %d starting\n", id); sleep(2); // Simulate work printf("Background task %d complete\n", id); // When this function returns, resources are automatically freed return NULL;} /* * Method 1: Create as detached via attributes */void create_detached_via_attr(void) { pthread_t tid; pthread_attr_t attr; int result; // Initialize attributes result = pthread_attr_init(&attr); if (result != 0) { fprintf(stderr, "attr_init: %s\n", strerror(result)); return; } // Set detached state result = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); if (result != 0) { fprintf(stderr, "setdetachstate: %s\n", strerror(result)); pthread_attr_destroy(&attr); return; } // Create thread result = pthread_create(&tid, &attr, background_task, (void *)1); if (result != 0) { fprintf(stderr, "pthread_create: %s\n", strerror(result)); } else { printf("Created detached thread\n"); } // Cannot join this thread - it's detached // pthread_join(tid, NULL); // Would return EINVAL! pthread_attr_destroy(&attr);} /* * Method 2: Detach after creation */void create_then_detach(void) { pthread_t tid; int result; // Create as joinable (default) result = pthread_create(&tid, NULL, background_task, (void *)2); if (result != 0) { fprintf(stderr, "pthread_create: %s\n", strerror(result)); return; } // Immediately detach result = pthread_detach(tid); if (result != 0) { fprintf(stderr, "pthread_detach: %s\n", strerror(result)); } printf("Thread created and detached\n");} /* * Self-detaching thread */void *self_detaching_task(void *arg) { // Detach ourselves so no one needs to join pthread_detach(pthread_self()); printf("Self-detached thread running\n"); sleep(1); printf("Self-detached thread exiting\n"); return NULL;} /* * Daemon-style thread with graceful shutdown */ volatile int shutdown_requested = 0;pthread_mutex_t shutdown_mutex = PTHREAD_MUTEX_INITIALIZER;pthread_cond_t shutdown_cond = PTHREAD_COND_INITIALIZER; void *daemon_thread(void *arg) { // Detach - we manage our own lifecycle pthread_detach(pthread_self()); printf("Daemon thread started\n"); while (1) { // Do periodic work printf("Daemon: heartbeat\n"); // Wait for shutdown signal with timeout pthread_mutex_lock(&shutdown_mutex); struct timespec timeout; clock_gettime(CLOCK_REALTIME, &timeout); timeout.tv_sec += 1; // 1 second timeout int result = pthread_cond_timedwait(&shutdown_cond, &shutdown_mutex, &timeout); int should_exit = shutdown_requested; pthread_mutex_unlock(&shutdown_mutex); if (should_exit) { printf("Daemon: shutdown requested, exiting\n"); break; } } return NULL;} void request_daemon_shutdown(void) { pthread_mutex_lock(&shutdown_mutex); shutdown_requested = 1; pthread_cond_signal(&shutdown_cond); pthread_mutex_unlock(&shutdown_mutex);} /* * Checking if a thread is detached */void check_detach_state(pthread_t tid) { pthread_attr_t attr; int detach_state; // Get thread's current attributes (Linux extension)#ifdef __linux__ if (pthread_getattr_np(tid, &attr) == 0) { pthread_attr_getdetachstate(&attr, &detach_state); printf("Thread is %s\n", detach_state == PTHREAD_CREATE_DETACHED ? "DETACHED" : "JOINABLE"); pthread_attr_destroy(&attr); }#endif} /* * DANGER: Joining a detached thread */void demonstrate_join_detached_error(void) { pthread_t tid; // Create and detach pthread_create(&tid, NULL, background_task, (void *)3); pthread_detach(tid); // Try to join - will fail int result = pthread_join(tid, NULL); if (result != 0) { // EINVAL: thread is not joinable fprintf(stderr, "pthread_join on detached: %s\n", strerror(result)); }}Detach threads when: (1) you don't need their return value, (2) you don't need to synchronize on their completion, (3) they should run independently for the process lifetime. Default to joinable threads unless you have a specific reason to detach—it's easier to debug and reason about thread lifecycles when you explicitly join.
Windows doesn't have a separate "join" concept—threads are joined using the unified wait functions that work on any synchronizable kernel object. A thread handle becomes signaled when the thread terminates, making WaitForSingleObject() the join equivalent.
WaitForSingleObject(hThread, timeout) — Wait for one threadWaitForMultipleObjects(count, handles, waitAll, timeout) — Wait for multiple threadsWaitForMultipleObjectsEx() — Extended version with alertable waitsAfter waiting, use GetExitCodeThread() to retrieve the thread's return value:
DWORD exitCode;
WaitForSingleObject(hThread, INFINITE);
GetExitCodeThread(hThread, &exitCode);
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197
#include <windows.h>#include <stdio.h> /* * Windows Thread Joining Patterns */ DWORD WINAPI ComputeTask(LPVOID lpParam) { int value = (int)(INT_PTR)lpParam; printf("Thread: computing with value %d\n", value); Sleep(1000); // Simulate work // Return value becomes the exit code return value * 2;} /* * Basic joining with WaitForSingleObject */void basic_join(void) { HANDLE hThread; DWORD threadId; DWORD exitCode; DWORD waitResult; hThread = CreateThread(NULL, 0, ComputeTask, (LPVOID)21, 0, &threadId); if (hThread == NULL) { printf("CreateThread failed: %lu\n", GetLastError()); return; } printf("Main: waiting for thread %lu...\n", threadId); // Wait indefinitely for thread to terminate waitResult = WaitForSingleObject(hThread, INFINITE); switch (waitResult) { case WAIT_OBJECT_0: // Thread terminated GetExitCodeThread(hThread, &exitCode); printf("Thread exited with code: %lu\n", exitCode); break; case WAIT_FAILED: printf("WaitForSingleObject failed: %lu\n", GetLastError()); break; } CloseHandle(hThread);} /* * Joining with timeout */void join_with_timeout(void) { HANDLE hThread; DWORD threadId; DWORD waitResult; hThread = CreateThread(NULL, 0, ComputeTask, (LPVOID)100, 0, &threadId); // Wait for 500ms waitResult = WaitForSingleObject(hThread, 500); switch (waitResult) { case WAIT_OBJECT_0: printf("Thread completed within timeout\n"); break; case WAIT_TIMEOUT: printf("Timeout! Thread still running\n"); // Could terminate the thread (dangerous!) or wait more // TerminateThread(hThread, 1); // BAD PRACTICE WaitForSingleObject(hThread, INFINITE); // Wait for completion break; case WAIT_FAILED: printf("Wait failed: %lu\n", GetLastError()); break; } CloseHandle(hThread);} /* * Joining multiple threads */void join_multiple_threads(void) { const DWORD NUM_THREADS = 5; HANDLE threads[5]; DWORD i; // Create multiple threads for (i = 0; i < NUM_THREADS; i++) { threads[i] = CreateThread(NULL, 0, ComputeTask, (LPVOID)(INT_PTR)(i * 10), 0, NULL); if (threads[i] == NULL) { printf("Failed to create thread %lu\n", i); } } printf("Waiting for all threads...\n"); // Wait for ALL threads to complete DWORD waitResult = WaitForMultipleObjects( NUM_THREADS, // Number of handles threads, // Handle array TRUE, // Wait for ALL (FALSE = any one) INFINITE // Timeout ); if (waitResult >= WAIT_OBJECT_0 && waitResult < WAIT_OBJECT_0 + NUM_THREADS) { printf("All threads completed\n"); // Get each thread's exit code for (i = 0; i < NUM_THREADS; i++) { DWORD exitCode; GetExitCodeThread(threads[i], &exitCode); printf("Thread %lu exit code: %lu\n", i, exitCode); } } else if (waitResult == WAIT_FAILED) { printf("WaitForMultipleObjects failed: %lu\n", GetLastError()); } // Close all handles for (i = 0; i < NUM_THREADS; i++) { CloseHandle(threads[i]); }} /* * Waiting for first completion (any thread) */void join_first_completion(void) { const DWORD NUM_THREADS = 3; HANDLE threads[3]; // Create threads with different sleep times threads[0] = CreateThread(NULL, 0, ComputeTask, (LPVOID)1000, 0, NULL); threads[1] = CreateThread(NULL, 0, ComputeTask, (LPVOID)500, 0, NULL); threads[2] = CreateThread(NULL, 0, ComputeTask, (LPVOID)750, 0, NULL); // Wait for ANY one thread DWORD waitResult = WaitForMultipleObjects( NUM_THREADS, threads, FALSE, // Wait for ANY (not all) INFINITE ); if (waitResult >= WAIT_OBJECT_0 && waitResult < WAIT_OBJECT_0 + NUM_THREADS) { DWORD index = waitResult - WAIT_OBJECT_0; printf("Thread %lu completed first\n", index); } // Still need to wait for remaining threads WaitForMultipleObjects(NUM_THREADS, threads, TRUE, INFINITE); // Close handles for (DWORD i = 0; i < NUM_THREADS; i++) { CloseHandle(threads[i]); }} /* * Checking thread state without blocking */void non_blocking_check(HANDLE hThread) { // Zero timeout = non-blocking DWORD result = WaitForSingleObject(hThread, 0); switch (result) { case WAIT_OBJECT_0: printf("Thread has terminated\n"); break; case WAIT_TIMEOUT: printf("Thread still running\n"); break; case WAIT_FAILED: printf("Check failed: %lu\n", GetLastError()); break; } // Alternative: use GetExitCodeThread DWORD exitCode; GetExitCodeThread(hThread, &exitCode); if (exitCode == STILL_ACTIVE) { printf("Thread is STILL_ACTIVE\n"); } else { printf("Thread exit code: %lu\n", exitCode); }}GetExitCodeThread() returns STILL_ACTIVE (value 259) if the thread hasn't terminated. However, a thread could legitimately return 259 as its exit code! For reliable state checking, use WaitForSingleObject with 0 timeout. Only use GetExitCodeThread for the actual exit value after waiting confirms termination.
Java's Thread.join() method provides straightforward join semantics. All Java threads are automatically managed by the garbage collector, so there are no "zombie" threads as in POSIX—but joining still provides essential synchronization guarantees.
void join() // Wait indefinitely
void join(long millis) // Wait with millisecond timeout
void join(long millis, int nanos) // Wait with nanosecond precision
All join variants throw InterruptedException if the waiting thread is interrupted.
A successful join() establishes a happens-before relationship: all actions in the terminated thread happen-before the join() returns. This makes all writes by the terminated thread visible to the joining thread.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214
import java.util.concurrent.*;import java.util.List;import java.util.ArrayList; /** * Java Thread Joining Patterns */public class JavaJoining { /** * Basic join - wait indefinitely */ public static void basicJoin() { Thread worker = new Thread(() -> { System.out.println("Worker starting"); try { Thread.sleep(2000); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } System.out.println("Worker complete"); }); worker.start(); try { System.out.println("Main: waiting for worker..."); worker.join(); // Blocks until worker terminates System.out.println("Main: worker has terminated"); } catch (InterruptedException e) { System.out.println("Main: interrupted while waiting"); Thread.currentThread().interrupt(); } } /** * Join with timeout */ public static void joinWithTimeout() { Thread slowWorker = new Thread(() -> { try { Thread.sleep(5000); // 5 seconds } catch (InterruptedException e) { System.out.println("Worker interrupted"); Thread.currentThread().interrupt(); } }); slowWorker.start(); try { // Wait 2 seconds max slowWorker.join(2000); if (slowWorker.isAlive()) { System.out.println("Timeout! Worker still running"); // Option 1: Let it continue // Option 2: Interrupt it slowWorker.interrupt(); slowWorker.join(); // Wait for interrupt to take effect } else { System.out.println("Worker completed within timeout"); } } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } /** * Joining multiple threads */ public static void joinMultipleThreads() { List<Thread> workers = new ArrayList<>(); // Create and start workers for (int i = 0; i < 5; i++) { final int id = i; Thread t = new Thread(() -> { System.out.println("Worker " + id + " starting"); try { Thread.sleep(1000 + id * 500); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } System.out.println("Worker " + id + " done"); }, "Worker-" + i); workers.add(t); t.start(); } System.out.println("All workers started, waiting for completion..."); // Join all threads for (Thread t : workers) { try { t.join(); } catch (InterruptedException e) { System.out.println("Interrupted while waiting for " + t.getName()); Thread.currentThread().interrupt(); } } System.out.println("All workers completed"); } /** * Getting results: since join() returns void, use shared state */ public static void getResultsFromThreads() { // Pattern 1: Results array final int NUM_THREADS = 4; final int[] results = new int[NUM_THREADS]; Thread[] threads = new Thread[NUM_THREADS]; for (int i = 0; i < NUM_THREADS; i++) { final int index = i; threads[i] = new Thread(() -> { results[index] = index * index; // Compute result }); threads[i].start(); } for (Thread t : threads) { try { t.join(); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } // Results are guaranteed visible after join for (int i = 0; i < NUM_THREADS; i++) { System.out.println("Result " + i + ": " + results[i]); } } /** * Using Callable/Future for return values (preferred!) */ public static void useCallableForResults() throws Exception { ExecutorService executor = Executors.newFixedThreadPool(4); // Submit tasks that return values List<Future<Integer>> futures = new ArrayList<>(); for (int i = 0; i < 5; i++) { final int value = i; Future<Integer> future = executor.submit(() -> { Thread.sleep(500); return value * value; }); futures.add(future); } // Get results (blocks until each task completes) int total = 0; for (Future<Integer> future : futures) { total += future.get(); // Blocks until result available } System.out.println("Total: " + total); executor.shutdown(); } /** * Daemon threads don't need joining */ public static void daemonThreads() { Thread daemon = new Thread(() -> { while (true) { System.out.println("Daemon heartbeat"); try { Thread.sleep(1000); } catch (InterruptedException e) { break; } } }); daemon.setDaemon(true); // Mark as daemon BEFORE start daemon.start(); // Daemon threads: // - Don't prevent JVM shutdown // - Are abruptly killed when last non-daemon exits // - Generally shouldn't be joined (they run forever) System.out.println("Main exiting, daemon will be killed"); } /** * Proper interrupt handling during join */ public static void handleInterruptDuringJoin(Thread target) { boolean interrupted = false; while (true) { try { target.join(); break; // Successfully joined } catch (InterruptedException e) { // Remember we were interrupted, but keep waiting interrupted = true; } } // Restore interrupt status if (interrupted) { Thread.currentThread().interrupt(); } }}For most applications, use ExecutorService with Callable/Future instead of raw threads with join(). Executors provide: cleaner result handling (Future.get()), automatic thread reuse, graceful shutdown, exception handling, and timeout support. Raw Thread.join() is still useful for simple cases or when you need precise thread lifecycle control.
Real-world applications often require more sophisticated join patterns than basic one-at-a-time waiting. This section covers advanced techniques for coordinating multiple thread completions.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244
#include <pthread.h>#include <stdio.h>#include <stdlib.h>#include <string.h>#include <stdbool.h>#include <time.h>#include <errno.h> /* * Pattern: Parallel Map-Reduce with Join * ------------------------------------ * Scatter work across threads, gather results */ typedef struct { int *data; int start; int end; long partial_sum;} ChunkWork; void *sum_chunk(void *arg) { ChunkWork *work = (ChunkWork *)arg; work->partial_sum = 0; for (int i = work->start; i < work->end; i++) { work->partial_sum += work->data[i]; } return NULL;} long parallel_sum(int *data, int size, int num_threads) { pthread_t *threads = malloc(num_threads * sizeof(pthread_t)); ChunkWork *works = malloc(num_threads * sizeof(ChunkWork)); int chunk_size = size / num_threads; // Scatter for (int i = 0; i < num_threads; i++) { works[i].data = data; works[i].start = i * chunk_size; works[i].end = (i == num_threads - 1) ? size : (i + 1) * chunk_size; pthread_create(&threads[i], NULL, sum_chunk, &works[i]); } // Gather (join all and collect results) long total = 0; for (int i = 0; i < num_threads; i++) { pthread_join(threads[i], NULL); total += works[i].partial_sum; } free(threads); free(works); return total;} /* * Pattern: First-to-Complete (Racing Threads) * ------------------------------------------- * Wait for any thread to finish, then proceed * Uses condition variable instead of sequential joins */ typedef struct { pthread_mutex_t mutex; pthread_cond_t cond; int completed_id; void *result; bool done;} RaceContext; typedef struct { int id; RaceContext *ctx; int work_duration_ms;} RaceWorker; void *race_worker(void *arg) { RaceWorker *worker = (RaceWorker *)arg; // Simulate variable work time usleep(worker->work_duration_ms * 1000); pthread_mutex_lock(&worker->ctx->mutex); if (!worker->ctx->done) { // We're the first! worker->ctx->done = true; worker->ctx->completed_id = worker->id; worker->ctx->result = (void *)(intptr_t)(worker->id * 100); pthread_cond_signal(&worker->ctx->cond); } pthread_mutex_unlock(&worker->ctx->mutex); return NULL;} void *first_to_complete(int num_racers) { RaceContext ctx = { .mutex = PTHREAD_MUTEX_INITIALIZER, .cond = PTHREAD_COND_INITIALIZER, .done = false, .completed_id = -1, .result = NULL }; pthread_t *threads = malloc(num_racers * sizeof(pthread_t)); RaceWorker *workers = malloc(num_racers * sizeof(RaceWorker)); // Start racers with random delays srand(time(NULL)); for (int i = 0; i < num_racers; i++) { workers[i].id = i; workers[i].ctx = &ctx; workers[i].work_duration_ms = 100 + (rand() % 200); pthread_create(&threads[i], NULL, race_worker, &workers[i]); } // Wait for first completion pthread_mutex_lock(&ctx.mutex); while (!ctx.done) { pthread_cond_wait(&ctx.cond, &ctx.mutex); } int winner = ctx.completed_id; void *result = ctx.result; pthread_mutex_unlock(&ctx.mutex); printf("Thread %d won the race!\n", winner); // Still need to join all threads for (int i = 0; i < num_racers; i++) { pthread_join(threads[i], NULL); } free(threads); free(workers); return result;} /* * Pattern: Join with Cleanup on Failure * -------------------------------------- * If any thread fails, cancel and join all */ typedef struct { pthread_t tid; bool created; bool failed;} ThreadSlot; void *fallible_worker(void *arg) { int id = (int)(intptr_t)arg; // Simulate possible failure if (id == 2) { return (void *)-1; // Failure } usleep(100000); // Work return (void *)0; // Success} int run_with_failure_handling(int num_threads) { ThreadSlot *slots = calloc(num_threads, sizeof(ThreadSlot)); int successful = 0; // Create threads for (int i = 0; i < num_threads; i++) { int result = pthread_create(&slots[i].tid, NULL, fallible_worker, (void *)(intptr_t)i); if (result == 0) { slots[i].created = true; } else { fprintf(stderr, "Failed to create thread %d\n", i); goto cleanup; } } // Join all and check results for (int i = 0; i < num_threads; i++) { if (!slots[i].created) continue; void *retval; pthread_join(slots[i].tid, &retval); if (retval == (void *)-1) { slots[i].failed = true; fprintf(stderr, "Thread %d failed\n", i); } else { successful++; } } free(slots); return successful; cleanup: // Cancel and join all created threads for (int i = 0; i < num_threads; i++) { if (slots[i].created) { pthread_cancel(slots[i].tid); } } for (int i = 0; i < num_threads; i++) { if (slots[i].created) { pthread_join(slots[i].tid, NULL); } } free(slots); return -1;} /* * Pattern: Timed Join with pthread_timedjoin_np (Linux) * ----------------------------------------------------- * Join with absolute timeout */ #ifdef __linux__#define _GNU_SOURCE#include <pthread.h> int timed_join_example(pthread_t tid, int timeout_seconds) { struct timespec ts; clock_gettime(CLOCK_REALTIME, &ts); ts.tv_sec += timeout_seconds; int result = pthread_timedjoin_np(tid, NULL, &ts); if (result == ETIMEDOUT) { printf("Join timed out\n"); return -1; } else if (result != 0) { fprintf(stderr, "pthread_timedjoin_np: %s\n", strerror(result)); return -1; } return 0;}#endifA barrier (pthread_barrier) makes all threads wait for each other at a synchronization point, then all continue. This is different from joining, where the joined thread terminates. Use barriers when threads need to synchronize mid-execution; use joins for final thread completion.
Thread joining seems straightforward but harbors subtle pitfalls that cause resource leaks, deadlocks, and crashes. Understanding these issues is essential for robust concurrent programming.
You now have comprehensive knowledge of thread joining across POSIX, Windows, and Java platforms. You understand the semantics of joining vs detaching, timeout-based waiting, advanced synchronization patterns, and common pitfalls. This completes our exploration of thread libraries—you're now equipped to write robust, well-coordinated multithreaded applications on any major platform.