Loading learning content...
While pthread_cond_signal() wakes a single waiter, many synchronization scenarios require waking all waiting threads simultaneously. This is the purpose of pthread_cond_broadcast().
Broadcast is essential when:
Broadcast is the "safe" choice when uncertain—it guarantees no waiter misses a relevant notification. However, this safety comes with performance implications when many threads wake only to recheck and re-sleep.
This page provides comprehensive coverage of pthread_cond_broadcast()—its semantics, use cases, the thundering herd problem, performance optimization, and patterns for effective application.
By completing this page, you will: (1) Master the pthread_cond_broadcast() function signature and semantics, (2) Know when broadcast is required vs. when signal suffices, (3) Understand the thundering herd problem and mitigation strategies, (4) Apply broadcast correctly in shutdown, barrier, and readers-writers patterns, (5) Optimize broadcast usage for performance-critical systems, and (6) Design condition variable strategies that minimize unnecessary wakeups.
#include <pthread.h>
int pthread_cond_broadcast(pthread_cond_t *cond);
Parameters:
cond: Pointer to a properly initialized condition variable.Return Value:
0: Success. All waiting threads (if any) have been awakened.EINVAL).When a thread calls pthread_cond_broadcast():
cond are awakened| Aspect | signal() | broadcast() |
|---|---|---|
| Wakeup count | At least one | All waiters |
| Performance (N waiters) | O(1) | O(N) |
| Use case | One waiter can proceed | Multiple/all waiters may proceed |
| Risk if wrong choice | Lost wakeups | Unnecessary wakeups |
| Default safety | Requires analysis | Always correct |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
#include <pthread.h>#include <stdio.h>#include <stdbool.h> pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;pthread_cond_t cond = PTHREAD_COND_INITIALIZER;bool shutdown = false;int workers_waiting = 0; // =====================================================// Broadcast: The Shutdown Pattern// ===================================================== // Worker threads wait for work or shutdownvoid *worker_thread(void *arg) { int id = *(int *)arg; pthread_mutex_lock(&mutex); while (!shutdown) { workers_waiting++; printf("Worker %d: waiting...", id); pthread_cond_wait(&cond, &mutex); workers_waiting--; if (shutdown) break; // Do work... } printf("Worker %d: exiting due to shutdown", id); pthread_mutex_unlock(&mutex); return NULL;} // Master thread initiates shutdownvoid initiate_shutdown(void) { pthread_mutex_lock(&mutex); printf("Shutdown: setting flag and broadcasting"); shutdown = true; // MUST use broadcast - ALL workers need to know pthread_cond_broadcast(&cond); pthread_mutex_unlock(&mutex);} // =====================================================// What Happens After Broadcast// ===================================================== /* * Timeline with 3 workers waiting on cond: * * Master: lock(mutex) * Master: shutdown = true * Master: broadcast(cond) * -> Worker 1, 2, 3 all marked as runnable * Master: unlock(mutex) * * Now all workers compete for the mutex: * * Worker 1: acquires mutex, sees shutdown=true, exits * Worker 1: unlock(mutex) * * Worker 2: acquires mutex, sees shutdown=true, exits * Worker 2: unlock(mutex) * * Worker 3: acquires mutex, sees shutdown=true, exits * * All workers have exited cleanly. * Signal would have only woken ONE worker! */When broadcast() is called, every thread in the condition variable's wait queue is made runnable. This includes threads that called wait before the broadcast AND any that enter wait before the broadcasting thread releases the mutex. However, threads that call wait AFTER the broadcast find no waiters to wake—the broadcast is not 'remembered'.
Broadcast is not just a "safe default"—there are scenarios where it is mandatory for correctness. Using signal in these cases leads to missed wakeups and deadlock.
When multiple threads wait on the same condition variable but check different conditions, signal may wake the wrong thread:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
#include <pthread.h>#include <stdbool.h> pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;pthread_cond_t cond = PTHREAD_COND_INITIALIZER; int count = 0; // Thread A: waits for count to become EVENvoid *wait_for_even(void *arg) { pthread_mutex_lock(&mutex); while (count % 2 != 0) { // Wait while ODD pthread_cond_wait(&cond, &mutex); } printf("Even condition met: count = %d", count); pthread_mutex_unlock(&mutex); return NULL;} // Thread B: waits for count > 10void *wait_for_gt_10(void *arg) { pthread_mutex_lock(&mutex); while (count <= 10) { pthread_cond_wait(&cond, &mutex); } printf("Greater-than-10 condition met: count = %d", count); pthread_mutex_unlock(&mutex); return NULL;} // WRONG: Using signalvoid increment_WRONG(void) { pthread_mutex_lock(&mutex); count++; pthread_cond_signal(&cond); // May wake wrong thread! pthread_mutex_unlock(&mutex);} // Example: count goes from 10 to 11// - Thread A wants even (11 is odd, will re-wait)// - Thread B wants > 10 (11 > 10, should proceed)// - Signal wakes Thread A// - Thread A checks: 11 % 2 != 0, goes back to sleep// - Thread B never woken! LOST WAKEUP! // CORRECT: Using broadcastvoid increment_CORRECT(void) { pthread_mutex_lock(&mutex); count++; pthread_cond_broadcast(&cond); // Both threads check their predicates pthread_mutex_unlock(&mutex);}When a state change enables multiple waiters to make progress simultaneously:
123456789101112131415161718192021222324252627282930313233343536373839404142434445
#include <pthread.h>#include <stdbool.h> // Readers-Writers: Multiple readers can proceed simultaneously pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;pthread_cond_t readers_ok = PTHREAD_COND_INITIALIZER; int active_writers = 0;int waiting_readers = 0;int active_readers = 0; void *reader_thread(void *arg) { pthread_mutex_lock(&mutex); waiting_readers++; while (active_writers > 0) { pthread_cond_wait(&readers_ok, &mutex); } waiting_readers--; active_readers++; pthread_mutex_unlock(&mutex); // ... read ... pthread_mutex_lock(&mutex); active_readers--; // Signal writers if last reader (handled elsewhere) pthread_mutex_unlock(&mutex); return NULL;} void writer_finish(void) { pthread_mutex_lock(&mutex); active_writers = 0; // Multiple readers may be waiting - ALL should proceed // Using signal would only wake ONE reader! pthread_cond_broadcast(&readers_ok); pthread_mutex_unlock(&mutex);}When the overall system state changes (initialization complete, shutdown, mode change), all affected components must be notified:
If you're uncertain whether signal or broadcast is correct, USE BROADCAST. An unnecessary broadcast causes extra wakeups (performance cost). A missing broadcast causes deadlock (correctness failure). Performance problems are debuggable; deadlocks are catastrophic.
While broadcast is often necessary for correctness, it introduces a performance challenge known as the thundering herd problem.
When broadcast wakes N threads, but only a few (or one) can actually make progress:
This is called "thundering herd" because it resembles a herd of animals all stampeding toward a resource, only to find most cannot access it.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182
#include <pthread.h>#include <stdio.h>#include <stdbool.h> pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;pthread_cond_t cond = PTHREAD_COND_INITIALIZER; int available_connections = 0;int waiting_threads = 0;int wakeup_count = 0; // =====================================================// Thundering Herd Demonstration// ===================================================== void *worker(void *arg) { int id = *(int *)arg; pthread_mutex_lock(&mutex); while (available_connections == 0) { waiting_threads++; printf("Thread %d: waiting for connection (waiters=%d)", id, waiting_threads); pthread_cond_wait(&cond, &mutex); waiting_threads--; wakeup_count++; printf("Thread %d: woken up (total wakeups=%d)", id, wakeup_count); } // Got a connection available_connections--; printf("Thread %d: acquired connection", id); pthread_mutex_unlock(&mutex); // Use connection... return NULL;} // One connection becomes availablevoid release_one_connection(void) { pthread_mutex_lock(&mutex); available_connections = 1; printf("Releasing 1 connection with %d waiting threads", waiting_threads); // BROADCAST: All waiting threads wake up pthread_cond_broadcast(&cond); pthread_mutex_unlock(&mutex); // Result with 10 waiting threads: // - 10 threads wake up // - 10 threads serialize on mutex // - 1 thread gets connection, proceeds // - 9 threads find available_connections==0, go back to sleep // - Wasted: 9 wakeups + 10 mutex acquisitions} // BETTER: Use signal when only releasing onevoid release_one_connection_optimized(void) { pthread_mutex_lock(&mutex); available_connections = 1; // SIGNAL: Only one thread can use this anyway pthread_cond_signal(&cond); pthread_mutex_unlock(&mutex); // Result: 1 thread wakes, 1 thread proceeds}| Waiting Threads | Broadcast Cost | Wasted Wakeups | Mutex Contentions |
|---|---|---|---|
| 10 | ~50,000 cycles | 9 | 9 |
| 100 | ~500,000 cycles | 99 | 99 |
| 1,000 | ~5,000,000 cycles | 999 | 999 |
| 10,000 | ~50,000,000 cycles | 9,999 | Severe contention |
1. Use Signal When Possible
Don't use broadcast reflexively. Analyze whether signal suffices:
2. Separate Condition Variables
Instead of one cond var with heterogeneous waiters, use multiple cond vars with homogeneous waiters:
// Instead of:
pthread_cond_t cond; // All workers wait here
// Use:
pthread_cond_t connection_available; // Connection waiters
pthread_cond_t work_available; // Work waiters
Now you can signal the specific condition that changed.
3. Token-Based Wakeup
For resource pools, track exactly how many resources are available:
// Release N connections
available_connections += N;
for (int i = 0; i < N && waiting_threads > 0; i++) {
pthread_cond_signal(&cond); // Signal exactly N times
}
4. Staged Wakeup
Wake threads in batches rather than all at once:
// Wake threads gradually to reduce contention
for (int batch = 0; batch < total_waiters; batch += BATCH_SIZE) {
// Wake a batch
for (int i = 0; i < BATCH_SIZE && i + batch < total_waiters; i++) {
pthread_cond_signal(&cond);
}
// Brief yield to let batch proceed
sched_yield();
}
The best thundering herd mitigation is design-level: structure your system so broadcasts are rarely needed. Use separate condition variables for separate conditions. Track state precisely so you can signal exactly the right number of waiters. Reserve broadcast for true 'everyone must know' events like shutdown.
Understanding how pthread_cond_broadcast() is implemented helps predict and optimize performance.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
// =====================================================// Conceptual Implementation of pthread_cond_broadcast// (Simplified for understanding)// ===================================================== int pthread_cond_broadcast(pthread_cond_t *cond) { struct __pthread_cond_t *c = (struct __pthread_cond_t *)cond; // Quick check: any waiters? if (__atomic_load_n(&c->__nwaiters, __ATOMIC_RELAXED) == 0) { return 0; // No waiters, nothing to do } // Acquire internal lock __lock(&c->__internal_lock); // Wake ALL waiting threads unsigned int n = c->__nwaiters; if (n > 0) { // Increment wakeup sequence by number of waiters c->__wakeup_seq += n; // Use futex to wake all // FUTEX_WAKE with count = INT_MAX (or n) futex_wake(&c->__futex_word, INT_MAX); } __unlock(&c->__internal_lock); return 0;} // =====================================================// Why Broadcast Is O(N)// ===================================================== /* * The broadcast operation itself is relatively cheap: * - One system call (futex_wake with count=ALL) * - One atomic increment * * The O(N) cost comes from what happens AFTER broadcast: * * 1. Kernel marks all N threads as runnable * - Kernel work: O(N) queue operations * * 2. All N threads get scheduled (eventually) * - Context switches: potentially N, across CPUs * * 3. All N threads try to acquire mutex * - Serialization: N-1 threads block * - Cache bouncing: mutex line ping-pongs * * 4. Each thread runs, checks condition, mostly re-sleeps * - Wasted work: N-1 useless wakeups * * The system call is O(1), but total impact is O(N). */Modern Linux implementations (glibc 2.3.3+) use the FUTEX_CMP_REQUEUE operation to optimize broadcast:
Without Requeue:
With Requeue:
This is a significant optimization: the "hurry up and wait" problem is completely eliminated.
| Implementation | Broadcast System Calls | Mutex Contention | Total Overhead |
|---|---|---|---|
| Naive (wake all) | 1 (wake N) | High (N-1 block) | High |
| Requeue (Linux) | 1 (requeue) | Minimal (serialized) | Low |
| Wait morphing | 0 (queue move) | Minimal | Lowest |
The requeue optimization is Linux-specific (futex feature). Other platforms (macOS, FreeBSD, Windows) may not have equivalent optimizations. Code that relies on broadcast being cheap may perform poorly on other systems. Design for correctness first, then measure and optimize.
Let's examine production-quality implementations of common patterns that require broadcast.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
#include <pthread.h>#include <stdio.h> // =====================================================// Thread Barrier: All Threads Wait for Each Other// ===================================================== typedef struct { pthread_mutex_t mutex; pthread_cond_t cond; int threshold; // Number of threads to wait for int count; // Current number arrived int generation; // Barrier reuse tracking} barrier_t; void barrier_init(barrier_t *b, int n) { pthread_mutex_init(&b->mutex, NULL); pthread_cond_init(&b->cond, NULL); b->threshold = n; b->count = 0; b->generation = 0;} void barrier_wait(barrier_t *b) { pthread_mutex_lock(&b->mutex); int my_generation = b->generation; b->count++; if (b->count == b->threshold) { // Last thread to arrive b->count = 0; b->generation++; // Prepare for reuse // BROADCAST: Release ALL waiting threads pthread_cond_broadcast(&b->cond); } else { // Wait for last thread // Use while + generation to handle spurious wakeups correctly while (my_generation == b->generation) { pthread_cond_wait(&b->cond, &b->mutex); } } pthread_mutex_unlock(&b->mutex);} // Usage: parallel computation with phasesvoid *parallel_worker(void *arg) { barrier_t *barrier = (barrier_t *)arg; do_phase1_work(); barrier_wait(barrier); // All threads sync here do_phase2_work(); barrier_wait(barrier); // All threads sync again return NULL;}12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
#include <pthread.h>#include <stdbool.h> // =====================================================// One-Shot Event: Signal Once, All Waiters Released// ===================================================== typedef struct { pthread_mutex_t mutex; pthread_cond_t cond; bool triggered;} latch_t; void latch_init(latch_t *l) { pthread_mutex_init(&l->mutex, NULL); pthread_cond_init(&l->cond, NULL); l->triggered = false;} // Wait for event (blocks until triggered)void latch_wait(latch_t *l) { pthread_mutex_lock(&l->mutex); while (!l->triggered) { pthread_cond_wait(&l->cond, &l->mutex); } pthread_mutex_unlock(&l->mutex);} // Trigger event (releases all current and future waiters)void latch_trigger(latch_t *l) { pthread_mutex_lock(&l->mutex); if (!l->triggered) { l->triggered = true; // Wake ALL current waiters pthread_cond_broadcast(&l->cond); } pthread_mutex_unlock(&l->mutex);} // Usage: initialization completionlatch_t app_initialized; void *component_thread(void *arg) { // Wait for app to be ready latch_wait(&app_initialized); // Now safe to proceed do_work(); return NULL;} void main_init(void) { latch_init(&app_initialized); // Start components (they'll wait) start_all_components(); // Do initialization load_config(); connect_to_database(); // Signal all components: "ready to go!" latch_trigger(&app_initialized);}1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
#include <pthread.h>#include <stdbool.h>#include <stdio.h> // =====================================================// Graceful Shutdown: Wait for All Workers to Exit// ===================================================== typedef struct { pthread_mutex_t mutex; pthread_cond_t work_cond; // Workers wait here for work pthread_cond_t shutdown_cond; // Shutdown waits here for workers bool shutdown_requested; int active_workers; int total_workers; // Work queue fields...} worker_pool_t; void *worker(void *arg) { worker_pool_t *pool = (worker_pool_t *)arg; pthread_mutex_lock(&pool->mutex); pool->active_workers++; while (!pool->shutdown_requested) { // Wait for work or shutdown pthread_cond_wait(&pool->work_cond, &pool->mutex); if (pool->shutdown_requested) break; // Process work (would unlock during processing) } pool->active_workers--; // If last worker, signal shutdown waiter if (pool->active_workers == 0) { pthread_cond_signal(&pool->shutdown_cond); } printf("Worker exiting, %d remaining", pool->active_workers); pthread_mutex_unlock(&pool->mutex); return NULL;} void pool_shutdown_and_wait(worker_pool_t *pool) { pthread_mutex_lock(&pool->mutex); printf("Initiating shutdown of %d workers", pool->active_workers); pool->shutdown_requested = true; // BROADCAST: All workers must see shutdown pthread_cond_broadcast(&pool->work_cond); // Wait for all workers to exit while (pool->active_workers > 0) { pthread_cond_wait(&pool->shutdown_cond, &pool->mutex); } printf("All workers have exited"); pthread_mutex_unlock(&pool->mutex);}While correctness trumps performance, there are legitimate optimizations for broadcast-heavy code.
Only broadcast when someone might actually be waiting:
123456789101112131415161718192021222324252627282930313233343536373839
#include <pthread.h> pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;pthread_cond_t cond = PTHREAD_COND_INITIALIZER; int waiters = 0; // Track waiter count explicitly void *consumer(void *arg) { pthread_mutex_lock(&mutex); waiters++; while (!has_work()) { pthread_cond_wait(&cond, &mutex); } waiters--; do_work(); pthread_mutex_unlock(&mutex); return NULL;} void producer_add_work(void) { pthread_mutex_lock(&mutex); add_work_item(); // OPTIMIZATION: Only broadcast if there are waiters if (waiters > 0) { pthread_cond_broadcast(&cond); } pthread_mutex_unlock(&mutex);} // Analysis:// - Without optimization: broadcast every add, even if no waiters// - With optimization: skip broadcast when waiters == 0// - Savings: ~5,000 cycles per no-op broadcast avoidedReduce what broadcast wakes by using more specific condition variables:
123456789101112131415161718192021222324252627282930313233343536373839404142
#include <pthread.h> // BAD: All waiters on one cond vartypedef struct { pthread_mutex_t mutex; pthread_cond_t cond; // Everyone waits here int connection_count; int work_count;} pool_bad_t; void add_connection_bad(pool_bad_t *p) { pthread_mutex_lock(&p->mutex); p->connection_count++; // Must broadcast - both connection waiters AND work waiters are on cond pthread_cond_broadcast(&p->cond); pthread_mutex_unlock(&p->mutex);} // GOOD: Separate cond vars for separate conditionstypedef struct { pthread_mutex_t mutex; pthread_cond_t connection_available; // Only connection waiters pthread_cond_t work_available; // Only work waiters int connection_count; int work_count;} pool_good_t; void add_connection_good(pool_good_t *p) { pthread_mutex_lock(&p->mutex); p->connection_count++; // Only wake connection waiters pthread_cond_signal(&p->connection_available); // Can even use signal! pthread_mutex_unlock(&p->mutex);} void add_work_good(pool_good_t *p) { pthread_mutex_lock(&p->mutex); p->work_count++; // Only wake work waiters pthread_cond_signal(&p->work_available); pthread_mutex_unlock(&p->mutex);}For resource release where usually one waiter proceeds but sometimes multiple might:
1234567891011121314151617181920212223242526272829
#include <pthread.h> // Release one resource - usually only one waiter needs itvoid release_resource(pool_t *p) { pthread_mutex_lock(&p->mutex); p->available++; // Common case: signal one waiter pthread_cond_signal(&p->resource_cond); pthread_mutex_unlock(&p->mutex);} // Release multiple resources atomicallyvoid release_resources(pool_t *p, int n) { pthread_mutex_lock(&p->mutex); p->available += n; // Multiple resources: broadcast or signal n times if (n > 1) { pthread_cond_broadcast(&p->resource_cond); } else { pthread_cond_signal(&p->resource_cond); } pthread_mutex_unlock(&p->mutex);}Profile your application before micro-optimizing broadcast. Often the overhead of broadcast is dwarfed by actual work. Focus optimization effort on: (1) reducing contention overall, (2) separating predicates into separate cond vars, (3) reducing waiter counts through better batching.
Broadcast has its own set of pitfalls beyond those of signal.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102
#include <pthread.h>#include <stdbool.h> pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;pthread_cond_t cond = PTHREAD_COND_INITIALIZER;bool resource_ready = false; // =====================================================// MISTAKE 1: Broadcast for Single-Consumer Events// ===================================================== void release_single_resource_INEFFICIENT(void) { pthread_mutex_lock(&mutex); resource_ready = true; // WASTEFUL: Only one thread can use the resource // Using broadcast wakes all, N-1 go back to sleep pthread_cond_broadcast(&cond); pthread_mutex_unlock(&mutex);} void release_single_resource_OPTIMAL(void) { pthread_mutex_lock(&mutex); resource_ready = true; // BETTER: Signal wakes exactly one waiter pthread_cond_signal(&cond); pthread_mutex_unlock(&mutex);} // =====================================================// MISTAKE 2: Broadcast Inside a Loop// ===================================================== void add_items_WRONG(int *items, int n) { for (int i = 0; i < n; i++) { pthread_mutex_lock(&mutex); add_item(items[i]); pthread_cond_broadcast(&cond); // WRONG: Broadcast N times! pthread_mutex_unlock(&mutex); } // Result: N broadcasts, each waking all waiters = N^2 wakeups} void add_items_RIGHT(int *items, int n) { pthread_mutex_lock(&mutex); for (int i = 0; i < n; i++) { add_item(items[i]); } pthread_cond_broadcast(&cond); // ONE broadcast after all adds pthread_mutex_unlock(&mutex);} // =====================================================// MISTAKE 3: Missing Broadcast on All Exit Paths// ===================================================== void produce_WRONG(void) { pthread_mutex_lock(&mutex); if (error_condition) { pthread_mutex_unlock(&mutex); return; // WRONG: No broadcast, waiters may be stuck! } add_work(); pthread_cond_broadcast(&cond); pthread_mutex_unlock(&mutex);} void produce_CORRECT(void) { pthread_mutex_lock(&mutex); if (error_condition) { error_flag = true; // Set error state pthread_cond_broadcast(&cond); // Wake waiters to see error pthread_mutex_unlock(&mutex); return; } add_work(); pthread_cond_broadcast(&cond); pthread_mutex_unlock(&mutex);} // =====================================================// MISTAKE 4: Expecting Broadcast to Transfer the Mutex// ===================================================== void wake_and_expect_immediate_WRONG(void) { pthread_mutex_lock(&mutex); ready = true; pthread_cond_broadcast(&cond); // WRONG assumption: waiters are now running // REALITY: waiters won't run until WE release mutex do_more_work_holding_mutex(); // This delays waiters pthread_mutex_unlock(&mutex);}We have comprehensively explored pthread_cond_broadcast()—the operation that wakes all waiting threads. It is essential for correctness in multi-waiter scenarios and system-wide state changes.
You now understand pthread_cond_broadcast() for multi-waiter scenarios. The next page consolidates everything with Proper Usage Patterns—the canonical patterns, best practices, and anti-patterns for production condition variable usage.