Operating SystemsPthread Condition Variables

Pthread Condition Variables

LevelAdvanced

Duration120 mins

TopicPthread Condition Variables

4 / 5

pthread_cond_broadcast

Waking All Waiters

While pthread_cond_signal() wakes a single waiter, many synchronization scenarios require waking all waiting threads simultaneously. This is the purpose of pthread_cond_broadcast().

Broadcast is essential when:

Multiple threads can make progress after a state change
Waiters have different predicates on the same condition variable
System-wide events like shutdown must reach all waiting threads
The signaling thread doesn't know which waiter(s) should proceed

Broadcast is the "safe" choice when uncertain—it guarantees no waiter misses a relevant notification. However, this safety comes with performance implications when many threads wake only to recheck and re-sleep.

This page provides comprehensive coverage of pthread_cond_broadcast()—its semantics, use cases, the thundering herd problem, performance optimization, and patterns for effective application.

Learning Objectives

By completing this page, you will: (1) Master the pthread_cond_broadcast() function signature and semantics, (2) Know when broadcast is required vs. when signal suffices, (3) Understand the thundering herd problem and mitigation strategies, (4) Apply broadcast correctly in shutdown, barrier, and readers-writers patterns, (5) Optimize broadcast usage for performance-critical systems, and (6) Design condition variable strategies that minimize unnecessary wakeups.

Function Signature and Semantics

The pthread_cond_broadcast() Function

#include <pthread.h>

int pthread_cond_broadcast(pthread_cond_t *cond);

Parameters:

cond: Pointer to a properly initialized condition variable.

Return Value:

0: Success. All waiting threads (if any) have been awakened.
Non-zero error code: Failure (rare; typically EINVAL).

The Semantic Contract

When a thread calls pthread_cond_broadcast():

All threads blocked on cond are awakened
Each awakened thread will attempt to reacquire its associated mutex
If no threads are waiting, the operation has no effect
The operation is non-blocking—it returns immediately

Key Differences from Signal

pthread_cond_signal() vs pthread_cond_broadcast()
Aspect	signal()	broadcast()
Wakeup count	At least one	All waiters
Performance (N waiters)	O(1)	O(N)
Use case	One waiter can proceed	Multiple/all waiters may proceed
Risk if wrong choice	Lost wakeups	Unnecessary wakeups
Default safety	Requires analysis	Always correct

pthread_cond_broadcast_basics.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
#include <pthread.h>
#include <stdio.h>
#include <stdbool.h>
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
bool shutdown = false;
int workers_waiting = 0;
 
// =====================================================
// Broadcast: The Shutdown Pattern
// =====================================================
 
// Worker threads wait for work or shutdown
void *worker_thread(void *arg) {
    int id = *(int *)arg;
    
    pthread_mutex_lock(&mutex);
    
    while (!shutdown) {
        workers_waiting++;
        printf("Worker %d: waiting...
", id);
        
        pthread_cond_wait(&cond, &mutex);
        
        workers_waiting--;
        
        if (shutdown) break;
        
        // Do work...
    }
    
    printf("Worker %d: exiting due to shutdown
", id);
    pthread_mutex_unlock(&mutex);
    return NULL;
}
 
// Master thread initiates shutdown
void initiate_shutdown(void) {
    pthread_mutex_lock(&mutex);
    
    printf("Shutdown: setting flag and broadcasting
");
    shutdown = true;
    
    // MUST use broadcast - ALL workers need to know
    pthread_cond_broadcast(&cond);
    
    pthread_mutex_unlock(&mutex);
}
 
// =====================================================
// What Happens After Broadcast
// =====================================================
 
/*
 * Timeline with 3 workers waiting on cond:
 *
 * Master: lock(mutex)
 * Master: shutdown = true
 * Master: broadcast(cond)
 *   -> Worker 1, 2, 3 all marked as runnable
 * Master: unlock(mutex)
 *
 * Now all workers compete for the mutex:
 *
 * Worker 1: acquires mutex, sees shutdown=true, exits
 * Worker 1: unlock(mutex)
 *
 * Worker 2: acquires mutex, sees shutdown=true, exits
 * Worker 2: unlock(mutex)
 *
 * Worker 3: acquires mutex, sees shutdown=true, exits
 *
 * All workers have exited cleanly.
 * Signal would have only woken ONE worker!
 */

All Means All

When broadcast() is called, every thread in the condition variable's wait queue is made runnable. This includes threads that called wait before the broadcast AND any that enter wait before the broadcasting thread releases the mutex. However, threads that call wait AFTER the broadcast find no waiters to wake—the broadcast is not 'remembered'.

When Broadcast Is Required

Broadcast is not just a "safe default"—there are scenarios where it is mandatory for correctness. Using signal in these cases leads to missed wakeups and deadlock.

Scenario 1: Waiters with Different Predicates

When multiple threads wait on the same condition variable but check different conditions, signal may wake the wrong thread:

heterogeneous_waiters.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include <pthread.h>
#include <stdbool.h>
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
 
int count = 0;
 
// Thread A: waits for count to become EVEN
void *wait_for_even(void *arg) {
    pthread_mutex_lock(&mutex);
    while (count % 2 != 0) {  // Wait while ODD
        pthread_cond_wait(&cond, &mutex);
    }
    printf("Even condition met: count = %d
", count);
    pthread_mutex_unlock(&mutex);
    return NULL;
}
 
// Thread B: waits for count > 10
void *wait_for_gt_10(void *arg) {
    pthread_mutex_lock(&mutex);
    while (count <= 10) {
        pthread_cond_wait(&cond, &mutex);
    }
    printf("Greater-than-10 condition met: count = %d
", count);
    pthread_mutex_unlock(&mutex);
    return NULL;
}
 
// WRONG: Using signal
void increment_WRONG(void) {
    pthread_mutex_lock(&mutex);
    count++;
    pthread_cond_signal(&cond);  // May wake wrong thread!
    pthread_mutex_unlock(&mutex);
}
 
// Example: count goes from 10 to 11
// - Thread A wants even (11 is odd, will re-wait)
// - Thread B wants > 10 (11 > 10, should proceed)
// - Signal wakes Thread A
// - Thread A checks: 11 % 2 != 0, goes back to sleep
// - Thread B never woken! LOST WAKEUP!
 
// CORRECT: Using broadcast
void increment_CORRECT(void) {
    pthread_mutex_lock(&mutex);
    count++;
    pthread_cond_broadcast(&cond);  // Both threads check their predicates
    pthread_mutex_unlock(&mutex);
}

Scenario 2: Multiple Threads Can Proceed

When a state change enables multiple waiters to make progress simultaneously:

multiple_can_proceed.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <pthread.h>
#include <stdbool.h>
 
// Readers-Writers: Multiple readers can proceed simultaneously
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t readers_ok = PTHREAD_COND_INITIALIZER;
 
int active_writers = 0;
int waiting_readers = 0;
int active_readers = 0;
 
void *reader_thread(void *arg) {
    pthread_mutex_lock(&mutex);
    
    waiting_readers++;
    while (active_writers > 0) {
        pthread_cond_wait(&readers_ok, &mutex);
    }
    waiting_readers--;
    active_readers++;
    
    pthread_mutex_unlock(&mutex);
    
    // ... read ...
    
    pthread_mutex_lock(&mutex);
    active_readers--;
    // Signal writers if last reader (handled elsewhere)
    pthread_mutex_unlock(&mutex);
    
    return NULL;
}
 
void writer_finish(void) {
    pthread_mutex_lock(&mutex);
    
    active_writers = 0;
    
    // Multiple readers may be waiting - ALL should proceed
    // Using signal would only wake ONE reader!
    pthread_cond_broadcast(&readers_ok);
    
    pthread_mutex_unlock(&mutex);
}

Scenario 3: System State Transitions

When the overall system state changes (initialization complete, shutdown, mode change), all affected components must be notified:

Broadcast-Required Scenarios

•Heterogeneous predicates — Waiters checking different conditions on the same cond var.
•Multiple threads can proceed — Adding N items to a pool with M > 1 waiters.
•Readers-writers — Writer finishing allows all waiting readers to proceed.
•Barrier synchronization — Last thread arriving releases all waiting threads.
•Shutdown signal — All components must observe the shutdown flag.
•Mode transitions — System moving from 'initializing' to 'ready' state.
•Resource becoming available — A contested resource like a database connection becomes free for multiple potential users.

The Safety Rule

If you're uncertain whether signal or broadcast is correct, USE BROADCAST. An unnecessary broadcast causes extra wakeups (performance cost). A missing broadcast causes deadlock (correctness failure). Performance problems are debuggable; deadlocks are catastrophic.

The Thundering Herd Problem

While broadcast is often necessary for correctness, it introduces a performance challenge known as the thundering herd problem.

The Problem Defined

When broadcast wakes N threads, but only a few (or one) can actually make progress:

All N threads wake up (expensive: system calls, context switches)
All N threads compete for the mutex (contention)
Threads acquire mutex one at a time, check condition
Most find condition still false, go back to sleep
Only 1-2 threads actually make progress
N-1 to N-2 threads wasted CPU and caused contention

This is called "thundering herd" because it resembles a herd of animals all stampeding toward a resource, only to find most cannot access it.

thundering_herd_demo.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
#include <pthread.h>
#include <stdio.h>
#include <stdbool.h>
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
 
int available_connections = 0;
int waiting_threads = 0;
int wakeup_count = 0;
 
// =====================================================
// Thundering Herd Demonstration
// =====================================================
 
void *worker(void *arg) {
    int id = *(int *)arg;
    
    pthread_mutex_lock(&mutex);
    
    while (available_connections == 0) {
        waiting_threads++;
        printf("Thread %d: waiting for connection (waiters=%d)
", 
               id, waiting_threads);
        
        pthread_cond_wait(&cond, &mutex);
        
        waiting_threads--;
        wakeup_count++;
        printf("Thread %d: woken up (total wakeups=%d)
", id, wakeup_count);
    }
    
    // Got a connection
    available_connections--;
    printf("Thread %d: acquired connection
", id);
    
    pthread_mutex_unlock(&mutex);
    
    // Use connection...
    
    return NULL;
}
 
// One connection becomes available
void release_one_connection(void) {
    pthread_mutex_lock(&mutex);
    
    available_connections = 1;
    printf("
Releasing 1 connection with %d waiting threads
", 
           waiting_threads);
    
    // BROADCAST: All waiting threads wake up
    pthread_cond_broadcast(&cond);
    
    pthread_mutex_unlock(&mutex);
    
    // Result with 10 waiting threads:
    // - 10 threads wake up
    // - 10 threads serialize on mutex
    // - 1 thread gets connection, proceeds
    // - 9 threads find available_connections==0, go back to sleep
    // - Wasted: 9 wakeups + 10 mutex acquisitions
}
 
// BETTER: Use signal when only releasing one
void release_one_connection_optimized(void) {
    pthread_mutex_lock(&mutex);
    
    available_connections = 1;
    
    // SIGNAL: Only one thread can use this anyway
    pthread_cond_signal(&cond);
    
    pthread_mutex_unlock(&mutex);
    
    // Result: 1 thread wakes, 1 thread proceeds
}

Thundering Herd Impact
Waiting Threads	Broadcast Cost	Wasted Wakeups	Mutex Contentions
10	~50,000 cycles	9	9
100	~500,000 cycles	99	99
1,000	~5,000,000 cycles	999	999
10,000	~50,000,000 cycles	9,999	Severe contention

Mitigation Strategies

1. Use Signal When Possible

Don't use broadcast reflexively. Analyze whether signal suffices:

Does only one thread need to proceed? → Signal
Do multiple threads need to proceed? → Broadcast

2. Separate Condition Variables

Instead of one cond var with heterogeneous waiters, use multiple cond vars with homogeneous waiters:

// Instead of:
pthread_cond_t cond;  // All workers wait here

// Use:
pthread_cond_t connection_available;  // Connection waiters
pthread_cond_t work_available;         // Work waiters

Now you can signal the specific condition that changed.

3. Token-Based Wakeup

For resource pools, track exactly how many resources are available:

// Release N connections
available_connections += N;
for (int i = 0; i < N && waiting_threads > 0; i++) {
    pthread_cond_signal(&cond);  // Signal exactly N times
}

4. Staged Wakeup

Wake threads in batches rather than all at once:

// Wake threads gradually to reduce contention
for (int batch = 0; batch < total_waiters; batch += BATCH_SIZE) {
    // Wake a batch
    for (int i = 0; i < BATCH_SIZE && i + batch < total_waiters; i++) {
        pthread_cond_signal(&cond);
    }
    // Brief yield to let batch proceed
    sched_yield();
}

Design Principle

The best thundering herd mitigation is design-level: structure your system so broadcasts are rarely needed. Use separate condition variables for separate conditions. Track state precisely so you can signal exactly the right number of waiters. Reserve broadcast for true 'everyone must know' events like shutdown.

Implementation Details

Understanding how pthread_cond_broadcast() is implemented helps predict and optimize performance.

Conceptual Implementation

broadcast_implementation.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
// =====================================================
// Conceptual Implementation of pthread_cond_broadcast
// (Simplified for understanding)
// =====================================================
 
int pthread_cond_broadcast(pthread_cond_t *cond) {
    struct __pthread_cond_t *c = (struct __pthread_cond_t *)cond;
    
    // Quick check: any waiters?
    if (__atomic_load_n(&c->__nwaiters, __ATOMIC_RELAXED) == 0) {
        return 0;  // No waiters, nothing to do
    }
    
    // Acquire internal lock
    __lock(&c->__internal_lock);
    
    // Wake ALL waiting threads
    unsigned int n = c->__nwaiters;
    if (n > 0) {
        // Increment wakeup sequence by number of waiters
        c->__wakeup_seq += n;
        
        // Use futex to wake all
        // FUTEX_WAKE with count = INT_MAX (or n)
        futex_wake(&c->__futex_word, INT_MAX);
    }
    
    __unlock(&c->__internal_lock);
    
    return 0;
}
 
// =====================================================
// Why Broadcast Is O(N)
// =====================================================
 
/*
 * The broadcast operation itself is relatively cheap:
 * - One system call (futex_wake with count=ALL)
 * - One atomic increment
 * 
 * The O(N) cost comes from what happens AFTER broadcast:
 * 
 * 1. Kernel marks all N threads as runnable
 *    - Kernel work: O(N) queue operations
 *    
 * 2. All N threads get scheduled (eventually)
 *    - Context switches: potentially N, across CPUs
 *    
 * 3. All N threads try to acquire mutex
 *    - Serialization: N-1 threads block
 *    - Cache bouncing: mutex line ping-pongs
 *    
 * 4. Each thread runs, checks condition, mostly re-sleeps
 *    - Wasted work: N-1 useless wakeups
 *    
 * The system call is O(1), but total impact is O(N).
 */

Linux/NPTL Optimization: Requeue

Modern Linux implementations (glibc 2.3.3+) use the FUTEX_CMP_REQUEUE operation to optimize broadcast:

Without Requeue:

Broadcast wakes all waiters
All waiters run, try to lock mutex
N-1 waiters block on mutex
Each wakes sequentially as mutex is released

With Requeue:

Broadcast moves waiters directly from cond var to mutex wait queue
Waiters never fully wake up
As mutex is released, waiters wake one at a time
No thundering herd on the mutex

This is a significant optimization: the "hurry up and wait" problem is completely eliminated.

Broadcast Implementation Comparison
Implementation	Broadcast System Calls	Mutex Contention	Total Overhead
Naive (wake all)	1 (wake N)	High (N-1 block)	High
Requeue (Linux)	1 (requeue)	Minimal (serialized)	Low
Wait morphing	0 (queue move)	Minimal	Lowest

Implementation Varies

The requeue optimization is Linux-specific (futex feature). Other platforms (macOS, FreeBSD, Windows) may not have equivalent optimizations. Code that relies on broadcast being cheap may perform poorly on other systems. Design for correctness first, then measure and optimize.

Practical Patterns Using Broadcast

Let's examine production-quality implementations of common patterns that require broadcast.

Pattern 1: Thread Barrier

barrier_pattern.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include <pthread.h>
#include <stdio.h>
 
// =====================================================
// Thread Barrier: All Threads Wait for Each Other
// =====================================================
 
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    int threshold;      // Number of threads to wait for
    int count;          // Current number arrived
    int generation;     // Barrier reuse tracking
} barrier_t;
 
void barrier_init(barrier_t *b, int n) {
    pthread_mutex_init(&b->mutex, NULL);
    pthread_cond_init(&b->cond, NULL);
    b->threshold = n;
    b->count = 0;
    b->generation = 0;
}
 
void barrier_wait(barrier_t *b) {
    pthread_mutex_lock(&b->mutex);
    
    int my_generation = b->generation;
    b->count++;
    
    if (b->count == b->threshold) {
        // Last thread to arrive
        b->count = 0;
        b->generation++;  // Prepare for reuse
        
        // BROADCAST: Release ALL waiting threads
        pthread_cond_broadcast(&b->cond);
    } else {
        // Wait for last thread
        // Use while + generation to handle spurious wakeups correctly
        while (my_generation == b->generation) {
            pthread_cond_wait(&b->cond, &b->mutex);
        }
    }
    
    pthread_mutex_unlock(&b->mutex);
}
 
// Usage: parallel computation with phases
void *parallel_worker(void *arg) {
    barrier_t *barrier = (barrier_t *)arg;
    
    do_phase1_work();
    barrier_wait(barrier);  // All threads sync here
    
    do_phase2_work();
    barrier_wait(barrier);  // All threads sync again
    
    return NULL;
}

Pattern 2: One-Shot Event (Latch)

latch_pattern.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#include <pthread.h>
#include <stdbool.h>
 
// =====================================================
// One-Shot Event: Signal Once, All Waiters Released
// =====================================================
 
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    bool triggered;
} latch_t;
 
void latch_init(latch_t *l) {
    pthread_mutex_init(&l->mutex, NULL);
    pthread_cond_init(&l->cond, NULL);
    l->triggered = false;
}
 
// Wait for event (blocks until triggered)
void latch_wait(latch_t *l) {
    pthread_mutex_lock(&l->mutex);
    
    while (!l->triggered) {
        pthread_cond_wait(&l->cond, &l->mutex);
    }
    
    pthread_mutex_unlock(&l->mutex);
}
 
// Trigger event (releases all current and future waiters)
void latch_trigger(latch_t *l) {
    pthread_mutex_lock(&l->mutex);
    
    if (!l->triggered) {
        l->triggered = true;
        
        // Wake ALL current waiters
        pthread_cond_broadcast(&l->cond);
    }
    
    pthread_mutex_unlock(&l->mutex);
}
 
// Usage: initialization completion
latch_t app_initialized;
 
void *component_thread(void *arg) {
    // Wait for app to be ready
    latch_wait(&app_initialized);
    
    // Now safe to proceed
    do_work();
    
    return NULL;
}
 
void main_init(void) {
    latch_init(&app_initialized);
    
    // Start components (they'll wait)
    start_all_components();
    
    // Do initialization
    load_config();
    connect_to_database();
    
    // Signal all components: "ready to go!"
    latch_trigger(&app_initialized);
}

Pattern 3: Graceful Shutdown

graceful_shutdown.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include <pthread.h>
#include <stdbool.h>
#include <stdio.h>
 
// =====================================================
// Graceful Shutdown: Wait for All Workers to Exit
// =====================================================
 
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t work_cond;       // Workers wait here for work
    pthread_cond_t shutdown_cond;   // Shutdown waits here for workers
    
    bool shutdown_requested;
    int active_workers;
    int total_workers;
    
    // Work queue fields...
} worker_pool_t;
 
void *worker(void *arg) {
    worker_pool_t *pool = (worker_pool_t *)arg;
    
    pthread_mutex_lock(&pool->mutex);
    pool->active_workers++;
    
    while (!pool->shutdown_requested) {
        // Wait for work or shutdown
        pthread_cond_wait(&pool->work_cond, &pool->mutex);
        
        if (pool->shutdown_requested) break;
        
        // Process work (would unlock during processing)
    }
    
    pool->active_workers--;
    
    // If last worker, signal shutdown waiter
    if (pool->active_workers == 0) {
        pthread_cond_signal(&pool->shutdown_cond);
    }
    
    printf("Worker exiting, %d remaining
", pool->active_workers);
    pthread_mutex_unlock(&pool->mutex);
    
    return NULL;
}
 
void pool_shutdown_and_wait(worker_pool_t *pool) {
    pthread_mutex_lock(&pool->mutex);
    
    printf("Initiating shutdown of %d workers
", pool->active_workers);
    pool->shutdown_requested = true;
    
    // BROADCAST: All workers must see shutdown
    pthread_cond_broadcast(&pool->work_cond);
    
    // Wait for all workers to exit
    while (pool->active_workers > 0) {
        pthread_cond_wait(&pool->shutdown_cond, &pool->mutex);
    }
    
    printf("All workers have exited
");
    pthread_mutex_unlock(&pool->mutex);
}

Pattern Commonalities

•All patterns use broadcast for 'everyone must know' events — Barrier release, latch trigger, shutdown signal.
•While loops handle spurious wakeups — Even with broadcast, always recheck conditions.
•Generation/sequence counters prevent race conditions — Barrier uses generation; latch uses triggered flag.
•Dual condition variables for bidirectional sync — Shutdown pattern uses work_cond AND shutdown_cond.

Performance Optimization

While correctness trumps performance, there are legitimate optimizations for broadcast-heavy code.

Strategy 1: Predicated Broadcast

Only broadcast when someone might actually be waiting:

predicated_broadcast.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <pthread.h>
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
 
int waiters = 0;  // Track waiter count explicitly
 
void *consumer(void *arg) {
    pthread_mutex_lock(&mutex);
    
    waiters++;
    while (!has_work()) {
        pthread_cond_wait(&cond, &mutex);
    }
    waiters--;
    
    do_work();
    
    pthread_mutex_unlock(&mutex);
    return NULL;
}
 
void producer_add_work(void) {
    pthread_mutex_lock(&mutex);
    
    add_work_item();
    
    // OPTIMIZATION: Only broadcast if there are waiters
    if (waiters > 0) {
        pthread_cond_broadcast(&cond);
    }
    
    pthread_mutex_unlock(&mutex);
}
 
// Analysis:
// - Without optimization: broadcast every add, even if no waiters
// - With optimization: skip broadcast when waiters == 0
// - Savings: ~5,000 cycles per no-op broadcast avoided

Strategy 2: Minimize Broadcast Scope

Reduce what broadcast wakes by using more specific condition variables:

minimize_scope.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <pthread.h>
 
// BAD: All waiters on one cond var
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t cond;  // Everyone waits here
    int connection_count;
    int work_count;
} pool_bad_t;
 
void add_connection_bad(pool_bad_t *p) {
    pthread_mutex_lock(&p->mutex);
    p->connection_count++;
    // Must broadcast - both connection waiters AND work waiters are on cond
    pthread_cond_broadcast(&p->cond);
    pthread_mutex_unlock(&p->mutex);
}
 
// GOOD: Separate cond vars for separate conditions
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t connection_available;  // Only connection waiters
    pthread_cond_t work_available;        // Only work waiters
    int connection_count;
    int work_count;
} pool_good_t;
 
void add_connection_good(pool_good_t *p) {
    pthread_mutex_lock(&p->mutex);
    p->connection_count++;
    // Only wake connection waiters
    pthread_cond_signal(&p->connection_available);  // Can even use signal!
    pthread_mutex_unlock(&p->mutex);
}
 
void add_work_good(pool_good_t *p) {
    pthread_mutex_lock(&p->mutex);
    p->work_count++;
    // Only wake work waiters
    pthread_cond_signal(&p->work_available);
    pthread_mutex_unlock(&p->mutex);
}

Strategy 3: Signal-Then-Broadcast Pattern

For resource release where usually one waiter proceeds but sometimes multiple might:

signal_then_broadcast.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include <pthread.h>
 
// Release one resource - usually only one waiter needs it
void release_resource(pool_t *p) {
    pthread_mutex_lock(&p->mutex);
    
    p->available++;
    
    // Common case: signal one waiter
    pthread_cond_signal(&p->resource_cond);
    
    pthread_mutex_unlock(&p->mutex);
}
 
// Release multiple resources atomically
void release_resources(pool_t *p, int n) {
    pthread_mutex_lock(&p->mutex);
    
    p->available += n;
    
    // Multiple resources: broadcast or signal n times
    if (n > 1) {
        pthread_cond_broadcast(&p->resource_cond);
    } else {
        pthread_cond_signal(&p->resource_cond);
    }
    
    pthread_mutex_unlock(&p->mutex);
}

Measure Before Optimizing

Profile your application before micro-optimizing broadcast. Often the overhead of broadcast is dwarfed by actual work. Focus optimization effort on: (1) reducing contention overall, (2) separating predicates into separate cond vars, (3) reducing waiter counts through better batching.

Common Mistakes

Broadcast has its own set of pitfalls beyond those of signal.

broadcast_mistakes.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
#include <pthread.h>
#include <stdbool.h>
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
bool resource_ready = false;
 
// =====================================================
// MISTAKE 1: Broadcast for Single-Consumer Events
// =====================================================
 
void release_single_resource_INEFFICIENT(void) {
    pthread_mutex_lock(&mutex);
    resource_ready = true;
    
    // WASTEFUL: Only one thread can use the resource
    // Using broadcast wakes all, N-1 go back to sleep
    pthread_cond_broadcast(&cond);
    
    pthread_mutex_unlock(&mutex);
}
 
void release_single_resource_OPTIMAL(void) {
    pthread_mutex_lock(&mutex);
    resource_ready = true;
    
    // BETTER: Signal wakes exactly one waiter
    pthread_cond_signal(&cond);
    
    pthread_mutex_unlock(&mutex);
}
 
// =====================================================
// MISTAKE 2: Broadcast Inside a Loop
// =====================================================
 
void add_items_WRONG(int *items, int n) {
    for (int i = 0; i < n; i++) {
        pthread_mutex_lock(&mutex);
        add_item(items[i]);
        pthread_cond_broadcast(&cond);  // WRONG: Broadcast N times!
        pthread_mutex_unlock(&mutex);
    }
    // Result: N broadcasts, each waking all waiters = N^2 wakeups
}
 
void add_items_RIGHT(int *items, int n) {
    pthread_mutex_lock(&mutex);
    for (int i = 0; i < n; i++) {
        add_item(items[i]);
    }
    pthread_cond_broadcast(&cond);  // ONE broadcast after all adds
    pthread_mutex_unlock(&mutex);
}
 
// =====================================================
// MISTAKE 3: Missing Broadcast on All Exit Paths
// =====================================================
 
void produce_WRONG(void) {
    pthread_mutex_lock(&mutex);
    
    if (error_condition) {
        pthread_mutex_unlock(&mutex);
        return;  // WRONG: No broadcast, waiters may be stuck!
    }
    
    add_work();
    pthread_cond_broadcast(&cond);
    
    pthread_mutex_unlock(&mutex);
}
 
void produce_CORRECT(void) {
    pthread_mutex_lock(&mutex);
    
    if (error_condition) {
        error_flag = true;  // Set error state
        pthread_cond_broadcast(&cond);  // Wake waiters to see error
        pthread_mutex_unlock(&mutex);
        return;
    }
    
    add_work();
    pthread_cond_broadcast(&cond);
    
    pthread_mutex_unlock(&mutex);
}
 
// =====================================================
// MISTAKE 4: Expecting Broadcast to Transfer the Mutex
// =====================================================
 
void wake_and_expect_immediate_WRONG(void) {
    pthread_mutex_lock(&mutex);
    ready = true;
    pthread_cond_broadcast(&cond);
    // WRONG assumption: waiters are now running
    // REALITY: waiters won't run until WE release mutex
    do_more_work_holding_mutex();  // This delays waiters
    pthread_mutex_unlock(&mutex);
}

Mistake Summary

•Broadcast for single-consumer — Use signal when only one waiter can proceed.
•Broadcast in loops — Batch operations and broadcast once at the end.
•Missing broadcast on error paths — Ensure all exit paths that change state also signal/broadcast.
•Assuming immediate wakeup — Waiters don't run until you release the mutex.
•Thundering herd unawareness — Understand the performance cost and optimize when needed.

Summary

We have comprehensively explored pthread_cond_broadcast()—the operation that wakes all waiting threads. It is essential for correctness in multi-waiter scenarios and system-wide state changes.

Key Takeaways

•Wakes All Waiters — pthread_cond_broadcast() wakes every thread blocked on the condition variable.
•Required for Heterogeneous Waiters — When waiters have different predicates, broadcast is mandatory to avoid lost wakeups.
•Essential for State Transitions — Shutdown, initialization complete, and mode changes require broadcast.
•Thundering Herd Impact — N waiters = N wakeups = O(N) overhead. Design to minimize.
•Optimization Strategies — Predicated broadcast, separate cond vars, and signal-when-possible reduce overhead.
•Modern Implementations Optimize — Linux futex requeue eliminates 'hurry up and wait' pattern.
•Default to Broadcast When Uncertain — Correctness (broadcast) > Performance (signal). Measure, then optimize.

Broadcast Operation Mastered

You now understand pthread_cond_broadcast() for multi-waiter scenarios. The next page consolidates everything with Proper Usage Patterns—the canonical patterns, best practices, and anti-patterns for production condition variable usage.

4 / 5

Loading learning content...

Operating SystemsPthread Condition Variables

Pthread Condition Variables

LevelAdvanced

Duration120 mins

TopicPthread Condition Variables

4 / 5

pthread_cond_broadcast

Waking All Waiters

While pthread_cond_signal() wakes a single waiter, many synchronization scenarios require waking all waiting threads simultaneously. This is the purpose of pthread_cond_broadcast().

Broadcast is essential when:

Multiple threads can make progress after a state change
Waiters have different predicates on the same condition variable
System-wide events like shutdown must reach all waiting threads
The signaling thread doesn't know which waiter(s) should proceed

This page provides comprehensive coverage of pthread_cond_broadcast()—its semantics, use cases, the thundering herd problem, performance optimization, and patterns for effective application.

Learning Objectives

Function Signature and Semantics

The pthread_cond_broadcast() Function

#include <pthread.h>

int pthread_cond_broadcast(pthread_cond_t *cond);

Parameters:

cond: Pointer to a properly initialized condition variable.

Return Value:

0: Success. All waiting threads (if any) have been awakened.
Non-zero error code: Failure (rare; typically EINVAL).

The Semantic Contract

When a thread calls pthread_cond_broadcast():

All threads blocked on cond are awakened
Each awakened thread will attempt to reacquire its associated mutex
If no threads are waiting, the operation has no effect
The operation is non-blocking—it returns immediately

Key Differences from Signal

pthread_cond_signal() vs pthread_cond_broadcast()
Aspect	signal()	broadcast()
Wakeup count	At least one	All waiters
Performance (N waiters)	O(1)	O(N)
Use case	One waiter can proceed	Multiple/all waiters may proceed
Risk if wrong choice	Lost wakeups	Unnecessary wakeups
Default safety	Requires analysis	Always correct

pthread_cond_broadcast_basics.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
#include <pthread.h>
#include <stdio.h>
#include <stdbool.h>
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
bool shutdown = false;
int workers_waiting = 0;
 
// =====================================================
// Broadcast: The Shutdown Pattern
// =====================================================
 
// Worker threads wait for work or shutdown
void *worker_thread(void *arg) {
    int id = *(int *)arg;
    
    pthread_mutex_lock(&mutex);
    
    while (!shutdown) {
        workers_waiting++;
        printf("Worker %d: waiting...
", id);
        
        pthread_cond_wait(&cond, &mutex);
        
        workers_waiting--;
        
        if (shutdown) break;
        
        // Do work...
    }
    
    printf("Worker %d: exiting due to shutdown
", id);
    pthread_mutex_unlock(&mutex);
    return NULL;
}
 
// Master thread initiates shutdown
void initiate_shutdown(void) {
    pthread_mutex_lock(&mutex);
    
    printf("Shutdown: setting flag and broadcasting
");
    shutdown = true;
    
    // MUST use broadcast - ALL workers need to know
    pthread_cond_broadcast(&cond);
    
    pthread_mutex_unlock(&mutex);
}
 
// =====================================================
// What Happens After Broadcast
// =====================================================
 
/*
 * Timeline with 3 workers waiting on cond:
 *
 * Master: lock(mutex)
 * Master: shutdown = true
 * Master: broadcast(cond)
 *   -> Worker 1, 2, 3 all marked as runnable
 * Master: unlock(mutex)
 *
 * Now all workers compete for the mutex:
 *
 * Worker 1: acquires mutex, sees shutdown=true, exits
 * Worker 1: unlock(mutex)
 *
 * Worker 2: acquires mutex, sees shutdown=true, exits
 * Worker 2: unlock(mutex)
 *
 * Worker 3: acquires mutex, sees shutdown=true, exits
 *
 * All workers have exited cleanly.
 * Signal would have only woken ONE worker!
 */

All Means All

When Broadcast Is Required

Broadcast is not just a "safe default"—there are scenarios where it is mandatory for correctness. Using signal in these cases leads to missed wakeups and deadlock.

Scenario 1: Waiters with Different Predicates

When multiple threads wait on the same condition variable but check different conditions, signal may wake the wrong thread:

heterogeneous_waiters.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include <pthread.h>
#include <stdbool.h>
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
 
int count = 0;
 
// Thread A: waits for count to become EVEN
void *wait_for_even(void *arg) {
    pthread_mutex_lock(&mutex);
    while (count % 2 != 0) {  // Wait while ODD
        pthread_cond_wait(&cond, &mutex);
    }
    printf("Even condition met: count = %d
", count);
    pthread_mutex_unlock(&mutex);
    return NULL;
}
 
// Thread B: waits for count > 10
void *wait_for_gt_10(void *arg) {
    pthread_mutex_lock(&mutex);
    while (count <= 10) {
        pthread_cond_wait(&cond, &mutex);
    }
    printf("Greater-than-10 condition met: count = %d
", count);
    pthread_mutex_unlock(&mutex);
    return NULL;
}
 
// WRONG: Using signal
void increment_WRONG(void) {
    pthread_mutex_lock(&mutex);
    count++;
    pthread_cond_signal(&cond);  // May wake wrong thread!
    pthread_mutex_unlock(&mutex);
}
 
// Example: count goes from 10 to 11
// - Thread A wants even (11 is odd, will re-wait)
// - Thread B wants > 10 (11 > 10, should proceed)
// - Signal wakes Thread A
// - Thread A checks: 11 % 2 != 0, goes back to sleep
// - Thread B never woken! LOST WAKEUP!
 
// CORRECT: Using broadcast
void increment_CORRECT(void) {
    pthread_mutex_lock(&mutex);
    count++;
    pthread_cond_broadcast(&cond);  // Both threads check their predicates
    pthread_mutex_unlock(&mutex);
}

Scenario 2: Multiple Threads Can Proceed

When a state change enables multiple waiters to make progress simultaneously:

multiple_can_proceed.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <pthread.h>
#include <stdbool.h>
 
// Readers-Writers: Multiple readers can proceed simultaneously
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t readers_ok = PTHREAD_COND_INITIALIZER;
 
int active_writers = 0;
int waiting_readers = 0;
int active_readers = 0;
 
void *reader_thread(void *arg) {
    pthread_mutex_lock(&mutex);
    
    waiting_readers++;
    while (active_writers > 0) {
        pthread_cond_wait(&readers_ok, &mutex);
    }
    waiting_readers--;
    active_readers++;
    
    pthread_mutex_unlock(&mutex);
    
    // ... read ...
    
    pthread_mutex_lock(&mutex);
    active_readers--;
    // Signal writers if last reader (handled elsewhere)
    pthread_mutex_unlock(&mutex);
    
    return NULL;
}
 
void writer_finish(void) {
    pthread_mutex_lock(&mutex);
    
    active_writers = 0;
    
    // Multiple readers may be waiting - ALL should proceed
    // Using signal would only wake ONE reader!
    pthread_cond_broadcast(&readers_ok);
    
    pthread_mutex_unlock(&mutex);
}

Scenario 3: System State Transitions

When the overall system state changes (initialization complete, shutdown, mode change), all affected components must be notified:

Broadcast-Required Scenarios

•Heterogeneous predicates — Waiters checking different conditions on the same cond var.
•Multiple threads can proceed — Adding N items to a pool with M > 1 waiters.
•Readers-writers — Writer finishing allows all waiting readers to proceed.
•Barrier synchronization — Last thread arriving releases all waiting threads.
•Shutdown signal — All components must observe the shutdown flag.
•Mode transitions — System moving from 'initializing' to 'ready' state.
•Resource becoming available — A contested resource like a database connection becomes free for multiple potential users.

The Safety Rule

The Thundering Herd Problem

While broadcast is often necessary for correctness, it introduces a performance challenge known as the thundering herd problem.

The Problem Defined

When broadcast wakes N threads, but only a few (or one) can actually make progress:

All N threads wake up (expensive: system calls, context switches)
All N threads compete for the mutex (contention)
Threads acquire mutex one at a time, check condition
Most find condition still false, go back to sleep
Only 1-2 threads actually make progress
N-1 to N-2 threads wasted CPU and caused contention

This is called "thundering herd" because it resembles a herd of animals all stampeding toward a resource, only to find most cannot access it.

thundering_herd_demo.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
#include <pthread.h>
#include <stdio.h>
#include <stdbool.h>
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
 
int available_connections = 0;
int waiting_threads = 0;
int wakeup_count = 0;
 
// =====================================================
// Thundering Herd Demonstration
// =====================================================
 
void *worker(void *arg) {
    int id = *(int *)arg;
    
    pthread_mutex_lock(&mutex);
    
    while (available_connections == 0) {
        waiting_threads++;
        printf("Thread %d: waiting for connection (waiters=%d)
", 
               id, waiting_threads);
        
        pthread_cond_wait(&cond, &mutex);
        
        waiting_threads--;
        wakeup_count++;
        printf("Thread %d: woken up (total wakeups=%d)
", id, wakeup_count);
    }
    
    // Got a connection
    available_connections--;
    printf("Thread %d: acquired connection
", id);
    
    pthread_mutex_unlock(&mutex);
    
    // Use connection...
    
    return NULL;
}
 
// One connection becomes available
void release_one_connection(void) {
    pthread_mutex_lock(&mutex);
    
    available_connections = 1;
    printf("
Releasing 1 connection with %d waiting threads
", 
           waiting_threads);
    
    // BROADCAST: All waiting threads wake up
    pthread_cond_broadcast(&cond);
    
    pthread_mutex_unlock(&mutex);
    
    // Result with 10 waiting threads:
    // - 10 threads wake up
    // - 10 threads serialize on mutex
    // - 1 thread gets connection, proceeds
    // - 9 threads find available_connections==0, go back to sleep
    // - Wasted: 9 wakeups + 10 mutex acquisitions
}
 
// BETTER: Use signal when only releasing one
void release_one_connection_optimized(void) {
    pthread_mutex_lock(&mutex);
    
    available_connections = 1;
    
    // SIGNAL: Only one thread can use this anyway
    pthread_cond_signal(&cond);
    
    pthread_mutex_unlock(&mutex);
    
    // Result: 1 thread wakes, 1 thread proceeds
}

Thundering Herd Impact
Waiting Threads	Broadcast Cost	Wasted Wakeups	Mutex Contentions
10	~50,000 cycles	9	9
100	~500,000 cycles	99	99
1,000	~5,000,000 cycles	999	999
10,000	~50,000,000 cycles	9,999	Severe contention

Mitigation Strategies

1. Use Signal When Possible

Don't use broadcast reflexively. Analyze whether signal suffices:

Does only one thread need to proceed? → Signal
Do multiple threads need to proceed? → Broadcast

2. Separate Condition Variables

Instead of one cond var with heterogeneous waiters, use multiple cond vars with homogeneous waiters:

// Instead of:
pthread_cond_t cond;  // All workers wait here

// Use:
pthread_cond_t connection_available;  // Connection waiters
pthread_cond_t work_available;         // Work waiters

Now you can signal the specific condition that changed.

3. Token-Based Wakeup

For resource pools, track exactly how many resources are available:

// Release N connections
available_connections += N;
for (int i = 0; i < N && waiting_threads > 0; i++) {
    pthread_cond_signal(&cond);  // Signal exactly N times
}

4. Staged Wakeup

Wake threads in batches rather than all at once:

// Wake threads gradually to reduce contention
for (int batch = 0; batch < total_waiters; batch += BATCH_SIZE) {
    // Wake a batch
    for (int i = 0; i < BATCH_SIZE && i + batch < total_waiters; i++) {
        pthread_cond_signal(&cond);
    }
    // Brief yield to let batch proceed
    sched_yield();
}

Design Principle

Implementation Details

Understanding how pthread_cond_broadcast() is implemented helps predict and optimize performance.

Conceptual Implementation

broadcast_implementation.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
// =====================================================
// Conceptual Implementation of pthread_cond_broadcast
// (Simplified for understanding)
// =====================================================
 
int pthread_cond_broadcast(pthread_cond_t *cond) {
    struct __pthread_cond_t *c = (struct __pthread_cond_t *)cond;
    
    // Quick check: any waiters?
    if (__atomic_load_n(&c->__nwaiters, __ATOMIC_RELAXED) == 0) {
        return 0;  // No waiters, nothing to do
    }
    
    // Acquire internal lock
    __lock(&c->__internal_lock);
    
    // Wake ALL waiting threads
    unsigned int n = c->__nwaiters;
    if (n > 0) {
        // Increment wakeup sequence by number of waiters
        c->__wakeup_seq += n;
        
        // Use futex to wake all
        // FUTEX_WAKE with count = INT_MAX (or n)
        futex_wake(&c->__futex_word, INT_MAX);
    }
    
    __unlock(&c->__internal_lock);
    
    return 0;
}
 
// =====================================================
// Why Broadcast Is O(N)
// =====================================================
 
/*
 * The broadcast operation itself is relatively cheap:
 * - One system call (futex_wake with count=ALL)
 * - One atomic increment
 * 
 * The O(N) cost comes from what happens AFTER broadcast:
 * 
 * 1. Kernel marks all N threads as runnable
 *    - Kernel work: O(N) queue operations
 *    
 * 2. All N threads get scheduled (eventually)
 *    - Context switches: potentially N, across CPUs
 *    
 * 3. All N threads try to acquire mutex
 *    - Serialization: N-1 threads block
 *    - Cache bouncing: mutex line ping-pongs
 *    
 * 4. Each thread runs, checks condition, mostly re-sleeps
 *    - Wasted work: N-1 useless wakeups
 *    
 * The system call is O(1), but total impact is O(N).
 */

Linux/NPTL Optimization: Requeue

Modern Linux implementations (glibc 2.3.3+) use the FUTEX_CMP_REQUEUE operation to optimize broadcast:

Without Requeue:

Broadcast wakes all waiters
All waiters run, try to lock mutex
N-1 waiters block on mutex
Each wakes sequentially as mutex is released

With Requeue:

Broadcast moves waiters directly from cond var to mutex wait queue
Waiters never fully wake up
As mutex is released, waiters wake one at a time
No thundering herd on the mutex

This is a significant optimization: the "hurry up and wait" problem is completely eliminated.

Broadcast Implementation Comparison
Implementation	Broadcast System Calls	Mutex Contention	Total Overhead
Naive (wake all)	1 (wake N)	High (N-1 block)	High
Requeue (Linux)	1 (requeue)	Minimal (serialized)	Low
Wait morphing	0 (queue move)	Minimal	Lowest

Implementation Varies

Practical Patterns Using Broadcast

Let's examine production-quality implementations of common patterns that require broadcast.

Pattern 1: Thread Barrier

barrier_pattern.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include <pthread.h>
#include <stdio.h>
 
// =====================================================
// Thread Barrier: All Threads Wait for Each Other
// =====================================================
 
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    int threshold;      // Number of threads to wait for
    int count;          // Current number arrived
    int generation;     // Barrier reuse tracking
} barrier_t;
 
void barrier_init(barrier_t *b, int n) {
    pthread_mutex_init(&b->mutex, NULL);
    pthread_cond_init(&b->cond, NULL);
    b->threshold = n;
    b->count = 0;
    b->generation = 0;
}
 
void barrier_wait(barrier_t *b) {
    pthread_mutex_lock(&b->mutex);
    
    int my_generation = b->generation;
    b->count++;
    
    if (b->count == b->threshold) {
        // Last thread to arrive
        b->count = 0;
        b->generation++;  // Prepare for reuse
        
        // BROADCAST: Release ALL waiting threads
        pthread_cond_broadcast(&b->cond);
    } else {
        // Wait for last thread
        // Use while + generation to handle spurious wakeups correctly
        while (my_generation == b->generation) {
            pthread_cond_wait(&b->cond, &b->mutex);
        }
    }
    
    pthread_mutex_unlock(&b->mutex);
}
 
// Usage: parallel computation with phases
void *parallel_worker(void *arg) {
    barrier_t *barrier = (barrier_t *)arg;
    
    do_phase1_work();
    barrier_wait(barrier);  // All threads sync here
    
    do_phase2_work();
    barrier_wait(barrier);  // All threads sync again
    
    return NULL;
}

Pattern 2: One-Shot Event (Latch)

latch_pattern.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#include <pthread.h>
#include <stdbool.h>
 
// =====================================================
// One-Shot Event: Signal Once, All Waiters Released
// =====================================================
 
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    bool triggered;
} latch_t;
 
void latch_init(latch_t *l) {
    pthread_mutex_init(&l->mutex, NULL);
    pthread_cond_init(&l->cond, NULL);
    l->triggered = false;
}
 
// Wait for event (blocks until triggered)
void latch_wait(latch_t *l) {
    pthread_mutex_lock(&l->mutex);
    
    while (!l->triggered) {
        pthread_cond_wait(&l->cond, &l->mutex);
    }
    
    pthread_mutex_unlock(&l->mutex);
}
 
// Trigger event (releases all current and future waiters)
void latch_trigger(latch_t *l) {
    pthread_mutex_lock(&l->mutex);
    
    if (!l->triggered) {
        l->triggered = true;
        
        // Wake ALL current waiters
        pthread_cond_broadcast(&l->cond);
    }
    
    pthread_mutex_unlock(&l->mutex);
}
 
// Usage: initialization completion
latch_t app_initialized;
 
void *component_thread(void *arg) {
    // Wait for app to be ready
    latch_wait(&app_initialized);
    
    // Now safe to proceed
    do_work();
    
    return NULL;
}
 
void main_init(void) {
    latch_init(&app_initialized);
    
    // Start components (they'll wait)
    start_all_components();
    
    // Do initialization
    load_config();
    connect_to_database();
    
    // Signal all components: "ready to go!"
    latch_trigger(&app_initialized);
}

Pattern 3: Graceful Shutdown

graceful_shutdown.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include <pthread.h>
#include <stdbool.h>
#include <stdio.h>
 
// =====================================================
// Graceful Shutdown: Wait for All Workers to Exit
// =====================================================
 
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t work_cond;       // Workers wait here for work
    pthread_cond_t shutdown_cond;   // Shutdown waits here for workers
    
    bool shutdown_requested;
    int active_workers;
    int total_workers;
    
    // Work queue fields...
} worker_pool_t;
 
void *worker(void *arg) {
    worker_pool_t *pool = (worker_pool_t *)arg;
    
    pthread_mutex_lock(&pool->mutex);
    pool->active_workers++;
    
    while (!pool->shutdown_requested) {
        // Wait for work or shutdown
        pthread_cond_wait(&pool->work_cond, &pool->mutex);
        
        if (pool->shutdown_requested) break;
        
        // Process work (would unlock during processing)
    }
    
    pool->active_workers--;
    
    // If last worker, signal shutdown waiter
    if (pool->active_workers == 0) {
        pthread_cond_signal(&pool->shutdown_cond);
    }
    
    printf("Worker exiting, %d remaining
", pool->active_workers);
    pthread_mutex_unlock(&pool->mutex);
    
    return NULL;
}
 
void pool_shutdown_and_wait(worker_pool_t *pool) {
    pthread_mutex_lock(&pool->mutex);
    
    printf("Initiating shutdown of %d workers
", pool->active_workers);
    pool->shutdown_requested = true;
    
    // BROADCAST: All workers must see shutdown
    pthread_cond_broadcast(&pool->work_cond);
    
    // Wait for all workers to exit
    while (pool->active_workers > 0) {
        pthread_cond_wait(&pool->shutdown_cond, &pool->mutex);
    }
    
    printf("All workers have exited
");
    pthread_mutex_unlock(&pool->mutex);
}

Pattern Commonalities

•All patterns use broadcast for 'everyone must know' events — Barrier release, latch trigger, shutdown signal.
•While loops handle spurious wakeups — Even with broadcast, always recheck conditions.
•Generation/sequence counters prevent race conditions — Barrier uses generation; latch uses triggered flag.
•Dual condition variables for bidirectional sync — Shutdown pattern uses work_cond AND shutdown_cond.

Performance Optimization

While correctness trumps performance, there are legitimate optimizations for broadcast-heavy code.

Strategy 1: Predicated Broadcast

Only broadcast when someone might actually be waiting:

predicated_broadcast.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <pthread.h>
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
 
int waiters = 0;  // Track waiter count explicitly
 
void *consumer(void *arg) {
    pthread_mutex_lock(&mutex);
    
    waiters++;
    while (!has_work()) {
        pthread_cond_wait(&cond, &mutex);
    }
    waiters--;
    
    do_work();
    
    pthread_mutex_unlock(&mutex);
    return NULL;
}
 
void producer_add_work(void) {
    pthread_mutex_lock(&mutex);
    
    add_work_item();
    
    // OPTIMIZATION: Only broadcast if there are waiters
    if (waiters > 0) {
        pthread_cond_broadcast(&cond);
    }
    
    pthread_mutex_unlock(&mutex);
}
 
// Analysis:
// - Without optimization: broadcast every add, even if no waiters
// - With optimization: skip broadcast when waiters == 0
// - Savings: ~5,000 cycles per no-op broadcast avoided

Strategy 2: Minimize Broadcast Scope

Reduce what broadcast wakes by using more specific condition variables:

minimize_scope.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <pthread.h>
 
// BAD: All waiters on one cond var
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t cond;  // Everyone waits here
    int connection_count;
    int work_count;
} pool_bad_t;
 
void add_connection_bad(pool_bad_t *p) {
    pthread_mutex_lock(&p->mutex);
    p->connection_count++;
    // Must broadcast - both connection waiters AND work waiters are on cond
    pthread_cond_broadcast(&p->cond);
    pthread_mutex_unlock(&p->mutex);
}
 
// GOOD: Separate cond vars for separate conditions
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t connection_available;  // Only connection waiters
    pthread_cond_t work_available;        // Only work waiters
    int connection_count;
    int work_count;
} pool_good_t;
 
void add_connection_good(pool_good_t *p) {
    pthread_mutex_lock(&p->mutex);
    p->connection_count++;
    // Only wake connection waiters
    pthread_cond_signal(&p->connection_available);  // Can even use signal!
    pthread_mutex_unlock(&p->mutex);
}
 
void add_work_good(pool_good_t *p) {
    pthread_mutex_lock(&p->mutex);
    p->work_count++;
    // Only wake work waiters
    pthread_cond_signal(&p->work_available);
    pthread_mutex_unlock(&p->mutex);
}

Strategy 3: Signal-Then-Broadcast Pattern

For resource release where usually one waiter proceeds but sometimes multiple might:

signal_then_broadcast.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include <pthread.h>
 
// Release one resource - usually only one waiter needs it
void release_resource(pool_t *p) {
    pthread_mutex_lock(&p->mutex);
    
    p->available++;
    
    // Common case: signal one waiter
    pthread_cond_signal(&p->resource_cond);
    
    pthread_mutex_unlock(&p->mutex);
}
 
// Release multiple resources atomically
void release_resources(pool_t *p, int n) {
    pthread_mutex_lock(&p->mutex);
    
    p->available += n;
    
    // Multiple resources: broadcast or signal n times
    if (n > 1) {
        pthread_cond_broadcast(&p->resource_cond);
    } else {
        pthread_cond_signal(&p->resource_cond);
    }
    
    pthread_mutex_unlock(&p->mutex);
}

Measure Before Optimizing

Common Mistakes

Broadcast has its own set of pitfalls beyond those of signal.

broadcast_mistakes.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
#include <pthread.h>
#include <stdbool.h>
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
bool resource_ready = false;
 
// =====================================================
// MISTAKE 1: Broadcast for Single-Consumer Events
// =====================================================
 
void release_single_resource_INEFFICIENT(void) {
    pthread_mutex_lock(&mutex);
    resource_ready = true;
    
    // WASTEFUL: Only one thread can use the resource
    // Using broadcast wakes all, N-1 go back to sleep
    pthread_cond_broadcast(&cond);
    
    pthread_mutex_unlock(&mutex);
}
 
void release_single_resource_OPTIMAL(void) {
    pthread_mutex_lock(&mutex);
    resource_ready = true;
    
    // BETTER: Signal wakes exactly one waiter
    pthread_cond_signal(&cond);
    
    pthread_mutex_unlock(&mutex);
}
 
// =====================================================
// MISTAKE 2: Broadcast Inside a Loop
// =====================================================
 
void add_items_WRONG(int *items, int n) {
    for (int i = 0; i < n; i++) {
        pthread_mutex_lock(&mutex);
        add_item(items[i]);
        pthread_cond_broadcast(&cond);  // WRONG: Broadcast N times!
        pthread_mutex_unlock(&mutex);
    }
    // Result: N broadcasts, each waking all waiters = N^2 wakeups
}
 
void add_items_RIGHT(int *items, int n) {
    pthread_mutex_lock(&mutex);
    for (int i = 0; i < n; i++) {
        add_item(items[i]);
    }
    pthread_cond_broadcast(&cond);  // ONE broadcast after all adds
    pthread_mutex_unlock(&mutex);
}
 
// =====================================================
// MISTAKE 3: Missing Broadcast on All Exit Paths
// =====================================================
 
void produce_WRONG(void) {
    pthread_mutex_lock(&mutex);
    
    if (error_condition) {
        pthread_mutex_unlock(&mutex);
        return;  // WRONG: No broadcast, waiters may be stuck!
    }
    
    add_work();
    pthread_cond_broadcast(&cond);
    
    pthread_mutex_unlock(&mutex);
}
 
void produce_CORRECT(void) {
    pthread_mutex_lock(&mutex);
    
    if (error_condition) {
        error_flag = true;  // Set error state
        pthread_cond_broadcast(&cond);  // Wake waiters to see error
        pthread_mutex_unlock(&mutex);
        return;
    }
    
    add_work();
    pthread_cond_broadcast(&cond);
    
    pthread_mutex_unlock(&mutex);
}
 
// =====================================================
// MISTAKE 4: Expecting Broadcast to Transfer the Mutex
// =====================================================
 
void wake_and_expect_immediate_WRONG(void) {
    pthread_mutex_lock(&mutex);
    ready = true;
    pthread_cond_broadcast(&cond);
    // WRONG assumption: waiters are now running
    // REALITY: waiters won't run until WE release mutex
    do_more_work_holding_mutex();  // This delays waiters
    pthread_mutex_unlock(&mutex);
}

Mistake Summary

•Broadcast for single-consumer — Use signal when only one waiter can proceed.
•Broadcast in loops — Batch operations and broadcast once at the end.
•Missing broadcast on error paths — Ensure all exit paths that change state also signal/broadcast.
•Assuming immediate wakeup — Waiters don't run until you release the mutex.
•Thundering herd unawareness — Understand the performance cost and optimize when needed.

Summary

We have comprehensively explored pthread_cond_broadcast()—the operation that wakes all waiting threads. It is essential for correctness in multi-waiter scenarios and system-wide state changes.

Key Takeaways

•Wakes All Waiters — pthread_cond_broadcast() wakes every thread blocked on the condition variable.
•Required for Heterogeneous Waiters — When waiters have different predicates, broadcast is mandatory to avoid lost wakeups.
•Essential for State Transitions — Shutdown, initialization complete, and mode changes require broadcast.
•Thundering Herd Impact — N waiters = N wakeups = O(N) overhead. Design to minimize.
•Optimization Strategies — Predicated broadcast, separate cond vars, and signal-when-possible reduce overhead.
•Modern Implementations Optimize — Linux futex requeue eliminates 'hurry up and wait' pattern.
•Default to Broadcast When Uncertain — Correctness (broadcast) > Performance (signal). Measure, then optimize.

Broadcast Operation Mastered

4 / 5