Operating SystemsCommon Concurrency Patterns

Common Concurrency Patterns

LevelIntermediate

Duration90 mins

TopicCommon Concurrency Patterns

1 / 5

Producer-Consumer Pattern

The Coordination Problem at the Heart of Concurrent Systems

Imagine a manufacturing assembly line. Workers at one station produce components, placing them on a conveyor belt. Workers at the next station consume these components, assembling them into final products. The belt serves as a buffer—it absorbs differences in production and consumption rates, preventing both bottlenecks and wasted effort.

This simple industrial metaphor captures one of the most fundamental challenges in concurrent programming: How do we coordinate work between entities that produce data and entities that consume it? This is the essence of the producer-consumer pattern, also known as the bounded-buffer problem—a pattern so pervasive that you'll find it embedded in everything from operating system kernels to web servers, message queues to video streaming pipelines.

What You Will Learn

By the end of this page, you will deeply understand the producer-consumer pattern: its theoretical foundations, the synchronization challenges it addresses, multiple implementation strategies using different primitives, performance considerations, and its manifestation across real-world systems. You'll be equipped to recognize, analyze, and implement this pattern in any concurrent environment.

Pattern Definition and Core Concepts

The producer-consumer pattern is a classic concurrency design pattern that describes the interaction between two types of concurrent entities through a shared buffer or queue.

Formal Definition:

The producer-consumer pattern involves:

Producers: One or more threads/processes that generate data items and place them into a shared buffer
Consumers: One or more threads/processes that remove data items from the shared buffer and process them
Bounded Buffer: A finite-capacity data structure that decouples producers from consumers, allowing asynchronous operation

The pattern addresses a fundamental coordination problem: producers and consumers operate at different rates, have different processing times, and may be temporarily unavailable. The buffer absorbs these variations, enabling the system to function smoothly without tight coupling.

Core Components of the Producer-Consumer Pattern
Component	Responsibility	Synchronization Concern
Producer(s)	Generate data items, insert into buffer	Must wait if buffer is full (back-pressure)
Consumer(s)	Remove data items, process them	Must wait if buffer is empty (starvation)
Bounded Buffer	Store items, enforce capacity limits	Must ensure thread-safe access (mutual exclusion)
Synchronization	Coordinate access, signal state changes	Must prevent races, deadlocks, and lost signals

Why 'Bounded'?

The buffer is typically bounded (finite capacity) for several critical reasons:

Memory Management: Unbounded buffers can grow indefinitely, leading to memory exhaustion if producers outpace consumers
Back-Pressure Mechanism: A bounded buffer naturally applies back-pressure—when full, producers must wait, preventing runaway production
Latency Control: Bounded buffers limit the maximum time an item spends waiting, providing predictable latency bounds
Resource Allocation: Finite buffers enable static memory allocation, avoiding dynamic allocation overhead in performance-critical paths

The choice of buffer size is a crucial design decision that affects throughput, latency, and memory usage.

Historical Context

The producer-consumer problem was first formalized by Edsger Dijkstra in 1965 as a demonstration of his semaphore concept. It has since become a canonical example for teaching synchronization and remains directly applicable to modern systems—from operating system I/O buffers to distributed message queues like Kafka and RabbitMQ.

The Synchronization Challenge

The producer-consumer pattern introduces three distinct synchronization requirements that must be simultaneously satisfied. Failure to address any one of these leads to correctness bugs, crashes, or undefined behavior.

The Three Synchronization Requirements:

Critical Synchronization Requirements

•Mutual Exclusion: Only one thread can modify the buffer at a time. Concurrent modifications corrupt the data structure—imagine two producers simultaneously trying to insert at the same index, or a producer and consumer accessing a partially-written item.
•Buffer-Full Condition: Producers must block when the buffer is full. Without this, producers either overwrite existing data (data loss) or write beyond buffer bounds (memory corruption).
•Buffer-Empty Condition: Consumers must block when the buffer is empty. Without this, consumers either read stale/garbage data or access uninitialized memory.

The Naive Approach Fails:

Consider a naive implementation using only a mutex for mutual exclusion:

naive_producer_consumer.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// BROKEN IMPLEMENTATION - DO NOT USE
#define BUFFER_SIZE 10
int buffer[BUFFER_SIZE];
int count = 0;  // Number of items in buffer
int in = 0;     // Next insert position
int out = 0;    // Next remove position
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
 
void producer(int item) {
    pthread_mutex_lock(&mutex);
    
    // PROBLEM: What if buffer is full?
    // This busy-waits while holding the lock!
    while (count == BUFFER_SIZE) {
        // Spin... but we're holding the mutex!
        // Consumer can NEVER acquire the lock to consume!
        // DEADLOCK!
    }
    
    buffer[in] = item;
    in = (in + 1) % BUFFER_SIZE;
    count++;
    
    pthread_mutex_unlock(&mutex);
}
 
void consumer(int *item) {
    pthread_mutex_lock(&mutex);
    
    // Same problem if buffer is empty
    while (count == 0) {
        // Spin while holding lock - Consumer starves
    }
    
    *item = buffer[out];
    out = (out + 1) % BUFFER_SIZE;
    count--;
    
    pthread_mutex_unlock(&mutex);
}

The naive approach demonstrates a critical insight: we cannot use a single lock for both mutual exclusion and condition waiting. If a thread holds a lock while waiting for a condition, no other thread can acquire the lock to change that condition—resulting in deadlock.

The Core Insight:

We need mechanisms that:

Allow threads to wait for conditions without holding locks
Allow threads to signal condition changes to waiting threads
Ensure atomic check-and-wait to prevent lost signals

This is precisely what condition variables, semaphores, and monitors provide—primitives designed for coordination, not just exclusion.

The Lost Wakeup Problem

A subtle bug occurs when a thread checks a condition, finds it false, and prepares to wait—but before it actually waits, another thread changes the condition and signals. The first thread then waits for a signal that already occurred. This 'lost wakeup' can cause indefinite blocking. Proper synchronization primitives atomically combine the condition check and wait operation to prevent this.

Semaphore-Based Implementation

Dijkstra's original solution to the producer-consumer problem used semaphores—counting synchronization primitives that elegantly encode the buffer state. This implementation remains instructive for understanding the pattern's mechanics.

Semaphore Strategy:

We use three semaphores:

mutex (binary semaphore, initialized to 1): Ensures mutual exclusion for buffer access
empty (counting semaphore, initialized to BUFFER_SIZE): Counts empty slots; producers wait on this
full (counting semaphore, initialized to 0): Counts filled slots; consumers wait on this

The counting semaphores encode buffer capacity directly—their values always reflect the number of available slots (empty) and items (full).

semaphore_producer_consumer.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
#include <stdio.h>
#include <pthread.h>
#include <semaphore.h>
 
#define BUFFER_SIZE 10
 
int buffer[BUFFER_SIZE];
int in = 0;   // Next position for producer to insert
int out = 0;  // Next position for consumer to remove
 
sem_t mutex;  // Binary semaphore for mutual exclusion
sem_t empty;  // Counting semaphore: number of empty slots
sem_t full;   // Counting semaphore: number of filled slots
 
void init_synchronization() {
    sem_init(&mutex, 0, 1);           // Binary semaphore, initial = 1
    sem_init(&empty, 0, BUFFER_SIZE); // Initially all slots are empty
    sem_init(&full, 0, 0);            // Initially no slots are full
}
 
void producer(int item) {
    // Step 1: Wait for an empty slot
    // Decrements 'empty' count; blocks if 'empty' == 0
    sem_wait(&empty);
    
    // Step 2: Acquire mutual exclusion to modify buffer
    sem_wait(&mutex);
    
    // Step 3: Critical section - insert item
    buffer[in] = item;
    in = (in + 1) % BUFFER_SIZE;
    printf("Produced: %d at position %d
", item, (in - 1 + BUFFER_SIZE) % BUFFER_SIZE);
    
    // Step 4: Release mutual exclusion
    sem_post(&mutex);
    
    // Step 5: Signal that a slot is now full
    // Increments 'full' count; wakes waiting consumer if any
    sem_post(&full);
}
 
int consumer() {
    int item;
    
    // Step 1: Wait for a full slot
    // Decrements 'full' count; blocks if 'full' == 0
    sem_wait(&full);
    
    // Step 2: Acquire mutual exclusion to modify buffer
    sem_wait(&mutex);
    
    // Step 3: Critical section - remove item
    item = buffer[out];
    out = (out + 1) % BUFFER_SIZE;
    printf("Consumed: %d from position %d
", item, (out - 1 + BUFFER_SIZE) % BUFFER_SIZE);
    
    // Step 4: Release mutual exclusion
    sem_post(&mutex);
    
    // Step 5: Signal that a slot is now empty
    // Increments 'empty' count; wakes waiting producer if any
    sem_post(&empty);
    
    return item;
}
 
// Thread functions
void* producer_thread(void* arg) {
    for (int i = 0; i < 20; i++) {
        producer(i);
    }
    return NULL;
}
 
void* consumer_thread(void* arg) {
    for (int i = 0; i < 20; i++) {
        consumer();
    }
    return NULL;
}

Correctness Analysis:

Let's verify this implementation satisfies all three requirements:

1. Mutual Exclusion:

The mutex semaphore ensures only one thread executes the critical section at a time
Both producer and consumer acquire mutex before modifying buffer, in, or out

2. Buffer-Full Protection:

Before inserting, producer calls sem_wait(&empty)
If empty == 0 (buffer full), producer blocks
When consumer removes an item, it calls sem_post(&empty), incrementing the count and potentially waking a blocked producer

3. Buffer-Empty Protection:

Before removing, consumer calls sem_wait(&full)
If full == 0 (buffer empty), consumer blocks
When producer inserts an item, it calls sem_post(&full), incrementing the count and potentially waking a blocked consumer

Invariant:

At all times: empty + full = BUFFER_SIZE (every slot is either empty or full)
This invariant is maintained because every sem_wait(&empty) is paired with a sem_post(&full) (production), and vice versa (consumption)

Critical: Semaphore Ordering Matters

The order of sem_wait operations is CRITICAL. If we swap the order and acquire mutex before checking empty/full, we recreate the deadlock from the naive solution. The producer would hold mutex while waiting for empty slots, preventing consumers from acquiring mutex to create empty slots. Always wait on condition semaphores BEFORE acquiring the mutex.

WRONG_ordering.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// WRONG - CAUSES DEADLOCK
void producer_WRONG(int item) {
    sem_wait(&mutex);   // Acquire lock first
    sem_wait(&empty);   // Now wait for empty slot while holding lock
    // If buffer is full, we block here holding mutex
    // Consumer can never acquire mutex to consume!
    // DEADLOCK!
    
    buffer[in] = item;
    in = (in + 1) % BUFFER_SIZE;
    
    sem_post(&mutex);
    sem_post(&full);
}
 
// CORRECT - As shown above
void producer_CORRECT(int item) {
    sem_wait(&empty);   // Wait for condition WITHOUT holding lock
    sem_wait(&mutex);   // THEN acquire lock
    // We only reach here if there's an empty slot
    // Other threads can still access buffer while we wait
    
    buffer[in] = item;
    in = (in + 1) % BUFFER_SIZE;
    
    sem_post(&mutex);
    sem_post(&full);
}

Condition Variable Implementation

Modern POSIX systems typically implement producer-consumer using condition variables rather than semaphores. Condition variables provide more flexibility and integrate naturally with mutexes, making the synchronization logic more explicit.

Condition Variable Strategy:

We use:

One mutex for mutual exclusion
Two condition variables: not_full (producers wait on this) and not_empty (consumers wait on this)
Explicit count variable to track buffer occupancy (unlike semaphores where the count is implicit)

condvar_producer_consumer.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include <stdio.h>
#include <pthread.h>
 
#define BUFFER_SIZE 10
 
int buffer[BUFFER_SIZE];
int count = 0;  // Current number of items in buffer
int in = 0;     // Next insert position
int out = 0;    // Next remove position
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t not_full = PTHREAD_COND_INITIALIZER;   // Signaled when buffer becomes non-full
pthread_cond_t not_empty = PTHREAD_COND_INITIALIZER;  // Signaled when buffer becomes non-empty
 
void producer(int item) {
    pthread_mutex_lock(&mutex);
    
    // Wait while buffer is full
    // CRITICAL: Use 'while' not 'if' - handles spurious wakeups
    while (count == BUFFER_SIZE) {
        // Atomically: release mutex AND wait for signal
        // Upon return: mutex is re-acquired
        pthread_cond_wait(&not_full, &mutex);
    }
    
    // Invariant: count < BUFFER_SIZE (buffer has space)
    buffer[in] = item;
    in = (in + 1) % BUFFER_SIZE;
    count++;
    printf("Produced: %d (buffer count: %d)
", item, count);
    
    // Signal waiting consumers that buffer is no longer empty
    pthread_cond_signal(&not_empty);
    
    pthread_mutex_unlock(&mutex);
}
 
int consumer() {
    int item;
    
    pthread_mutex_lock(&mutex);
    
    // Wait while buffer is empty
    while (count == 0) {
        pthread_cond_wait(&not_empty, &mutex);
    }
    
    // Invariant: count > 0 (buffer has items)
    item = buffer[out];
    out = (out + 1) % BUFFER_SIZE;
    count--;
    printf("Consumed: %d (buffer count: %d)
", item, count);
    
    // Signal waiting producers that buffer is no longer full
    pthread_cond_signal(&not_full);
    
    pthread_mutex_unlock(&mutex);
    
    return item;
}

Critical Detail: The while Loop for Waiting

Notice we use while (count == BUFFER_SIZE) rather than if. This is not optional—it's essential for correctness due to spurious wakeups and signal stealing:

Spurious Wakeups: POSIX permits pthread_cond_wait to return even without a corresponding signal (for implementation efficiency). The condition must be re-checked after waking.

Signal Stealing: With multiple waiters, a signal might wake thread A, but before A re-acquires the mutex, thread B (which wasn't waiting) acquires it and consumes the item. When A finally runs, the condition is again false.

The Pattern:

pthread_mutex_lock(&mutex);
while (!condition) {           // Always use while, never if
    pthread_cond_wait(&cv, &mutex);
}
// Condition is guaranteed true here
// Do work...
pthread_mutex_unlock(&mutex);

pthread_cond_signal vs pthread_cond_broadcast

signal() wakes ONE waiting thread; broadcast() wakes ALL waiting threads. For producer-consumer with multiple producers/consumers, signal() is usually correct—one new slot can only satisfy one waiter. Use broadcast() when multiple waiters might proceed (e.g., if you add multiple items at once), or when conditions overlap. When in doubt, broadcast() is always safe but may cause unnecessary context switches (thundering herd).

Circular Buffer (Ring Buffer) Mechanics

The underlying data structure for the bounded buffer is typically a circular buffer (also called a ring buffer). Understanding its mechanics is essential for implementing producer-consumer correctly.

Circular Buffer Concept:

A circular buffer treats a fixed-size array as if it were circular—after reaching the end, indices wrap around to the beginning. This eliminates the need for expensive data shifting that would occur in a linear buffer.

Converting Mermaid diagram...

ring_buffer.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Ring Buffer Implementation Details
 
#define BUFFER_SIZE 8  // Should be power of 2 for optimization
 
typedef struct {
    int data[BUFFER_SIZE];
    int in;    // Write index (next position for producer)
    int out;   // Read index (next position for consumer)
    int count; // Number of items currently in buffer
} RingBuffer;
 
// Initialize ring buffer
void rb_init(RingBuffer* rb) {
    rb->in = 0;
    rb->out = 0;
    rb->count = 0;
}
 
// Check if buffer is empty
int rb_is_empty(RingBuffer* rb) {
    return rb->count == 0;
}
 
// Check if buffer is full
int rb_is_full(RingBuffer* rb) {
    return rb->count == BUFFER_SIZE;
}
 
// Insert item (assumes buffer is not full)
void rb_insert(RingBuffer* rb, int item) {
    rb->data[rb->in] = item;
    rb->in = (rb->in + 1) % BUFFER_SIZE;  // Wrap around
    rb->count++;
}
 
// Remove item (assumes buffer is not empty)
int rb_remove(RingBuffer* rb) {
    int item = rb->data[rb->out];
    rb->out = (rb->out + 1) % BUFFER_SIZE;  // Wrap around
    rb->count--;
    return item;
}
 
// OPTIMIZATION: If BUFFER_SIZE is power of 2, use bitwise AND
// (rb->in + 1) & (BUFFER_SIZE - 1) is faster than % BUFFER_SIZE

Key Properties of Circular Buffers:

O(1) Insert and Remove: No data movement required—only pointer updates
Fixed Memory: Size determined at creation; no dynamic allocation during operation
Cache-Friendly: Sequential memory access improves cache utilization
Power-of-Two Optimization: If size is 2^n, modulo becomes cheap bitwise AND

Alternative: Count-Free Implementation:

Some implementations avoid the count variable by using pointer positions alone. This works but wastes one slot (distinguishing full from empty):

ring_buffer_no_count.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// Alternative: No count variable, but wastes one slot
 
#define BUFFER_SIZE 8
int buffer[BUFFER_SIZE];
int in = 0;
int out = 0;
 
// Buffer is empty when in == out
int is_empty() {
    return in == out;
}
 
// Buffer is full when next position of 'in' equals 'out'
// This means we can only store BUFFER_SIZE - 1 items
int is_full() {
    return ((in + 1) % BUFFER_SIZE) == out;
}
 
// Available items = (in - out) mod BUFFER_SIZE
int item_count() {
    return (in - out + BUFFER_SIZE) % BUFFER_SIZE;
}
 
// Tradeoff: Save one variable, waste one buffer slot
// Modern systems typically prefer the count variable approach

Lock-Free Ring Buffers

For single-producer single-consumer (SPSC) scenarios, lock-free ring buffers can be implemented using atomic operations and memory barriers. The producer updates 'in' after writing data; the consumer reads 'in' to know how far it can safely read, then updates 'out'. This achieves very high throughput by avoiding mutex overhead entirely. Linux's kernel implements this in kfifo.

Multiple Producers and Multiple Consumers

Real-world systems often have multiple producers and multiple consumers operating concurrently. This introduces additional considerations beyond the basic pattern.

Terminology:

SPSC: Single Producer, Single Consumer
MPSC: Multiple Producers, Single Consumer
SPMC: Single Producer, Multiple Consumers
MPMC: Multiple Producers, Multiple Consumers

Each configuration has different synchronization requirements and optimization opportunities.

Producer-Consumer Configurations
Configuration	Synchronization Needs	Example Use Case
SPSC	Minimal - can be lock-free with memory barriers	Audio/video pipeline stages
MPSC	Producer-side locking only	Logging from multiple threads to single file
SPMC	Consumer-side locking only	Single data source to multiple workers
MPMC	Full synchronization required	General work queue, thread pool

MPMC Considerations:

With multiple producers and consumers, several additional issues arise:

1. Fairness:

Are all producers/consumers guaranteed eventual access, or can some be starved?
FIFO ordering of waiting threads may be important for fairness
Some implementations use separate locks for insert and remove operations

2. Order Preservation:

Items may not be consumed in the order they were produced
If order matters, additional sequencing mechanisms are needed

3. Batch Operations:

Inserting/removing multiple items atomically improves throughput
Requires careful API design to maintain buffer invariants

mpmc_considerations.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// MPMC Queue with separate locks (two-lock algorithm)
// Reduces contention between producers and consumers
 
typedef struct Node {
    int value;
    struct Node* next;
} Node;
 
typedef struct {
    Node* head;              // Consumer end
    Node* tail;              // Producer end
    pthread_mutex_t h_lock;  // Lock for head (consumers)
    pthread_mutex_t t_lock;  // Lock for tail (producers)
    sem_t items;             // Count of items
    sem_t slots;             // Count of available slots
} TwoLockQueue;
 
void enqueue(TwoLockQueue* q, int value) {
    sem_wait(&q->slots);  // Wait for available slot
    
    Node* node = malloc(sizeof(Node));
    node->value = value;
    node->next = NULL;
    
    pthread_mutex_lock(&q->t_lock);
    q->tail->next = node;
    q->tail = node;
    pthread_mutex_unlock(&q->t_lock);
    
    sem_post(&q->items);  // Signal item available
}
 
int dequeue(TwoLockQueue* q) {
    sem_wait(&q->items);  // Wait for available item
    
    pthread_mutex_lock(&q->h_lock);
    Node* node = q->head;
    Node* new_head = node->next;
    int value = new_head->value;
    q->head = new_head;
    pthread_mutex_unlock(&q->h_lock);
    
    free(node);
    sem_post(&q->slots);  // Signal slot available
    
    return value;
}
 
// Key insight: Producers only touch tail, consumers only touch head
// This allows concurrent produce and consume operations
// Much better scalability than single-lock approach

Work Stealing for Load Balancing

In systems with multiple consumers, 'work stealing' improves load balance. Each consumer has its own queue; when a consumer's queue is empty, it 'steals' work from another consumer's queue. This is used in Go's goroutine scheduler, Java's ForkJoinPool, and Intel TBB. It combines local access (fast) with global balancing (fair).

Real-World Applications

The producer-consumer pattern is ubiquitous in systems programming. Understanding where and how it appears helps you recognize the pattern and apply appropriate solutions.

Producer-Consumer in Operating Systems

•I/O Buffers: Keyboard input buffer—hardware interrupts (producer) add keystrokes; user programs (consumer) read them. The buffer decouples interrupt speed from program processing.
•Print Spooler: Applications (producers) submit print jobs to a queue; the print daemon (consumer) processes them in order. Enables multiple apps to 'print' without waiting for hardware.
•Pipe Implementation: The | operator in shells connects processes via bounded buffers. ls | grep txt has ls producing output and grep consuming it.
•Terminal I/O: Input/output buffers for terminal devices use producer-consumer to handle asynchronous character I/O at different speeds.

Producer-Consumer in Applications

•Thread Pools: Work items (produced by request handlers) queued for worker threads (consumers). Every web server uses this pattern.
•Message Queues: Kafka, RabbitMQ, SQS—distributed producer-consumer with persistence, multiple consumers, and replay capabilities.
•Video Streaming: Decode thread produces frames; render thread consumes them. Buffer absorbs variation in decode time.
•Logging Systems: Application threads produce log entries; dedicated logging thread consumes and writes to disk. Decouples application from I/O latency.
•Event-Driven Systems: GUI event loop: user actions produce events; event handlers consume them from the queue.

examples_in_practice.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Python's queue module provides producer-consumer infrastructure
import queue
import threading
 
# Bounded queue - thread-safe producer-consumer
work_queue = queue.Queue(maxsize=100)  # Bounded buffer
 
def producer():
    for i in range(1000):
        work_queue.put(f"work_item_{i}")  # Blocks if queue full
    work_queue.put(None)  # Sentinel to signal completion
 
def consumer():
    while True:
        item = work_queue.get()  # Blocks if queue empty
        if item is None:
            break
        process(item)
        work_queue.task_done()  # Mark item as processed
 
# Java's BlockingQueue provides similar functionality
# Go's channels are producer-consumer primitives
# Rust's std::sync::mpsc provides MPSC channels

Distributed Producer-Consumer

Modern distributed systems extend producer-consumer across networks. Apache Kafka is essentially a distributed, persistent, replicated producer-consumer pattern with guaranteed ordering and exactly-once delivery semantics. Understanding the local pattern makes the distributed version intuitive.

Summary: The Producer-Consumer Pattern

The producer-consumer pattern is a foundational concurrency pattern that you'll encounter throughout your career. Let's consolidate the key insights:

Key Takeaways

•Three Synchronization Requirements: Mutual exclusion for buffer access, producer blocking on full buffer, consumer blocking on empty buffer. All three must be addressed.
•Semaphore Solution: Use counting semaphores (empty, full) for condition waiting, binary semaphore (mutex) for exclusion. Order matters—wait on condition BEFORE acquiring mutex.
•Condition Variable Solution: Use explicit count with while-loop waiting. while() handles spurious wakeups; signal() vs broadcast() affects performance.
•Circular Buffers: O(1) operations, fixed memory, power-of-two optimization. Fundamental data structure for bounded buffers.
•Configuration Matters: SPSC can be lock-free; MPMC needs full synchronization. Choose implementation based on actual usage pattern.
•Ubiquitous Pattern: I/O buffers, thread pools, message queues, logging systems—producer-consumer is everywhere in systems programming.

What's Next:

Having mastered the producer-consumer pattern, we'll next explore the reader-writer pattern—another fundamental coordination problem where multiple readers can access shared data concurrently, but writers require exclusive access. This pattern appears in databases, caches, and any system with read-heavy workloads.

Page Complete

You now have a deep understanding of the producer-consumer pattern—from theoretical foundations through practical implementation to real-world applications. This pattern will appear repeatedly throughout your work with concurrent systems. Next, we'll examine the reader-writer pattern for coordinating shared data access.

1 / 5

Loading learning content...

Operating SystemsCommon Concurrency Patterns

Common Concurrency Patterns

LevelIntermediate

Duration90 mins

TopicCommon Concurrency Patterns

1 / 5

Producer-Consumer Pattern

The Coordination Problem at the Heart of Concurrent Systems

What You Will Learn

Pattern Definition and Core Concepts

The producer-consumer pattern is a classic concurrency design pattern that describes the interaction between two types of concurrent entities through a shared buffer or queue.

Formal Definition:

The producer-consumer pattern involves:

Producers: One or more threads/processes that generate data items and place them into a shared buffer
Consumers: One or more threads/processes that remove data items from the shared buffer and process them
Bounded Buffer: A finite-capacity data structure that decouples producers from consumers, allowing asynchronous operation

Core Components of the Producer-Consumer Pattern
Component	Responsibility	Synchronization Concern
Producer(s)	Generate data items, insert into buffer	Must wait if buffer is full (back-pressure)
Consumer(s)	Remove data items, process them	Must wait if buffer is empty (starvation)
Bounded Buffer	Store items, enforce capacity limits	Must ensure thread-safe access (mutual exclusion)
Synchronization	Coordinate access, signal state changes	Must prevent races, deadlocks, and lost signals

Why 'Bounded'?

The buffer is typically bounded (finite capacity) for several critical reasons:

Memory Management: Unbounded buffers can grow indefinitely, leading to memory exhaustion if producers outpace consumers
Back-Pressure Mechanism: A bounded buffer naturally applies back-pressure—when full, producers must wait, preventing runaway production
Latency Control: Bounded buffers limit the maximum time an item spends waiting, providing predictable latency bounds
Resource Allocation: Finite buffers enable static memory allocation, avoiding dynamic allocation overhead in performance-critical paths

The choice of buffer size is a crucial design decision that affects throughput, latency, and memory usage.

Historical Context

The Synchronization Challenge

The Three Synchronization Requirements:

Critical Synchronization Requirements

•Mutual Exclusion: Only one thread can modify the buffer at a time. Concurrent modifications corrupt the data structure—imagine two producers simultaneously trying to insert at the same index, or a producer and consumer accessing a partially-written item.
•Buffer-Full Condition: Producers must block when the buffer is full. Without this, producers either overwrite existing data (data loss) or write beyond buffer bounds (memory corruption).
•Buffer-Empty Condition: Consumers must block when the buffer is empty. Without this, consumers either read stale/garbage data or access uninitialized memory.

The Naive Approach Fails:

Consider a naive implementation using only a mutex for mutual exclusion:

naive_producer_consumer.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// BROKEN IMPLEMENTATION - DO NOT USE
#define BUFFER_SIZE 10
int buffer[BUFFER_SIZE];
int count = 0;  // Number of items in buffer
int in = 0;     // Next insert position
int out = 0;    // Next remove position
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
 
void producer(int item) {
    pthread_mutex_lock(&mutex);
    
    // PROBLEM: What if buffer is full?
    // This busy-waits while holding the lock!
    while (count == BUFFER_SIZE) {
        // Spin... but we're holding the mutex!
        // Consumer can NEVER acquire the lock to consume!
        // DEADLOCK!
    }
    
    buffer[in] = item;
    in = (in + 1) % BUFFER_SIZE;
    count++;
    
    pthread_mutex_unlock(&mutex);
}
 
void consumer(int *item) {
    pthread_mutex_lock(&mutex);
    
    // Same problem if buffer is empty
    while (count == 0) {
        // Spin while holding lock - Consumer starves
    }
    
    *item = buffer[out];
    out = (out + 1) % BUFFER_SIZE;
    count--;
    
    pthread_mutex_unlock(&mutex);
}

The Core Insight:

We need mechanisms that:

Allow threads to wait for conditions without holding locks
Allow threads to signal condition changes to waiting threads
Ensure atomic check-and-wait to prevent lost signals

This is precisely what condition variables, semaphores, and monitors provide—primitives designed for coordination, not just exclusion.

The Lost Wakeup Problem

Semaphore-Based Implementation

Semaphore Strategy:

We use three semaphores:

mutex (binary semaphore, initialized to 1): Ensures mutual exclusion for buffer access
empty (counting semaphore, initialized to BUFFER_SIZE): Counts empty slots; producers wait on this
full (counting semaphore, initialized to 0): Counts filled slots; consumers wait on this

The counting semaphores encode buffer capacity directly—their values always reflect the number of available slots (empty) and items (full).

semaphore_producer_consumer.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
#include <stdio.h>
#include <pthread.h>
#include <semaphore.h>
 
#define BUFFER_SIZE 10
 
int buffer[BUFFER_SIZE];
int in = 0;   // Next position for producer to insert
int out = 0;  // Next position for consumer to remove
 
sem_t mutex;  // Binary semaphore for mutual exclusion
sem_t empty;  // Counting semaphore: number of empty slots
sem_t full;   // Counting semaphore: number of filled slots
 
void init_synchronization() {
    sem_init(&mutex, 0, 1);           // Binary semaphore, initial = 1
    sem_init(&empty, 0, BUFFER_SIZE); // Initially all slots are empty
    sem_init(&full, 0, 0);            // Initially no slots are full
}
 
void producer(int item) {
    // Step 1: Wait for an empty slot
    // Decrements 'empty' count; blocks if 'empty' == 0
    sem_wait(&empty);
    
    // Step 2: Acquire mutual exclusion to modify buffer
    sem_wait(&mutex);
    
    // Step 3: Critical section - insert item
    buffer[in] = item;
    in = (in + 1) % BUFFER_SIZE;
    printf("Produced: %d at position %d
", item, (in - 1 + BUFFER_SIZE) % BUFFER_SIZE);
    
    // Step 4: Release mutual exclusion
    sem_post(&mutex);
    
    // Step 5: Signal that a slot is now full
    // Increments 'full' count; wakes waiting consumer if any
    sem_post(&full);
}
 
int consumer() {
    int item;
    
    // Step 1: Wait for a full slot
    // Decrements 'full' count; blocks if 'full' == 0
    sem_wait(&full);
    
    // Step 2: Acquire mutual exclusion to modify buffer
    sem_wait(&mutex);
    
    // Step 3: Critical section - remove item
    item = buffer[out];
    out = (out + 1) % BUFFER_SIZE;
    printf("Consumed: %d from position %d
", item, (out - 1 + BUFFER_SIZE) % BUFFER_SIZE);
    
    // Step 4: Release mutual exclusion
    sem_post(&mutex);
    
    // Step 5: Signal that a slot is now empty
    // Increments 'empty' count; wakes waiting producer if any
    sem_post(&empty);
    
    return item;
}
 
// Thread functions
void* producer_thread(void* arg) {
    for (int i = 0; i < 20; i++) {
        producer(i);
    }
    return NULL;
}
 
void* consumer_thread(void* arg) {
    for (int i = 0; i < 20; i++) {
        consumer();
    }
    return NULL;
}

Correctness Analysis:

Let's verify this implementation satisfies all three requirements:

1. Mutual Exclusion:

The mutex semaphore ensures only one thread executes the critical section at a time
Both producer and consumer acquire mutex before modifying buffer, in, or out

2. Buffer-Full Protection:

Before inserting, producer calls sem_wait(&empty)
If empty == 0 (buffer full), producer blocks
When consumer removes an item, it calls sem_post(&empty), incrementing the count and potentially waking a blocked producer

3. Buffer-Empty Protection:

Before removing, consumer calls sem_wait(&full)
If full == 0 (buffer empty), consumer blocks
When producer inserts an item, it calls sem_post(&full), incrementing the count and potentially waking a blocked consumer

Invariant:

At all times: empty + full = BUFFER_SIZE (every slot is either empty or full)
This invariant is maintained because every sem_wait(&empty) is paired with a sem_post(&full) (production), and vice versa (consumption)

Critical: Semaphore Ordering Matters

WRONG_ordering.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// WRONG - CAUSES DEADLOCK
void producer_WRONG(int item) {
    sem_wait(&mutex);   // Acquire lock first
    sem_wait(&empty);   // Now wait for empty slot while holding lock
    // If buffer is full, we block here holding mutex
    // Consumer can never acquire mutex to consume!
    // DEADLOCK!
    
    buffer[in] = item;
    in = (in + 1) % BUFFER_SIZE;
    
    sem_post(&mutex);
    sem_post(&full);
}
 
// CORRECT - As shown above
void producer_CORRECT(int item) {
    sem_wait(&empty);   // Wait for condition WITHOUT holding lock
    sem_wait(&mutex);   // THEN acquire lock
    // We only reach here if there's an empty slot
    // Other threads can still access buffer while we wait
    
    buffer[in] = item;
    in = (in + 1) % BUFFER_SIZE;
    
    sem_post(&mutex);
    sem_post(&full);
}

Condition Variable Implementation

Condition Variable Strategy:

We use:

One mutex for mutual exclusion
Two condition variables: not_full (producers wait on this) and not_empty (consumers wait on this)
Explicit count variable to track buffer occupancy (unlike semaphores where the count is implicit)

condvar_producer_consumer.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include <stdio.h>
#include <pthread.h>
 
#define BUFFER_SIZE 10
 
int buffer[BUFFER_SIZE];
int count = 0;  // Current number of items in buffer
int in = 0;     // Next insert position
int out = 0;    // Next remove position
 
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t not_full = PTHREAD_COND_INITIALIZER;   // Signaled when buffer becomes non-full
pthread_cond_t not_empty = PTHREAD_COND_INITIALIZER;  // Signaled when buffer becomes non-empty
 
void producer(int item) {
    pthread_mutex_lock(&mutex);
    
    // Wait while buffer is full
    // CRITICAL: Use 'while' not 'if' - handles spurious wakeups
    while (count == BUFFER_SIZE) {
        // Atomically: release mutex AND wait for signal
        // Upon return: mutex is re-acquired
        pthread_cond_wait(&not_full, &mutex);
    }
    
    // Invariant: count < BUFFER_SIZE (buffer has space)
    buffer[in] = item;
    in = (in + 1) % BUFFER_SIZE;
    count++;
    printf("Produced: %d (buffer count: %d)
", item, count);
    
    // Signal waiting consumers that buffer is no longer empty
    pthread_cond_signal(&not_empty);
    
    pthread_mutex_unlock(&mutex);
}
 
int consumer() {
    int item;
    
    pthread_mutex_lock(&mutex);
    
    // Wait while buffer is empty
    while (count == 0) {
        pthread_cond_wait(&not_empty, &mutex);
    }
    
    // Invariant: count > 0 (buffer has items)
    item = buffer[out];
    out = (out + 1) % BUFFER_SIZE;
    count--;
    printf("Consumed: %d (buffer count: %d)
", item, count);
    
    // Signal waiting producers that buffer is no longer full
    pthread_cond_signal(&not_full);
    
    pthread_mutex_unlock(&mutex);
    
    return item;
}

Critical Detail: The while Loop for Waiting

Notice we use while (count == BUFFER_SIZE) rather than if. This is not optional—it's essential for correctness due to spurious wakeups and signal stealing:

Spurious Wakeups: POSIX permits pthread_cond_wait to return even without a corresponding signal (for implementation efficiency). The condition must be re-checked after waking.

The Pattern:

pthread_mutex_lock(&mutex);
while (!condition) {           // Always use while, never if
    pthread_cond_wait(&cv, &mutex);
}
// Condition is guaranteed true here
// Do work...
pthread_mutex_unlock(&mutex);

pthread_cond_signal vs pthread_cond_broadcast

Circular Buffer (Ring Buffer) Mechanics

Circular Buffer Concept:

Converting Mermaid diagram...

ring_buffer.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Ring Buffer Implementation Details
 
#define BUFFER_SIZE 8  // Should be power of 2 for optimization
 
typedef struct {
    int data[BUFFER_SIZE];
    int in;    // Write index (next position for producer)
    int out;   // Read index (next position for consumer)
    int count; // Number of items currently in buffer
} RingBuffer;
 
// Initialize ring buffer
void rb_init(RingBuffer* rb) {
    rb->in = 0;
    rb->out = 0;
    rb->count = 0;
}
 
// Check if buffer is empty
int rb_is_empty(RingBuffer* rb) {
    return rb->count == 0;
}
 
// Check if buffer is full
int rb_is_full(RingBuffer* rb) {
    return rb->count == BUFFER_SIZE;
}
 
// Insert item (assumes buffer is not full)
void rb_insert(RingBuffer* rb, int item) {
    rb->data[rb->in] = item;
    rb->in = (rb->in + 1) % BUFFER_SIZE;  // Wrap around
    rb->count++;
}
 
// Remove item (assumes buffer is not empty)
int rb_remove(RingBuffer* rb) {
    int item = rb->data[rb->out];
    rb->out = (rb->out + 1) % BUFFER_SIZE;  // Wrap around
    rb->count--;
    return item;
}
 
// OPTIMIZATION: If BUFFER_SIZE is power of 2, use bitwise AND
// (rb->in + 1) & (BUFFER_SIZE - 1) is faster than % BUFFER_SIZE

Key Properties of Circular Buffers:

O(1) Insert and Remove: No data movement required—only pointer updates
Fixed Memory: Size determined at creation; no dynamic allocation during operation
Cache-Friendly: Sequential memory access improves cache utilization
Power-of-Two Optimization: If size is 2^n, modulo becomes cheap bitwise AND

Alternative: Count-Free Implementation:

Some implementations avoid the count variable by using pointer positions alone. This works but wastes one slot (distinguishing full from empty):

ring_buffer_no_count.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// Alternative: No count variable, but wastes one slot
 
#define BUFFER_SIZE 8
int buffer[BUFFER_SIZE];
int in = 0;
int out = 0;
 
// Buffer is empty when in == out
int is_empty() {
    return in == out;
}
 
// Buffer is full when next position of 'in' equals 'out'
// This means we can only store BUFFER_SIZE - 1 items
int is_full() {
    return ((in + 1) % BUFFER_SIZE) == out;
}
 
// Available items = (in - out) mod BUFFER_SIZE
int item_count() {
    return (in - out + BUFFER_SIZE) % BUFFER_SIZE;
}
 
// Tradeoff: Save one variable, waste one buffer slot
// Modern systems typically prefer the count variable approach

Lock-Free Ring Buffers

Multiple Producers and Multiple Consumers

Real-world systems often have multiple producers and multiple consumers operating concurrently. This introduces additional considerations beyond the basic pattern.

Terminology:

SPSC: Single Producer, Single Consumer
MPSC: Multiple Producers, Single Consumer
SPMC: Single Producer, Multiple Consumers
MPMC: Multiple Producers, Multiple Consumers

Each configuration has different synchronization requirements and optimization opportunities.

Producer-Consumer Configurations
Configuration	Synchronization Needs	Example Use Case
SPSC	Minimal - can be lock-free with memory barriers	Audio/video pipeline stages
MPSC	Producer-side locking only	Logging from multiple threads to single file
SPMC	Consumer-side locking only	Single data source to multiple workers
MPMC	Full synchronization required	General work queue, thread pool

MPMC Considerations:

With multiple producers and consumers, several additional issues arise:

1. Fairness:

Are all producers/consumers guaranteed eventual access, or can some be starved?
FIFO ordering of waiting threads may be important for fairness
Some implementations use separate locks for insert and remove operations

2. Order Preservation:

Items may not be consumed in the order they were produced
If order matters, additional sequencing mechanisms are needed

3. Batch Operations:

Inserting/removing multiple items atomically improves throughput
Requires careful API design to maintain buffer invariants

mpmc_considerations.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// MPMC Queue with separate locks (two-lock algorithm)
// Reduces contention between producers and consumers
 
typedef struct Node {
    int value;
    struct Node* next;
} Node;
 
typedef struct {
    Node* head;              // Consumer end
    Node* tail;              // Producer end
    pthread_mutex_t h_lock;  // Lock for head (consumers)
    pthread_mutex_t t_lock;  // Lock for tail (producers)
    sem_t items;             // Count of items
    sem_t slots;             // Count of available slots
} TwoLockQueue;
 
void enqueue(TwoLockQueue* q, int value) {
    sem_wait(&q->slots);  // Wait for available slot
    
    Node* node = malloc(sizeof(Node));
    node->value = value;
    node->next = NULL;
    
    pthread_mutex_lock(&q->t_lock);
    q->tail->next = node;
    q->tail = node;
    pthread_mutex_unlock(&q->t_lock);
    
    sem_post(&q->items);  // Signal item available
}
 
int dequeue(TwoLockQueue* q) {
    sem_wait(&q->items);  // Wait for available item
    
    pthread_mutex_lock(&q->h_lock);
    Node* node = q->head;
    Node* new_head = node->next;
    int value = new_head->value;
    q->head = new_head;
    pthread_mutex_unlock(&q->h_lock);
    
    free(node);
    sem_post(&q->slots);  // Signal slot available
    
    return value;
}
 
// Key insight: Producers only touch tail, consumers only touch head
// This allows concurrent produce and consume operations
// Much better scalability than single-lock approach

Work Stealing for Load Balancing

Real-World Applications

The producer-consumer pattern is ubiquitous in systems programming. Understanding where and how it appears helps you recognize the pattern and apply appropriate solutions.

Producer-Consumer in Operating Systems

•I/O Buffers: Keyboard input buffer—hardware interrupts (producer) add keystrokes; user programs (consumer) read them. The buffer decouples interrupt speed from program processing.
•Print Spooler: Applications (producers) submit print jobs to a queue; the print daemon (consumer) processes them in order. Enables multiple apps to 'print' without waiting for hardware.
•Pipe Implementation: The | operator in shells connects processes via bounded buffers. ls | grep txt has ls producing output and grep consuming it.
•Terminal I/O: Input/output buffers for terminal devices use producer-consumer to handle asynchronous character I/O at different speeds.

Producer-Consumer in Applications

•Thread Pools: Work items (produced by request handlers) queued for worker threads (consumers). Every web server uses this pattern.
•Message Queues: Kafka, RabbitMQ, SQS—distributed producer-consumer with persistence, multiple consumers, and replay capabilities.
•Video Streaming: Decode thread produces frames; render thread consumes them. Buffer absorbs variation in decode time.
•Logging Systems: Application threads produce log entries; dedicated logging thread consumes and writes to disk. Decouples application from I/O latency.
•Event-Driven Systems: GUI event loop: user actions produce events; event handlers consume them from the queue.

examples_in_practice.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Python's queue module provides producer-consumer infrastructure
import queue
import threading
 
# Bounded queue - thread-safe producer-consumer
work_queue = queue.Queue(maxsize=100)  # Bounded buffer
 
def producer():
    for i in range(1000):
        work_queue.put(f"work_item_{i}")  # Blocks if queue full
    work_queue.put(None)  # Sentinel to signal completion
 
def consumer():
    while True:
        item = work_queue.get()  # Blocks if queue empty
        if item is None:
            break
        process(item)
        work_queue.task_done()  # Mark item as processed
 
# Java's BlockingQueue provides similar functionality
# Go's channels are producer-consumer primitives
# Rust's std::sync::mpsc provides MPSC channels

Distributed Producer-Consumer

Summary: The Producer-Consumer Pattern

The producer-consumer pattern is a foundational concurrency pattern that you'll encounter throughout your career. Let's consolidate the key insights:

Key Takeaways

•Three Synchronization Requirements: Mutual exclusion for buffer access, producer blocking on full buffer, consumer blocking on empty buffer. All three must be addressed.
•Semaphore Solution: Use counting semaphores (empty, full) for condition waiting, binary semaphore (mutex) for exclusion. Order matters—wait on condition BEFORE acquiring mutex.
•Condition Variable Solution: Use explicit count with while-loop waiting. while() handles spurious wakeups; signal() vs broadcast() affects performance.
•Circular Buffers: O(1) operations, fixed memory, power-of-two optimization. Fundamental data structure for bounded buffers.
•Configuration Matters: SPSC can be lock-free; MPMC needs full synchronization. Choose implementation based on actual usage pattern.
•Ubiquitous Pattern: I/O buffers, thread pools, message queues, logging systems—producer-consumer is everywhere in systems programming.

What's Next:

Page Complete

1 / 5