Operating SystemsCritical Section Problem

The Critical Section Problem

LevelIntermediate

Duration60 mins

TopicCritical Section Problem

2 / 5

Entry Section

The Gatekeeper of Critical Sections

Every process that wishes to execute its critical section must first pass through a gatekeeper—a segment of code responsible for ensuring that only one process at a time gains access to the protected region. This gatekeeper is the entry section, and its design is one of the most subtle and critical challenges in concurrent programming.

The entry section's job appears deceptively simple: determine whether it's safe to enter the critical section, and if not, wait until it becomes safe. Yet this seemingly straightforward requirement has occupied some of the greatest minds in computer science, spawned decades of research, and continues to present challenges in modern systems.

What You Will Learn

By the end of this page, you will understand the precise role and requirements of the entry section, the design challenges it presents, common approaches to implementing it, and the failure modes that occur when it's designed incorrectly. You'll see why getting the entry section right is essential for correct synchronization.

Formal Definition and Role of the Entry Section

Definition: The entry section is the code that a process executes immediately before entering its critical section. Its purpose is to request and wait for permission to enter the critical section, ensuring that the mutual exclusion property is maintained.

Formal Position in Process Structure:

while (true) {
    ENTRY SECTION      ← Request permission to enter CS
    CRITICAL SECTION   ← Access shared resources
    EXIT SECTION       ← Release permission, signal others
    REMAINDER SECTION  ← Non-critical work
}

The entry section serves three essential functions:

Access Control: It determines whether the process may proceed to the critical section or must wait.
Waiting Mechanism: When access is denied, it provides a mechanism for the process to wait until access becomes available.
State Signaling: It updates shared state to indicate the process's intention to enter or its presence in the critical region, enabling other processes to make correct decisions.

Entry Section Responsibilities

•Announce intention — Signal to other processes that this process wants to enter the critical section
•Check permission — Determine whether entry is currently safe (no other process in CS)
•Wait if necessary — If another process is in the CS, wait until it exits
•Prevent simultaneous entry — Ensure that the mechanism itself doesn't allow two processes to enter simultaneously
•Guarantee eventual entry — Ensure that a waiting process will eventually be allowed to enter (progress requirement)

The Paradox of the Entry Section

Here's a fascinating paradox: the entry section itself accesses shared state (flags, turn variables, locks) to coordinate entry. But if the entry section accesses shared state, doesn't it need its own critical section? The resolution lies in using atomic operations at the lowest level—hardware instructions that cannot be interrupted—as the foundation upon which higher-level entry sections are built.

Design Requirements for Entry Sections

The entry section cannot be designed arbitrarily. It must satisfy specific requirements to ensure the overall solution is correct. These requirements are formally stated in the critical section problem, and the entry section (combined with the exit section) must satisfy all of them.

The Three Requirements:

Any solution to the critical section problem—and therefore any entry section design—must satisfy:

Critical Section Problem Requirements
Requirement	Formal Definition	Entry Section Implication
Mutual Exclusion	If process P_i is executing in its critical section, no other process P_j (j ≠ i) can be executing in its critical section.	The entry section must block processes when another is already in the CS. It must not allow two processes to pass simultaneously.
Progress	If no process is in its CS and some processes wish to enter, the selection of which process enters next cannot be postponed indefinitely. Only processes not in their remainder section can participate in this decision.	A waiting process must eventually be allowed to enter when the CS is free. The entry section must not deadlock or livelock.
Bounded Waiting	There must exist a bound on the number of times other processes are allowed to enter their CS after a process has made a request and before that request is granted.	A process must not wait indefinitely while other processes repeatedly enter. Fairness must be ensured.

Why All Three Matter:

Mutual Exclusion alone could be trivially achieved by never letting anyone enter. But that violates progress.
Progress alone could be achieved by letting everyone enter freely. But that violates mutual exclusion.
Bounded Waiting prevents starvation. A solution could satisfy mutual exclusion and progress but still allow one unlucky process to wait forever while others take turns. Bounded waiting prevents this.

The entry section's design must be such that, combined with the exit section, all three properties are guaranteed under all possible interleavings of process execution.

The Difficulty Is Real

Many brilliant computer scientists proposed solutions to the critical section problem that were later found to be incorrect. Dekker's and Peterson's algorithms went through multiple iterations before arriving at correct versions. This underscores the difficulty: reasoning about concurrent execution is notoriously error-prone, and intuition often fails in the face of non-deterministic interleaving.

Common Entry Section Patterns

Over decades of research and practice, several patterns for implementing entry sections have emerged. Each has distinct characteristics, trade-offs, and use cases.

Busy Waiting (also called spinning) is the simplest form of entry section. The process continuously tests a condition in a loop, consuming CPU cycles while waiting for permission to enter.

Structure:

while (condition_not_satisfied) {
    // Do nothing, keep checking
}
// Condition satisfied, enter critical section

Characteristics:

Simple to implement: Requires minimal code and no OS support beyond atomic operations.
Low latency: When the lock is released, the waiting process can enter immediately (already running).
CPU waste: Burns CPU cycles doing nothing useful, preventing other work from executing on that core.
Not suitable for long waits: If the critical section is long or the waiting process cannot do other work, spinning wastes significant resources.

spinlock_entry.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Simplified spinlock entry section using test-and-set
volatile int lock = 0;  // 0 = unlocked, 1 = locked
 
void enter_critical_section(void) {
    // Entry Section: Spin until we acquire the lock
    while (test_and_set(&lock) == 1) {
        // Lock was already held, keep spinning
        // The test_and_set atomically:
        // 1. Reads current value
        // 2. Sets lock to 1
        // 3. Returns old value
        // If old value was 0, we got the lock
        // If old value was 1, someone else has it
    }
    // Lock acquired, proceed to critical section
}
 
// Note: This basic spinlock doesn't guarantee bounded waiting
// More sophisticated variants (ticket locks, MCS locks) add fairness

When Spinlocks Make Sense

Spinlocks are appropriate when: (1) the critical section is very short (microseconds), (2) the system has multiple CPUs so the waiting process doesn't block the lock holder, and (3) the cost of context switching exceeds the expected spin time. Kernel code often uses spinlocks for brief critical sections protecting in-memory data structures.

The Mechanics of Waiting

When a process cannot immediately enter the critical section, the entry section must implement a waiting mechanism. The choice of waiting mechanism profoundly affects system performance and resource utilization.

Busy Waiting (Spinning):

In busy waiting, the process executes a tight loop, repeatedly testing the entry condition:

while (!can_enter()) {
    // CPU is 100% utilized doing nothing useful
}

The Cost of Spinning:

Each iteration consumes a CPU cycle
On a single-CPU system, spinning prevents the lock holder from running, potentially causing deadlock
Memory bus traffic from repeated reads can slow down other CPUs (cache coherence overhead)
Power consumption increases significantly

spinning_analysis.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// The true cost of a spinning loop
volatile int lock = 0;
 
void poor_spinlock(void) {
    // Naive spin - BAD for performance
    while (lock == 1) {
        // Each iteration:
        // 1. Read 'lock' from memory (or cache)
        // 2. Compare with 1
        // 3. Branch back to loop
        // This can be millions of iterations per second
        // consuming 100% of one CPU core
    }
}
 
void better_spinlock(void) {
    // Spin with pause instruction and exponential backoff
    int backoff = 1;
    while (test_and_set(&lock) == 1) {
        // Pause hint - reduces power and allows other hyperthreads to run
        for (int i = 0; i < backoff; i++) {
            __builtin_ia32_pause();  // x86 PAUSE instruction
        }
        backoff = min(backoff * 2, MAX_BACKOFF);  // Exponential backoff
    }
}
 
// PAUSE instruction benefits:
// 1. Reduces power consumption during spin
// 2. Avoids memory ordering violations on exit
// 3. Improves hyperthreading by yielding resources
// 4. Adds small delay reducing bus traffic

Blocking (Sleep Waiting):

In blocking, the waiting process is put to sleep, yielding the CPU:

while (!can_enter()) {
    sleep();  // Remove from CPU, add to wait queue
}

The Cost of Blocking:

Context switch overhead (saving/restoring registers, TLB flushes)
Cache pollution (new process brings its own working set)
Wakeup latency (OS scheduler must notice and schedule the woken process)
Wait queue management overhead

Spinning vs. Blocking Trade-offs
Aspect	Spinning	Blocking
CPU utilization	100% (wasted)	0% (process sleeps)
Entry latency (when lock free)	Very low (already running)	Higher (context switch)
Power consumption	High	Low
Suitable wait duration	Microseconds	Milliseconds or more
Memory traffic	High (repeated reads)	Low (no reads while sleeping)
Implementation complexity	Low	Higher (OS integration)
Single-CPU behavior	Problematic	Required
Multi-CPU behavior	Efficient for short waits	Efficient for long waits

The Break-Even Point

There's a critical threshold where spinning becomes more expensive than blocking. If the expected wait time exceeds the cost of two context switches (suspending and resuming), blocking wins. On modern systems, this threshold is typically around 10-50 microseconds, though it varies with hardware and workload. Adaptive algorithms try to detect this threshold dynamically.

Classic Software Entry Section Solutions

Before hardware support for synchronization became widespread, computer scientists developed software-only solutions for mutual exclusion. These algorithms are remarkable intellectual achievements and remain valuable for understanding the fundamental challenges.

The Challenge of Software Solutions:

With only ordinary memory reads and writes (no atomic operations), ensuring mutual exclusion is surprisingly difficult. The processor can interleave instructions from different processes at any point, and memory may have different values visible to different processors (memory consistency issues).

petersons_algorithm.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// Peterson's Algorithm: The elegant two-process solution
// Works for exactly two processes (0 and 1)
 
volatile int flag[2] = {0, 0};  // flag[i]: process i wants to enter
volatile int turn;              // Whose turn is it?
 
void enter_critical_section(int process_id) {
    int other = 1 - process_id;  // The other process
    
    // Step 1: Signal intention to enter
    flag[process_id] = 1;  // "I want to enter"
    
    // Step 2: Defer to other process
    turn = other;  // "But you can go first"
    
    // Step 3: Wait if other wants in AND it's their turn
    // The key insight: if both set flags simultaneously,
    // only one wins the race to set 'turn', 
    // and that one waits while the other enters
    while (flag[other] == 1 && turn == other) {
        // Busy wait
    }
    
    // CRITICAL SECTION BEGINS HERE
}
 
void exit_critical_section(int process_id) {
    // Simply clear our flag
    flag[process_id] = 0;  // "I'm done"
}
 
// Why Peterson's works:
// - If both try to enter: turn decides, one enters, other waits
// - If one tries: its flag=1, other's flag=0, so while-loop exits
// - Mutual exclusion: proven via invariants on flag and turn
// - Bounded waiting: if I'm waiting, other enters once, sets flag=0, I enter

Peterson's Algorithm Correctness:

The beauty of Peterson's algorithm lies in its careful balance of two mechanisms:

Flags indicate intention: "I want to enter the critical section."
Turn provides tiebreaking: "If we both want in, this determines who waits."

The key insight is that when setting turn = other, a process is saying, "I'm willing to let you go first." If both processes do this simultaneously, only one of them "wins" the race to set turn last, and that's the one who waits.

Correctness Proof Sketch:

Mutual Exclusion: Assume both P0 and P1 are in CS. Then flag[0]=1 and flag[1]=1 (both wanted in) and neither is waiting. But if neither is waiting, each saw (at the while condition) either flag[other]=0 or turn=self. Since both flags are 1, each must have seen turn=self. But turn cannot equal both 0 and 1 simultaneously. Contradiction.
Progress: If no one is in CS and P0 wants in, either flag[1]=0 (P1 doesn't want in, so P0 enters) or flag[1]=1 and eventually P1 exits, setting flag[1]=0.
Bounded Waiting: Once P0 starts waiting, P1 can enter at most once. After P1 exits and if it tries again, it sets turn=0, giving P0 priority.

Memory Ordering Caveat

Peterson's algorithm, as written, assumes sequential consistency—that memory operations appear to execute in program order and all processors see the same order. Modern processors with weak memory models may reorder operations, breaking the algorithm. In practice, memory barriers (fences) must be inserted to enforce proper ordering, or compiler optimizations may break the algorithm. This is why hardware support for synchronization is essential in real systems.

Hardware-Assisted Entry Sections

Modern processors provide atomic instructions that make entry section implementation far simpler and more efficient than pure software solutions. These instructions are indivisible at the hardware level—no other processor can interleave operations with them.

Common Atomic Instructions:

Atomic Instructions for Synchronization
Instruction	Semantics	Use in Entry Section
Test-and-Set (TAS)	Atomically read old value and set to 1; return old value	If old value was 0, lock was free and we got it; if 1, keep spinning
Compare-and-Swap (CAS)	Atomically: if (mem == expected) mem = new; return success/failure	Try to change lock from 0 to 1; if fails, someone else got it
Fetch-and-Add	Atomically increment and return old value	Used in ticket locks to get a unique ticket number
Load-Linked/Store-Conditional (LL/SC)	LL loads a value; SC stores only if no intervening write	Foundation for CAS-like operations on RISC architectures

hardware_entry_sections.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// Test-and-Set based entry section
int lock = 0;
 
int test_and_set(int* target) {
    // This is a SINGLE ATOMIC instruction at hardware level
    int old = *target;
    *target = 1;
    return old;
    // In reality, this is something like:
    // x86: XCHG instruction
    // ARM: LDREX/STREX sequence
}
 
void enter_tas(void) {
    while (test_and_set(&lock) == 1) {
        // Lock was 1 (held), keep trying
    }
    // Lock was 0, we set it to 1 atomically, we own it
}
 
// Compare-and-Swap based entry section
void enter_cas(void) {
    while (!compare_and_swap(&lock, 0, 1)) {
        // CAS failed: someone else has the lock
        // CAS(addr, expected=0, new=1):
        //   if *addr == 0: *addr = 1; return true
        //   else: return false
    }
    // CAS succeeded: lock was 0, now it's 1, we own it
}
 
// Ticket Lock: Fair spinlock using fetch-and-add
volatile int next_ticket = 0;
volatile int now_serving = 0;
 
void enter_ticket(void) {
    int my_ticket = fetch_and_add(&next_ticket, 1);  // Get unique ticket
    while (now_serving != my_ticket) {
        // Wait for our number to be called
        // This is FIFO fair: processes enter in ticket order
    }
    // Our turn, enter critical section
}
 
void exit_ticket(void) {
    now_serving++;  // Call next number
}

Why Hardware Support Matters

Hardware atomic instructions solve the fundamental problem that plagues software solutions: the gap between reading a value and acting on it. With TAS or CAS, the read-check-modify sequence happens in a single instruction that cannot be interrupted. This eliminates entire classes of race conditions and enables much simpler entry section implementations.

Entry Section Failure Modes

Incorrectly designed entry sections lead to serious failures. Understanding these failure modes helps recognize and prevent them.

Failure Mode 1: Mutual Exclusion Violation

Two or more processes simultaneously occupy the critical section, leading to race conditions.

broken_mutex.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// BROKEN: Race condition in entry section
int lock = 0;
 
void broken_enter(void) {
    while (lock == 1) {
        // Wait while locked
    }
    // DANGER ZONE: Another thread can pass the while-loop here!
    lock = 1;  // Too late - not atomic with the check above
}
 
// Scenario causing mutual exclusion violation:
// 1. Thread A: reads lock = 0, exits while loop
// 2. Thread B: reads lock = 0, exits while loop  (interleaved!)
// 3. Thread A: sets lock = 1
// 4. Thread B: sets lock = 1
// 5. Both A and B are in critical section - VIOLATION!

Failure Mode 2: Deadlock

Processes wait forever, unable to make progress. Each process holds something another needs.

deadlock_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// BROKEN: Strict alternation causes deadlock-like behavior
int turn = 0;  // Only process whose turn it is can enter
 
void broken_enter(int process_id) {
    while (turn != process_id) {
        // Wait for my turn
    }
    // Enter critical section
}
 
void broken_exit(int process_id) {
    turn = 1 - process_id;  // Give turn to other process
}
 
// Problem: Processes must strictly alternate
// If process 0 wants to enter twice before process 1 wants to enter:
// 1. Process 0 enters (turn = 0)
// 2. Process 0 exits (turn = 1)
// 3. Process 0 wants to enter again - but turn = 1!
// Process 0 waits forever for process 1, even though CS is free
// This violates the PROGRESS requirement

Failure Mode 3: Livelock

Processes are actively executing (not blocked), but none make progress—they're all "being polite" and deferring to each other.

livelock_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// BROKEN: Livelock due to over-politeness
int flag[2] = {0, 0};
 
void broken_enter(int process_id) {
    int other = 1 - process_id;
    
    flag[process_id] = 1;  // I want in
    
    while (flag[other] == 1) {
        // Other wants in too - let me back off
        flag[process_id] = 0;  // Retract my intention
        
        // Wait a bit (randomized delay might help, but not guaranteed)
        delay();
        
        flag[process_id] = 1;  // Try again
    }
}
 
// Livelock scenario:
// 1. Both set flags to 1 simultaneously
// 2. Both see other's flag = 1
// 3. Both back off (set flags to 0)
// 4. Both wait, then both set flags to 1 again
// 5. Back to step 2 - infinite loop, no progress!

Failure Mode 4: Starvation

Some processes enter the critical section while another waits indefinitely.

Starvation Example: Unfair Spinlocks

Simple test-and-set spinlocks provide no fairness. A process that just exited the CS can immediately re-acquire the lock while other processes have been waiting. On systems with non-uniform memory access (NUMA), processes on the same chip as the lock variable may have an unfair advantage due to faster memory access, potentially starving processes on distant chips.

Entry Section Design Pitfalls Summary

•Non-atomic check-then-act — Separating the condition check from the action allows interleaving that violates mutual exclusion
•Strict alternation — Requiring processes to take turns violates progress when one process doesn't want the CS
•Asymmetric backoff — Identical backoff strategies for all processes can cause synchronized movement (livelock)
•No fairness mechanism — Simple locks may allow indefinite starvation of some processes
•Priority inversion — Lower-priority processes holding locks can block higher-priority processes

Summary: Mastering the Entry Section

The entry section is where the battle for correct synchronization is fought. We've explored its definition, requirements, patterns, and failure modes:

Key Takeaways

•The entry section gates access to the critical section — It must ensure that processes enter only when safe and wait correctly when not.
•Three requirements must be satisfied — Mutual exclusion, progress, and bounded waiting. Violating any one creates serious problems.
•Waiting can be busy (spinning) or blocking (sleeping) — Each has trade-offs; hybrid approaches combine both for practical efficiency.
•Software-only solutions are elegant but limited — Peterson's algorithm works but requires memory ordering guarantees that modern hardware doesn't provide by default.
•Hardware atomic instructions are essential — Test-and-set, compare-and-swap, and fetch-and-add enable correct, efficient entry sections.
•Incorrect entry sections cause devastating bugs — Mutual exclusion violations, deadlocks, livelocks, and starvation can all result from flawed designs.

What's Next:

The entry section grants access to the critical section, but what happens after the critical section completes? The exit section must correctly release the lock and potentially wake waiting processes. Just as the entry section has subtle requirements, the exit section must be designed to maintain all correctness properties. That's what we'll explore next.

Page Complete

You now understand the entry section—its role, requirements, implementation patterns, and failure modes. This knowledge is essential for understanding any synchronization mechanism, from simple spinlocks to sophisticated concurrent data structures. Next, we examine the exit section and how it completes the synchronization protocol.

2 / 5

Loading learning content...

Operating SystemsCritical Section Problem

The Critical Section Problem

LevelIntermediate

Duration60 mins

TopicCritical Section Problem

2 / 5

Entry Section

The Gatekeeper of Critical Sections

What You Will Learn

Formal Definition and Role of the Entry Section

Formal Position in Process Structure:

while (true) {
    ENTRY SECTION      ← Request permission to enter CS
    CRITICAL SECTION   ← Access shared resources
    EXIT SECTION       ← Release permission, signal others
    REMAINDER SECTION  ← Non-critical work
}

The entry section serves three essential functions:

Access Control: It determines whether the process may proceed to the critical section or must wait.
Waiting Mechanism: When access is denied, it provides a mechanism for the process to wait until access becomes available.
State Signaling: It updates shared state to indicate the process's intention to enter or its presence in the critical region, enabling other processes to make correct decisions.

Entry Section Responsibilities

•Announce intention — Signal to other processes that this process wants to enter the critical section
•Check permission — Determine whether entry is currently safe (no other process in CS)
•Wait if necessary — If another process is in the CS, wait until it exits
•Prevent simultaneous entry — Ensure that the mechanism itself doesn't allow two processes to enter simultaneously
•Guarantee eventual entry — Ensure that a waiting process will eventually be allowed to enter (progress requirement)

The Paradox of the Entry Section

Design Requirements for Entry Sections

The Three Requirements:

Any solution to the critical section problem—and therefore any entry section design—must satisfy:

Critical Section Problem Requirements
Requirement	Formal Definition	Entry Section Implication
Mutual Exclusion	If process P_i is executing in its critical section, no other process P_j (j ≠ i) can be executing in its critical section.	The entry section must block processes when another is already in the CS. It must not allow two processes to pass simultaneously.
Progress	If no process is in its CS and some processes wish to enter, the selection of which process enters next cannot be postponed indefinitely. Only processes not in their remainder section can participate in this decision.	A waiting process must eventually be allowed to enter when the CS is free. The entry section must not deadlock or livelock.
Bounded Waiting	There must exist a bound on the number of times other processes are allowed to enter their CS after a process has made a request and before that request is granted.	A process must not wait indefinitely while other processes repeatedly enter. Fairness must be ensured.

Why All Three Matter:

Mutual Exclusion alone could be trivially achieved by never letting anyone enter. But that violates progress.
Progress alone could be achieved by letting everyone enter freely. But that violates mutual exclusion.
Bounded Waiting prevents starvation. A solution could satisfy mutual exclusion and progress but still allow one unlucky process to wait forever while others take turns. Bounded waiting prevents this.

The entry section's design must be such that, combined with the exit section, all three properties are guaranteed under all possible interleavings of process execution.

The Difficulty Is Real

Common Entry Section Patterns

Over decades of research and practice, several patterns for implementing entry sections have emerged. Each has distinct characteristics, trade-offs, and use cases.

Busy Waiting (also called spinning) is the simplest form of entry section. The process continuously tests a condition in a loop, consuming CPU cycles while waiting for permission to enter.

Structure:

while (condition_not_satisfied) {
    // Do nothing, keep checking
}
// Condition satisfied, enter critical section

Characteristics:

Simple to implement: Requires minimal code and no OS support beyond atomic operations.
Low latency: When the lock is released, the waiting process can enter immediately (already running).
CPU waste: Burns CPU cycles doing nothing useful, preventing other work from executing on that core.
Not suitable for long waits: If the critical section is long or the waiting process cannot do other work, spinning wastes significant resources.

spinlock_entry.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Simplified spinlock entry section using test-and-set
volatile int lock = 0;  // 0 = unlocked, 1 = locked
 
void enter_critical_section(void) {
    // Entry Section: Spin until we acquire the lock
    while (test_and_set(&lock) == 1) {
        // Lock was already held, keep spinning
        // The test_and_set atomically:
        // 1. Reads current value
        // 2. Sets lock to 1
        // 3. Returns old value
        // If old value was 0, we got the lock
        // If old value was 1, someone else has it
    }
    // Lock acquired, proceed to critical section
}
 
// Note: This basic spinlock doesn't guarantee bounded waiting
// More sophisticated variants (ticket locks, MCS locks) add fairness

When Spinlocks Make Sense

The Mechanics of Waiting

Busy Waiting (Spinning):

In busy waiting, the process executes a tight loop, repeatedly testing the entry condition:

while (!can_enter()) {
    // CPU is 100% utilized doing nothing useful
}

The Cost of Spinning:

Each iteration consumes a CPU cycle
On a single-CPU system, spinning prevents the lock holder from running, potentially causing deadlock
Memory bus traffic from repeated reads can slow down other CPUs (cache coherence overhead)
Power consumption increases significantly

spinning_analysis.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// The true cost of a spinning loop
volatile int lock = 0;
 
void poor_spinlock(void) {
    // Naive spin - BAD for performance
    while (lock == 1) {
        // Each iteration:
        // 1. Read 'lock' from memory (or cache)
        // 2. Compare with 1
        // 3. Branch back to loop
        // This can be millions of iterations per second
        // consuming 100% of one CPU core
    }
}
 
void better_spinlock(void) {
    // Spin with pause instruction and exponential backoff
    int backoff = 1;
    while (test_and_set(&lock) == 1) {
        // Pause hint - reduces power and allows other hyperthreads to run
        for (int i = 0; i < backoff; i++) {
            __builtin_ia32_pause();  // x86 PAUSE instruction
        }
        backoff = min(backoff * 2, MAX_BACKOFF);  // Exponential backoff
    }
}
 
// PAUSE instruction benefits:
// 1. Reduces power consumption during spin
// 2. Avoids memory ordering violations on exit
// 3. Improves hyperthreading by yielding resources
// 4. Adds small delay reducing bus traffic

Blocking (Sleep Waiting):

In blocking, the waiting process is put to sleep, yielding the CPU:

while (!can_enter()) {
    sleep();  // Remove from CPU, add to wait queue
}

The Cost of Blocking:

Context switch overhead (saving/restoring registers, TLB flushes)
Cache pollution (new process brings its own working set)
Wakeup latency (OS scheduler must notice and schedule the woken process)
Wait queue management overhead

Spinning vs. Blocking Trade-offs
Aspect	Spinning	Blocking
CPU utilization	100% (wasted)	0% (process sleeps)
Entry latency (when lock free)	Very low (already running)	Higher (context switch)
Power consumption	High	Low
Suitable wait duration	Microseconds	Milliseconds or more
Memory traffic	High (repeated reads)	Low (no reads while sleeping)
Implementation complexity	Low	Higher (OS integration)
Single-CPU behavior	Problematic	Required
Multi-CPU behavior	Efficient for short waits	Efficient for long waits

The Break-Even Point

Classic Software Entry Section Solutions

The Challenge of Software Solutions:

petersons_algorithm.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// Peterson's Algorithm: The elegant two-process solution
// Works for exactly two processes (0 and 1)
 
volatile int flag[2] = {0, 0};  // flag[i]: process i wants to enter
volatile int turn;              // Whose turn is it?
 
void enter_critical_section(int process_id) {
    int other = 1 - process_id;  // The other process
    
    // Step 1: Signal intention to enter
    flag[process_id] = 1;  // "I want to enter"
    
    // Step 2: Defer to other process
    turn = other;  // "But you can go first"
    
    // Step 3: Wait if other wants in AND it's their turn
    // The key insight: if both set flags simultaneously,
    // only one wins the race to set 'turn', 
    // and that one waits while the other enters
    while (flag[other] == 1 && turn == other) {
        // Busy wait
    }
    
    // CRITICAL SECTION BEGINS HERE
}
 
void exit_critical_section(int process_id) {
    // Simply clear our flag
    flag[process_id] = 0;  // "I'm done"
}
 
// Why Peterson's works:
// - If both try to enter: turn decides, one enters, other waits
// - If one tries: its flag=1, other's flag=0, so while-loop exits
// - Mutual exclusion: proven via invariants on flag and turn
// - Bounded waiting: if I'm waiting, other enters once, sets flag=0, I enter

Peterson's Algorithm Correctness:

The beauty of Peterson's algorithm lies in its careful balance of two mechanisms:

Flags indicate intention: "I want to enter the critical section."
Turn provides tiebreaking: "If we both want in, this determines who waits."

Correctness Proof Sketch:

Mutual Exclusion: Assume both P0 and P1 are in CS. Then flag[0]=1 and flag[1]=1 (both wanted in) and neither is waiting. But if neither is waiting, each saw (at the while condition) either flag[other]=0 or turn=self. Since both flags are 1, each must have seen turn=self. But turn cannot equal both 0 and 1 simultaneously. Contradiction.
Progress: If no one is in CS and P0 wants in, either flag[1]=0 (P1 doesn't want in, so P0 enters) or flag[1]=1 and eventually P1 exits, setting flag[1]=0.
Bounded Waiting: Once P0 starts waiting, P1 can enter at most once. After P1 exits and if it tries again, it sets turn=0, giving P0 priority.

Memory Ordering Caveat

Hardware-Assisted Entry Sections

Common Atomic Instructions:

Atomic Instructions for Synchronization
Instruction	Semantics	Use in Entry Section
Test-and-Set (TAS)	Atomically read old value and set to 1; return old value	If old value was 0, lock was free and we got it; if 1, keep spinning
Compare-and-Swap (CAS)	Atomically: if (mem == expected) mem = new; return success/failure	Try to change lock from 0 to 1; if fails, someone else got it
Fetch-and-Add	Atomically increment and return old value	Used in ticket locks to get a unique ticket number
Load-Linked/Store-Conditional (LL/SC)	LL loads a value; SC stores only if no intervening write	Foundation for CAS-like operations on RISC architectures

hardware_entry_sections.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// Test-and-Set based entry section
int lock = 0;
 
int test_and_set(int* target) {
    // This is a SINGLE ATOMIC instruction at hardware level
    int old = *target;
    *target = 1;
    return old;
    // In reality, this is something like:
    // x86: XCHG instruction
    // ARM: LDREX/STREX sequence
}
 
void enter_tas(void) {
    while (test_and_set(&lock) == 1) {
        // Lock was 1 (held), keep trying
    }
    // Lock was 0, we set it to 1 atomically, we own it
}
 
// Compare-and-Swap based entry section
void enter_cas(void) {
    while (!compare_and_swap(&lock, 0, 1)) {
        // CAS failed: someone else has the lock
        // CAS(addr, expected=0, new=1):
        //   if *addr == 0: *addr = 1; return true
        //   else: return false
    }
    // CAS succeeded: lock was 0, now it's 1, we own it
}
 
// Ticket Lock: Fair spinlock using fetch-and-add
volatile int next_ticket = 0;
volatile int now_serving = 0;
 
void enter_ticket(void) {
    int my_ticket = fetch_and_add(&next_ticket, 1);  // Get unique ticket
    while (now_serving != my_ticket) {
        // Wait for our number to be called
        // This is FIFO fair: processes enter in ticket order
    }
    // Our turn, enter critical section
}
 
void exit_ticket(void) {
    now_serving++;  // Call next number
}

Why Hardware Support Matters

Entry Section Failure Modes

Incorrectly designed entry sections lead to serious failures. Understanding these failure modes helps recognize and prevent them.

Failure Mode 1: Mutual Exclusion Violation

Two or more processes simultaneously occupy the critical section, leading to race conditions.

broken_mutex.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// BROKEN: Race condition in entry section
int lock = 0;
 
void broken_enter(void) {
    while (lock == 1) {
        // Wait while locked
    }
    // DANGER ZONE: Another thread can pass the while-loop here!
    lock = 1;  // Too late - not atomic with the check above
}
 
// Scenario causing mutual exclusion violation:
// 1. Thread A: reads lock = 0, exits while loop
// 2. Thread B: reads lock = 0, exits while loop  (interleaved!)
// 3. Thread A: sets lock = 1
// 4. Thread B: sets lock = 1
// 5. Both A and B are in critical section - VIOLATION!

Failure Mode 2: Deadlock

Processes wait forever, unable to make progress. Each process holds something another needs.

deadlock_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// BROKEN: Strict alternation causes deadlock-like behavior
int turn = 0;  // Only process whose turn it is can enter
 
void broken_enter(int process_id) {
    while (turn != process_id) {
        // Wait for my turn
    }
    // Enter critical section
}
 
void broken_exit(int process_id) {
    turn = 1 - process_id;  // Give turn to other process
}
 
// Problem: Processes must strictly alternate
// If process 0 wants to enter twice before process 1 wants to enter:
// 1. Process 0 enters (turn = 0)
// 2. Process 0 exits (turn = 1)
// 3. Process 0 wants to enter again - but turn = 1!
// Process 0 waits forever for process 1, even though CS is free
// This violates the PROGRESS requirement

Failure Mode 3: Livelock

Processes are actively executing (not blocked), but none make progress—they're all "being polite" and deferring to each other.

livelock_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// BROKEN: Livelock due to over-politeness
int flag[2] = {0, 0};
 
void broken_enter(int process_id) {
    int other = 1 - process_id;
    
    flag[process_id] = 1;  // I want in
    
    while (flag[other] == 1) {
        // Other wants in too - let me back off
        flag[process_id] = 0;  // Retract my intention
        
        // Wait a bit (randomized delay might help, but not guaranteed)
        delay();
        
        flag[process_id] = 1;  // Try again
    }
}
 
// Livelock scenario:
// 1. Both set flags to 1 simultaneously
// 2. Both see other's flag = 1
// 3. Both back off (set flags to 0)
// 4. Both wait, then both set flags to 1 again
// 5. Back to step 2 - infinite loop, no progress!

Failure Mode 4: Starvation

Some processes enter the critical section while another waits indefinitely.

Starvation Example: Unfair Spinlocks

Entry Section Design Pitfalls Summary

•Non-atomic check-then-act — Separating the condition check from the action allows interleaving that violates mutual exclusion
•Strict alternation — Requiring processes to take turns violates progress when one process doesn't want the CS
•Asymmetric backoff — Identical backoff strategies for all processes can cause synchronized movement (livelock)
•No fairness mechanism — Simple locks may allow indefinite starvation of some processes
•Priority inversion — Lower-priority processes holding locks can block higher-priority processes

Summary: Mastering the Entry Section

The entry section is where the battle for correct synchronization is fought. We've explored its definition, requirements, patterns, and failure modes:

Key Takeaways

•The entry section gates access to the critical section — It must ensure that processes enter only when safe and wait correctly when not.
•Three requirements must be satisfied — Mutual exclusion, progress, and bounded waiting. Violating any one creates serious problems.
•Waiting can be busy (spinning) or blocking (sleeping) — Each has trade-offs; hybrid approaches combine both for practical efficiency.
•Software-only solutions are elegant but limited — Peterson's algorithm works but requires memory ordering guarantees that modern hardware doesn't provide by default.
•Hardware atomic instructions are essential — Test-and-set, compare-and-swap, and fetch-and-add enable correct, efficient entry sections.
•Incorrect entry sections cause devastating bugs — Mutual exclusion violations, deadlocks, livelocks, and starvation can all result from flawed designs.

What's Next:

Page Complete

2 / 5