Loading learning content...
We've studied the critical section, entry section, and exit section in detail. But these components represent only a fraction of what a typical process does. The vast majority of a process's execution time is often spent in the remainder section—the code that doesn't involve shared resources and can execute concurrently without any synchronization.
Understanding the remainder section is crucial for two reasons: first, it completes our conceptual model of process synchronization; second, and more practically, maximizing time spent in the remainder section relative to the critical section is the key to scalable concurrent performance.
By the end of this page, you will understand the formal definition of the remainder section, its role in process structure, why maximizing remainder section work is critical for performance, and how remainder section behavior affects the overall synchronization requirements.
Definition: The remainder section is the portion of a process's code that executes after the exit section and before the next entry section. It encompasses all computation that does not require exclusive access to shared resources.
Formal Position in Process Structure:
while (true) {
ENTRY SECTION ← Acquire exclusive access
CRITICAL SECTION ← Access shared resources
EXIT SECTION ← Release access, notify waiters
REMAINDER SECTION ← Independent work (no synchronization needed)
}
The remainder section is characterized by three essential properties:
The term 'remainder' emphasizes that this is the 'rest of the work' that doesn't fit into the critical section. In well-designed concurrent programs, the remainder section should constitute the majority of execution time—if most work requires synchronization, concurrency provides little benefit.
Not all code naturally falls into the critical section or remainder section categories. Understanding what can safely be moved to the remainder section is a critical design skill.
Candidates for the Remainder Section:
| Code Type | Can Be in Remainder? | Rationale |
|---|---|---|
| Local variable computation | ✅ Yes | Thread-local data cannot cause races |
| Pure functions with immutable inputs | ✅ Yes | No side effects, no shared state |
| I/O to private resources | ✅ Yes | Thread-specific files, sockets |
| Sleeping/yielding | ✅ Yes | Reduces contention when no CS work needed |
| Reads of immutable shared data | ✅ Yes | Data never changes, no race possible |
| Reads of shared mutable data | ⚠️ Sometimes | Only if stale data is acceptable |
| Writes to thread-local copies | ✅ Yes | Changes visible only to current thread |
| Writes to shared data | ❌ No | Must be in critical section |
| Check-then-act on shared data | ❌ No | Race between check and act |
| Complex invariant maintenance | ❌ No | Invariant might be observed mid-update |
1234567891011121314151617181920212223242526272829303132333435363738394041
// Example: Optimizing by expanding the remainder section // BEFORE: Excessive code in critical sectionvoid process_request_bad(Request* req) { mutex_lock(&processing_mutex); // ====== CRITICAL SECTION (too large) ====== validate_request(req); // Pure computation - no shared state! parse_request(req); // Pure computation - no shared state! Response resp = compute_response(req); // Expensive, but no shared state! update_statistics(&global_stats, req); // Needs synchronization write_to_log(req, resp); // Could use thread-local buffer! // =========================================== mutex_unlock(&processing_mutex); send_response(req->client, resp);}// Problem: Most work doesn't need the lock! // AFTER: Minimized critical section, expanded remainder sectionvoid process_request_good(Request* req) { // ====== REMAINDER SECTION (before CS) ====== // All this work is thread-local, runs while others work validate_request(req); // Pure function parse_request(req); // Pure function Response resp = compute_response(req); // Expensive but local // ====== ENTRY SECTION ====== mutex_lock(&processing_mutex); // ====== CRITICAL SECTION (minimal) ====== update_statistics(&global_stats, req); // Only this needs the lock! // =========================================== // ====== EXIT SECTION ====== mutex_unlock(&processing_mutex); // ====== REMAINDER SECTION (after CS) ====== write_to_thread_local_log(req, resp); // Thread-local buffer send_response(req->client, resp); // Per-connection, no sharing}// Benefit: Threads can process in parallel, only statistics update is serializedAlways ask: 'Does this code NEED to be in the critical section?' If the answer is no, move it out. Every instruction removed from the critical section directly improves concurrency. This is one of the highest-impact optimizations in concurrent programming.
The ratio of time spent in the remainder section versus the critical section is one of the most important factors in concurrent system performance. This relationship is captured by Amdahl's Law.
Amdahl's Law for Locks:
If a fraction s of execution time is spent in the critical section (serialized), and the remainder (1 - s) can run in parallel, then with n processors:
Speedup = 1 / (s + (1-s)/n)
As n → ∞:
Speedup_max = 1 / s
This means: if 10% of work is in the critical section, you can never achieve more than 10x speedup, no matter how many processors you add.
| % in Critical Section (s) | % in Remainder (1-s) | Max Speedup (1/s) | With 8 CPUs | With 64 CPUs |
|---|---|---|---|---|
| 1% | 99% | 100x | 7.5x | 39.3x |
| 5% | 95% | 20x | 5.9x | 13.2x |
| 10% | 90% | 10x | 4.7x | 8.0x |
| 20% | 80% | 5x | 3.3x | 4.4x |
| 50% | 50% | 2x | 1.8x | 1.9x |
| 90% | 10% | 1.1x | 1.1x | 1.1x |
The Real-World Implications:
This mathematical reality has profound implications:
A common design mistake is protecting all shared data with a single global lock. This effectively makes the entire system serialized—only one thread can do shared-data work at a time. Even if 8 cores are available, the system behaves like it has one core for any locked operations. Identify independent data structures and protect each with its own lock to maximize concurrency.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// ANTIPATTERN: Single global lockstruct GlobalState { Counter counter; UserDatabase users; TransactionLog log; SessionCache cache;}; mutex_t global_lock; // One lock for everything! void process_user_request(Request* r) { mutex_lock(&global_lock); // Everyone waits here // Even if we only touch the cache, // we block access to counter, users, and log! Session* s = cache_lookup(&state.cache, r->session_id); mutex_unlock(&global_lock);} // BETTER: Fine-grained locksstruct GlobalState { struct { Counter counter; mutex_t lock; } counter_data; struct { UserDatabase users; rwlock_t lock; // Multiple readers OK } user_data; struct { TransactionLog log; mutex_t lock; } log_data; struct { SessionCache cache; mutex_t lock; } cache_data;}; void process_user_request_better(Request* r) { mutex_lock(&state.cache_data.lock); // Only cache lock // Other threads can access counter, users, log concurrently! Session* s = cache_lookup(&state.cache_data.cache, r->session_id); mutex_unlock(&state.cache_data.lock);}The remainder section plays a subtle but important role in the correctness requirements of synchronization primitives, particularly the progress requirement.
Recall the Progress Requirement:
If no process is in its critical section and some processes wish to enter, the selection of which process enters next cannot be postponed indefinitely. Only processes not executing in their remainder sections can participate in this decision.
The last phrase is crucial. Let's understand why.
Why Remainder Sections Are Excluded from Progress Decisions:
Consider what would happen if a process in its remainder section could influence who enters the critical section next:
This would be absurd. A process that doesn't even want the CS would block others who do. The progress requirement explicitly excludes remainder section processes to avoid this.
1234567891011121314151617181920212223242526272829303132333435363738394041424344
// HYPOTHETICAL BROKEN ALGORITHM that violates progress// by involving remainder section processes // Each process must "vote" before anyone enters CSvolatile int vote[N]; // 0 = no opinion, 1 = OK to enter void broken_entry_section(int process_id) { // Request votes from ALL processes for (int i = 0; i < N; i++) { vote[i] = 0; // Reset votes } // Wait until all processes vote OK for (int i = 0; i < N; i++) { while (vote[i] == 0) { // BLOCKED! If process i is in remainder section doing // a long computation, it might never vote. // Progress is violated! } }} // In remainder section, each process periodically:void remainder_section_processing(int process_id) { // ... long computation ... // Maybe eventually vote vote[process_id] = 1; // But this might take forever! // ... more computation ...} // FIX: Correct algorithms (like Peterson's) only involve// processes that are actively trying to enter CS:void peterson_entry(int process_id) { flag[process_id] = 1; // I WANT IN turn = other; // Only waits for the OTHER process IF: // - Other wants in (flag[other] == 1), AND // - It's other's turn // If other is in remainder section, flag[other] = 0, we enter immediately! while (flag[other] == 1 && turn == other);}A process in its remainder section is expressing: 'I'm not interested in the critical section right now.' Correct synchronization algorithms interpret this as implicit permission for others to proceed. This is why flag variables in Peterson's algorithm are set to 0 in the exit section—it's announcing disinterest, not blocking others.
The duration of the remainder section—how long a process stays outside the critical section between accesses—has significant implications for system behavior and synchronization mechanism choice.
Short Remainder Sections:
When processes quickly return to wanting the CS:
| Characteristic | Short Remainder | Long Remainder |
|---|---|---|
| Contention level | High (processes frequently want CS) | Low (CS rarely contested) |
| Preferred lock type | Fair lock (prevents starvation) | Simple spinlock may suffice |
| Wait strategy | Blocking (CPU for others) | Spinning (quick access when free) |
| Throughput sensitivity | CS time dominates | CS time less critical |
| Cache effects | Lock line constantly bouncing | Lock line stays cold |
| Priority issues | Priority inversion more likely | Less likely |
| Scalability | Poor (Amdahl's Law) | Good (mostly parallel) |
Long Remainder Sections (Infrequent Lock Access):
When processes spend most time in remainder:
Design Implication: Know Your Access Patterns
Choosing the right synchronization mechanism requires understanding:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
// Analyzing access patterns to choose synchronization strategy // Pattern A: SHORT REMAINDER, SHORT CS// Example: Updating a shared counter for every operationvoid pattern_a_counter_update(void) { for (int i = 0; i < MILLION; i++) { // Remainder: almost nothing // CS: increment shared counter mutex_lock(&counter_lock); shared_counter++; // 1 instruction mutex_unlock(&counter_lock); // Back to remainder immediately }}// Best approach: atomic increment (no lock at all)// atomic_fetch_add(&shared_counter, 1); // Pattern B: LONG REMAINDER, SHORT CS// Example: Worker threads processing independent jobsvoid pattern_b_worker_thread(void) { while (true) { Job* job; // ENTRY + CS: Get next job (brief) mutex_lock(&queue_lock); job = dequeue(&job_queue); mutex_unlock(&queue_lock); // REMAINDER: Do the work (long) process_job(job); // Might take seconds save_results(job); }}// Best approach: simple mutex, contention is rare // Pattern C: SHORT REMAINDER, LONG CS// Example: Database transactionsvoid pattern_c_database_transaction(void) { // Remainder: prepare transaction (brief) Transaction* tx = prepare(); // CS: execute transaction (long) db_lock(); execute_multiple_queries(tx); // Many operations update_indexes(tx); flush_to_disk(tx); // I/O! db_unlock(); // Remainder: notify (brief) notify_completion(tx);}// Problem: Long CS hurts concurrency// Better approach: Fine-grained locking (per-table or per-row)// Or: Optimistic concurrency (check conflicts at commit)When many processes finish their remainder sections at similar times (e.g., timer-based batch processing), they all arrive at the critical section together, forming a 'convoy.' This creates temporary high contention even if average contention is low. Jittering (adding random delays) in the remainder section can break up convoys and smooth access patterns.
In real-time systems, the remainder section takes on additional significance because it affects schedulability—the system's ability to meet all timing deadlines.
The Real-Time Perspective:
Real-time systems often model tasks as periodic:
The remainder section contributes to the task's period—it's the gap between one critical section access and the next.
Blocking Time and Schedulability:
When a high-priority task must wait in the entry section (blocked by a low-priority task holding the lock), this blocking time adds to the task's worst-case execution time. The remainder section affects this indirectly:
Schedulability Analysis in Real-Time:
For a system with n tasks and blocking time Bi for task i:
U = Σ(Ci + Bi) / Ti ≤ n(2^(1/n) - 1) ≈ ln(2) for large n
The blocking time Bi depends on:
12345678910111213141516171819202122232425262728293031323334353637383940
// Real-time task structure exampletypedef struct { int period_ms; // How often task runs int computation_ms; // Time when not in CS (remainder) int critical_section_ms; // Time in CS int deadline_ms; // When task must complete int priority; // Scheduling priority (higher = more important) mutex_t* required_lock; // Lock this task needs} RealtimeTask; // Example real-time systemRealtimeTask tasks[] = { // High priority: Quick sensor read, long remainder (analysis) {.period_ms = 100, .computation_ms = 80, .critical_section_ms = 2, .deadline_ms = 100, .priority = 10, .required_lock = &sensor_lock}, // Medium priority: Control loop, balanced remainder/CS {.period_ms = 50, .computation_ms = 20, .critical_section_ms = 10, .deadline_ms = 50, .priority = 5, .required_lock = &actuator_lock}, // Low priority: Logging, long CS (file I/O), long period {.period_ms = 1000, .computation_ms = 50, .critical_section_ms = 100, .deadline_ms = 1000, .priority = 1, .required_lock = &log_lock},}; // Blocking time analysis:// - High priority task uses sensor_lock: Not shared by others, no blocking// - Medium priority uses actuator_lock: Not shared by others, no blocking // - BUT if they shared a lock, high priority could be blocked by low// for up to 100ms (low's CS duration)! // With priority inheritance mutex:void high_priority_task_with_inheritance(void) { // When we block on lock held by low-priority task, // low-priority task's priority is boosted to ours // This ensures low finishes quickly, minimizing our blocking time pthread_mutex_lock(&shared_lock); // Uses PTHREAD_PRIO_INHERIT // ... critical section ... pthread_mutex_unlock(&shared_lock);}In hard real-time systems, unbounded blocking is unacceptable—it can cause missed deadlines and system failure. Protocols like Priority Inheritance and Priority Ceiling ensure that blocking time is bounded and can be factored into schedulability analysis. The remainder section patterns of all tasks must be analyzed to compute worst-case blocking.
Since maximizing remainder section work improves concurrency, let's explore strategies for moving work out of critical sections and into the remainder section.
Strategy 1: Copy-on-Entry, Merge-on-Exit
Copy data at the start of the CS, release the lock, process the copy in the remainder section, then re-acquire and update:
1234567891011121314151617181920
// Copy-on-entry patternvoid process_with_copy(void) { DataCopy local_copy; // Brief CS: just copy the data mutex_lock(&data_lock); local_copy = shared_data; // Shallow or deep copy mutex_unlock(&data_lock); // Long remainder: work on the copy Result result = expensive_computation(&local_copy); // Brief CS: merge results back mutex_lock(&data_lock); merge_result(&shared_data, &result); mutex_unlock(&data_lock);} // Trade-off: Works well for read-heavy workloads// Caveat: May need conflict detection if data changed while computingStrategy 2: Thread-Local Accumulation
Each thread maintains local state in its remainder section, periodically merging into shared state:
12345678910111213141516171819202122
// Thread-local accumulation pattern__thread int local_counter = 0; // Each thread has its ownint global_counter = 0;mutex_t counter_lock; void increment(void) { // REMAINDER: Update thread-local (no synchronization!) local_counter++; // Only occasionally merge to global if (local_counter >= BATCH_SIZE) { mutex_lock(&counter_lock); global_counter += local_counter; // CS: quick bulk update mutex_unlock(&counter_lock); local_counter = 0; }} // Results:// - 99% of increments are lock-free// - CS occurs only every BATCH_SIZE operations// - Dramatically reduces contentionStrategy 3: Read-Copy-Update (RCU)
Readers never block; writers create a new version and atomically swap pointers. Readers in their remainder section may see old or new data—both are valid:
123456789101112131415161718192021222324252627282930313233
// RCU-style optimization (simplified concept)struct Data { // ... lots of fields ...}; atomic_ptr<Data> shared_ptr; // Atomic pointer to current data // Readers: No lock, no waiting! (all in "remainder")Data* read_data(void) { Data* ptr = atomic_load(&shared_ptr); // Atomic read // Use ptr freely - the data won't be freed while we're using it // (Requires a grace period mechanism) return ptr;} // Writer: Does heavy work in remainder, brief atomic swapvoid update_data(UpdateInfo* info) { // REMAINDER: Prepare new version (no lock) Data* old_ptr = atomic_load(&shared_ptr); Data* new_ptr = malloc(sizeof(Data)); *new_ptr = *old_ptr; // Copy old data apply_update(new_ptr, info); // Modify the copy // CS: Just an atomic pointer swap! atomic_store(&shared_ptr, new_ptr); // REMAINDER: Reclaim old data after grace period // (Wait until all readers using old_ptr have finished) synchronize_rcu(); free(old_ptr);} // RCU is used extensively in the Linux kernel for read-mostly structuresThe remainder section completes our understanding of the four-part process structure for accessing shared resources. Let's consolidate the key insights:
What's Next:
Now that we fully understand the structure of the critical section problem—the critical section itself, the entry and exit sections that protect it, and the remainder section that runs freely—we're ready to formalize the problem requirements. The next page presents the three requirements (mutual exclusion, progress, and bounded waiting) in rigorous detail, enabling us to evaluate any proposed solution.
You now understand the remainder section and its crucial role in concurrent system performance. Combined with your knowledge of the entry, critical, and exit sections, you have a complete conceptual model of how processes safely share resources. Next, we formalize the requirements that any correct solution must satisfy.