Operating SystemsCPU Scheduling

Scheduling Concepts

LevelIntermediate

Duration60 mins

TopicCPU Scheduling

4 / 5

Scheduling Criteria

Measuring What Matters

How do we know if a scheduling algorithm is good? The answer depends entirely on what we're trying to optimize. A datacenter operator running batch jobs cares about maximizing throughput—completing as many jobs as possible per hour. A user editing a document cares about responsiveness—that keystrokes appear instantly. A real-time system controlling a nuclear reactor cares about deadlines—that critical responses happen within strict time bounds.

These stakeholders have fundamentally different—sometimes conflicting—definitions of "good" scheduling. No single algorithm optimizes all criteria simultaneously. Understanding these scheduling criteria is essential for:

Selecting algorithms: Choosing the right scheduler for your workload
Evaluating tradeoffs: Understanding what you sacrifice when optimizing one metric
Benchmarking: Measuring scheduler performance objectively
System design: Matching scheduling policy to system goals

What You Will Learn

By the end of this page, you will understand: (1) The five primary scheduling criteria and their formal definitions, (2) How to calculate each metric from scheduling data, (3) The inherent tradeoffs between criteria, (4) Which criteria matter for which workload types, and (5) How real systems balance multiple objectives.

CPU Utilization

CPU Utilization measures what fraction of time the CPU is doing useful work (executing processes) rather than sitting idle.

Formal definition:

CPU Utilization = (Total CPU Busy Time) / (Total Elapsed Time) × 100%

Where:

CPU Busy Time = Time spent executing user or kernel code for processes
Idle Time = Time spent in the idle process (waiting for work)
Elapsed Time = CPU Busy Time + Idle Time

Target values:

System Type	Typical Target	Notes
Lightly loaded desktop	20-40%	User doesn't want to wait, but system often idle
Heavily loaded workstation	60-80%	Some headroom for bursts
Production server	70-90%	High utilization is cost-effective
Overloaded system	95-100%	May indicate capacity problem

Why CPU utilization matters:

Cost efficiency: Idle CPUs are wasted investment—especially in cloud computing where you pay per CPU-hour
Capacity planning: Sustained low utilization suggests over-provisioning; sustained high suggests under-provisioning
Scheduling effectiveness: A scheduler that leaves the CPU idle when runnable processes exist is inefficient

Measurement on Linux:

measure_cpu_utilization.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#!/bin/bash
# Measuring CPU utilization on Linux
 
# Method 1: Using top (real-time view)
top -bn1 | grep "Cpu(s)" | awk '{print "CPU Utilization: " 100 - $8 "%"}'
# Output: CPU Utilization: 42.3%
 
# Method 2: Using /proc/stat (programmatic)
# /proc/stat shows cumulative CPU time since boot
 
cat /proc/stat | head -1
# Output: cpu  10132153 290696 3084719 46828483 16683 0 25195 0 0 0
#         cpu  user     nice   system  idle     iowait irq  softirq ...
 
# Calculate utilization over an interval:
# 1. Read /proc/stat at time T1
# 2. Wait N seconds  
# 3. Read /proc/stat at time T2
# 4. Utilization = (busy_time_delta / total_time_delta) × 100%
 
# Method 3: Using vmstat
vmstat 1 5   # 5 samples at 1-second intervals
# id = idle percentage (100 - id = utilization)
 
# Method 4: Using mpstat (per-CPU)
mpstat -P ALL 1 5   # Per-CPU utilization
# Shows if load is balanced across cores
 
# Breakdown components:
# - us (user): Time running user processes
# - sy (system): Time in kernel (system calls, interrupts)
# - ni (nice): Time running niced user processes
# - id (idle): Time doing nothing
# - wa (iowait): Time waiting for I/O (controversial - see note)
# - hi/si: Hardware/software interrupts

The iowait Misconception

The 'iowait' metric is often misunderstood. It shows the percentage of time the CPU was idle while processes were waiting for I/O. High iowait doesn't mean the CPU is busy waiting—it means the CPU is idle and could run other processes if any were ready. A system with high iowait needs more I/O bandwidth or process diversity, not more CPU.

Throughput

Throughput measures the amount of work completed per unit time—how many processes complete execution in a given interval.

Formal definition:

Throughput = (Number of Processes Completed) / (Total Time Interval)

Units: Jobs/second, transactions/second, requests/second, etc.

Example calculation:

If 120 processes complete in a 60-second window:

Throughput = 120 / 60 = 2 processes/second

Factors affecting throughput:

Process characteristics: Short processes complete quickly, increasing throughput count
Context switch overhead: Excessive switching wastes CPU cycles, reducing work done
Scheduling algorithm: Some algorithms favor completing many short jobs (higher throughput) vs. fewer long jobs
System load: Throughput typically increases with load until saturation, then may decrease due to overhead

Throughput vs. CPU utilization:

These metrics are related but distinct:

High CPU utilization doesn't guarantee high throughput (CPU might be thrashing or doing overhead work)
High throughput doesn't require 100% CPU utilization (I/O-bound workloads)

Example: A system at 100% CPU utilization due to excessive context switching might have lower throughput than a system at 80% utilization with fewer switches.

Throughput Comparison: Algorithm Impact
Scenario	Processes	Algorithm	Completion Time	Throughput
3 jobs: 24s, 3s, 3s	3	FCFS (long first)	30s total	0.1 jobs/s
3 jobs: 24s, 3s, 3s	3	SJF	30s total	0.1 jobs/s
Mixed short/long	10 (5 short, 5 long)	FCFS	Variable	Lower
Mixed short/long	10 (5 short, 5 long)	SJF	Same total work	Higher (short jobs finish early)

Throughput and Completion Order

For a fixed set of processes with fixed CPU burst times, the total work to complete is constant. What changes with different algorithms is the order of completion. SJF doesn't reduce total time, but by finishing short jobs early, throughput (completions per time) is higher early in the execution—which matters for metrics like average turnaround time.

Turnaround Time

Turnaround Time measures the total elapsed time from when a process is submitted until it completes—the full lifecycle duration.

Formal definition:

Turnaround Time = Completion Time - Arrival Time

Alternatively:

Turnaround Time = Waiting Time + Burst Time (+ I/O Time, if any)

Where:

Arrival Time = When the process enters the system (ready queue)
Completion Time = When the process finishes execution and exits
Waiting Time = Time spent in the ready queue, not executing
Burst Time = Actual CPU execution time

Example calculation:

Consider three processes with the following characteristics:

Process Properties
Process	Arrival Time	Burst Time
P1	0	24
P2	0	3
P3	0	3

Using FCFS (order: P1, P2, P3):

Process	Burst	Completion	Turnaround
P1	24	24	24 - 0 = 24
P2	3	27	27 - 0 = 27
P3	3	30	30 - 0 = 30

Average Turnaround Time = (24 + 27 + 30) / 3 = 27

Using SJF (order: P2, P3, P1):

Process	Burst	Completion	Turnaround
P2	3	3	3 - 0 = 3
P3	3	6	6 - 0 = 6
P1	24	30	30 - 0 = 30

Average Turnaround Time = (3 + 6 + 30) / 3 = 13

SJF reduces average turnaround time from 27 to 13—a 52% improvement!

Why SJF Minimizes Average Turnaround

Mathematical insight: When a process runs first, it contributes to the waiting time of all subsequent processes. Running short jobs first minimizes this cumulative wait. This is why SJF (Shortest Job First) is provably optimal for minimizing average turnaround time when all processes arrive at the same time.

Who cares about turnaround time:

Batch processing operators: Want jobs to complete quickly after submission
CI/CD pipelines: Total build+test time matters
Scientific computing: Simulation campaigns should complete soon after submission
Users submitting long-running jobs: Total wait matters more than intermediate responsiveness

Waiting Time

Waiting Time measures the total time a process spends in the ready queue—ready to run but not actually running.

Formal definition:

Waiting Time = Turnaround Time - Burst Time - I/O Time

Or equivalently:

Waiting Time = Sum of all periods spent in ready queue

Key insight: Waiting time excludes:

Time spent actually executing (CPU burst)
Time spent blocked on I/O (I/O burst)

Waiting time only counts the time a process is ready but waiting for the CPU.

Example with preemption:

Consider a Round Robin scheduler with quantum = 4:

Processes for Round Robin Example
Process	Arrival	Burst Time
P1	0	10
P2	1	4
P3	2	2

Execution timeline (Gantt chart):

| P1 | P2 | P3 | P1 | P2 | P1 |
0    4    8   10   14  15  18

Wait, let's be precise:

Time 0-4:   P1 runs (4 units)
Time 4-8:   P2 runs (4 units) — P2 complete
Time 8-10:  P3 runs (2 units) — P3 complete  
Time 10-14: P1 runs (4 units)
Time 14-16: P1 runs (2 units) — P1 complete

Waiting time calculation:

Process	Arrival	Burst	Periods Waiting	Total Wait
P1	0	10	[4-4] + [8-4] = 0 + 4 = wait 4-8	4
P2	1	4	wait 1-4	3
P3	2	2	wait 2-8	6

Average Waiting Time = (4 + 3 + 6) / 3 = 4.33

Waiting Time vs. Turnaround Time

Waiting time isolates the scheduler's contribution to delay. A process with long I/O bursts will have high turnaround time regardless of the scheduler, but waiting time captures only the delays the scheduler could potentially reduce. This makes waiting time a purer measure of scheduler quality.

Why waiting time matters:

Fairness indicator: Large variance in waiting times suggests unfair scheduling
Queue management: Chronic high waiting times indicate overload or poor algorithm choice
User perception: Waiting time directly correlates with perceived system sluggishness
Algorithm comparison: Waiting time allows apples-to-apples comparison (eliminates differences in job burst times)

Response Time

Response Time measures the time from when a request is submitted until the first response is produced—not full completion, but initial visible output.

Formal definition:

Response Time = First Response Time - Arrival Time

Or for CPU scheduling specifically:

Response Time = First CPU Execution Start - Arrival Time

This is distinct from turnaround time:

Turnaround time: Total time until completion
Response time: Time until first response

For interactive systems, response time is often more important than turnaround time.

Example comparison:

Consider Process P with 10-second burst time:

In FCFS with 20 seconds of work ahead:

Response time: 20 seconds (doesn't start until prior work completes)
Turnaround time: 30 seconds (20 wait + 10 execution)

In Round Robin with quantum = 2:

Response time: 0-4 seconds (starts within first few quanta)
Turnaround time: Potentially much longer due to interleaving

Round Robin provides much better response time even if turnaround time is similar or worse.

Response Time Targets by Application Type
Application	Response Time Target	User Perception
Keyboard echo	< 50 ms	Instantaneous
Mouse cursor	< 10 ms	Lag perceptible above 10ms
GUI button click	< 100 ms	Feels responsive
Page load start	< 200 ms	Acceptable for web
Complex query	< 1 second	User remains engaged
Report generation	< 10 seconds	Progress indicator needed

Response time distribution matters:

For interactive systems, average response time is less important than the distribution:

p50 (median): 50% of requests respond within this time
p95: 95% of requests respond within this time
p99: 99% of requests respond within this time (tail latency)

Example: A system with:

Average response time: 50ms
p99 response time: 5 seconds

...is worse for user experience than a system with:

Average response time: 100ms
p99 response time: 200ms

Because the first system has unacceptable outliers that frustrate users.

Tail Latency Matters

In web-scale systems, poor tail latency (p99, p999) has outsized impact. If a page requires 100 backend calls and each has 1% chance of slow response, the page has ~63% chance of experiencing at least one slow call. Tail latency becomes the dominant user experience factor at scale.

response_time_metrics.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# Calculating response time metrics from samples
 
import numpy as np
 
def analyze_response_times(samples):
    """
    Analyze response time distribution from collected samples.
    
    For scheduling, samples would be: first_run_time - arrival_time
    for each process.
    """
    samples = np.array(samples)
    
    metrics = {
        'mean': np.mean(samples),
        'median': np.median(samples),
        'std_dev': np.std(samples),
        'min': np.min(samples),
        'max': np.max(samples),
        'p50': np.percentile(samples, 50),
        'p90': np.percentile(samples, 90),
        'p95': np.percentile(samples, 95),
        'p99': np.percentile(samples, 99),
        'p999': np.percentile(samples, 99.9),
    }
    
    return metrics
 
# Example: Response times in milliseconds
response_times = [45, 52, 48, 51, 49, 420, 47, 53, 46, 50,
                   48, 51, 49, 52, 47, 50, 48, 890, 51, 49]
 
stats = analyze_response_times(response_times)
 
# Output interpretation:
# mean:   ~80ms (inflated by outliers)
# median: ~50ms (typical experience)
# p99:    ~890ms (worst case, 1 in 100)
#
# The mean is misleading here; median and percentiles
# better represent actual user experience

The Tradeoff Space

The five scheduling criteria we've examined are not independent—optimizing one often degrades another. Understanding these tradeoffs is essential for algorithm selection.

The fundamental tensions:

Scheduling Criteria Tradeoffs
Optimizing For	May Hurt	Reason
Throughput	Response Time	Minimizing switches increases work done but hurts interactivity
Response Time	Throughput	Frequent switches for responsiveness add overhead
Turnaround (short jobs)	Turnaround (long jobs)	Favoring short jobs delays long ones
CPU Utilization	Response Time	Keeping CPU busy may queue interactive requests
Fairness	Throughput	Equal time slices may not be globally optimal

The SJF paradox:

Shortest Job First minimizes average waiting/turnaround time—mathematically provable. But it creates extreme unfairness:

Short jobs enjoy excellent service
Long jobs suffer severe delays (potential starvation)
Variance in waiting time is maximized

In systems valuing fairness, SJF is unacceptable despite its optimal average-case metrics.

The Round Robin compromise:

Round Robin provides:

Good response time (everyone gets CPU quickly)
Fair CPU sharing (proportional allocation over time)
Predictable behavior (bounded waiting time)

But sacrifices:

Throughput (context switch overhead)
Optimal waiting time (compared to SJF)
Efficiency for CPU-bound work (frequent unnecessary switches)

There Is No Perfect Scheduler

This is a fundamental truth: no scheduling algorithm optimizes all criteria simultaneously. Scheduler design is about choosing which tradeoffs to make based on workload characteristics and system goals. The 'best' scheduler depends entirely on what 'best' means for your use case.

Matching criteria to workloads:

Which Criteria Matter When

•Batch processing systems: Maximize throughput, minimize turnaround time; response time irrelevant
•Interactive desktops: Minimize response time; throughput secondary; fairness among user apps
•Real-time systems: Meet deadlines (not captured by standard criteria); predictability over throughput
•Servers: Minimize response time variance (tail latency); maintain throughput; fair client handling
•Cloud computing: Maximize utilization (cost efficiency); meet SLAs (response time guarantees)

Complete Worked Example

Let's calculate all scheduling metrics for a concrete scenario, comparing two algorithms.

Problem setup:

Four processes with the following characteristics:

Process Data
Process	Arrival Time	Burst Time
P1	0	8
P2	1	4
P3	2	9
P4	3	5

Algorithm 1: FCFS (First-Come, First-Served)

Execution order: P1 → P2 → P3 → P4

Gantt Chart:
|     P1     |  P2  |     P3      |   P4   |
0           8     12            21        26

Calculations:

FCFS Metrics
Process	Arrival	Burst	Completion	Turnaround	Waiting	Response
P1	0	8	8	8-0=8	8-8=0	0-0=0
P2	1	4	12	12-1=11	11-4=7	8-1=7
P3	2	9	21	21-2=19	19-9=10	12-2=10
P4	3	5	26	26-3=23	23-5=18	21-3=18
Avg				15.25	8.75	8.75

Algorithm 2: SJF (Shortest Job First, Non-preemptive)

At time 0, only P1 available → run P1 At time 8, P2(4), P3(9), P4(5) available → run P2 (shortest) At time 12, P3(9), P4(5) available → run P4 (shorter) At time 17, only P3 left → run P3

Gantt Chart:
|     P1     |  P2  |   P4   |     P3      |
0           8     12        17            26

Calculations:

SJF Metrics
Process	Arrival	Burst	Completion	Turnaround	Waiting	Response
P1	0	8	8	8-0=8	8-8=0	0-0=0
P2	1	4	12	12-1=11	11-4=7	8-1=7
P3	2	9	26	26-2=24	24-9=15	17-2=15
P4	3	5	17	17-3=14	14-5=9	12-3=9
Avg				14.25	7.75	7.75

Comparison:

Metric	FCFS	SJF	Winner
Avg Turnaround	15.25	14.25	SJF (7% better)
Avg Waiting	8.75	7.75	SJF (11% better)
Avg Response	8.75	7.75	SJF (11% better)
CPU Utilization	100%	100%	Tie
Throughput	4/26 = 0.154	4/26 = 0.154	Tie
Fairness (P3 wait)	10	15	FCFS (P3 treated better)

Observation: SJF wins on aggregate metrics but P3 (longest job) waits 50% longer. This illustrates the fairness tradeoff.

Systematic Calculation Method

To calculate metrics systematically:

Draw the Gantt chart (execution timeline)
For each process, read Completion Time from chart
Turnaround = Completion - Arrival
Waiting = Turnaround - Burst (or Turnaround - CPU time used)
Response = First execution start - Arrival
Average = Sum of individual values / Number of processes

Summary: Scheduling Criteria

We've established the framework for evaluating and comparing scheduling algorithms. Let's consolidate the key concepts:

Key Takeaways

•CPU Utilization measures efficient use of the processor (% time busy).
•Throughput measures work completed per unit time (jobs/second).
•Turnaround Time = Completion - Arrival = total time in system.
•Waiting Time = time spent ready but not running = scheduler's contribution to delay.
•Response Time = time to first output; critical for interactive systems.
•No algorithm optimizes all criteria—scheduler design involves explicit tradeoffs.
•Different workloads prioritize different criteria: batch → throughput; interactive → response time; real-time → predictability.

What's next:

With our evaluation framework established, we now turn to the Dispatcher—the component that actually implements context switches and hands the CPU to the selected process. Understanding the dispatcher completes our picture of scheduling mechanics before we dive into specific algorithms.

Page Complete

You now have a complete framework for evaluating scheduling algorithms. You can calculate CPU utilization, throughput, turnaround time, waiting time, and response time for any scheduling scenario. More importantly, you understand the inherent tradeoffs between these criteria—essential knowledge for system design and algorithm selection.

4 / 5

Loading learning content...

Operating SystemsCPU Scheduling

Scheduling Concepts

LevelIntermediate

Duration60 mins

TopicCPU Scheduling

4 / 5

Scheduling Criteria

Measuring What Matters

Selecting algorithms: Choosing the right scheduler for your workload
Evaluating tradeoffs: Understanding what you sacrifice when optimizing one metric
Benchmarking: Measuring scheduler performance objectively
System design: Matching scheduling policy to system goals

What You Will Learn

CPU Utilization

CPU Utilization measures what fraction of time the CPU is doing useful work (executing processes) rather than sitting idle.

Formal definition:

CPU Utilization = (Total CPU Busy Time) / (Total Elapsed Time) × 100%

Where:

CPU Busy Time = Time spent executing user or kernel code for processes
Idle Time = Time spent in the idle process (waiting for work)
Elapsed Time = CPU Busy Time + Idle Time

Target values:

System Type	Typical Target	Notes
Lightly loaded desktop	20-40%	User doesn't want to wait, but system often idle
Heavily loaded workstation	60-80%	Some headroom for bursts
Production server	70-90%	High utilization is cost-effective
Overloaded system	95-100%	May indicate capacity problem

Why CPU utilization matters:

Cost efficiency: Idle CPUs are wasted investment—especially in cloud computing where you pay per CPU-hour
Capacity planning: Sustained low utilization suggests over-provisioning; sustained high suggests under-provisioning
Scheduling effectiveness: A scheduler that leaves the CPU idle when runnable processes exist is inefficient

Measurement on Linux:

measure_cpu_utilization.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#!/bin/bash
# Measuring CPU utilization on Linux
 
# Method 1: Using top (real-time view)
top -bn1 | grep "Cpu(s)" | awk '{print "CPU Utilization: " 100 - $8 "%"}'
# Output: CPU Utilization: 42.3%
 
# Method 2: Using /proc/stat (programmatic)
# /proc/stat shows cumulative CPU time since boot
 
cat /proc/stat | head -1
# Output: cpu  10132153 290696 3084719 46828483 16683 0 25195 0 0 0
#         cpu  user     nice   system  idle     iowait irq  softirq ...
 
# Calculate utilization over an interval:
# 1. Read /proc/stat at time T1
# 2. Wait N seconds  
# 3. Read /proc/stat at time T2
# 4. Utilization = (busy_time_delta / total_time_delta) × 100%
 
# Method 3: Using vmstat
vmstat 1 5   # 5 samples at 1-second intervals
# id = idle percentage (100 - id = utilization)
 
# Method 4: Using mpstat (per-CPU)
mpstat -P ALL 1 5   # Per-CPU utilization
# Shows if load is balanced across cores
 
# Breakdown components:
# - us (user): Time running user processes
# - sy (system): Time in kernel (system calls, interrupts)
# - ni (nice): Time running niced user processes
# - id (idle): Time doing nothing
# - wa (iowait): Time waiting for I/O (controversial - see note)
# - hi/si: Hardware/software interrupts

The iowait Misconception

Throughput

Throughput measures the amount of work completed per unit time—how many processes complete execution in a given interval.

Formal definition:

Throughput = (Number of Processes Completed) / (Total Time Interval)

Units: Jobs/second, transactions/second, requests/second, etc.

Example calculation:

If 120 processes complete in a 60-second window:

Throughput = 120 / 60 = 2 processes/second

Factors affecting throughput:

Process characteristics: Short processes complete quickly, increasing throughput count
Context switch overhead: Excessive switching wastes CPU cycles, reducing work done
Scheduling algorithm: Some algorithms favor completing many short jobs (higher throughput) vs. fewer long jobs
System load: Throughput typically increases with load until saturation, then may decrease due to overhead

Throughput vs. CPU utilization:

These metrics are related but distinct:

High CPU utilization doesn't guarantee high throughput (CPU might be thrashing or doing overhead work)
High throughput doesn't require 100% CPU utilization (I/O-bound workloads)

Example: A system at 100% CPU utilization due to excessive context switching might have lower throughput than a system at 80% utilization with fewer switches.

Throughput Comparison: Algorithm Impact
Scenario	Processes	Algorithm	Completion Time	Throughput
3 jobs: 24s, 3s, 3s	3	FCFS (long first)	30s total	0.1 jobs/s
3 jobs: 24s, 3s, 3s	3	SJF	30s total	0.1 jobs/s
Mixed short/long	10 (5 short, 5 long)	FCFS	Variable	Lower
Mixed short/long	10 (5 short, 5 long)	SJF	Same total work	Higher (short jobs finish early)

Throughput and Completion Order

Turnaround Time

Turnaround Time measures the total elapsed time from when a process is submitted until it completes—the full lifecycle duration.

Formal definition:

Turnaround Time = Completion Time - Arrival Time

Alternatively:

Turnaround Time = Waiting Time + Burst Time (+ I/O Time, if any)

Where:

Arrival Time = When the process enters the system (ready queue)
Completion Time = When the process finishes execution and exits
Waiting Time = Time spent in the ready queue, not executing
Burst Time = Actual CPU execution time

Example calculation:

Consider three processes with the following characteristics:

Process Properties
Process	Arrival Time	Burst Time
P1	0	24
P2	0	3
P3	0	3

Using FCFS (order: P1, P2, P3):

Process	Burst	Completion	Turnaround
P1	24	24	24 - 0 = 24
P2	3	27	27 - 0 = 27
P3	3	30	30 - 0 = 30

Average Turnaround Time = (24 + 27 + 30) / 3 = 27

Using SJF (order: P2, P3, P1):

Process	Burst	Completion	Turnaround
P2	3	3	3 - 0 = 3
P3	3	6	6 - 0 = 6
P1	24	30	30 - 0 = 30

Average Turnaround Time = (3 + 6 + 30) / 3 = 13

SJF reduces average turnaround time from 27 to 13—a 52% improvement!

Why SJF Minimizes Average Turnaround

Who cares about turnaround time:

Batch processing operators: Want jobs to complete quickly after submission
CI/CD pipelines: Total build+test time matters
Scientific computing: Simulation campaigns should complete soon after submission
Users submitting long-running jobs: Total wait matters more than intermediate responsiveness

Waiting Time

Waiting Time measures the total time a process spends in the ready queue—ready to run but not actually running.

Formal definition:

Waiting Time = Turnaround Time - Burst Time - I/O Time

Or equivalently:

Waiting Time = Sum of all periods spent in ready queue

Key insight: Waiting time excludes:

Time spent actually executing (CPU burst)
Time spent blocked on I/O (I/O burst)

Waiting time only counts the time a process is ready but waiting for the CPU.

Example with preemption:

Consider a Round Robin scheduler with quantum = 4:

Processes for Round Robin Example
Process	Arrival	Burst Time
P1	0	10
P2	1	4
P3	2	2

Execution timeline (Gantt chart):

| P1 | P2 | P3 | P1 | P2 | P1 |
0    4    8   10   14  15  18

Wait, let's be precise:

Time 0-4:   P1 runs (4 units)
Time 4-8:   P2 runs (4 units) — P2 complete
Time 8-10:  P3 runs (2 units) — P3 complete  
Time 10-14: P1 runs (4 units)
Time 14-16: P1 runs (2 units) — P1 complete

Waiting time calculation:

Process	Arrival	Burst	Periods Waiting	Total Wait
P1	0	10	[4-4] + [8-4] = 0 + 4 = wait 4-8	4
P2	1	4	wait 1-4	3
P3	2	2	wait 2-8	6

Average Waiting Time = (4 + 3 + 6) / 3 = 4.33

Waiting Time vs. Turnaround Time

Why waiting time matters:

Fairness indicator: Large variance in waiting times suggests unfair scheduling
Queue management: Chronic high waiting times indicate overload or poor algorithm choice
User perception: Waiting time directly correlates with perceived system sluggishness
Algorithm comparison: Waiting time allows apples-to-apples comparison (eliminates differences in job burst times)

Response Time

Response Time measures the time from when a request is submitted until the first response is produced—not full completion, but initial visible output.

Formal definition:

Response Time = First Response Time - Arrival Time

Or for CPU scheduling specifically:

Response Time = First CPU Execution Start - Arrival Time

This is distinct from turnaround time:

Turnaround time: Total time until completion
Response time: Time until first response

For interactive systems, response time is often more important than turnaround time.

Example comparison:

Consider Process P with 10-second burst time:

In FCFS with 20 seconds of work ahead:

Response time: 20 seconds (doesn't start until prior work completes)
Turnaround time: 30 seconds (20 wait + 10 execution)

In Round Robin with quantum = 2:

Response time: 0-4 seconds (starts within first few quanta)
Turnaround time: Potentially much longer due to interleaving

Round Robin provides much better response time even if turnaround time is similar or worse.

Response Time Targets by Application Type
Application	Response Time Target	User Perception
Keyboard echo	< 50 ms	Instantaneous
Mouse cursor	< 10 ms	Lag perceptible above 10ms
GUI button click	< 100 ms	Feels responsive
Page load start	< 200 ms	Acceptable for web
Complex query	< 1 second	User remains engaged
Report generation	< 10 seconds	Progress indicator needed

Response time distribution matters:

For interactive systems, average response time is less important than the distribution:

p50 (median): 50% of requests respond within this time
p95: 95% of requests respond within this time
p99: 99% of requests respond within this time (tail latency)

Example: A system with:

Average response time: 50ms
p99 response time: 5 seconds

...is worse for user experience than a system with:

Average response time: 100ms
p99 response time: 200ms

Because the first system has unacceptable outliers that frustrate users.

Tail Latency Matters

response_time_metrics.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# Calculating response time metrics from samples
 
import numpy as np
 
def analyze_response_times(samples):
    """
    Analyze response time distribution from collected samples.
    
    For scheduling, samples would be: first_run_time - arrival_time
    for each process.
    """
    samples = np.array(samples)
    
    metrics = {
        'mean': np.mean(samples),
        'median': np.median(samples),
        'std_dev': np.std(samples),
        'min': np.min(samples),
        'max': np.max(samples),
        'p50': np.percentile(samples, 50),
        'p90': np.percentile(samples, 90),
        'p95': np.percentile(samples, 95),
        'p99': np.percentile(samples, 99),
        'p999': np.percentile(samples, 99.9),
    }
    
    return metrics
 
# Example: Response times in milliseconds
response_times = [45, 52, 48, 51, 49, 420, 47, 53, 46, 50,
                   48, 51, 49, 52, 47, 50, 48, 890, 51, 49]
 
stats = analyze_response_times(response_times)
 
# Output interpretation:
# mean:   ~80ms (inflated by outliers)
# median: ~50ms (typical experience)
# p99:    ~890ms (worst case, 1 in 100)
#
# The mean is misleading here; median and percentiles
# better represent actual user experience

The Tradeoff Space

The five scheduling criteria we've examined are not independent—optimizing one often degrades another. Understanding these tradeoffs is essential for algorithm selection.

The fundamental tensions:

Scheduling Criteria Tradeoffs
Optimizing For	May Hurt	Reason
Throughput	Response Time	Minimizing switches increases work done but hurts interactivity
Response Time	Throughput	Frequent switches for responsiveness add overhead
Turnaround (short jobs)	Turnaround (long jobs)	Favoring short jobs delays long ones
CPU Utilization	Response Time	Keeping CPU busy may queue interactive requests
Fairness	Throughput	Equal time slices may not be globally optimal

The SJF paradox:

Shortest Job First minimizes average waiting/turnaround time—mathematically provable. But it creates extreme unfairness:

Short jobs enjoy excellent service
Long jobs suffer severe delays (potential starvation)
Variance in waiting time is maximized

In systems valuing fairness, SJF is unacceptable despite its optimal average-case metrics.

The Round Robin compromise:

Round Robin provides:

Good response time (everyone gets CPU quickly)
Fair CPU sharing (proportional allocation over time)
Predictable behavior (bounded waiting time)

But sacrifices:

Throughput (context switch overhead)
Optimal waiting time (compared to SJF)
Efficiency for CPU-bound work (frequent unnecessary switches)

There Is No Perfect Scheduler

Matching criteria to workloads:

Which Criteria Matter When

•Batch processing systems: Maximize throughput, minimize turnaround time; response time irrelevant
•Interactive desktops: Minimize response time; throughput secondary; fairness among user apps
•Real-time systems: Meet deadlines (not captured by standard criteria); predictability over throughput
•Servers: Minimize response time variance (tail latency); maintain throughput; fair client handling
•Cloud computing: Maximize utilization (cost efficiency); meet SLAs (response time guarantees)

Complete Worked Example

Let's calculate all scheduling metrics for a concrete scenario, comparing two algorithms.

Problem setup:

Four processes with the following characteristics:

Process Data
Process	Arrival Time	Burst Time
P1	0	8
P2	1	4
P3	2	9
P4	3	5

Algorithm 1: FCFS (First-Come, First-Served)

Execution order: P1 → P2 → P3 → P4

Gantt Chart:
|     P1     |  P2  |     P3      |   P4   |
0           8     12            21        26

Calculations:

FCFS Metrics
Process	Arrival	Burst	Completion	Turnaround	Waiting	Response
P1	0	8	8	8-0=8	8-8=0	0-0=0
P2	1	4	12	12-1=11	11-4=7	8-1=7
P3	2	9	21	21-2=19	19-9=10	12-2=10
P4	3	5	26	26-3=23	23-5=18	21-3=18
Avg				15.25	8.75	8.75

Algorithm 2: SJF (Shortest Job First, Non-preemptive)

At time 0, only P1 available → run P1 At time 8, P2(4), P3(9), P4(5) available → run P2 (shortest) At time 12, P3(9), P4(5) available → run P4 (shorter) At time 17, only P3 left → run P3

Gantt Chart:
|     P1     |  P2  |   P4   |     P3      |
0           8     12        17            26

Calculations:

SJF Metrics
Process	Arrival	Burst	Completion	Turnaround	Waiting	Response
P1	0	8	8	8-0=8	8-8=0	0-0=0
P2	1	4	12	12-1=11	11-4=7	8-1=7
P3	2	9	26	26-2=24	24-9=15	17-2=15
P4	3	5	17	17-3=14	14-5=9	12-3=9
Avg				14.25	7.75	7.75

Comparison:

Metric	FCFS	SJF	Winner
Avg Turnaround	15.25	14.25	SJF (7% better)
Avg Waiting	8.75	7.75	SJF (11% better)
Avg Response	8.75	7.75	SJF (11% better)
CPU Utilization	100%	100%	Tie
Throughput	4/26 = 0.154	4/26 = 0.154	Tie
Fairness (P3 wait)	10	15	FCFS (P3 treated better)

Observation: SJF wins on aggregate metrics but P3 (longest job) waits 50% longer. This illustrates the fairness tradeoff.

Systematic Calculation Method

To calculate metrics systematically:

Draw the Gantt chart (execution timeline)
For each process, read Completion Time from chart
Turnaround = Completion - Arrival
Waiting = Turnaround - Burst (or Turnaround - CPU time used)
Response = First execution start - Arrival
Average = Sum of individual values / Number of processes

Summary: Scheduling Criteria

We've established the framework for evaluating and comparing scheduling algorithms. Let's consolidate the key concepts:

Key Takeaways

•CPU Utilization measures efficient use of the processor (% time busy).
•Throughput measures work completed per unit time (jobs/second).
•Turnaround Time = Completion - Arrival = total time in system.
•Waiting Time = time spent ready but not running = scheduler's contribution to delay.
•Response Time = time to first output; critical for interactive systems.
•No algorithm optimizes all criteria—scheduler design involves explicit tradeoffs.
•Different workloads prioritize different criteria: batch → throughput; interactive → response time; real-time → predictability.

What's next:

Page Complete

4 / 5