Loading learning content...
How do we know if a scheduling algorithm is good? The answer depends entirely on what we're trying to optimize. A datacenter operator running batch jobs cares about maximizing throughput—completing as many jobs as possible per hour. A user editing a document cares about responsiveness—that keystrokes appear instantly. A real-time system controlling a nuclear reactor cares about deadlines—that critical responses happen within strict time bounds.
These stakeholders have fundamentally different—sometimes conflicting—definitions of "good" scheduling. No single algorithm optimizes all criteria simultaneously. Understanding these scheduling criteria is essential for:
By the end of this page, you will understand: (1) The five primary scheduling criteria and their formal definitions, (2) How to calculate each metric from scheduling data, (3) The inherent tradeoffs between criteria, (4) Which criteria matter for which workload types, and (5) How real systems balance multiple objectives.
CPU Utilization measures what fraction of time the CPU is doing useful work (executing processes) rather than sitting idle.
Formal definition:
CPU Utilization = (Total CPU Busy Time) / (Total Elapsed Time) × 100%
Where:
Target values:
| System Type | Typical Target | Notes |
|---|---|---|
| Lightly loaded desktop | 20-40% | User doesn't want to wait, but system often idle |
| Heavily loaded workstation | 60-80% | Some headroom for bursts |
| Production server | 70-90% | High utilization is cost-effective |
| Overloaded system | 95-100% | May indicate capacity problem |
Why CPU utilization matters:
Measurement on Linux:
1234567891011121314151617181920212223242526272829303132333435
#!/bin/bash# Measuring CPU utilization on Linux # Method 1: Using top (real-time view)top -bn1 | grep "Cpu(s)" | awk '{print "CPU Utilization: " 100 - $8 "%"}'# Output: CPU Utilization: 42.3% # Method 2: Using /proc/stat (programmatic)# /proc/stat shows cumulative CPU time since boot cat /proc/stat | head -1# Output: cpu 10132153 290696 3084719 46828483 16683 0 25195 0 0 0# cpu user nice system idle iowait irq softirq ... # Calculate utilization over an interval:# 1. Read /proc/stat at time T1# 2. Wait N seconds # 3. Read /proc/stat at time T2# 4. Utilization = (busy_time_delta / total_time_delta) × 100% # Method 3: Using vmstatvmstat 1 5 # 5 samples at 1-second intervals# id = idle percentage (100 - id = utilization) # Method 4: Using mpstat (per-CPU)mpstat -P ALL 1 5 # Per-CPU utilization# Shows if load is balanced across cores # Breakdown components:# - us (user): Time running user processes# - sy (system): Time in kernel (system calls, interrupts)# - ni (nice): Time running niced user processes# - id (idle): Time doing nothing# - wa (iowait): Time waiting for I/O (controversial - see note)# - hi/si: Hardware/software interruptsThe 'iowait' metric is often misunderstood. It shows the percentage of time the CPU was idle while processes were waiting for I/O. High iowait doesn't mean the CPU is busy waiting—it means the CPU is idle and could run other processes if any were ready. A system with high iowait needs more I/O bandwidth or process diversity, not more CPU.
Throughput measures the amount of work completed per unit time—how many processes complete execution in a given interval.
Formal definition:
Throughput = (Number of Processes Completed) / (Total Time Interval)
Units: Jobs/second, transactions/second, requests/second, etc.
Example calculation:
If 120 processes complete in a 60-second window:
Throughput = 120 / 60 = 2 processes/second
Factors affecting throughput:
Throughput vs. CPU utilization:
These metrics are related but distinct:
Example: A system at 100% CPU utilization due to excessive context switching might have lower throughput than a system at 80% utilization with fewer switches.
| Scenario | Processes | Algorithm | Completion Time | Throughput |
|---|---|---|---|---|
| 3 jobs: 24s, 3s, 3s | 3 | FCFS (long first) | 30s total | 0.1 jobs/s |
| 3 jobs: 24s, 3s, 3s | 3 | SJF | 30s total | 0.1 jobs/s |
| Mixed short/long | 10 (5 short, 5 long) | FCFS | Variable | Lower |
| Mixed short/long | 10 (5 short, 5 long) | SJF | Same total work | Higher (short jobs finish early) |
For a fixed set of processes with fixed CPU burst times, the total work to complete is constant. What changes with different algorithms is the order of completion. SJF doesn't reduce total time, but by finishing short jobs early, throughput (completions per time) is higher early in the execution—which matters for metrics like average turnaround time.
Turnaround Time measures the total elapsed time from when a process is submitted until it completes—the full lifecycle duration.
Formal definition:
Turnaround Time = Completion Time - Arrival Time
Alternatively:
Turnaround Time = Waiting Time + Burst Time (+ I/O Time, if any)
Where:
Example calculation:
Consider three processes with the following characteristics:
| Process | Arrival Time | Burst Time |
|---|---|---|
| P1 | 0 | 24 |
| P2 | 0 | 3 |
| P3 | 0 | 3 |
Using FCFS (order: P1, P2, P3):
| Process | Arrival | Burst | Completion | Turnaround |
|---|---|---|---|---|
| P1 | 0 | 24 | 24 | 24 - 0 = 24 |
| P2 | 0 | 3 | 27 | 27 - 0 = 27 |
| P3 | 0 | 3 | 30 | 30 - 0 = 30 |
Average Turnaround Time = (24 + 27 + 30) / 3 = 27
Using SJF (order: P2, P3, P1):
| Process | Arrival | Burst | Completion | Turnaround |
|---|---|---|---|---|
| P2 | 0 | 3 | 3 | 3 - 0 = 3 |
| P3 | 0 | 3 | 6 | 6 - 0 = 6 |
| P1 | 0 | 24 | 30 | 30 - 0 = 30 |
Average Turnaround Time = (3 + 6 + 30) / 3 = 13
SJF reduces average turnaround time from 27 to 13—a 52% improvement!
Mathematical insight: When a process runs first, it contributes to the waiting time of all subsequent processes. Running short jobs first minimizes this cumulative wait. This is why SJF (Shortest Job First) is provably optimal for minimizing average turnaround time when all processes arrive at the same time.
Who cares about turnaround time:
Waiting Time measures the total time a process spends in the ready queue—ready to run but not actually running.
Formal definition:
Waiting Time = Turnaround Time - Burst Time - I/O Time
Or equivalently:
Waiting Time = Sum of all periods spent in ready queue
Key insight: Waiting time excludes:
Waiting time only counts the time a process is ready but waiting for the CPU.
Example with preemption:
Consider a Round Robin scheduler with quantum = 4:
| Process | Arrival | Burst Time |
|---|---|---|
| P1 | 0 | 10 |
| P2 | 1 | 4 |
| P3 | 2 | 2 |
Execution timeline (Gantt chart):
| P1 | P2 | P3 | P1 | P2 | P1 |
0 4 8 10 14 15 18
Wait, let's be precise:
Time 0-4: P1 runs (4 units)
Time 4-8: P2 runs (4 units) — P2 complete
Time 8-10: P3 runs (2 units) — P3 complete
Time 10-14: P1 runs (4 units)
Time 14-16: P1 runs (2 units) — P1 complete
Waiting time calculation:
| Process | Arrival | Burst | Periods Waiting | Total Wait |
|---|---|---|---|---|
| P1 | 0 | 10 | [4-4] + [8-4] = 0 + 4 = wait 4-8 | 4 |
| P2 | 1 | 4 | wait 1-4 | 3 |
| P3 | 2 | 2 | wait 2-8 | 6 |
Average Waiting Time = (4 + 3 + 6) / 3 = 4.33
Waiting time isolates the scheduler's contribution to delay. A process with long I/O bursts will have high turnaround time regardless of the scheduler, but waiting time captures only the delays the scheduler could potentially reduce. This makes waiting time a purer measure of scheduler quality.
Why waiting time matters:
Response Time measures the time from when a request is submitted until the first response is produced—not full completion, but initial visible output.
Formal definition:
Response Time = First Response Time - Arrival Time
Or for CPU scheduling specifically:
Response Time = First CPU Execution Start - Arrival Time
This is distinct from turnaround time:
For interactive systems, response time is often more important than turnaround time.
Example comparison:
Consider Process P with 10-second burst time:
In FCFS with 20 seconds of work ahead:
In Round Robin with quantum = 2:
Round Robin provides much better response time even if turnaround time is similar or worse.
| Application | Response Time Target | User Perception |
|---|---|---|
| Keyboard echo | < 50 ms | Instantaneous |
| Mouse cursor | < 10 ms | Lag perceptible above 10ms |
| GUI button click | < 100 ms | Feels responsive |
| Page load start | < 200 ms | Acceptable for web |
| Complex query | < 1 second | User remains engaged |
| Report generation | < 10 seconds | Progress indicator needed |
Response time distribution matters:
For interactive systems, average response time is less important than the distribution:
Example: A system with:
...is worse for user experience than a system with:
Because the first system has unacceptable outliers that frustrate users.
In web-scale systems, poor tail latency (p99, p999) has outsized impact. If a page requires 100 backend calls and each has 1% chance of slow response, the page has ~63% chance of experiencing at least one slow call. Tail latency becomes the dominant user experience factor at scale.
1234567891011121314151617181920212223242526272829303132333435363738394041
# Calculating response time metrics from samples import numpy as np def analyze_response_times(samples): """ Analyze response time distribution from collected samples. For scheduling, samples would be: first_run_time - arrival_time for each process. """ samples = np.array(samples) metrics = { 'mean': np.mean(samples), 'median': np.median(samples), 'std_dev': np.std(samples), 'min': np.min(samples), 'max': np.max(samples), 'p50': np.percentile(samples, 50), 'p90': np.percentile(samples, 90), 'p95': np.percentile(samples, 95), 'p99': np.percentile(samples, 99), 'p999': np.percentile(samples, 99.9), } return metrics # Example: Response times in millisecondsresponse_times = [45, 52, 48, 51, 49, 420, 47, 53, 46, 50, 48, 51, 49, 52, 47, 50, 48, 890, 51, 49] stats = analyze_response_times(response_times) # Output interpretation:# mean: ~80ms (inflated by outliers)# median: ~50ms (typical experience)# p99: ~890ms (worst case, 1 in 100)## The mean is misleading here; median and percentiles# better represent actual user experienceThe five scheduling criteria we've examined are not independent—optimizing one often degrades another. Understanding these tradeoffs is essential for algorithm selection.
The fundamental tensions:
| Optimizing For | May Hurt | Reason |
|---|---|---|
| Throughput | Response Time | Minimizing switches increases work done but hurts interactivity |
| Response Time | Throughput | Frequent switches for responsiveness add overhead |
| Turnaround (short jobs) | Turnaround (long jobs) | Favoring short jobs delays long ones |
| CPU Utilization | Response Time | Keeping CPU busy may queue interactive requests |
| Fairness | Throughput | Equal time slices may not be globally optimal |
The SJF paradox:
Shortest Job First minimizes average waiting/turnaround time—mathematically provable. But it creates extreme unfairness:
In systems valuing fairness, SJF is unacceptable despite its optimal average-case metrics.
The Round Robin compromise:
Round Robin provides:
But sacrifices:
This is a fundamental truth: no scheduling algorithm optimizes all criteria simultaneously. Scheduler design is about choosing which tradeoffs to make based on workload characteristics and system goals. The 'best' scheduler depends entirely on what 'best' means for your use case.
Matching criteria to workloads:
Let's calculate all scheduling metrics for a concrete scenario, comparing two algorithms.
Problem setup:
Four processes with the following characteristics:
| Process | Arrival Time | Burst Time |
|---|---|---|
| P1 | 0 | 8 |
| P2 | 1 | 4 |
| P3 | 2 | 9 |
| P4 | 3 | 5 |
Algorithm 1: FCFS (First-Come, First-Served)
Execution order: P1 → P2 → P3 → P4
Gantt Chart:
| P1 | P2 | P3 | P4 |
0 8 12 21 26
Calculations:
| Process | Arrival | Burst | Completion | Turnaround | Waiting | Response |
|---|---|---|---|---|---|---|
| P1 | 0 | 8 | 8 | 8-0=8 | 8-8=0 | 0-0=0 |
| P2 | 1 | 4 | 12 | 12-1=11 | 11-4=7 | 8-1=7 |
| P3 | 2 | 9 | 21 | 21-2=19 | 19-9=10 | 12-2=10 |
| P4 | 3 | 5 | 26 | 26-3=23 | 23-5=18 | 21-3=18 |
| Avg | 15.25 | 8.75 | 8.75 |
Algorithm 2: SJF (Shortest Job First, Non-preemptive)
At time 0, only P1 available → run P1 At time 8, P2(4), P3(9), P4(5) available → run P2 (shortest) At time 12, P3(9), P4(5) available → run P4 (shorter) At time 17, only P3 left → run P3
Gantt Chart:
| P1 | P2 | P4 | P3 |
0 8 12 17 26
Calculations:
| Process | Arrival | Burst | Completion | Turnaround | Waiting | Response |
|---|---|---|---|---|---|---|
| P1 | 0 | 8 | 8 | 8-0=8 | 8-8=0 | 0-0=0 |
| P2 | 1 | 4 | 12 | 12-1=11 | 11-4=7 | 8-1=7 |
| P3 | 2 | 9 | 26 | 26-2=24 | 24-9=15 | 17-2=15 |
| P4 | 3 | 5 | 17 | 17-3=14 | 14-5=9 | 12-3=9 |
| Avg | 14.25 | 7.75 | 7.75 |
Comparison:
| Metric | FCFS | SJF | Winner |
|---|---|---|---|
| Avg Turnaround | 15.25 | 14.25 | SJF (7% better) |
| Avg Waiting | 8.75 | 7.75 | SJF (11% better) |
| Avg Response | 8.75 | 7.75 | SJF (11% better) |
| CPU Utilization | 100% | 100% | Tie |
| Throughput | 4/26 = 0.154 | 4/26 = 0.154 | Tie |
| Fairness (P3 wait) | 10 | 15 | FCFS (P3 treated better) |
Observation: SJF wins on aggregate metrics but P3 (longest job) waits 50% longer. This illustrates the fairness tradeoff.
To calculate metrics systematically:
We've established the framework for evaluating and comparing scheduling algorithms. Let's consolidate the key concepts:
What's next:
With our evaluation framework established, we now turn to the Dispatcher—the component that actually implements context switches and hands the CPU to the selected process. Understanding the dispatcher completes our picture of scheduling mechanics before we dive into specific algorithms.
You now have a complete framework for evaluating scheduling algorithms. You can calculate CPU utilization, throughput, turnaround time, waiting time, and response time for any scheduling scenario. More importantly, you understand the inherent tradeoffs between these criteria—essential knowledge for system design and algorithm selection.