Sjf And Srtf - Learning Module

Loading content...

0/227

Optimal for Average Wait Time

The Optimality Claim

In the previous page, we stated that Shortest Job First (SJF) provably minimizes average waiting time among all non-preemptive scheduling algorithms. This is a remarkable claim—not a heuristic observation, not a "usually works well" guideline, but a mathematical guarantee.

Understanding why SJF is optimal matters for several reasons:

Verification: We can be confident in SJF's theoretical superiority rather than relying on empirical testing
Extension: The proof structure reveals what conditions are necessary for optimality
Foundation: Understanding optimality proofs enables evaluating new scheduling proposals
Insight: The proof illuminates the fundamental trade-offs in scheduling

This page presents multiple approaches to understanding and proving SJF's optimality, from intuitive arguments to formal mathematical proofs.

What You Will Master

By completing this page, you will understand the intuition behind SJF's optimality, follow a complete mathematical proof, comprehend the exchange argument technique, and recognize the precise conditions under which SJF guarantees minimal average waiting time.

Building Intuition

Before formal proofs, let's build intuition about why scheduling shorter jobs first minimizes average waiting time.

The Waiting Time Cascade

Consider what happens when a process runs. Every process behind it in the queue must wait for it to complete. If a process with burst time B runs first, every subsequent process waits at least B time units.

This creates a cascade effect:

The first process's burst time affects the wait of all n-1 remaining processes
The second process's burst time affects n-2 remaining processes
And so on...

Mathematically, if we schedule processes in order j₁, j₂, ..., jₙ with burst times b₁, b₂, ..., bₙ:

Total Waiting Time = (n-1)·b_{j₁} + (n-2)·b_{j₂} + ... + 1·b_{j_{n-1}} + 0·b_{jₙ}

Each burst time is weighted by how many processes must wait for it.

The Weighting Insight

The first job's burst time is multiplied by (n-1) because all other processes wait. The last job's burst time is multiplied by 0 because no one waits after it. To minimize the weighted sum, we should assign the largest weights to the smallest burst times. This is exactly what SJF does!

A Concrete Example

Consider 3 processes with bursts: A=1, B=2, C=3

Schedule A→B→C (SJF order):

Wait for A: 0
Wait for B: 1 (after A)
Wait for C: 1+2=3 (after A and B)
Total: 0+1+3 = 4

Schedule C→B→A (reverse order):

Wait for C: 0
Wait for B: 3 (after C)
Wait for A: 3+2=5 (after C and B)
Total: 0+3+5 = 8

The SJF order produces half the total waiting time!

Using our formula:

SJF: 2×1 + 1×2 + 0×3 = 2+2+0 = 4 ✓
Reverse: 2×3 + 1×2 + 0×1 = 6+2+0 = 8 ✓

By assigning large weights (2, 1, 0) to small values (1, 2, 3), SJF minimizes the weighted sum.

All Possible Orderings for 3 Processes (bursts: 1, 2, 3)
Schedule	Weighted Sum Calculation	Total Wait	Average Wait
1→2→3 (SJF)	2×1 + 1×2 + 0×3	4	1.33
1→3→2	2×1 + 1×3 + 0×2	5	1.67
2→1→3	2×2 + 1×1 + 0×3	5	1.67
2→3→1	2×2 + 1×3 + 0×1	7	2.33
3→1→2	2×3 + 1×1 + 0×2	7	2.33
3→2→1 (Worst)	2×3 + 1×2 + 0×1	8	2.67

Empirical Confirmation

The table confirms that the SJF order (1→2→3) achieves the minimum possible average waiting time of 1.33, while the reverse order achieves the maximum of 2.67. No permutation beats SJF.

Formal Problem Statement

Before proving optimality, we must precisely state what we're proving.

Assumptions and Conditions

Given:

n processes: P = {p₁, p₂, ..., pₙ}
CPU burst times: B = {b₁, b₂, ..., bₙ} where bᵢ > 0 for all i
All processes arrive at time t=0 (simultaneous arrival)
Non-preemptive scheduling (once started, runs to completion)
Single CPU

Define:

A schedule S is a permutation π of {1, 2, ..., n} specifying execution order
Waiting time W(pᵢ, S) = time from arrival to start of execution for process pᵢ under schedule S
Average waiting time = (1/n) · Σᵢ W(pᵢ, S)

To Prove: The schedule S* that orders processes by non-decreasing burst times (SJF order) minimizes average waiting time over all possible schedules.

Mathematical Formulation

Let π* be the SJF permutation where b_{π*(1)} ≤ b_{π*(2)} ≤ ... ≤ b_{π*(n)}.

For any permutation π:

Total Waiting Time(π) = Σᵢ₌₁ⁿ [Σⱼ₌₁^{i-1} b_{π(j)}]

This says: the waiting time for the i-th process in schedule order is the sum of burst times of all processes before it.

Equivalently:

Total Waiting Time(π) = Σᵢ₌₁ⁿ (n - i) · b_{π(i)}

The i-th scheduled process's burst time is weighted by how many processes come after it.

The Simultaneous Arrival Assumption

This proof assumes all processes arrive at time 0. With different arrival times, we can extend the analysis to show that SJF remains optimal at each decision point, selecting the shortest available job. The core insight—minimize the impact of early jobs on later jobs—still holds.

Proof by Exchange Argument

The exchange argument is a powerful proof technique. We show that any schedule not in SJF order can be improved by swapping adjacent out-of-order elements. Since every non-SJF schedule can be incrementally improved toward SJF, no non-SJF schedule can be optimal.

The Core Lemma

Lemma: If schedule S has two adjacent processes pᵢ and pⱼ where pᵢ precedes pⱼ but bᵢ > bⱼ (longer job before shorter), then swapping them produces schedule S' with strictly lower total waiting time.

Proof of Lemma:

Consider processes at positions k and k+1 in the schedule:

Before swap: process pᵢ at position k, pⱼ at position k+1
After swap: pⱼ at position k, pᵢ at position k+1

Let W_before and W_after denote total waiting times.

Key observation: The swap only affects:

Waiting time of pᵢ (now must wait for pⱼ instead of nothing extra)
Waiting time of pⱼ (now waits for nothing extra instead of pᵢ)

All other processes are unaffected—their relative positions and the total burst time before them remain unchanged.

Before swap:

pⱼ waits for pᵢ (additional wait = bᵢ)

After swap:

pᵢ waits for pⱼ (additional wait = bⱼ)

Change in total waiting time: W_after - W_before = bⱼ - bᵢ < 0 (since bⱼ < bᵢ)

The total waiting time decreases by (bᵢ - bⱼ) > 0.

∎

Exchange Argument Intuition

When we swap a longer job with a shorter job that follows it, the longer job now waits for the shorter one (small penalty), while the shorter job no longer waits for the longer one (big benefit). The net effect is always positive because we're trading a large waiting penalty for a small one.

Main Theorem

Theorem: The SJF schedule (processes ordered by non-decreasing burst times) achieves minimum total waiting time.

Proof:

Let S be any schedule that is not SJF order. Since S is not sorted by burst times, there must exist at least one pair of adjacent processes where a longer process precedes a shorter one.

Step 1: By the Lemma, we can swap these adjacent processes to reduce total waiting time, producing schedule S₁ with W(S₁) < W(S).

Step 2: If S₁ is still not in SJF order, repeat the swap process. Each swap reduces total waiting time.

Step 3: Since there are finitely many possible swaps (at most n(n-1)/2 inversions to fix), and each swap strictly decreases waiting time, this process terminates.

Step 4: When no more improving swaps are possible, the schedule must be sorted by burst times—the SJF order.

Step 5: Therefore, starting from any non-SJF schedule, we can reach the SJF schedule through a series of improvements. This means no non-SJF schedule can have lower waiting time than SJF.

Conclusion: SJF achieves the minimum total (and hence average) waiting time.

∎

Converting Mermaid diagram...

Alternative Proof: Rearrangement Inequality

A more elegant proof uses the Rearrangement Inequality, a fundamental result from mathematical analysis.

The Rearrangement Inequality

Theorem (Rearrangement Inequality): Let a₁ ≤ a₂ ≤ ... ≤ aₙ and b₁ ≤ b₂ ≤ ... ≤ bₙ be sorted sequences of real numbers. For any permutation π:

aₙb₁ + aₙ₋₁b₂ + ... + a₁bₙ ≤ a_{π(1)}b₁ + a_{π(2)}b₂ + ... + a_{π(n)}bₙ ≤ a₁b₁ + a₂b₂ + ... + aₙbₙ

In words:

The minimum sum is achieved by pairing the sequences in opposite order (largest with smallest)
The maximum sum is achieved by pairing in the same order (largest with largest)

Application to SJF

Recall our total waiting time formula:

Total Waiting Time = Σᵢ₌₁ⁿ (n - i) · b_{π(i)}

We can view this as a dot product of two sequences:

Weights w = (n-1, n-2, ..., 1, 0) — fixed in descending order
Burst times arranged according to permutation π

To minimize the sum, by the Rearrangement Inequality, we should pair:

Largest weight (n-1) with smallest burst time
Second largest weight (n-2) with second smallest burst time
And so on...

This means burst times should be in ascending order — exactly the SJF order!

∎

The Power of Mathematical Structure

The Rearrangement Inequality proof reveals that SJF's optimality is an instance of a deeper mathematical principle. Any problem that reduces to minimizing a weighted sum with fixed descending weights is solved by ordering items in ascending order of their values.

Applying Rearrangement Inequality to Scheduling
Position	Weight (n-i)	SJF: Pairs With	Reverse: Pairs With
1st	(n-1) = largest	Smallest burst	Largest burst
2nd	(n-2)	2nd smallest burst	2nd largest burst
...	...	...	...
n-th	0 = smallest	Largest burst	Smallest burst

Numerical Verification

With 3 bursts [1, 2, 3] and weights [2, 1, 0]:

SJF (ascending bursts): 2×1 + 1×2 + 0×3 = 2 + 2 + 0 = 4

Reverse (descending bursts): 2×3 + 1×2 + 0×1 = 6 + 2 + 0 = 8

The Rearrangement Inequality guarantees 4 ≤ any permutation ≤ 8.

All 6 permutations produce values in [4, 8], confirming the theory.

When Does SJF Optimality Hold?

SJF's optimality is proven under specific conditions. Understanding these conditions clarifies when the guarantee applies and when it doesn't.

Conditions for Optimality

Required Conditions

•All processes available at decision time — At each scheduling decision, we consider all currently ready processes. SJF selects the shortest among those available.
•Known or accurately predicted burst times — The optimality proof assumes we know exact burst lengths. With prediction errors, performance degrades.
•Non-preemptive scheduling — For non-preemptive scheduling. Preemptive SJF (SRTF) has its own optimality properties, covered later.
•Single resource (CPU) — The proof applies to a single processor. Multi-processor scheduling introduces load balancing concerns.
•Metric is average waiting time — SJF optimizes this specific metric. Other metrics (response time, throughput, fairness) may require different algorithms.

What Happens When Conditions Are Violated?

Impact of Condition Violations
Condition Violated	Result	Mitigation
Burst times unknown	Only approximate optimality possible	Use prediction (exponential averaging)
Different arrival times	SJF at each decision point is still optimal for that point	Apply SJF incrementally as processes arrive
Preemption required	Consider SRTF instead	SRTF is optimal among preemptive algorithms
Multiple CPUs	SJF is not necessarily optimal	Use load-balancing heuristics
Fairness required	SJF causes starvation	Use aging to limit waiting times

Optimality ≠ Best Choice

SJF is optimal for average waiting time, but that doesn't make it the best scheduler. If fairness matters, SJF's starvation problem is a fatal flaw. If response time matters, preemptive SRTF outperforms non-preemptive SJF. Always consider the complete set of system requirements, not just one metric.

Extension: Non-Simultaneous Arrivals

The proof above assumes all processes arrive at time 0. When processes arrive at different times, the analysis requires extension.

The Adapted SJF Rule

With non-simultaneous arrivals, SJF becomes:

At each scheduling decision point, select the shortest job among all currently ready (arrived and waiting) processes.

This is still SJF—applied dynamically to the available process set.

Optimality Under Different Arrivals

Local Optimality: At each decision point, selecting the shortest available job is optimal for that decision, given the processes present.

Global Optimality: However, SJF is not necessarily globally optimal with different arrival times. Consider:

Process	Arrival	Burst
P1	0	7
P2	2	1

SJF Execution:

t=0: Only P1 available → P1 starts (burst=7)
P1 runs until t=7 (P2 arrived at t=2 but couldn't preempt)
t=7: P2 runs (burst=1), completes at t=8
Wait times: P1=0, P2=5 → Average = 2.5

If we could see the future (and wait for P2):

t=0-2: CPU idle
t=2: P2 runs (burst=1), completes at t=3
t=3: P1 runs (burst=7), completes at t=10
Wait times: P1=3, P2=0 → Average = 1.5

Waiting for P2 produces lower average waiting time!

The Clairvoyant Scheduler

Optimal scheduling with different arrivals would require knowing the future—when processes will arrive and their burst times. Real schedulers can't wait for processes that haven't arrived yet. SJF among current processes is the best we can do without future knowledge, but it's not globally optimal.

When SJF Remains Optimal

With different arrival times, SJF still provides:

Optimal at each decision point — Given current ready processes, no other choice produces lower average wait for available jobs
Strong approximation — In typical workloads, SJF is near-optimal even without clairvoyance
No idle time — If any process is ready, SJF runs it (no artificial waiting)

For truly optimal scheduling with arrivals, preemptive SRTF (covered in Page 4) provides significant improvements by allowing re-evaluation when new processes arrive.

Computational Complexity Perspective

From an algorithmic standpoint, the SJF optimality result connects to broader complexity concepts.

Connection to Sorting

Finding the SJF schedule is equivalent to sorting processes by burst time:

Sort n processes by burst length: O(n log n) comparisons
The sorted order directly gives the optimal schedule

This makes SJF a remarkably efficient algorithm to compute (when burst times are known).

Contrast with Hard Scheduling Problems

Many scheduling problems are NP-hard—no efficient algorithm guarantees optimal solutions:

Problem	Complexity	SJF Applicable?
Single machine, minimize total wait	P (SJF is optimal, O(n log n))	Yes
Multiple machines, minimize makespan	NP-hard	No
Single machine with deadlines	NP-hard (in general)	No
Job shop scheduling	NP-hard	No
Preemptive single machine	P (SRTF)	Related

SJF's polynomial-time optimality is special. Most realistic scheduling problems have no efficient optimal algorithm.

Why This Matters

Understanding that SJF optimality falls into P (polynomial time) while most scheduling is NP-hard helps explain why operating systems use heuristics. When burst times are unknown and multiple objectives compete, we're in NP-hard territory. SJF's optimality is a rare island of tractability.

The Greedy Property

SJF is a greedy algorithm—at each step, it makes the locally optimal choice (shortest job) and never reconsiders.

Greedy algorithms are optimal when:

Optimal substructure — Optimal solutions to subproblems combine into optimal overall solutions
Greedy choice property — Locally optimal choices lead to globally optimal solutions

For SJF with simultaneous arrivals, both properties hold:

Scheduling the shortest job first is the greedy choice
After scheduling it, the remaining subproblem has the same structure
Inductive reasoning confirms global optimality

This connects SJF to the broader family of greedy scheduling algorithms, including variants like weighted job scheduling and activity selection.

Summary: Understanding SJF Optimality

We have rigorously established SJF's optimality property. Let's consolidate the key insights:

Key Takeaways

•Total waiting time = Σ (weight × burst) — Early positions have high weights; minimizing this sum requires short bursts early
•Exchange argument proves optimality — Any out-of-order pair can be swapped to reduce total wait time
•Rearrangement Inequality provides elegant proof — SJF pairs descending weights with ascending burst times to minimize the dot product
•Optimality requires specific conditions — Known bursts, single CPU, non-preemptive, all processes available
•Different arrival times complicate optimality — SJF remains locally optimal but may not be globally optimal without clairvoyance
•SJF is computationally efficient — O(n log n) to compute optimal schedule, unlike NP-hard general scheduling

Looking Ahead:

The optimality proof assumes we know burst times exactly—which we don't in practice. The next page tackles the critical challenge of burst time prediction: How can a scheduler estimate future CPU burst lengths using past behavior? We'll explore exponential averaging and other prediction techniques that make SJF practical.

Page Complete

You now possess a rigorous understanding of why SJF minimizes average waiting time. The exchange argument and rearrangement inequality proofs equip you to reason about scheduling optimality in other contexts as well. This mathematical foundation is essential for critically evaluating scheduling algorithms.