Greedy Choice Property - Learning Module

Loading content...

0/279

Proving a Choice Is Part of an Optimal Solution

From Intuition to Proof

In the previous page, we established what makes a greedy choice "safe" — the existence of an optimal solution that includes our choice. But knowing the definition isn't enough. The real challenge is proving that a specific greedy choice satisfies this property.

This is where many programmers struggle. It's one thing to conjecture that a greedy approach works; it's quite another to demonstrate it rigorously. A satisfying proof not only validates our algorithm but deepens our understanding of why it works — knowledge that transfers to recognizing greedy opportunities in new problems.

This page teaches you to construct these proofs systematically. By the end, you'll be equipped with techniques that work across a wide range of greedy problems.

What You Will Learn

You will master the primary technique for proving greedy correctness: demonstrating that the greedy choice can be 'inserted' into any optimal solution. You'll see how to structure proofs, handle edge cases, and build intuition for when proofs will succeed or fail.

The Structure of Greedy Correctness Proofs

Greedy correctness proofs follow a recognizable pattern. Understanding this structure provides a template you can apply to new problems.

The Standard Proof Template:

State the greedy choice clearly — Precisely define what choice your greedy algorithm makes (e.g., "select the activity with earliest finish time")
Consider an arbitrary optimal solution — Let OPT be any optimal solution. Don't assume anything special about OPT except that it's optimal.
Case analysis on the greedy choice:
- Case A: OPT already includes the greedy choice → Done, we're safe.
- Case B: OPT doesn't include the greedy choice → Show we can modify OPT to include it without losing optimality.
Construct the modified solution — Build a new solution OPT' that includes the greedy choice.
Prove OPT' is still feasible — The modified solution satisfies all problem constraints.
Prove OPT' is still optimal — The modified solution has objective value at least as good as OPT (equal for exact optimality, or better/equal for optimization).
Invoke induction for remaining choices — Having established the first choice is safe, apply the same reasoning to the remaining subproblem.

The Key Insight

The heart of every greedy proof is step 3, Case B: showing that even if an optimal solution doesn't include our greedy choice, we can 'fix' it to include the greedy choice while maintaining optimality. This is the transformability that makes greedy work.

Why this template works:

The template leverages a powerful logical principle: if we can transform any optimal solution into one that includes our greedy choice, then there exists an optimal solution containing our choice. We don't need to find this optimal solution directly — we just need to show it exists through the transformation argument.

This indirect approach is elegant because optimal solutions might be hard to characterize explicitly, but the transformations are often simple swaps or substitutions.

Detailed Example — Activity Selection

Let's work through a complete proof for the activity selection problem. This is the canonical example of greedy correctness proofs.

Problem Statement:

Given n activities with start times s_i and finish times f_i, select the maximum number of non-overlapping activities.

Greedy Algorithm:

Repeatedly select the activity with the earliest finish time among those compatible with already-selected activities.

Theorem: The greedy algorithm produces an optimal solution.

Proof:

We prove by induction on the number of activities selected.

Base Case (First Selection):

Let A₁ be the activity with the earliest finish time among all activities. We claim that some optimal solution contains A₁.

Let OPT be any optimal solution. Consider two cases:

Case A: A₁ ∈ OPT. Then we're done — OPT already contains our greedy choice.

Case B: A₁ ∉ OPT. Let A_k be the activity in OPT with the earliest finish time. Since A₁ has the globally earliest finish time:

$$f_1 \leq f_k$$

Construct OPT' = (OPT \ {A_k}) ∪ {A₁}. That is, replace A_k with A₁ in OPT.

Claim: OPT' is feasible (no overlaps among its activities).

Proof of claim: The only potential new overlaps involve A₁. Any activity A_j in OPT \ {A_k} that was compatible with A_k (meaning s_j ≥ f_k) is also compatible with A₁ because f_1 ≤ f_k implies s_j ≥ f_k ≥ f_1. Since A_k was the first activity in OPT (earliest finish), no activity in OPT starts before f_k ends, so no activity in OPT overlaps with A₁. ∎

Claim: OPT' is optimal.

Proof: |OPT'| = |OPT \ {A_k}| + 1 = |OPT| - 1 + 1 = |OPT|. Since OPT was optimal and OPT' has the same size, OPT' is also optimal. ∎

Thus, OPT' is an optimal solution containing A₁, proving the greedy choice is safe.

activity_selection_proof_visualization.txt

Text

Timeline Visualization of the Swap:
 
Original OPT (not containing A₁):
    A_k         A_j         A_m
|---------|  |------|  |---------|
     f_k        f_j        f_m
 
A₁ (earliest finish overall):
|-----|
  f_1
 
Since f_1 ≤ f_k:
 
Modified OPT' (after swapping A_k → A₁):
  A₁          A_j         A_m
|-----|    |------|  |---------|
  f_1         f_j        f_m
 
• A₁ finishes no later than A_k did
• All activities that started after f_k still start after f_1
• No new overlaps are created
• Same number of activities, so still optimal

Inductive Step (Subsequent Selections):

After selecting A₁, the subproblem contains all activities A_i with s_i ≥ f_1 (those compatible with A₁).

Let OPT' be an optimal solution containing A₁ (which we just proved exists). Then OPT' \ {A₁} is an optimal solution for the subproblem.

Why? If there were a better solution for the subproblem, adding A₁ to it would give a better overall solution, contradicting OPT' being optimal.

Thus, the subproblem has the same structure as the original, and the same greedy criterion (earliest finish time among remaining) applies. By induction, the greedy algorithm finds an optimal solution for every subproblem.

QED

The Power of the Swap

The key move was swapping A_k for A₁. This works because A₁ 'fits' anywhere A_k fits (finishing no later) while maintaining solution size. This swap-based reasoning is the essence of the exchange argument.

The Insertion Technique

The activity selection proof used a substitution (replacing A_k with A₁). Another common technique is insertion — showing that the greedy choice can be added to any optimal solution, sometimes requiring adjustments.

Example: Fractional Knapsack

Given items with weights w_i and values v_i, and knapsack capacity W, maximize total value while respecting capacity. Fractional items are allowed.

Greedy Choice: Take as much as possible of the item with highest value-to-weight ratio r_i = v_i / w_i.

Proof Sketch:

Let item 1 have the highest ratio r_1. Suppose OPT takes amount x_1 < min(w_1, W) of item 1 (less than maximum possible). We claim OPT can be improved or is suboptimal.

Let δ = min(w_1, W) - x_1 > 0 be the additional amount of item 1 we could take.

In OPT, the remaining capacity W - Σ(x_i · w_i) is filled with items of ratio ≤ r_1 (since item 1 has the highest ratio).

Construct OPT': reduce other items by total weight δ and increase x_1 by δ. The value change is:

Lost value: at most δ · r_j for some item j with r_j ≤ r_1
Gained value: exactly δ · r_1

Since r_1 ≥ r_j, the net change is:

δ · r_1 - δ · r_j = δ · (r_1 - r_j) ≥ 0

If r_1 > r_j, the value strictly increases, contradicting OPT being optimal.

If r_1 = r_j, the value stays the same, so OPT' is also optimal.

In either case, there exists an optimal solution taking the maximum of item 1. ∎

Insertion/Substitution Decision Guide
Problem Type	Technique	Key Observation
Selection (count-based)	Substitution	Can swap elements while maintaining count
Optimization (value-based)	Insertion	Adding greedy choice can only improve value
Ordering (sequence-based)	Adjacent swaps	Swapping adjacent elements reveals dominance
Covering (set-based)	Reduction	Greedy covers more than alternatives

When to use insertion vs. substitution:

Substitution is natural when solutions have fixed size (like activity selection where we maximize count). We swap one element for another.
Insertion is natural when we're maximizing a value that increases with additions (like knapsack value). We shift "weight" toward the greedy choice.
Adjacent swaps work for ordering problems where sequence matters (like job scheduling). We show that putting the greedy element first is never worse.

Choosing the right technique simplifies the proof significantly. The wrong technique might still work but require more complex case analysis.

Solution Structure Matters

The structure of feasible solutions determines which proof technique is most natural. Fixed-size solutions suggest substitution; value-based solutions suggest insertion; ordered solutions suggest swaps. Recognizing this structure is half the battle.

Common Proof Pitfalls

Even experienced algorithm designers make mistakes in greedy proofs. Being aware of common pitfalls helps you avoid them and recognize shaky proofs.

Pitfall 1: Assuming the greedy choice is unique or global

Mistake: "The greedy choice is the best choice, so it must be in every optimal solution."

Why it fails: Multiple optimal solutions may exist. The greedy choice must be in some optimal solution, not all. Also, "best" is subjective — the greedy criterion might not capture true optimality.

Pitfall 2: Proving the wrong property

Mistake: Proving that the greedy solution is "good" or "reasonable" rather than optimal.

Why it fails: A greedy algorithm might produce a solution that's 99% optimal on most inputs but arbitrarily bad on adversarial inputs. The proof must work for all inputs.

Pitfall 3: Ignoring feasibility after transformation

Mistake: Assuming that after substitution/insertion, the new solution is automatically feasible.

Why it fails: Constraints might be subtle. In activity selection, we needed to verify that swapping A₁ for A_k didn't create overlaps with other activities. This step is easy to skip but essential.

More Proof Pitfalls

•Pitfall 4: Non-inductive reasoning — Proving the first choice is safe but not handling subsequent choices. The greedy choice property must hold at every step.
•Pitfall 5: Circular reasoning — Assuming what you're trying to prove. 'The greedy choice is optimal because greedy is optimal' is not a proof.
•Pitfall 6: Incorrect quantifiers — Proving 'for some input, greedy works' instead of 'for all inputs, greedy works'. One counter-example breaks the proof.
•Pitfall 7: Oversimplified transformation — The transformation from OPT to OPT' might require multiple steps. Each step must maintain feasibility and optimality.
•Pitfall 8: Ignoring ties — When multiple choices are equally good under the greedy criterion, the proof must handle tie-breaking carefully.

How to avoid pitfalls:

Write down constraints explicitly — List all the conditions a feasible solution must satisfy. Check each one after transformation.
Consider edge cases — Empty inputs, single-element inputs, inputs where all elements are identical. These often reveal implicit assumptions.
Look for counter-examples actively — Before completing your proof, spend time trying to break it. A failed counter-example attempt strengthens your confidence; a successful one saves you from publishing an incorrect algorithm.
Verify optimality quantitatively — Don't just claim "OPT' is at least as good"; compute the objective value and show the inequality explicitly.

Trust But Verify

Even if a greedy approach seems obviously correct, go through the proof mechanics. Many 'obvious' greedy algorithms fail on subtle edge cases. The coin change problem (greedy fails for {1, 3, 4}) is a classic example of intuition betraying us.

Step-by-Step Proof Construction

Let's walk through constructing a proof from scratch for a new problem. This demonstrates the thought process that leads to successful proofs.

Problem: Job Scheduling to Minimize Total Lateness

We have n jobs, each with processing time p_i and deadline d_i. We must process jobs one at a time (no preemption). Job i finishes at time C_i = sum of processing times of all jobs scheduled before it plus p_i. The lateness of job i is L_i = max(0, C_i - d_i). Minimize total lateness: Σ L_i.

Step 1: Identify a greedy candidate

Possible greedy criteria:

Shortest processing time first (SPT)
Earliest deadline first (EDD)
Minimum slack (d_i - p_i) first

Let's try EDD (earliest deadline first) and attempt to prove it.

Step 2: Formulate the claim

Claim: Scheduling jobs in order of increasing deadline minimizes total lateness.

Step 3: Consider an arbitrary optimal schedule

Let OPT be any schedule that minimizes total lateness. If OPT already schedules jobs by increasing deadline, we're done. Otherwise, there exist adjacent jobs i and j where j is scheduled before i but d_j > d_i (an inversion).

edd_proof_construction.txt

Text

Step 4: Analyze swapping the inverted pair
 
Before swap (j then i):
|----j----|----i----|
          C_j       C_i
 
After swap (i then j):
|----i----|----j----|
          C'_i      C'_j
 
Observations:
• C'_i = C_j - p_j + p_i = C_j - (p_j - p_i)   
  [Actually: C'_i = previous_time + p_i, C'_j = C'_i + p_j = previous_time + p_i + p_j]
  Wait, let me recalculate...
 
Let t = completion time of jobs before j in OPT.
Before swap: C_j = t + p_j, C_i = t + p_j + p_i
After swap:  C'_i = t + p_i, C'_j = t + p_i + p_j = C_i
 
Key insight: C'_j = C_i (total completion time of the pair is the same)
 
Lateness changes:
Before: L_j = max(0, C_j - d_j),      L_i = max(0, C_i - d_i)
After:  L'_i = max(0, C'_i - d_i),    L'_j = max(0, C'_j - d_j)
 
Since C'_i ≤ C_j (we moved i earlier):
L'_i ≤ L_i? Not always... but L'_i + L'_j ≤ L_j + L_i?

Step 5: Complete the swap analysis

Let t = finish time of jobs scheduled before {i, j} in OPT.

Before swap:

C_j = t + p_j, lateness L_j = max(0, t + p_j - d_j)
C_i = t + p_j + p_i, lateness L_i = max(0, t + p_j + p_i - d_i)

After swap:

C'_i = t + p_i, lateness L'_i = max(0, t + p_i - d_i)
C'_j = t + p_i + p_j, lateness L'_j = max(0, t + p_i + p_j - d_j)

Key observation: Since d_j > d_i (the inversion condition), job i has an earlier deadline.

Compare total lateness of the pair before and after:

Max lateness of the pair before: max(L_j, L_i) = max(0, t + p_j + p_i - d_i) (since i finishes later and has earlier deadline, its lateness dominates)

Max lateness of the pair after: max(L'_i, L'_j) = max(0, t + p_i + p_j - d_j) (since j now finishes later but has later deadline)

Since d_j > d_i: t + p_i + p_j - d_j < t + p_i + p_j - d_i

So max lateness after ≤ max lateness before. The swap doesn't increase total lateness.

Step 6: Conclude optimality

Since every inversion can be eliminated by a swap that doesn't increase total lateness, we can transform OPT into EDD order without worsening the objective. Therefore, EDD achieves optimal total lateness.

QED

The Inversion Elimination Pattern

For scheduling and ordering problems, the 'eliminate inversions' approach is powerful. If every adjacent swap that resolves an inversion doesn't hurt the objective, the sorted order is optimal. This ties greedy correctness to sorting-based proofs.

Proof Templates in Action

Different problems call for different proof strategies. Here's a toolkit of templates with examples.

Template A: Direct Substitution

Used when: Greedy selects one element from a set; solutions are subsets.

Structure:

Let g be the greedy choice
Let OPT be optimal, possibly not containing g
If OPT contains some element x violating the greedy criterion
Show OPT' = OPT \ {x} ∪ {g} is feasible and optimal

Example: Activity selection — substitute earliest-finishing activity

Template B: Gradual Improvement

Used when: Greedy optimizes a continuous quantity; partial solutions have measurable value.

Structure:

Suppose the greedy choice g is not fully utilized in OPT
Show that increasing the utilization of g while decreasing others improves or maintains value
Repeat until g is fully utilized

Example: Fractional knapsack — shift weight toward highest-ratio item

Proof Templates Quick Reference
Template	When to Use	Key Move	Example Problem
Direct Substitution	Subset selection	Swap one element	Activity Selection
Gradual Improvement	Continuous optimization	Shift quantity	Fractional Knapsack
Inversion Elimination	Ordering/Scheduling	Swap adjacent pair	Job Scheduling by Deadline
Greedy Stays Ahead	Step-by-step comparison	Compare partial solutions	Huffman Coding
Contradiction	Global property	Assume greedy fails, derive contradiction	MST Cut Property

Template C: Inversion Elimination

Used when: Solution is a permutation/ordering; there's a natural "sorted" order.

Structure:

Define "inversion" as adjacent pair violating the greedy order
Show swapping an inversion doesn't hurt the objective
Conclude: no inversions (greedy order) is optimal

Example: Scheduling to minimize lateness — sort by deadline

Template D: Greedy Stays Ahead

Used when: Solutions are built incrementally; partial solutions are comparable.

Structure:

At each step k, compare greedy's partial solution G_k to any alternative's A_k
Show G_k is at least as good as A_k (by some measure)
Conclude: final greedy solution is at least as good as any alternative

Example: Huffman coding — greedy tree has minimum cost

Template E: Contradiction via Cut Property

Used when: Problem has graph structure with a cut property.

Structure:

Assume greedy edge e is not in MST T
Adding e to T creates a cycle crossing some cut
Show e is minimum-weight crossing edge (by greedy criterion)
Contradiction: could swap a heavier edge for e

Example: Kruskal's algorithm, Prim's algorithm

Choose the Right Tool

Matching the proof template to the problem structure makes proofs cleaner and more intuitive. A scheduling problem crying out for 'inversion elimination' will be awkward to prove via 'direct substitution'. Let the problem guide your approach.

When Proofs Fail — Recognizing Defeat

Not every proof attempt succeeds — and that's valuable information. When you can't prove a greedy approach correct, either you're missing a clever insight, or greedy genuinely doesn't work for the problem.

Signs the proof won't work:

Transformation creates infeasibility — When you try to substitute/insert the greedy choice, you keep creating constraint violations that can't be fixed locally.
Value strictly decreases — Your transformation makes the solution strictly worse, not just different. You can't find a transformation that maintains or improves value.
Counter-examples appear — While trying to prove correctness, you stumble upon inputs where greedy fails. This is definitive proof of failure.
Subproblem structure breaks — The greedy choice doesn't lead to a clean subproblem with the same structure. The remaining problem is fundamentally different.

01_knapsack_proof_failure.txt

Text

Attempting to prove 0/1 Knapsack (greedy by ratio):
 
Setup:
- Item 1: w=1, v=1, ratio=1 (greedy chooses this first)
- Item 2: w=10, v=9, ratio=0.9
- Capacity: W=10
 
Greedy behavior: Takes item 1 (remaining capacity = 9), can't fit item 2.
Greedy solution: {item 1}, value = 1
 
Optimal solution: {item 2}, value = 9
 
Attempting the proof:
1. Let OPT = {item 2}
2. Greedy choice g = item 1
3. Can we substitute/insert g?
   
   Option A: Add g to OPT
   - Total weight = 1 + 10 = 11 > 10 → INFEASIBLE
   
   Option B: Replace item 2 with item 1
   - Value drops from 9 to 1 → STRICTLY WORSE
   
   Option C: Take part of item 2 and add item 1
   - Not allowed (0/1 constraint) → INFEASIBLE
 
PROOF FAILS: No transformation maintains feasibility + optimality
 
Diagnosis: The 0/1 constraint prevents the smooth substitution that
works for fractional knapsack. Taking item 1 "blocks" the optimal choice.

What to do when proof fails:

Try a different greedy criterion — Maybe your selection rule is wrong but a different one works. For activity selection, earliest start time fails but earliest finish time works.
Look for dynamic programming — If greedy fails because of "blocking" effects or overlapping subproblems, DP might be the right approach. 0/1 knapsack is a classic DP problem.
Consider approximation — If the optimal problem is NP-hard, greedy might still give a provable approximation ratio. Set cover is NP-hard, but greedy achieves O(log n) approximation.
Document the counter-example — A clean counter-example is valuable. It tells others (and future you) not to go down this path and reveals problem structure.

Failed Proofs Are Progress

A failed proof attempt isn't wasted time. It deepens your understanding of the problem and often reveals why DP or other approaches are necessary. The insight 'greedy fails because of X' is often as valuable as 'greedy works because of Y'.

Summary: Proving Greedy Choices Are Safe

We've developed the skills to rigorously prove greedy algorithm correctness. Let's consolidate the key techniques:

Key Takeaways

•Follow the template — State greedy choice, consider arbitrary OPT, transform OPT to include greedy choice, verify feasibility and optimality.
•Choose the right technique — Substitution for selection problems, gradual improvement for optimization, inversion elimination for ordering.
•Verify every step — Especially feasibility after transformation. Don't assume constraints are maintained.
•Handle induction — The greedy choice property must hold at every step, not just the first. Verify the subproblem structure is preserved.
•Look for counter-examples — Actively try to break your proof. Finding a counter-example is valuable; not finding one increases confidence.
•Recognize failure — If every transformation approach fails, the problem likely requires DP or other techniques instead of greedy.

What's next:

The proofs we've seen share a common technique: the exchange argument. In the next page, we'll study this technique in depth — understanding its variants, its power, and the intuition that makes it so effective across diverse greedy problems.

Page Complete

You now have the tools to prove (or refute) greedy algorithm correctness. You can follow the standard proof template, choose appropriate techniques based on problem structure, and recognize when greedy approaches won't work. Next, we'll master the exchange argument — the workhorse of greedy proofs.

Proving a Choice Is Part of an Optimal Solution

From Intuition to Proof

This page teaches you to construct these proofs systematically. By the end, you'll be equipped with techniques that work across a wide range of greedy problems.

What You Will Learn

The Structure of Greedy Correctness Proofs

Greedy correctness proofs follow a recognizable pattern. Understanding this structure provides a template you can apply to new problems.

The Standard Proof Template:

State the greedy choice clearly — Precisely define what choice your greedy algorithm makes (e.g., "select the activity with earliest finish time")
Consider an arbitrary optimal solution — Let OPT be any optimal solution. Don't assume anything special about OPT except that it's optimal.
Case analysis on the greedy choice:
- Case A: OPT already includes the greedy choice → Done, we're safe.
- Case B: OPT doesn't include the greedy choice → Show we can modify OPT to include it without losing optimality.
Construct the modified solution — Build a new solution OPT' that includes the greedy choice.
Prove OPT' is still feasible — The modified solution satisfies all problem constraints.
Prove OPT' is still optimal — The modified solution has objective value at least as good as OPT (equal for exact optimality, or better/equal for optimization).
Invoke induction for remaining choices — Having established the first choice is safe, apply the same reasoning to the remaining subproblem.

The Key Insight

Why this template works:

This indirect approach is elegant because optimal solutions might be hard to characterize explicitly, but the transformations are often simple swaps or substitutions.

Detailed Example — Activity Selection

Let's work through a complete proof for the activity selection problem. This is the canonical example of greedy correctness proofs.

Problem Statement:

Given n activities with start times s_i and finish times f_i, select the maximum number of non-overlapping activities.

Greedy Algorithm:

Repeatedly select the activity with the earliest finish time among those compatible with already-selected activities.

Theorem: The greedy algorithm produces an optimal solution.

Proof:

We prove by induction on the number of activities selected.

Base Case (First Selection):

Let A₁ be the activity with the earliest finish time among all activities. We claim that some optimal solution contains A₁.

Let OPT be any optimal solution. Consider two cases:

Case A: A₁ ∈ OPT. Then we're done — OPT already contains our greedy choice.

Case B: A₁ ∉ OPT. Let A_k be the activity in OPT with the earliest finish time. Since A₁ has the globally earliest finish time:

$$f_1 \leq f_k$$

Construct OPT' = (OPT \ {A_k}) ∪ {A₁}. That is, replace A_k with A₁ in OPT.

Claim: OPT' is feasible (no overlaps among its activities).

Claim: OPT' is optimal.

Proof: |OPT'| = |OPT \ {A_k}| + 1 = |OPT| - 1 + 1 = |OPT|. Since OPT was optimal and OPT' has the same size, OPT' is also optimal. ∎

Thus, OPT' is an optimal solution containing A₁, proving the greedy choice is safe.

activity_selection_proof_visualization.txt

Text

Timeline Visualization of the Swap:
 
Original OPT (not containing A₁):
    A_k         A_j         A_m
|---------|  |------|  |---------|
     f_k        f_j        f_m
 
A₁ (earliest finish overall):
|-----|
  f_1
 
Since f_1 ≤ f_k:
 
Modified OPT' (after swapping A_k → A₁):
  A₁          A_j         A_m
|-----|    |------|  |---------|
  f_1         f_j        f_m
 
• A₁ finishes no later than A_k did
• All activities that started after f_k still start after f_1
• No new overlaps are created
• Same number of activities, so still optimal

Inductive Step (Subsequent Selections):

After selecting A₁, the subproblem contains all activities A_i with s_i ≥ f_1 (those compatible with A₁).

Let OPT' be an optimal solution containing A₁ (which we just proved exists). Then OPT' \ {A₁} is an optimal solution for the subproblem.

Why? If there were a better solution for the subproblem, adding A₁ to it would give a better overall solution, contradicting OPT' being optimal.

QED

The Power of the Swap

The Insertion Technique

Example: Fractional Knapsack

Given items with weights w_i and values v_i, and knapsack capacity W, maximize total value while respecting capacity. Fractional items are allowed.

Greedy Choice: Take as much as possible of the item with highest value-to-weight ratio r_i = v_i / w_i.

Proof Sketch:

Let item 1 have the highest ratio r_1. Suppose OPT takes amount x_1 < min(w_1, W) of item 1 (less than maximum possible). We claim OPT can be improved or is suboptimal.

Let δ = min(w_1, W) - x_1 > 0 be the additional amount of item 1 we could take.

In OPT, the remaining capacity W - Σ(x_i · w_i) is filled with items of ratio ≤ r_1 (since item 1 has the highest ratio).

Construct OPT': reduce other items by total weight δ and increase x_1 by δ. The value change is:

Lost value: at most δ · r_j for some item j with r_j ≤ r_1
Gained value: exactly δ · r_1

Since r_1 ≥ r_j, the net change is:

δ · r_1 - δ · r_j = δ · (r_1 - r_j) ≥ 0

If r_1 > r_j, the value strictly increases, contradicting OPT being optimal.

If r_1 = r_j, the value stays the same, so OPT' is also optimal.

In either case, there exists an optimal solution taking the maximum of item 1. ∎

Insertion/Substitution Decision Guide
Problem Type	Technique	Key Observation
Selection (count-based)	Substitution	Can swap elements while maintaining count
Optimization (value-based)	Insertion	Adding greedy choice can only improve value
Ordering (sequence-based)	Adjacent swaps	Swapping adjacent elements reveals dominance
Covering (set-based)	Reduction	Greedy covers more than alternatives

When to use insertion vs. substitution:

Substitution is natural when solutions have fixed size (like activity selection where we maximize count). We swap one element for another.
Insertion is natural when we're maximizing a value that increases with additions (like knapsack value). We shift "weight" toward the greedy choice.
Adjacent swaps work for ordering problems where sequence matters (like job scheduling). We show that putting the greedy element first is never worse.

Choosing the right technique simplifies the proof significantly. The wrong technique might still work but require more complex case analysis.

Solution Structure Matters

Common Proof Pitfalls

Even experienced algorithm designers make mistakes in greedy proofs. Being aware of common pitfalls helps you avoid them and recognize shaky proofs.

Pitfall 1: Assuming the greedy choice is unique or global

Mistake: "The greedy choice is the best choice, so it must be in every optimal solution."

Pitfall 2: Proving the wrong property

Mistake: Proving that the greedy solution is "good" or "reasonable" rather than optimal.

Why it fails: A greedy algorithm might produce a solution that's 99% optimal on most inputs but arbitrarily bad on adversarial inputs. The proof must work for all inputs.

Pitfall 3: Ignoring feasibility after transformation

Mistake: Assuming that after substitution/insertion, the new solution is automatically feasible.

Why it fails: Constraints might be subtle. In activity selection, we needed to verify that swapping A₁ for A_k didn't create overlaps with other activities. This step is easy to skip but essential.

More Proof Pitfalls

•Pitfall 4: Non-inductive reasoning — Proving the first choice is safe but not handling subsequent choices. The greedy choice property must hold at every step.
•Pitfall 5: Circular reasoning — Assuming what you're trying to prove. 'The greedy choice is optimal because greedy is optimal' is not a proof.
•Pitfall 6: Incorrect quantifiers — Proving 'for some input, greedy works' instead of 'for all inputs, greedy works'. One counter-example breaks the proof.
•Pitfall 7: Oversimplified transformation — The transformation from OPT to OPT' might require multiple steps. Each step must maintain feasibility and optimality.
•Pitfall 8: Ignoring ties — When multiple choices are equally good under the greedy criterion, the proof must handle tie-breaking carefully.

How to avoid pitfalls:

Write down constraints explicitly — List all the conditions a feasible solution must satisfy. Check each one after transformation.
Consider edge cases — Empty inputs, single-element inputs, inputs where all elements are identical. These often reveal implicit assumptions.
Look for counter-examples actively — Before completing your proof, spend time trying to break it. A failed counter-example attempt strengthens your confidence; a successful one saves you from publishing an incorrect algorithm.
Verify optimality quantitatively — Don't just claim "OPT' is at least as good"; compute the objective value and show the inequality explicitly.

Trust But Verify

Step-by-Step Proof Construction

Let's walk through constructing a proof from scratch for a new problem. This demonstrates the thought process that leads to successful proofs.

Problem: Job Scheduling to Minimize Total Lateness

Step 1: Identify a greedy candidate

Possible greedy criteria:

Shortest processing time first (SPT)
Earliest deadline first (EDD)
Minimum slack (d_i - p_i) first

Let's try EDD (earliest deadline first) and attempt to prove it.

Step 2: Formulate the claim

Claim: Scheduling jobs in order of increasing deadline minimizes total lateness.

Step 3: Consider an arbitrary optimal schedule

edd_proof_construction.txt

Text

Step 4: Analyze swapping the inverted pair
 
Before swap (j then i):
|----j----|----i----|
          C_j       C_i
 
After swap (i then j):
|----i----|----j----|
          C'_i      C'_j
 
Observations:
• C'_i = C_j - p_j + p_i = C_j - (p_j - p_i)   
  [Actually: C'_i = previous_time + p_i, C'_j = C'_i + p_j = previous_time + p_i + p_j]
  Wait, let me recalculate...
 
Let t = completion time of jobs before j in OPT.
Before swap: C_j = t + p_j, C_i = t + p_j + p_i
After swap:  C'_i = t + p_i, C'_j = t + p_i + p_j = C_i
 
Key insight: C'_j = C_i (total completion time of the pair is the same)
 
Lateness changes:
Before: L_j = max(0, C_j - d_j),      L_i = max(0, C_i - d_i)
After:  L'_i = max(0, C'_i - d_i),    L'_j = max(0, C'_j - d_j)
 
Since C'_i ≤ C_j (we moved i earlier):
L'_i ≤ L_i? Not always... but L'_i + L'_j ≤ L_j + L_i?

Step 5: Complete the swap analysis

Let t = finish time of jobs scheduled before {i, j} in OPT.

Before swap:

C_j = t + p_j, lateness L_j = max(0, t + p_j - d_j)
C_i = t + p_j + p_i, lateness L_i = max(0, t + p_j + p_i - d_i)

After swap:

C'_i = t + p_i, lateness L'_i = max(0, t + p_i - d_i)
C'_j = t + p_i + p_j, lateness L'_j = max(0, t + p_i + p_j - d_j)

Key observation: Since d_j > d_i (the inversion condition), job i has an earlier deadline.

Compare total lateness of the pair before and after:

Max lateness of the pair before: max(L_j, L_i) = max(0, t + p_j + p_i - d_i) (since i finishes later and has earlier deadline, its lateness dominates)

Max lateness of the pair after: max(L'_i, L'_j) = max(0, t + p_i + p_j - d_j) (since j now finishes later but has later deadline)

Since d_j > d_i: t + p_i + p_j - d_j < t + p_i + p_j - d_i

So max lateness after ≤ max lateness before. The swap doesn't increase total lateness.

Step 6: Conclude optimality

QED

The Inversion Elimination Pattern

Proof Templates in Action

Different problems call for different proof strategies. Here's a toolkit of templates with examples.

Template A: Direct Substitution

Used when: Greedy selects one element from a set; solutions are subsets.

Structure:

Let g be the greedy choice
Let OPT be optimal, possibly not containing g
If OPT contains some element x violating the greedy criterion
Show OPT' = OPT \ {x} ∪ {g} is feasible and optimal

Example: Activity selection — substitute earliest-finishing activity

Template B: Gradual Improvement

Used when: Greedy optimizes a continuous quantity; partial solutions have measurable value.

Structure:

Suppose the greedy choice g is not fully utilized in OPT
Show that increasing the utilization of g while decreasing others improves or maintains value
Repeat until g is fully utilized

Example: Fractional knapsack — shift weight toward highest-ratio item

Proof Templates Quick Reference
Template	When to Use	Key Move	Example Problem
Direct Substitution	Subset selection	Swap one element	Activity Selection
Gradual Improvement	Continuous optimization	Shift quantity	Fractional Knapsack
Inversion Elimination	Ordering/Scheduling	Swap adjacent pair	Job Scheduling by Deadline
Greedy Stays Ahead	Step-by-step comparison	Compare partial solutions	Huffman Coding
Contradiction	Global property	Assume greedy fails, derive contradiction	MST Cut Property

Template C: Inversion Elimination

Used when: Solution is a permutation/ordering; there's a natural "sorted" order.

Structure:

Define "inversion" as adjacent pair violating the greedy order
Show swapping an inversion doesn't hurt the objective
Conclude: no inversions (greedy order) is optimal

Example: Scheduling to minimize lateness — sort by deadline

Template D: Greedy Stays Ahead

Used when: Solutions are built incrementally; partial solutions are comparable.

Structure:

At each step k, compare greedy's partial solution G_k to any alternative's A_k
Show G_k is at least as good as A_k (by some measure)
Conclude: final greedy solution is at least as good as any alternative

Example: Huffman coding — greedy tree has minimum cost

Template E: Contradiction via Cut Property

Used when: Problem has graph structure with a cut property.

Structure:

Assume greedy edge e is not in MST T
Adding e to T creates a cycle crossing some cut
Show e is minimum-weight crossing edge (by greedy criterion)
Contradiction: could swap a heavier edge for e

Example: Kruskal's algorithm, Prim's algorithm

Choose the Right Tool

When Proofs Fail — Recognizing Defeat

Signs the proof won't work:

Transformation creates infeasibility — When you try to substitute/insert the greedy choice, you keep creating constraint violations that can't be fixed locally.
Value strictly decreases — Your transformation makes the solution strictly worse, not just different. You can't find a transformation that maintains or improves value.
Counter-examples appear — While trying to prove correctness, you stumble upon inputs where greedy fails. This is definitive proof of failure.
Subproblem structure breaks — The greedy choice doesn't lead to a clean subproblem with the same structure. The remaining problem is fundamentally different.

01_knapsack_proof_failure.txt

Text

Attempting to prove 0/1 Knapsack (greedy by ratio):
 
Setup:
- Item 1: w=1, v=1, ratio=1 (greedy chooses this first)
- Item 2: w=10, v=9, ratio=0.9
- Capacity: W=10
 
Greedy behavior: Takes item 1 (remaining capacity = 9), can't fit item 2.
Greedy solution: {item 1}, value = 1
 
Optimal solution: {item 2}, value = 9
 
Attempting the proof:
1. Let OPT = {item 2}
2. Greedy choice g = item 1
3. Can we substitute/insert g?
   
   Option A: Add g to OPT
   - Total weight = 1 + 10 = 11 > 10 → INFEASIBLE
   
   Option B: Replace item 2 with item 1
   - Value drops from 9 to 1 → STRICTLY WORSE
   
   Option C: Take part of item 2 and add item 1
   - Not allowed (0/1 constraint) → INFEASIBLE
 
PROOF FAILS: No transformation maintains feasibility + optimality
 
Diagnosis: The 0/1 constraint prevents the smooth substitution that
works for fractional knapsack. Taking item 1 "blocks" the optimal choice.

What to do when proof fails:

Try a different greedy criterion — Maybe your selection rule is wrong but a different one works. For activity selection, earliest start time fails but earliest finish time works.
Look for dynamic programming — If greedy fails because of "blocking" effects or overlapping subproblems, DP might be the right approach. 0/1 knapsack is a classic DP problem.
Consider approximation — If the optimal problem is NP-hard, greedy might still give a provable approximation ratio. Set cover is NP-hard, but greedy achieves O(log n) approximation.
Document the counter-example — A clean counter-example is valuable. It tells others (and future you) not to go down this path and reveals problem structure.

Failed Proofs Are Progress

Summary: Proving Greedy Choices Are Safe

We've developed the skills to rigorously prove greedy algorithm correctness. Let's consolidate the key techniques:

Key Takeaways

•Follow the template — State greedy choice, consider arbitrary OPT, transform OPT to include greedy choice, verify feasibility and optimality.
•Choose the right technique — Substitution for selection problems, gradual improvement for optimization, inversion elimination for ordering.
•Verify every step — Especially feasibility after transformation. Don't assume constraints are maintained.
•Handle induction — The greedy choice property must hold at every step, not just the first. Verify the subproblem structure is preserved.
•Look for counter-examples — Actively try to break your proof. Finding a counter-example is valuable; not finding one increases confidence.
•Recognize failure — If every transformation approach fails, the problem likely requires DP or other techniques instead of greedy.

What's next:

Page Complete