Loading content...
We've implemented an elegant greedy algorithm for Activity Selection. It seems correct—our examples work, the logic feels right. But in algorithm design, intuition isn't enough. Many plausible-sounding greedy strategies fail on edge cases. How do we know ours doesn't?
This page develops a rigorous proof that the sort-by-finish-time greedy algorithm is optimal. More importantly, it teaches the exchange argument—a powerful proof technique that applies to many greedy problems. Understanding this proof transforms Activity Selection from a memorized recipe into a principled approach you can generalize.
By the end of this page, you will understand the formal structure of greedy correctness proofs, master the exchange argument technique, prove both the greedy choice property and optimal substructure for Activity Selection, and know how to apply these techniques to other problems.
Proving a greedy algorithm correct requires demonstrating two properties:
Property 1: Greedy Choice Property
There exists an optimal solution that includes the greedy choice. In other words, making the locally optimal choice doesn't exclude us from finding a globally optimal solution.
For Activity Selection: There exists an optimal solution containing the activity with the earliest finish time.
Property 2: Optimal Substructure
After making the greedy choice, the remaining subproblem is a smaller instance of the same problem, and an optimal solution to the original problem contains an optimal solution to this subproblem.
For Activity Selection: After selecting the earliest-finishing activity, the optimal solution to remaining compatible activities combined with our choice gives an optimal overall solution.
Together, these properties enable a proof by induction: the greedy choice is safe (greedy choice property), and after making it, we solve a smaller problem optimally (optimal substructure). By induction on problem size, the entire greedy algorithm is optimal.
Proof templates:
Most greedy proofs follow one of two patterns:
Exchange Argument: Start with an arbitrary optimal solution. Show that you can exchange elements to include the greedy choice, without reducing solution quality. Conclude that an optimal solution with the greedy choice exists.
Greedy Stays Ahead: Track a measure of progress. Show that at every step, the greedy solution is "ahead of" or "as good as" any alternative. Conclude that the final greedy solution is optimal.
For Activity Selection, the exchange argument is most natural. We'll develop it in detail.
Theorem (Greedy Choice Property for Activity Selection):
Let S be a set of activities. Let a₁ be the activity in S with the earliest finish time. Then there exists an optimal solution A* to the Activity Selection Problem on S such that a₁ ∈ A*.
Proof:
Let A be any optimal solution for S. We consider two cases:
Case 1: a₁ ∈ A
We're done—A is an optimal solution containing a₁, so set A* = A.
Case 2: a₁ ∉ A
Let a_j be the activity in A with the earliest finish time (among activities in A). Since a₁ has the earliest finish time among all activities in S, we have:
f₁ ≤ f_j
Now construct A' = (A - {a_j}) ∪ {a₁}. We replace a_j with a₁ in the optimal solution.
Claim: A' is a valid solution (all activities in A' are mutually compatible).
Justification:
Claim: A' is optimal (|A'| = |A|).
Justification:
|A'| = |A - {a_j}| + 1 = (|A| - 1) + 1 = |A|
Since A was optimal and |A'| = |A|, A' is also optimal.
Conclusion:
A' is an optimal solution containing a₁. Set A* = A'. ∎
The key insight is that the earliest-finishing activity can always 'swap into' any optimal solution without breaking compatibility or reducing size. This is the exchange argument: we don't claim the greedy choice is in every optimal solution, only that we can construct one containing it.
Let's visualize why the exchange argument works:
Before Exchange:
Time: 0 1 2 3 4 5 6 7 8 9 10
|--|--|--|--|--|--|--|--|--|--|
a₁: [===] (earliest finish, NOT in optimal A)
a_j: [====] (first activity in optimal A)
a_k: [======] (another activity in A)
a_m: [===] (another activity in A)
Optimal A = {a_j, a_k, a_m} (|A| = 3)
After Exchange (replace a_j with a₁):
Time: 0 1 2 3 4 5 6 7 8 9 10
|--|--|--|--|--|--|--|--|--|--|
a₁: [===] (now included)
a_k: [======] (still compatible!)
a_m: [===] (still compatible!)
New A' = {a₁, a_k, a_m} (|A'| = 3, still optimal)
Why compatibility is preserved:
Since a₁ finishes before a_j (f₁ ≤ f_j), and a_k started after a_j finished (s_k ≥ f_j), we have:
f₁ ≤ f_j ≤ s_k
So a₁ finishes before a_k starts—they're compatible. The same logic applies to all other activities that were compatible with a_j.
Geometrically, a₁'s interval ends leftward (earlier) than a_j's. Anything that didn't overlap with a_j on the right definitely won't overlap with a₁, which extends even less to the right. This is why finishing earlier can never hurt compatibility.
Theorem (Optimal Substructure for Activity Selection):
Let S be a set of activities, and let a₁ be the activity with the earliest finish time. Let S' = {a ∈ S : s_a ≥ f₁} be the activities that start after a₁ finishes (i.e., activities compatible with a₁ that come after it).
If A is an optimal solution for S containing a₁, then A - {a₁} is an optimal solution for the subproblem S'.
Proof (by contradiction):
Suppose A - {a₁} is not optimal for S'. Then there exists a solution B for S' with |B| > |A - {a₁}| = |A| - 1.
Consider the set A' = {a₁} ∪ B.
Claim: A' is a valid solution for S.
Justification:
Claim: |A'| > |A|, contradicting optimality of A.
Justification:
|A'| = 1 + |B| > 1 + (|A| - 1) = |A|
This contradicts our assumption that A is optimal for S.
Conclusion:
Our supposition is false. Therefore, A - {a₁} must be optimal for S'. ∎
What this means:
Optimal substructure tells us that the problem decomposes correctly. After selecting the earliest-finishing activity a₁, solving the remaining subproblem S' optimally and combining with a₁ gives an optimal overall solution.
This justifies our recursive/iterative approach: after each greedy selection, we're left with a smaller instance of the same problem.
Theorem (Correctness of Greedy Activity Selection):
The greedy algorithm that repeatedly selects the activity with the earliest finish time (among compatible remaining activities) produces an optimal solution to the Activity Selection Problem.
Proof (by strong induction on n, the number of activities):
Base Case (n = 0):
With zero activities, the greedy algorithm returns an empty set. The only solution is the empty set, which is optimal. ✓
Base Case (n = 1):
With one activity, the greedy algorithm selects it. This is optimal since selecting one activity is better than selecting none. ✓
Inductive Step:
Assume the greedy algorithm is optimal for all problems with fewer than n activities (strong induction hypothesis). Consider a problem with n activities.
Let a₁ be the activity with the earliest finish time. By the Greedy Choice Property, there exists an optimal solution A* containing a₁.
The greedy algorithm selects a₁, then solves the subproblem S' = {a ∈ S : s_a ≥ f₁}.
The subproblem S' has fewer than n activities (at least a₁ is removed, possibly more if activities overlapped with a₁).
By the induction hypothesis, the greedy algorithm produces an optimal solution B* for S'.
By Optimal Substructure, since A* is optimal for S and contains a₁, we know A* - {a₁} is optimal for S'. Therefore, |B*| = |A* - {a₁}| = |A*| - 1.
The greedy algorithm returns {a₁} ∪ B*, which has size 1 + |B*| = 1 + (|A*| - 1) = |A*|.
Since |A*| is optimal, the greedy solution is optimal. ∎
We have rigorously proven that the greedy Activity Selection algorithm is optimal. This isn't intuition—it's mathematical certainty. The algorithm will find a maximum compatible set for any valid input.
For completeness, let's sketch an alternative proof using the "greedy stays ahead" technique:
Theorem (Greedy Stays Ahead):
Let G = {g₁, g₂, ..., g_k} be the greedy solution (activities in order of selection). Let O = {o₁, o₂, ..., o_m} be any optimal solution (activities in finish time order).
Then for all i ≤ min(k, m): f(g_i) ≤ f(o_i).
In other words, at every step, the i-th activity selected by greedy finishes no later than the i-th activity in any optimal solution.
Proof (by induction on i):
Base Case (i = 1):
The greedy algorithm selects g₁, the activity with the earliest finish time among all activities. Since o₁ ∈ S, we have f(g₁) ≤ f(o₁). ✓
Inductive Step:
Assume f(g_{i-1}) ≤ f(o_{i-1}) for some i > 1.
Since o_i is compatible with o_{i-1}, we have s(o_i) ≥ f(o_{i-1}) ≥ f(g_{i-1}).
Therefore, o_i is a candidate for the greedy algorithm's i-th selection (it's compatible with g_{i-1}).
The greedy algorithm selects g_i as the earliest-finishing activity compatible with g_{i-1}. Since o_i is a candidate, we have:
f(g_i) ≤ f(o_i) ✓
Conclusion:
By induction, greedy stays ahead at every step.
Corollary: k ≥ m (greedy selects at least as many activities as any optimal solution).
Proof: If k < m, then the greedy solution ends with g_k, and we have f(g_k) ≤ f(o_k). But then o_{k+1} exists, starts after f(o_k) ≥ f(g_k), so is compatible with g_k. The greedy algorithm would have selected it, contradicting that greedy stopped at k activities.
Since greedy selects at least as many as optimal, and can't exceed optimal (by definition), greedy is optimal. ∎
Exchange arguments are ideal when you can show swapping maintains solution quality (Activity Selection, Huffman coding). Greedy-stays-ahead works well when you can define a progress measure that greedy dominates (interval scheduling, minimum spanning trees in some formulations). Both are valid—choose whichever is more natural for the problem.
Our proof depended critically on the earliest-finish-time ordering. Let's see why other orderings can't be proven correct—because they're not correct:
| Ordering | Where Proof Breaks | Counterexample |
|---|---|---|
| Earliest Start | Exchange fails: earliest start might finish very late, making exchange incompatible | [0, 100], [1, 2], [3, 4] → greedy takes [0,100], misses [1,2] and [3,4] |
| Shortest Duration | Exchange fails: short activity might be in the middle, blocking both ends | [0, 4], [3, 6], [5, 9] → greedy takes [3,6] (duration 3), can't take either other |
| Fewest Conflicts | Exchange/stays-ahead both fail: local conflict count doesn't predict global compatibility | Complex examples exist where low-conflict activity blocks optimal coverage |
The key insight:
Earliest finish time is special because it provides a monotonic guarantee: selecting an activity that finishes earlier never reduces the set of future-compatible activities. Every activity compatible with a later-finishing activity is also compatible with an earlier-finishing one.
This monotonicity is what makes the exchange argument work. Other orderings lack this property—selecting the "best" activity under other criteria might exclude activities that would have led to larger solutions.
If you can't prove your greedy strategy works, it probably doesn't. The inability to construct a proof often signals the existence of counterexamples. When developing greedy algorithms, always attempt a proof—it's both a validation tool and a forcing function for correct design.
The exchange argument we developed for Activity Selection is a template that applies to many greedy problems. Here's the general pattern:
Exchange Argument Template:
State the greedy choice: Identify what the algorithm selects first/next.
Consider an arbitrary optimal solution: Let OPT be any optimal solution.
Case analysis:
Construct the exchange: Create OPT' by swapping the greedy choice in.
Prove validity: Show OPT' satisfies all problem constraints.
Prove non-inferiority: Show OPT' is at least as good as OPT (|OPT'| ≥ |OPT| or cost(OPT') ≤ cost(OPT)).
Conclude: An optimal solution containing the greedy choice exists.
Practice is essential. When learning a new greedy algorithm, always work through its exchange argument yourself. Ask: What gets swapped? Why is the swap valid? Why doesn't quality decrease? With practice, constructing these arguments becomes natural.
We've completed a rigorous treatment of Activity Selection's correctness. Let's consolidate what we've learned:
You now have complete mastery of the Activity Selection Problem—from problem formulation to efficient algorithm to rigorous proof. This isn't just one algorithm; it's a template for understanding greedy thinking. Apply the exchange argument to new problems, and you'll recognize when greedy works and when it doesn't.
Connecting to the bigger picture:
Activity Selection is the gateway to understanding greedy algorithms deeply. You now have:
As you encounter new greedy problems—Huffman coding, MST algorithms, interval partitioning—apply this framework. Ask: Can I prove a greedy choice property via exchange? Does optimal substructure hold? If yes, you have a correct greedy algorithm. If not, look to dynamic programming or other techniques.
This is how expertise develops: not just memorizing algorithms, but understanding why they work and when they apply.