Loading content...
With formal definitions established, we now develop the algorithm that solves Activity Selection optimally. The strategy is elegantly simple: sort activities by finish time, then greedily select each activity that doesn't conflict with previously selected ones.
This page provides complete implementation, step-by-step execution traces, thorough complexity analysis, and—crucially—explains why this particular ordering works when others fail.
By the end of this page, you will be able to implement Activity Selection from scratch in any programming language, trace the algorithm's execution on any input, analyze time and space complexity, and articulate precisely why sorting by finish time (not start time, duration, or conflicts) yields optimal solutions.
The Activity Selection greedy algorithm consists of two phases:
Phase 1: Sort Sort all activities by their finish time in ascending order.
Phase 2: Select Iterate through sorted activities, selecting each activity that starts at or after the finish time of the last selected activity.
Pseudocode:
ActivitySelection(activities):
// Phase 1: Sort by finish time
Sort activities by finish time (ascending)
// Phase 2: Greedy selection
selected = [activities[0]] // First activity always selected
lastFinish = activities[0].finish
for i = 1 to n-1:
if activities[i].start >= lastFinish:
selected.append(activities[i])
lastFinish = activities[i].finish
return selected
The algorithm is remarkably concise. Its power lies not in complexity, but in the insight that this simple procedure is optimal.
By always picking the activity that finishes earliest among compatible options, we maximize the remaining time for future activities. Finishing early never hurts—it can only help or have no effect. This is the greedy choice property in action.
Here are production-quality implementations in multiple languages, handling edge cases and returning both the selected activities and their count.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
from typing import List, Tuple def activity_selection(activities: List[Tuple[int, int]]) -> List[Tuple[int, int]]: """ Solve the Activity Selection Problem using greedy algorithm. Args: activities: List of (start_time, finish_time) tuples Returns: List of selected activities (maximum non-overlapping set) Time Complexity: O(n log n) for sorting + O(n) for selection = O(n log n) Space Complexity: O(n) for the result list (or O(1) if modifying in place) """ if not activities: return [] # Phase 1: Sort by finish time sorted_activities = sorted(activities, key=lambda x: x[1]) # Phase 2: Greedy selection selected = [sorted_activities[0]] # First activity always optimal to include last_finish = sorted_activities[0][1] for i in range(1, len(sorted_activities)): start, finish = sorted_activities[i] # Select if compatible with last selected activity if start >= last_finish: selected.append(sorted_activities[i]) last_finish = finish return selected def activity_selection_with_indices(activities: List[Tuple[int, int]]) -> List[int]: """ Variant that returns indices of selected activities in original order. Useful when activities have associated metadata. """ if not activities: return [] # Include original indices indexed = [(i, start, finish) for i, (start, finish) in enumerate(activities)] indexed.sort(key=lambda x: x[2]) # Sort by finish time selected_indices = [indexed[0][0]] last_finish = indexed[0][2] for i in range(1, len(indexed)): idx, start, finish = indexed[i] if start >= last_finish: selected_indices.append(idx) last_finish = finish return selected_indices # Example usageif __name__ == "__main__": activities = [ (1, 4), (3, 5), (0, 6), (5, 7), (3, 9), (5, 9), (6, 10), (8, 11), (8, 12), (12, 16) ] result = activity_selection(activities) print(f"Selected {len(result)} activities: {result}") # Output: Selected 4 activities: [(1, 4), (5, 7), (8, 11), (12, 16)]Let's trace the algorithm on our example with 10 activities:
Input activities (unsorted):
Index Start Finish
0 1 4
1 3 5
2 0 6
3 5 7
4 3 9
5 5 9
6 6 10
7 8 11
8 8 12
9 12 16
Phase 1: Sort by Finish Time
Index Start Finish (Original Index)
0 1 4 (0)
1 3 5 (1)
2 0 6 (2)
3 5 7 (3)
4 3 9 (4)
5 5 9 (5)
6 6 10 (6)
7 8 11 (7)
8 8 12 (8)
9 12 16 (9)
Note: Activities are already sorted by finish time in this example.
| Step | Activity | Start | Finish | Last Finish | Compatible? | Action |
|---|---|---|---|---|---|---|
| 1 | (1, 4) | 1 | 4 | — | — | SELECT (first activity) |
| 2 | (3, 5) | 3 | 5 | 4 | ❌ 3 < 4 | SKIP |
| 3 | (0, 6) | 0 | 6 | 4 | ❌ 0 < 4 | SKIP |
| 4 | (5, 7) | 5 | 7 | 4 | ✅ 5 ≥ 4 | SELECT |
| 5 | (3, 9) | 3 | 9 | 7 | ❌ 3 < 7 | SKIP |
| 6 | (5, 9) | 5 | 9 | 7 | ❌ 5 < 7 | SKIP |
| 7 | (6, 10) | 6 | 10 | 7 | ❌ 6 < 7 | SKIP |
| 8 | (8, 11) | 8 | 11 | 7 | ✅ 8 ≥ 7 | SELECT |
| 9 | (8, 12) | 8 | 12 | 11 | ❌ 8 < 11 | SKIP |
| 10 | (12, 16) | 12 | 16 | 11 | ✅ 12 ≥ 11 | SELECT |
Result: Selected 4 activities: (1,4), (5,7), (8,11), (12,16)
Visual Timeline:
Time: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|
SELECTED:
(1,4) [===]
(5,7) [===]
(8,11) [=====]
(12,16) [========]
SKIPPED (would overlap):
(3,5) [==] ← overlaps with (1,4)
(0,6) [======] ← overlaps with (1,4)
(3,9) [======] ← overlaps with (1,4) and (5,7)
(5,9) [====] ← overlaps with (5,7)
(6,10) [====] ← overlaps with (5,7) and (8,11)
(8,12) [=====] ← overlaps with (8,11)
The algorithm's correctness hinges on sorting by finish time. But why this ordering? Why not sort by start time, duration, or number of conflicts? Let's examine each alternative:
Strategy: Sort by finish time ascending, select first compatible.
Why it works:
Maximizes remaining capacity: By finishing early, we leave maximum time for future activities.
Never excludes better options: If activity A finishes before B, taking A instead of B never reduces future options—anything compatible with B after B's finish is also compatible with A after A's finish.
Greedy choice property: There always exists an optimal solution containing the earliest-finishing activity.
Optimal substructure: After choosing earliest finish, the remaining problem is a smaller activity selection problem.
Counterexample resistance: No counterexample exists because the approach is provably optimal.
For maximizing activity count: sorting by finish time (ascending) is the unique greedy ordering that guarantees optimality. For different objectives (minimum lateness, meeting room allocation), different orderings become optimal. Always match the ordering to the objective.
Let's rigorously analyze the algorithm's time and space complexity.
| Phase | Operation | Time Complexity | Space Complexity |
|---|---|---|---|
| Sort | Comparison-based sort | O(n log n) | O(n) or O(log n)* |
| Selection | Single pass through sorted activities | O(n) | O(1) for iteration |
| Result | Store selected activities | — | O(k) where k ≤ n |
| Total | — | O(n log n) | O(n) |
*Sort space depends on algorithm: Merge sort uses O(n), QuickSort uses O(log n) for recursion.
Detailed Time Analysis:
Sorting: Standard comparison sorts (merge sort, quicksort, Timsort) run in O(n log n) worst-case or expected time.
Selection Loop: We iterate through n activities exactly once. Each iteration performs O(1) comparisons and updates.
Total: O(n log n) + O(n) = O(n log n)
Space Analysis:
Sorting: Most implementations use O(n) auxiliary space. In-place variants like heapsort use O(1).
Result Storage: We store up to n activities in the result. In the worst case (all mutually compatible), this is O(n).
Total: O(n)
Is O(n log n) the best possible? For comparison-based algorithms, yes—any algorithm that must distinguish between all possible orderings of n activities requires Ω(n log n) comparisons. However, if activities have bounded integer coordinates, counting sort could reduce sorting to O(n + k) where k is the coordinate range.
Optimizations and Variants:
Pre-sorted input: If input is already sorted by finish time, skip sorting for O(n) total.
Count only: If you only need the count (not the activities), you can avoid storing results: O(1) space.
Online variant: Activities arrive one at a time. Use an online algorithm that maintains selected activities in a balanced tree for O(log n) per insertion.
Parallelization: Sorting can be parallelized; selection is inherently sequential.
Production implementations must handle edge cases gracefully:
1234567891011121314151617181920212223242526272829
def test_edge_cases(): # Empty input assert activity_selection([]) == [] # Single activity assert activity_selection([(1, 5)]) == [(1, 5)] # All overlap: only one can be selected result = activity_selection([(0, 10), (1, 9), (2, 8)]) assert len(result) == 1 assert result[0] == (2, 8) # Earliest finish # All compatible: all selected result = activity_selection([(1, 2), (3, 4), (5, 6)]) assert len(result) == 3 # Ties in finish time result = activity_selection([(1, 5), (2, 5), (3, 5)]) assert len(result) == 1 # Only one can be selected # Zero-duration (edge touching) result = activity_selection([(1, 3), (3, 3), (3, 5)]) assert len(result) == 3 # All compatible at boundaries # Negative times result = activity_selection([(-10, -5), (-4, 0), (1, 3)]) assert len(result) == 3 print("All edge case tests passed!")Depending on requirements, the algorithm can be implemented in several equivalent ways:
12345678910111213141516171819202122232425262728293031
def activity_selection_recursive(activities: List[Tuple[int, int]]) -> List[Tuple[int, int]]: """ Recursive implementation - mirrors the correctness proof structure. Recurrence: T(n) = T(n-1) + O(1) = O(n) after O(n log n) sort """ if not activities: return [] # Sort once at the start sorted_acts = sorted(activities, key=lambda x: x[1]) def solve(acts: List[Tuple[int, int]], last_finish: int) -> List[Tuple[int, int]]: # Base case if not acts: return [] # Find first compatible activity for i, (start, finish) in enumerate(acts): if start >= last_finish: # Greedy choice: take this activity # Recursively solve for remaining (after this activity) return [(start, finish)] + solve(acts[i+1:], finish) # No compatible activity found return [] # Start with first activity (always selected) first = sorted_acts[0] return [first] + solve(sorted_acts[1:], first[1])Functional programming style:
from functools import reduce
def activity_selection_functional(activities):
sorted_acts = sorted(activities, key=lambda x: x[1])
def reducer(acc, activity):
selected, last_finish = acc
if activity[0] >= last_finish:
return (selected + [activity], activity[1])
return acc
if not sorted_acts:
return []
first = sorted_acts[0]
result, _ = reduce(reducer, sorted_acts[1:], ([first], first[1]))
return result
This functional style avoids mutable state, making it suitable for parallel/distributed settings, though it's less idiomatic in most languages.
We've developed a complete, production-ready solution to the Activity Selection Problem. Let's consolidate the key points:
You can now implement Activity Selection from scratch in any language. But implementing isn't the same as understanding. The next page provides the rigorous mathematical proof that this greedy approach is optimal—not just intuitively reasonable, but provably correct.
What's next:
The final page in this module tackles the most important question: Why does this algorithm work? We'll prove optimality using the exchange argument, a powerful technique that applies to many greedy problems. Understanding the proof transforms Activity Selection from a memorized recipe into a deeply understood principle.