Activity Selection Problem - Learning Module

Loading content...

0/279

Sort by End Time Strategy

The Greedy Algorithm

With formal definitions established, we now develop the algorithm that solves Activity Selection optimally. The strategy is elegantly simple: sort activities by finish time, then greedily select each activity that doesn't conflict with previously selected ones.

This page provides complete implementation, step-by-step execution traces, thorough complexity analysis, and—crucially—explains why this particular ordering works when others fail.

What You Will Learn

By the end of this page, you will be able to implement Activity Selection from scratch in any programming language, trace the algorithm's execution on any input, analyze time and space complexity, and articulate precisely why sorting by finish time (not start time, duration, or conflicts) yields optimal solutions.

The Algorithm

The Activity Selection greedy algorithm consists of two phases:

Phase 1: Sort Sort all activities by their finish time in ascending order.

Phase 2: Select Iterate through sorted activities, selecting each activity that starts at or after the finish time of the last selected activity.

Pseudocode:

ActivitySelection(activities):
    // Phase 1: Sort by finish time
    Sort activities by finish time (ascending)
    
    // Phase 2: Greedy selection
    selected = [activities[0]]  // First activity always selected
    lastFinish = activities[0].finish
    
    for i = 1 to n-1:
        if activities[i].start >= lastFinish:
            selected.append(activities[i])
            lastFinish = activities[i].finish
    
    return selected

The algorithm is remarkably concise. Its power lies not in complexity, but in the insight that this simple procedure is optimal.

The Core Insight

By always picking the activity that finishes earliest among compatible options, we maximize the remaining time for future activities. Finishing early never hurts—it can only help or have no effect. This is the greedy choice property in action.

Complete Implementation

Here are production-quality implementations in multiple languages, handling edge cases and returning both the selected activities and their count.

activity_selection
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
from typing import List, Tuple
 
def activity_selection(activities: List[Tuple[int, int]]) -> List[Tuple[int, int]]:
    """
    Solve the Activity Selection Problem using greedy algorithm.
    
    Args:
        activities: List of (start_time, finish_time) tuples
        
    Returns:
        List of selected activities (maximum non-overlapping set)
        
    Time Complexity: O(n log n) for sorting + O(n) for selection = O(n log n)
    Space Complexity: O(n) for the result list (or O(1) if modifying in place)
    """
    if not activities:
        return []
    
    # Phase 1: Sort by finish time
    sorted_activities = sorted(activities, key=lambda x: x[1])
    
    # Phase 2: Greedy selection
    selected = [sorted_activities[0]]  # First activity always optimal to include
    last_finish = sorted_activities[0][1]
    
    for i in range(1, len(sorted_activities)):
        start, finish = sorted_activities[i]
        # Select if compatible with last selected activity
        if start >= last_finish:
            selected.append(sorted_activities[i])
            last_finish = finish
    
    return selected
 
 
def activity_selection_with_indices(activities: List[Tuple[int, int]]) -> List[int]:
    """
    Variant that returns indices of selected activities in original order.
    Useful when activities have associated metadata.
    """
    if not activities:
        return []
    
    # Include original indices
    indexed = [(i, start, finish) for i, (start, finish) in enumerate(activities)]
    indexed.sort(key=lambda x: x[2])  # Sort by finish time
    
    selected_indices = [indexed[0][0]]
    last_finish = indexed[0][2]
    
    for i in range(1, len(indexed)):
        idx, start, finish = indexed[i]
        if start >= last_finish:
            selected_indices.append(idx)
            last_finish = finish
    
    return selected_indices
 
 
# Example usage
if __name__ == "__main__":
    activities = [
        (1, 4), (3, 5), (0, 6), (5, 7), (3, 9),
        (5, 9), (6, 10), (8, 11), (8, 12), (12, 16)
    ]
    
    result = activity_selection(activities)
    print(f"Selected {len(result)} activities: {result}")
    # Output: Selected 4 activities: [(1, 4), (5, 7), (8, 11), (12, 16)]

Step-by-Step Walkthrough

Let's trace the algorithm on our example with 10 activities:

Input activities (unsorted):

Index  Start  Finish
  0      1      4
  1      3      5
  2      0      6
  3      5      7
  4      3      9
  5      5      9
  6      6     10
  7      8     11
  8      8     12
  9     12     16

Phase 1: Sort by Finish Time

Index  Start  Finish  (Original Index)
  0      1      4      (0)
  1      3      5      (1)
  2      0      6      (2)
  3      5      7      (3)
  4      3      9      (4)
  5      5      9      (5)
  6      6     10      (6)
  7      8     11      (7)
  8      8     12      (8)
  9     12     16      (9)

Note: Activities are already sorted by finish time in this example.

Phase 2: Greedy Selection Trace
Step	Activity	Start	Finish	Last Finish	Compatible?	Action
1	(1, 4)	1	4	—	—	SELECT (first activity)
2	(3, 5)	3	5	4	❌ 3 < 4	SKIP
3	(0, 6)	0	6	4	❌ 0 < 4	SKIP
4	(5, 7)	5	7	4	✅ 5 ≥ 4	SELECT
5	(3, 9)	3	9	7	❌ 3 < 7	SKIP
6	(5, 9)	5	9	7	❌ 5 < 7	SKIP
7	(6, 10)	6	10	7	❌ 6 < 7	SKIP
8	(8, 11)	8	11	7	✅ 8 ≥ 7	SELECT
9	(8, 12)	8	12	11	❌ 8 < 11	SKIP
10	(12, 16)	12	16	11	✅ 12 ≥ 11	SELECT

Result: Selected 4 activities: (1,4), (5,7), (8,11), (12,16)

Visual Timeline:

Time:  0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16
       |--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|

SELECTED:
(1,4)     [===]                    
(5,7)              [===]           
(8,11)                   [=====]   
(12,16)                         [========]

SKIPPED (would overlap):
(3,5)        [==] ← overlaps with (1,4)
(0,6)  [======]   ← overlaps with (1,4)
(3,9)        [======]  ← overlaps with (1,4) and (5,7)
(5,9)              [====] ← overlaps with (5,7)
(6,10)                [====] ← overlaps with (5,7) and (8,11)
(8,12)                   [=====] ← overlaps with (8,11)

Why Finish Time?

The algorithm's correctness hinges on sorting by finish time. But why this ordering? Why not sort by start time, duration, or number of conflicts? Let's examine each alternative:

Strategy: Sort by finish time ascending, select first compatible.

Why it works:

Maximizes remaining capacity: By finishing early, we leave maximum time for future activities.
Never excludes better options: If activity A finishes before B, taking A instead of B never reduces future options—anything compatible with B after B's finish is also compatible with A after A's finish.
Greedy choice property: There always exists an optimal solution containing the earliest-finishing activity.
Optimal substructure: After choosing earliest finish, the remaining problem is a smaller activity selection problem.

Counterexample resistance: No counterexample exists because the approach is provably optimal.

The Unique Correct Ordering

For maximizing activity count: sorting by finish time (ascending) is the unique greedy ordering that guarantees optimality. For different objectives (minimum lateness, meeting room allocation), different orderings become optimal. Always match the ordering to the objective.

Complexity Analysis

Let's rigorously analyze the algorithm's time and space complexity.

Complexity Breakdown
Phase	Operation	Time Complexity	Space Complexity
Sort	Comparison-based sort	O(n log n)	O(n) or O(log n)*
Selection	Single pass through sorted activities	O(n)	O(1) for iteration
Result	Store selected activities	—	O(k) where k ≤ n
Total	—	O(n log n)	O(n)

*Sort space depends on algorithm: Merge sort uses O(n), QuickSort uses O(log n) for recursion.

Detailed Time Analysis:

Sorting: Standard comparison sorts (merge sort, quicksort, Timsort) run in O(n log n) worst-case or expected time.
Selection Loop: We iterate through n activities exactly once. Each iteration performs O(1) comparisons and updates.
Total: O(n log n) + O(n) = O(n log n)

Space Analysis:

Sorting: Most implementations use O(n) auxiliary space. In-place variants like heapsort use O(1).
Result Storage: We store up to n activities in the result. In the worst case (all mutually compatible), this is O(n).
Total: O(n)

The Sorting Lower Bound

Is O(n log n) the best possible? For comparison-based algorithms, yes—any algorithm that must distinguish between all possible orderings of n activities requires Ω(n log n) comparisons. However, if activities have bounded integer coordinates, counting sort could reduce sorting to O(n + k) where k is the coordinate range.

Optimizations and Variants:

Pre-sorted input: If input is already sorted by finish time, skip sorting for O(n) total.
Count only: If you only need the count (not the activities), you can avoid storing results: O(1) space.
Online variant: Activities arrive one at a time. Use an online algorithm that maintains selected activities in a balanced tree for O(log n) per insertion.
Parallelization: Sorting can be parallelized; selection is inherently sequential.

Edge Cases and Considerations

Production implementations must handle edge cases gracefully:

Edge Cases to Handle

•Empty input: Return empty list (0 activities selected).
•Single activity: Return that activity (1 activity selected).
•All activities overlap: Return any single activity (only 1 can be selected).
•All activities are compatible: Return all (trivial case, all n selected).
•Ties in finish time: Algorithm still works—any tie-breaking is valid.
•Zero-duration activities: Handle (s, s) as valid degenerate case—compatible with most activities.
•Negative times: Algorithm works with negative integers; just ensure s < f for each activity.
•Floating-point times: Use appropriate comparison (beware of precision issues).
•Identical activities: Duplicates are fine—at most one will be selected.

edge_cases_tests
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def test_edge_cases():
    # Empty input
    assert activity_selection([]) == []
    
    # Single activity
    assert activity_selection([(1, 5)]) == [(1, 5)]
    
    # All overlap: only one can be selected
    result = activity_selection([(0, 10), (1, 9), (2, 8)])
    assert len(result) == 1
    assert result[0] == (2, 8)  # Earliest finish
    
    # All compatible: all selected
    result = activity_selection([(1, 2), (3, 4), (5, 6)])
    assert len(result) == 3
    
    # Ties in finish time
    result = activity_selection([(1, 5), (2, 5), (3, 5)])
    assert len(result) == 1  # Only one can be selected
    
    # Zero-duration (edge touching)
    result = activity_selection([(1, 3), (3, 3), (3, 5)])
    assert len(result) == 3  # All compatible at boundaries
    
    # Negative times
    result = activity_selection([(-10, -5), (-4, 0), (1, 3)])
    assert len(result) == 3
    
    print("All edge case tests passed!")

Implementation Variants

Depending on requirements, the algorithm can be implemented in several equivalent ways:

Iterative (Standard)

•Sort activities by finish time
•Loop through, tracking last finish
•Select compatible activities
•Pros: Clear, efficient, easy to debug
•Cons: None significant

Recursive

•Base case: no activities → empty set
•Select first activity (earliest finish)
•Recursively solve for compatible remainder
•Pros: Mirrors correctness proof structure
•Cons: Stack overhead, less efficient

recursive_variant
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
def activity_selection_recursive(activities: List[Tuple[int, int]]) -> List[Tuple[int, int]]:
    """
    Recursive implementation - mirrors the correctness proof structure.
    
    Recurrence: 
        T(n) = T(n-1) + O(1) = O(n) after O(n log n) sort
    """
    if not activities:
        return []
    
    # Sort once at the start
    sorted_acts = sorted(activities, key=lambda x: x[1])
    
    def solve(acts: List[Tuple[int, int]], last_finish: int) -> List[Tuple[int, int]]:
        # Base case
        if not acts:
            return []
        
        # Find first compatible activity
        for i, (start, finish) in enumerate(acts):
            if start >= last_finish:
                # Greedy choice: take this activity
                # Recursively solve for remaining (after this activity)
                return [(start, finish)] + solve(acts[i+1:], finish)
        
        # No compatible activity found
        return []
    
    # Start with first activity (always selected)
    first = sorted_acts[0]
    return [first] + solve(sorted_acts[1:], first[1])

Functional programming style:

from functools import reduce

def activity_selection_functional(activities):
    sorted_acts = sorted(activities, key=lambda x: x[1])
    
    def reducer(acc, activity):
        selected, last_finish = acc
        if activity[0] >= last_finish:
            return (selected + [activity], activity[1])
        return acc
    
    if not sorted_acts:
        return []
    
    first = sorted_acts[0]
    result, _ = reduce(reducer, sorted_acts[1:], ([first], first[1]))
    return result

This functional style avoids mutable state, making it suitable for parallel/distributed settings, though it's less idiomatic in most languages.

Summary

We've developed a complete, production-ready solution to the Activity Selection Problem. Let's consolidate the key points:

Key Takeaways

•The Algorithm: Sort by finish time, greedily select first compatible activity.
•Time Complexity: O(n log n) dominated by sorting; selection is O(n).
•Space Complexity: O(n) for input/output storage.
•Why Finish Time: Other orderings (start time, duration, conflicts) have counterexamples; finish time is the unique correct ordering.
•Implementation: Can be iterative or recursive; iterative is preferred for efficiency.
•Edge Cases: Empty input, single activity, all overlapping, all compatible—all handled gracefully.

Algorithm Complete

You can now implement Activity Selection from scratch in any language. But implementing isn't the same as understanding. The next page provides the rigorous mathematical proof that this greedy approach is optimal—not just intuitively reasonable, but provably correct.

What's next:

The final page in this module tackles the most important question: Why does this algorithm work? We'll prove optimality using the exchange argument, a powerful technique that applies to many greedy problems. Understanding the proof transforms Activity Selection from a memorized recipe into a deeply understood principle.