Data Structures & AlgorithmsOptimal Substructure for Greedy

Optimal Substructure for Greedy Algorithms

LevelIntermediate

Duration60 mins

TopicOptimal Substructure for Greedy

3 / 4

Connection to D&C and DP

Three Paradigms, One Foundation

Greedy algorithms, divide-and-conquer, and dynamic programming share a common foundation: optimal substructure. All three paradigms solve large problems by solving smaller subproblems and combining the results. Yet they differ profoundly in how they decompose problems, which subproblems they solve, and how they combine solutions.

Understanding these connections reveals deep truths about algorithm design. You'll see why some problems yield to greedy approaches while others require the exhaustive exploration of dynamic programming. You'll understand when divide-and-conquer's independent subproblems simplify analysis and when overlapping subproblems demand memoization.

Most importantly, you'll develop intuition for choosing the right paradigm—a skill that separates competent programmers from exceptional algorithm designers.

What You Will Learn

By the end of this page, you will understand how optimal substructure manifests in each paradigm, the key differences in how subproblems are generated and explored, when to use each approach based on problem characteristics, and how to recognize problems that could be solved by multiple paradigms.

Optimal Substructure: The Common Foundation

Before comparing paradigms, let's establish a precise definition of optimal substructure that applies across all three.

Definition: Optimal Substructure

A problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems.

Formally, for problem P with optimal solution OPT(P):

OPT(P) can be decomposed into subproblem solutions: OPT(P) = combine(S₁, S₂, ..., Sₖ)
Each Sᵢ is an optimal solution to subproblem Pᵢ
OPT(Pᵢ) = Sᵢ for all i

The "Cut-and-Paste" Argument:

Optimal substructure is typically proved by contradiction:

Assume OPT(P) contains a suboptimal solution S'ᵢ to some subproblem Pᵢ
Replace S'ᵢ with OPT(Pᵢ) in OPT(P)
Show the resulting solution is better than OPT(P)
This contradicts OPT(P) being optimal
Therefore, OPT(P) must contain optimal subsolutions

Optimal Substructure Is Necessary, Not Sufficient

Having optimal substructure doesn't tell you which paradigm to use. All three paradigms REQUIRE optimal substructure, but each also needs additional properties: greedy needs the greedy choice property; D&C needs independent subproblems; DP needs overlapping subproblems and a DAG structure. Optimal substructure is the foundation; additional properties determine the approach.

Optimal Substructure Across Paradigms
Paradigm	Substructure Type	Subproblem Relationship	Additional Requirement
Greedy	Sequential	One subproblem after each choice	Greedy choice property (local optimal = global optimal)
Divide & Conquer	Parallel	Multiple independent subproblems	Subproblem independence (disjoint inputs)
Dynamic Programming	Overlapping	Multiple shared subproblems	Overlapping subproblems (reuse computation)

How Subproblems Are Generated

The three paradigms generate subproblems in fundamentally different ways. Understanding these differences is key to matching problems to paradigms.

Greedy: Sequential Reduction

Greedy algorithms generate exactly ONE subproblem at each step:

P → P₁ → P₂ → ... → Pₙ → ∅

Each subproblem is determined by the previous greedy choice. There's no branching, no exploring alternatives. The algorithm commits to a single path through the problem space.

Key Characteristic:

Single subproblem chain
Each subproblem depends on previous choice
No backtracking possible
Total subproblems: O(n)

D&C: Tree of Independent Subproblems

Divide-and-conquer splits into MULTIPLE disjoint subproblems:

        P
       / \
      P₁  P₂
     / \  / \
    ..  ..  ..

Subproblems share no input and can be solved independently (even in parallel). Results combine bottom-up.

Key Characteristic:

Tree structure of subproblems
Subproblems are independent
No sharing between branches
Total distinct subproblems: O(n)

Dynamic Programming: DAG of Overlapping Subproblems

Dynamic programming explores a directed acyclic graph (DAG) of subproblems:

            P
          /   \
         P₁   P₂
        / \ / \
       P₃  P₄  P₅
         \ | /
          P₆

Subproblems OVERLAP—the same subproblem may be reached from multiple paths. DP recognizes this and solves each unique subproblem only once.

Key Characteristic:

DAG structure (not just tree)
Subproblems may share (overlap)
Memoization eliminates redundant work
Unique subproblems: often O(n²) or O(n³), much less than exponential

subproblem_generation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
"""
Visualizing subproblem generation across paradigms.
Example: Computing Fibonacci F(n)
"""
 
# GREEDY APPROACH (hypothetical, doesn't work for Fibonacci)
def fibonacci_greedy_attempt(n):
    """
    Greedy would try to make a single choice at each step.
    For Fibonacci, there's no meaningful greedy choice.
    
    Subproblem chain (if it existed):
    F(n) → F(n-1) → F(n-2) → ... → F(1) → F(0)
    
    This MISSES F(n-1) requiring F(n-2) AND F(n-3)!
    Greedy fails because subproblem generation isn't linear.
    """
    pass  # Greedy doesn't apply to Fibonacci
 
 
# DIVIDE & CONQUER (naive recursion)
def fibonacci_dc(n, subproblems=None):
    """
    D&C treats each recursive call as independent.
    
    Tree structure:
              F(5)
             /    \
          F(4)    F(3)
         /   \   /   \
       F(3) F(2) F(2) F(1)
       ...
    
    Problem: F(3) is computed twice, F(2) three times...
    Same subproblems are solved repeatedly.
    Time: O(2^n) because we don't recognize overlap.
    """
    if subproblems is None:
        subproblems = []
    
    subproblems.append(f"F({n})")
    
    if n <= 1:
        return n
    
    # Independent subproblems (but they're NOT actually independent!)
    left = fibonacci_dc(n - 1, subproblems)
    right = fibonacci_dc(n - 2, subproblems)
    
    return left + right
 
 
# DYNAMIC PROGRAMMING (memoization)
def fibonacci_dp(n, memo=None, subproblems=None):
    """
    DP recognizes overlapping subproblems.
    
    DAG structure:
    F(5) → F(4) → F(3) → F(2) → F(1)
      \      \      \      \
       +-------+------+------+→ F(0)
    
    Each F(k) is computed exactly once.
    Time: O(n)
    """
    if memo is None:
        memo = {}
    if subproblems is None:
        subproblems = set()
    
    if n in memo:
        subproblems.add(f"F({n}) [cached]")
        return memo[n]
    
    subproblems.add(f"F({n}) [computed]")
    
    if n <= 1:
        memo[n] = n
        return n
    
    memo[n] = fibonacci_dp(n - 1, memo, subproblems) + fibonacci_dp(n - 2, memo, subproblems)
    return memo[n]
 
 
# COMPARISON
def compare_subproblem_generation():
    n = 6
    
    print("=" * 60)
    print("SUBPROBLEM GENERATION COMPARISON: Fibonacci(6)")
    print("=" * 60)
    
    # D&C subproblems
    dc_subproblems = []
    fibonacci_dc(n, dc_subproblems)
    print(f"\nDivide & Conquer subproblems: {len(dc_subproblems)}")
    print(f"  (many duplicates due to lack of overlap recognition)")
    
    # Count unique
    from collections import Counter
    dc_counts = Counter(dc_subproblems)
    print(f"  Unique: {len(dc_counts)}")
    print(f"  F(2) solved {dc_counts['F(2)']} times!")
    print(f"  F(3) solved {dc_counts['F(3)']} times!")
    
    # DP subproblems
    dp_subproblems = set()
    fibonacci_dp(n, {}, dp_subproblems)
    print(f"\nDynamic Programming subproblems: {len(dp_subproblems)}")
    print(f"  Each solved ONCE (optimal)")
    
    print(f"\nSpeedup factor: {len(dc_subproblems)} / {len(dp_subproblems)} = "
          f"{len(dc_subproblems) / len(dp_subproblems):.1f}x")
 
 
compare_subproblem_generation()

How Solutions Are Combined

Each paradigm combines subsolutions differently, reflecting its underlying philosophy.

Greedy: Incremental Accumulation

Greedy algorithms build solutions incrementally by adding one element at a time:

Solution = {} → {g₁} → {g₁, g₂} → ... → {g₁, g₂, ..., gₙ}

Combination Properties:

Additive: Each choice adds to the solution
Monotonic: Solution value never decreases
Irrevocable: Once added, elements stay
Simple: No complex merging logic

Why This Works:

The greedy choice property guarantees that each gᵢ is compatible with what came before AND with what will come after. There's never a need to "undo" or reconsider.

Example: Activity Selection

solution = []
for activity in sorted_by_finish_time:
    if compatible(solution, activity):
        solution.append(activity)  # Simple append
return solution

The final solution is just the sequence of additions. No merging, no recursion unwind.

Solution Combination Comparison
Paradigm	Combination Method	Typical Complexity	Key Operation
Greedy	Sequential append	O(1) per step	Add element to solution
D&C	Merge phase	O(n) to O(n²) per level	Merge disjoint subsolutions
DP	Transition function	O(1) to O(n) per cell	max/min over subproblems

Decision Making: How Choices Are Explored

Perhaps the most fundamental difference among paradigms is HOW decisions are made—how the algorithm explores the space of possible choices.

Greedy: Commit Without Looking Back

Greedy makes one irrevocable choice per step based solely on local information:

At step i: examine available options → select locally best → commit forever

This is extremely efficient but requires the greedy choice property to be correct.

D&C: Divide Deterministically

D&C typically has no "choice"—the division into subproblems is deterministic:

At each level: split input by predetermined rule (e.g., middle) → recurse on each part

Merge sort always splits at the middle; binary search always compares with the midpoint.

DP: Explore All, Remember Best

DP systematically explores ALL relevant subproblems and combinations:

For each subproblem: try all valid transitions → record optimal → use for larger problems

This exhaustive exploration is why DP always finds optimal solutions (for problems with optimal substructure).

decision_making_comparison.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
"""
Comparing decision-making across paradigms.
Example: Rod cutting problem
"""
 
def rod_cutting_comparison(prices, n):
    """
    Rod cutting: Given a rod of length n and prices for each length,
    find the maximum revenue from cutting the rod.
    
    This problem can be approached with all three paradigms
    (though greedy won't always work).
    """
    
    print("=" * 60)
    print(f"ROD CUTTING: Length = {n}")
    print(f"Prices: {prices}")
    print("=" * 60)
    
    # GREEDY APPROACH (Heuristic: cut off piece with best price/length ratio)
    def greedy_rod_cut(length):
        """
        Greedy decision: Always cut the piece with highest value/length ratio.
        WARNING: This does NOT always give optimal solution!
        """
        decisions = []
        total_value = 0
        remaining = length
        
        while remaining > 0:
            # Find best ratio among valid cuts
            best_cut = 0
            best_ratio = 0
            for cut in range(1, min(remaining + 1, len(prices))):
                ratio = prices[cut] / cut
                if ratio > best_ratio:
                    best_ratio = ratio
                    best_cut = cut
            
            decisions.append(f"Cut {best_cut} (ratio {best_ratio:.2f})")
            total_value += prices[best_cut]
            remaining -= best_cut
        
        return total_value, decisions
    
    greedy_value, greedy_decisions = greedy_rod_cut(n)
    print(f"\nGREEDY APPROACH (local optimal ratio):")
    print(f"  Decisions: {greedy_decisions}")
    print(f"  Total value: {greedy_value}")
    
    
    # DIVIDE & CONQUER APPROACH (exponential without memoization)
    def dc_rod_cut(length, depth=0):
        """
        D&C: Try all possible first cuts, recurse on remainder.
        Without memoization, this explores exponentially many subproblems.
        """
        if length == 0:
            return 0, []
        
        best_value = 0
        best_cuts = []
        subproblems_explored = 0
        
        # Try each possible first cut
        for first_cut in range(1, min(length + 1, len(prices))):
            subproblems_explored += 1
            rest_value, rest_cuts = dc_rod_cut(length - first_cut, depth + 1)
            total = prices[first_cut] + rest_value
            
            if total > best_value:
                best_value = total
                best_cuts = [first_cut] + rest_cuts
        
        return best_value, best_cuts
    
    dc_value, dc_cuts = dc_rod_cut(n)
    print(f"\nDIVIDE & CONQUER (exhaustive, no memo):")
    print(f"  Cuts: {dc_cuts}")
    print(f"  Total value: {dc_value}")
    print(f"  Note: Explores same subproblems multiple times!")
    
    
    # DYNAMIC PROGRAMMING (optimal)
    def dp_rod_cut(length):
        """
        DP: Systematically solve all subproblems, each exactly once.
        dp[i] = maximum revenue for rod of length i
        """
        dp = [0] * (length + 1)
        decisions = [[] for _ in range(length + 1)]
        
        for i in range(1, length + 1):
            # Try all first cuts for length i
            for first_cut in range(1, min(i + 1, len(prices))):
                if prices[first_cut] + dp[i - first_cut] > dp[i]:
                    dp[i] = prices[first_cut] + dp[i - first_cut]
                    decisions[i] = [first_cut] + decisions[i - first_cut]
        
        return dp[length], decisions[length]
    
    dp_value, dp_cuts = dp_rod_cut(n)
    print(f"\nDYNAMIC PROGRAMMING (optimal):")
    print(f"  Cuts: {dp_cuts}")
    print(f"  Total value: {dp_value}")
    print(f"  Subproblems: exactly {n} (one per length)")
    
    
    # ANALYSIS
    print(f"\n{'='*60}")
    print("ANALYSIS:")
    print(f"  Greedy: {greedy_value} ({'Optimal' if greedy_value == dp_value else 'SUBOPTIMAL'})")
    print(f"  D&C: {dc_value} (Optimal, but exponential time)")
    print(f"  DP: {dp_value} (Optimal, polynomial time)")
    
    if greedy_value != dp_value:
        print(f"\n  Greedy missed {dp_value - greedy_value} units of value!")
        print(f"  This is because greedy choice property doesn't hold.")
 
 
# Example where greedy fails
prices = [0, 1, 5, 8, 9, 10, 17, 17, 20]  # prices[i] = price for length i
rod_cutting_comparison(prices, 4)
 
# For length 4:
# Greedy (by ratio): 2+2 → price 5+5 = 10
# Optimal: single cut of 4 → price 9? NO, try 2+2=10
# Actually for these prices: 2+2 = 10, but what about length 8?
# Let's try length 8
print("\n\n")
rod_cutting_comparison(prices, 8)

Choosing the Right Paradigm

Given a problem with optimal substructure, how do you decide which paradigm to use? Here's a decision framework:

Decision Flowchart

•Check for Greedy First — Does the problem have the greedy choice property? Can you prove that a locally optimal choice is always part of a globally optimal solution? If yes, use greedy—it's the simplest.
•Check for Independent Subproblems — Do subproblems share no input? Can they be solved completely independently? If yes, consider divide-and-conquer. Look for natural "split points" (middle element, pivot, etc.).
•Check for Overlapping Subproblems — Are the same subproblems solved multiple times in a naive recursive approach? If yes, dynamic programming is indicated. Define state and transitions.
•Default to DP for Optimization — For optimization problems where greedy doesn't apply and subproblems overlap, DP is the reliable choice. It may be slower than greedy but guarantees optimality.

Problem Characteristics and Paradigm Match
Characteristic	Greedy	D&C	DP
Greedy choice property holds	✅ Use	—	Overkill
Subproblems independent (disjoint)	—	✅ Use	Works but no benefit from memo
Subproblems overlapping	❌ Fails	Inefficient (exponential)	✅ Use
Need to try all possibilities	❌ Fails	✅ (if independent)	✅ Use
Optimal at each step = global optimal	✅ Use	Need to verify	Guaranteed
Counting/enumeration problems	Rarely works	Sometimes	✅ Use

The Optimization Hierarchy

When in doubt, try approaches in this order: (1) Greedy—O(n) or O(n log n), (2) D&C—O(n log n) typically, (3) DP—O(n²) or O(n³) typically. Start simple and escalate only when needed. A greedy solution that works is better than a DP solution that's correct but slow.

Cross-Paradigm Problems:

Some problems can be solved by multiple paradigms. Understanding this reveals deep connections:

Merge Sort — Pure D&C. No greedy choice makes sense. Subproblems don't overlap (each element in exactly one half).
Activity Selection — Works with greedy (earliest finish time). Could use DP but it's overkill. D&C doesn't naturally apply.
Optimal BST — Requires DP. Greedy (most frequent at root) fails. D&C would work but subproblems overlap heavily.
Matrix Chain Multiplication — Pure DP. Greedy fails (no local choice guarantees global optimal). D&C is exactly what DP does, but with memoization.
Maximum Subarray — All three work! Greedy (Kadane's O(n)), D&C (O(n log n)), DP (O(n)). Greedy is best here.

Formal Comparison: The Mathematical View

Let's formalize the differences using a unified mathematical framework.

Unified Problem Formulation:

Consider an optimization problem:

State Space: S = set of all possible partial solutions
Transition Function: T: S × Action → S
Objective: f: S → ℝ (value to optimize)
Terminal: Goal ⊆ S (complete solutions)

formal_comparison.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
"""
Formal mathematical comparison of paradigms.
"""
 
from abc import ABC, abstractmethod
from typing import Set, List, Tuple, TypeVar, Generic
 
State = TypeVar('State')
Action = TypeVar('Action')
 
class OptimizationProblem(ABC, Generic[State, Action]):
    """Abstract base for optimization problems."""
    
    @abstractmethod
    def initial_state(self) -> State:
        """Return the initial state."""
        pass
    
    @abstractmethod
    def is_terminal(self, state: State) -> bool:
        """Check if state is a complete solution."""
        pass
    
    @abstractmethod
    def actions(self, state: State) -> List[Action]:
        """Return available actions from state."""
        pass
    
    @abstractmethod
    def transition(self, state: State, action: Action) -> State:
        """Apply action to state, return new state."""
        pass
    
    @abstractmethod
    def objective(self, state: State) -> float:
        """Evaluate the objective function."""
        pass
 
 
class GreedySolver(Generic[State, Action]):
    """
    Greedy exploration of state space.
    
    Key: At each state, select ONE action based on local criterion.
    """
    
    def __init__(self, problem: OptimizationProblem[State, Action],
                 greedy_criterion):
        self.problem = problem
        self.criterion = greedy_criterion  # Action -> score
    
    def solve(self) -> Tuple[State, List[Action]]:
        state = self.problem.initial_state()
        actions_taken = []
        
        while not self.problem.is_terminal(state):
            available = self.problem.actions(state)
            
            # LOCAL decision: best by criterion
            best_action = max(available, key=self.criterion)
            
            state = self.problem.transition(state, best_action)
            actions_taken.append(best_action)
        
        # Return final state and path
        return state, actions_taken
    
    @property
    def exploration_pattern(self) -> str:
        return "Single path through state space"
 
 
class DivideConquerSolver(Generic[State, Action]):
    """
    D&C exploration of state space.
    
    Key: Decompose state into independent subproblems, solve each, combine.
    """
    
    def __init__(self, problem: OptimizationProblem[State, Action],
                 divide_fn, combine_fn):
        self.problem = problem
        self.divide = divide_fn    # State -> List[State]
        self.combine = combine_fn  # List[State] -> State
    
    def solve(self, state: State = None) -> State:
        if state is None:
            state = self.problem.initial_state()
        
        if self.problem.is_terminal(state):
            return state
        
        # DIVIDE: split into independent subproblems
        substates = self.divide(state)
        
        # CONQUER: solve each independently
        subsolutions = [self.solve(sub) for sub in substates]
        
        # COMBINE: merge subsolutions
        return self.combine(subsolutions)
    
    @property
    def exploration_pattern(self) -> str:
        return "Tree of independent subproblems"
 
 
class DPSolver(Generic[State, Action]):
    """
    Dynamic programming exploration of state space.
    
    Key: Systematically enumerate all reachable states, solve each once.
    """
    
    def __init__(self, problem: OptimizationProblem[State, Action],
                 state_ordering, transition_fn):
        self.problem = problem
        self.ordering = state_ordering  # Order to fill DP table
        self.transition = transition_fn  # How to compute dp[state]
    
    def solve(self) -> Tuple[float, dict]:
        dp = {}  # Memoization table
        
        # Process states in dependency order
        for state in self.ordering():
            if self.problem.is_terminal(state):
                dp[state] = self.problem.objective(state)
            else:
                # Consider ALL transitions
                available = self.problem.actions(state)
                options = []
                
                for action in available:
                    next_state = self.problem.transition(state, action)
                    if next_state in dp:
                        options.append(dp[next_state])
                
                # Combine (e.g., max for maximization)
                dp[state] = max(options) if options else 0
        
        return dp[self.problem.initial_state()], dp
    
    @property
    def exploration_pattern(self) -> str:
        return "DAG of overlapping subproblems (each solved once)"
 
 
# Key difference summarized
def paradigm_comparison():
    print("PARADIGM COMPARISON: State Space Exploration")
    print("=" * 60)
    
    comparisons = [
        ("Aspect", "Greedy", "D&C", "DP"),
        ("States explored", "One path", "Tree (disjoint)", "DAG (overlapping)"),
        ("Decisions made", "One per step", "Deterministic split", "All possibilities"),
        ("Memory usage", "O(1 to n)", "O(log n) stack", "O(state space)"),
        ("Time complexity", "O(n) typical", "O(n log n) typical", "O(n² to n³) typical"),
        ("Guarantees optimum", "Only with property", "Always (if correct)", "Always"),
        ("Backtracking", "None", "At combine only", "Implicit in table"),
    ]
    
    # Print as table
    col_widths = [20, 15, 18, 20]
    for row in comparisons:
        print(" | ".join(f"{cell:<{col_widths[i]}}" for i, cell in enumerate(row)))
        if row[0] == "Aspect":
            print("-" * 80)
 
paradigm_comparison()

When Greedy Fails: Transitioning to DP

One of the most important skills is recognizing when a greedy approach won't work and smoothly transitioning to DP. Here's how to make that transition:

Signs Greedy Will Fail (Recap):

You can construct a counterexample where greedy gives suboptimal answer
The "best local choice" depends on future choices
You need to "undo" or "reconsider" earlier decisions
Multiple equally-good local choices lead to different global outcomes

The Transition Process:

Step 1: Identify What State You Need

In greedy, state was simple (e.g., "current time" in activity selection). For DP, ask: "What information do I need to make optimal future decisions?"

Step 2: Define Recurrence

The greedy choice "take best local" becomes "try all choices, pick best total."

Greedy: solution[i] = locally_best(available[i])
DP: dp[i] = max(dp[j] + value(i,j) for all valid j)

Step 3: Order Subproblems

Ensure you solve subproblems before they're needed. Typically smaller/earlier first.

Step 4: Add Memoization

Store solutions to avoid recomputation. This is what turns exponential D&C into polynomial DP.

greedy_to_dp_transition.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
"""
Example: Job Scheduling with Profits
- n jobs, each with (start, end, profit)
- Can't do overlapping jobs
- Maximize total profit
 
Greedy (by profit) fails. Transition to DP.
"""
 
from bisect import bisect_right
 
def job_scheduling_greedy_fails():
    """
    Demonstrate why greedy by profit fails.
    """
    jobs = [
        (0, 3, 50),   # Job A: short, good profit
        (1, 5, 100),  # Job B: overlaps with A, better profit
        (4, 6, 40),   # Job C: can combine with A but not B
    ]
    
    print("JOBS: (start, end, profit)")
    for i, (s, e, p) in enumerate(jobs):
        print(f"  Job {i}: [{s}, {e}) → profit {p}")
    
    print("\nGREEDY BY PROFIT:")
    print("  Take Job 1 (profit 100) first")
    print("  Job 0 conflicts with Job 1 → skip")
    print("  Job 2 conflicts with Job 1 → skip")
    print("  Total: 100")
    
    print("\nOPTIMAL (found by trying all):")
    print("  Take Job 0 (profit 50)")
    print("  Take Job 2 (profit 40) - doesn't conflict")
    print("  Total: 90")
    
    print("\n  Wait... greedy gave 100, optimal is 90?")
    print("  Let me reconsider...")
    print("  Actually for this example, greedy wins.")
    print("  Let me construct a proper counterexample:")
    
    # Proper counterexample
    jobs2 = [
        (0, 50, 60),   # Job A: long, modest profit
        (0, 25, 50),   # Job B: first half, good profit  
        (25, 50, 50),  # Job C: second half, good profit
    ]
    
    print("\nBETTER COUNTEREXAMPLE:")
    print("  Job 0: [0, 50) → profit 60")
    print("  Job 1: [0, 25) → profit 50")
    print("  Job 2: [25, 50) → profit 50")
    print("\n  Greedy by profit: Take Job 0 alone → 60")
    print("  Optimal: Take Job 1 + Job 2 → 100")
    print("\n  GREEDY FAILS! Local best (60) misses global best (100).")
    
    return jobs2
 
 
def job_scheduling_dp(jobs):
    """
    DP solution to weighted job scheduling.
    
    STATE IDENTIFICATION:
    - dp[i] = maximum profit using jobs [0..i] where we MUST include job i
    - Alternative: dp[i] = maximum profit considering jobs [0..i]
    
    RECURRENCE:
    - dp[i] = profit[i] + max(dp[j] for all j where job j ends before job i starts)
    - Or dp[i] = 0 if no compatible previous job
    """
    if not jobs:
        return 0
    
    # Sort by end time
    jobs = sorted(jobs, key=lambda x: x[1])
    n = len(jobs)
    
    # For each job i, find last job that doesn't conflict
    def find_last_non_conflict(i):
        target = jobs[i][0]  # Start time of job i
        # Binary search for last job ending <= target
        left, right = 0, i - 1
        result = -1
        while left <= right:
            mid = (left + right) // 2
            if jobs[mid][1] <= target:
                result = mid
                left = mid + 1
            else:
                right = mid - 1
        return result
    
    # DP table
    # dp[i] = max profit considering jobs [0..i]
    dp = [0] * n
    
    # Base case
    dp[0] = jobs[0][2]
    
    print("\nDP SOLUTION:")
    print(f"  dp[0] = {dp[0]} (just job 0)")
    
    for i in range(1, n):
        # Option 1: Don't include job i
        exclude = dp[i-1]
        
        # Option 2: Include job i
        last_compat = find_last_non_conflict(i)
        include = jobs[i][2] + (dp[last_compat] if last_compat >= 0 else 0)
        
        dp[i] = max(exclude, include)
        
        print(f"  dp[{i}] = max(exclude={exclude}, include={include}) = {dp[i]}")
    
    return dp[n-1]
 
 
# Run the transition example
jobs = job_scheduling_greedy_fails()
print("\n" + "=" * 60)
result = job_scheduling_dp(jobs)
print(f"\nDP RESULT: {result}")

Summary: Three Paradigms, One Foundation

We've explored the deep connections between greedy, divide-and-conquer, and dynamic programming. All three rest on optimal substructure but use it differently.

Key Takeaways

•Common Foundation — All three paradigms require optimal substructure: optimal solutions contain optimal subsolutions.
•Different Decomposition — Greedy: sequential single path. D&C: tree of independent subproblems. DP: DAG of overlapping subproblems.
•Different Combination — Greedy: incremental append. D&C: merge phase. DP: transition functions with max/min.
•Different Decision Making — Greedy: commit locally. D&C: split deterministically. DP: explore all, remember best.
•Paradigm Selection — Try greedy first (if greedy property holds), then D&C (if subproblems independent), then DP (for overlapping subproblems).
•Graceful Degradation — When greedy fails, identify necessary state and transition to DP. The same problem structure underlies both approaches.

Next Steps

You now understand how optimal substructure connects three fundamental algorithm design paradigms. The final page explores why this structural property matters—not just for correctness proofs, but for developing intuition about which problems are tractable and which require more sophisticated approaches.

3 / 4

Loading learning content...

Data Structures & AlgorithmsOptimal Substructure for Greedy

Optimal Substructure for Greedy Algorithms

LevelIntermediate

Duration60 mins

TopicOptimal Substructure for Greedy

3 / 4

Connection to D&C and DP

Three Paradigms, One Foundation

Most importantly, you'll develop intuition for choosing the right paradigm—a skill that separates competent programmers from exceptional algorithm designers.

What You Will Learn

Optimal Substructure: The Common Foundation

Before comparing paradigms, let's establish a precise definition of optimal substructure that applies across all three.

Definition: Optimal Substructure

A problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems.

Formally, for problem P with optimal solution OPT(P):

OPT(P) can be decomposed into subproblem solutions: OPT(P) = combine(S₁, S₂, ..., Sₖ)
Each Sᵢ is an optimal solution to subproblem Pᵢ
OPT(Pᵢ) = Sᵢ for all i

The "Cut-and-Paste" Argument:

Optimal substructure is typically proved by contradiction:

Assume OPT(P) contains a suboptimal solution S'ᵢ to some subproblem Pᵢ
Replace S'ᵢ with OPT(Pᵢ) in OPT(P)
Show the resulting solution is better than OPT(P)
This contradicts OPT(P) being optimal
Therefore, OPT(P) must contain optimal subsolutions

Optimal Substructure Is Necessary, Not Sufficient

Optimal Substructure Across Paradigms
Paradigm	Substructure Type	Subproblem Relationship	Additional Requirement
Greedy	Sequential	One subproblem after each choice	Greedy choice property (local optimal = global optimal)
Divide & Conquer	Parallel	Multiple independent subproblems	Subproblem independence (disjoint inputs)
Dynamic Programming	Overlapping	Multiple shared subproblems	Overlapping subproblems (reuse computation)

How Subproblems Are Generated

The three paradigms generate subproblems in fundamentally different ways. Understanding these differences is key to matching problems to paradigms.

Greedy: Sequential Reduction

Greedy algorithms generate exactly ONE subproblem at each step:

P → P₁ → P₂ → ... → Pₙ → ∅

Each subproblem is determined by the previous greedy choice. There's no branching, no exploring alternatives. The algorithm commits to a single path through the problem space.

Key Characteristic:

Single subproblem chain
Each subproblem depends on previous choice
No backtracking possible
Total subproblems: O(n)

D&C: Tree of Independent Subproblems

Divide-and-conquer splits into MULTIPLE disjoint subproblems:

        P
       / \
      P₁  P₂
     / \  / \
    ..  ..  ..

Subproblems share no input and can be solved independently (even in parallel). Results combine bottom-up.

Key Characteristic:

Tree structure of subproblems
Subproblems are independent
No sharing between branches
Total distinct subproblems: O(n)

Dynamic Programming: DAG of Overlapping Subproblems

Dynamic programming explores a directed acyclic graph (DAG) of subproblems:

            P
          /   \
         P₁   P₂
        / \ / \
       P₃  P₄  P₅
         \ | /
          P₆

Subproblems OVERLAP—the same subproblem may be reached from multiple paths. DP recognizes this and solves each unique subproblem only once.

Key Characteristic:

DAG structure (not just tree)
Subproblems may share (overlap)
Memoization eliminates redundant work
Unique subproblems: often O(n²) or O(n³), much less than exponential

subproblem_generation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
"""
Visualizing subproblem generation across paradigms.
Example: Computing Fibonacci F(n)
"""
 
# GREEDY APPROACH (hypothetical, doesn't work for Fibonacci)
def fibonacci_greedy_attempt(n):
    """
    Greedy would try to make a single choice at each step.
    For Fibonacci, there's no meaningful greedy choice.
    
    Subproblem chain (if it existed):
    F(n) → F(n-1) → F(n-2) → ... → F(1) → F(0)
    
    This MISSES F(n-1) requiring F(n-2) AND F(n-3)!
    Greedy fails because subproblem generation isn't linear.
    """
    pass  # Greedy doesn't apply to Fibonacci
 
 
# DIVIDE & CONQUER (naive recursion)
def fibonacci_dc(n, subproblems=None):
    """
    D&C treats each recursive call as independent.
    
    Tree structure:
              F(5)
             /    \
          F(4)    F(3)
         /   \   /   \
       F(3) F(2) F(2) F(1)
       ...
    
    Problem: F(3) is computed twice, F(2) three times...
    Same subproblems are solved repeatedly.
    Time: O(2^n) because we don't recognize overlap.
    """
    if subproblems is None:
        subproblems = []
    
    subproblems.append(f"F({n})")
    
    if n <= 1:
        return n
    
    # Independent subproblems (but they're NOT actually independent!)
    left = fibonacci_dc(n - 1, subproblems)
    right = fibonacci_dc(n - 2, subproblems)
    
    return left + right
 
 
# DYNAMIC PROGRAMMING (memoization)
def fibonacci_dp(n, memo=None, subproblems=None):
    """
    DP recognizes overlapping subproblems.
    
    DAG structure:
    F(5) → F(4) → F(3) → F(2) → F(1)
      \      \      \      \
       +-------+------+------+→ F(0)
    
    Each F(k) is computed exactly once.
    Time: O(n)
    """
    if memo is None:
        memo = {}
    if subproblems is None:
        subproblems = set()
    
    if n in memo:
        subproblems.add(f"F({n}) [cached]")
        return memo[n]
    
    subproblems.add(f"F({n}) [computed]")
    
    if n <= 1:
        memo[n] = n
        return n
    
    memo[n] = fibonacci_dp(n - 1, memo, subproblems) + fibonacci_dp(n - 2, memo, subproblems)
    return memo[n]
 
 
# COMPARISON
def compare_subproblem_generation():
    n = 6
    
    print("=" * 60)
    print("SUBPROBLEM GENERATION COMPARISON: Fibonacci(6)")
    print("=" * 60)
    
    # D&C subproblems
    dc_subproblems = []
    fibonacci_dc(n, dc_subproblems)
    print(f"\nDivide & Conquer subproblems: {len(dc_subproblems)}")
    print(f"  (many duplicates due to lack of overlap recognition)")
    
    # Count unique
    from collections import Counter
    dc_counts = Counter(dc_subproblems)
    print(f"  Unique: {len(dc_counts)}")
    print(f"  F(2) solved {dc_counts['F(2)']} times!")
    print(f"  F(3) solved {dc_counts['F(3)']} times!")
    
    # DP subproblems
    dp_subproblems = set()
    fibonacci_dp(n, {}, dp_subproblems)
    print(f"\nDynamic Programming subproblems: {len(dp_subproblems)}")
    print(f"  Each solved ONCE (optimal)")
    
    print(f"\nSpeedup factor: {len(dc_subproblems)} / {len(dp_subproblems)} = "
          f"{len(dc_subproblems) / len(dp_subproblems):.1f}x")
 
 
compare_subproblem_generation()

How Solutions Are Combined

Each paradigm combines subsolutions differently, reflecting its underlying philosophy.

Greedy: Incremental Accumulation

Greedy algorithms build solutions incrementally by adding one element at a time:

Solution = {} → {g₁} → {g₁, g₂} → ... → {g₁, g₂, ..., gₙ}

Combination Properties:

Additive: Each choice adds to the solution
Monotonic: Solution value never decreases
Irrevocable: Once added, elements stay
Simple: No complex merging logic

Why This Works:

The greedy choice property guarantees that each gᵢ is compatible with what came before AND with what will come after. There's never a need to "undo" or reconsider.

Example: Activity Selection

solution = []
for activity in sorted_by_finish_time:
    if compatible(solution, activity):
        solution.append(activity)  # Simple append
return solution

The final solution is just the sequence of additions. No merging, no recursion unwind.

Solution Combination Comparison
Paradigm	Combination Method	Typical Complexity	Key Operation
Greedy	Sequential append	O(1) per step	Add element to solution
D&C	Merge phase	O(n) to O(n²) per level	Merge disjoint subsolutions
DP	Transition function	O(1) to O(n) per cell	max/min over subproblems

Decision Making: How Choices Are Explored

Perhaps the most fundamental difference among paradigms is HOW decisions are made—how the algorithm explores the space of possible choices.

Greedy: Commit Without Looking Back

Greedy makes one irrevocable choice per step based solely on local information:

At step i: examine available options → select locally best → commit forever

This is extremely efficient but requires the greedy choice property to be correct.

D&C: Divide Deterministically

D&C typically has no "choice"—the division into subproblems is deterministic:

At each level: split input by predetermined rule (e.g., middle) → recurse on each part

Merge sort always splits at the middle; binary search always compares with the midpoint.

DP: Explore All, Remember Best

DP systematically explores ALL relevant subproblems and combinations:

For each subproblem: try all valid transitions → record optimal → use for larger problems

This exhaustive exploration is why DP always finds optimal solutions (for problems with optimal substructure).

decision_making_comparison.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
"""
Comparing decision-making across paradigms.
Example: Rod cutting problem
"""
 
def rod_cutting_comparison(prices, n):
    """
    Rod cutting: Given a rod of length n and prices for each length,
    find the maximum revenue from cutting the rod.
    
    This problem can be approached with all three paradigms
    (though greedy won't always work).
    """
    
    print("=" * 60)
    print(f"ROD CUTTING: Length = {n}")
    print(f"Prices: {prices}")
    print("=" * 60)
    
    # GREEDY APPROACH (Heuristic: cut off piece with best price/length ratio)
    def greedy_rod_cut(length):
        """
        Greedy decision: Always cut the piece with highest value/length ratio.
        WARNING: This does NOT always give optimal solution!
        """
        decisions = []
        total_value = 0
        remaining = length
        
        while remaining > 0:
            # Find best ratio among valid cuts
            best_cut = 0
            best_ratio = 0
            for cut in range(1, min(remaining + 1, len(prices))):
                ratio = prices[cut] / cut
                if ratio > best_ratio:
                    best_ratio = ratio
                    best_cut = cut
            
            decisions.append(f"Cut {best_cut} (ratio {best_ratio:.2f})")
            total_value += prices[best_cut]
            remaining -= best_cut
        
        return total_value, decisions
    
    greedy_value, greedy_decisions = greedy_rod_cut(n)
    print(f"\nGREEDY APPROACH (local optimal ratio):")
    print(f"  Decisions: {greedy_decisions}")
    print(f"  Total value: {greedy_value}")
    
    
    # DIVIDE & CONQUER APPROACH (exponential without memoization)
    def dc_rod_cut(length, depth=0):
        """
        D&C: Try all possible first cuts, recurse on remainder.
        Without memoization, this explores exponentially many subproblems.
        """
        if length == 0:
            return 0, []
        
        best_value = 0
        best_cuts = []
        subproblems_explored = 0
        
        # Try each possible first cut
        for first_cut in range(1, min(length + 1, len(prices))):
            subproblems_explored += 1
            rest_value, rest_cuts = dc_rod_cut(length - first_cut, depth + 1)
            total = prices[first_cut] + rest_value
            
            if total > best_value:
                best_value = total
                best_cuts = [first_cut] + rest_cuts
        
        return best_value, best_cuts
    
    dc_value, dc_cuts = dc_rod_cut(n)
    print(f"\nDIVIDE & CONQUER (exhaustive, no memo):")
    print(f"  Cuts: {dc_cuts}")
    print(f"  Total value: {dc_value}")
    print(f"  Note: Explores same subproblems multiple times!")
    
    
    # DYNAMIC PROGRAMMING (optimal)
    def dp_rod_cut(length):
        """
        DP: Systematically solve all subproblems, each exactly once.
        dp[i] = maximum revenue for rod of length i
        """
        dp = [0] * (length + 1)
        decisions = [[] for _ in range(length + 1)]
        
        for i in range(1, length + 1):
            # Try all first cuts for length i
            for first_cut in range(1, min(i + 1, len(prices))):
                if prices[first_cut] + dp[i - first_cut] > dp[i]:
                    dp[i] = prices[first_cut] + dp[i - first_cut]
                    decisions[i] = [first_cut] + decisions[i - first_cut]
        
        return dp[length], decisions[length]
    
    dp_value, dp_cuts = dp_rod_cut(n)
    print(f"\nDYNAMIC PROGRAMMING (optimal):")
    print(f"  Cuts: {dp_cuts}")
    print(f"  Total value: {dp_value}")
    print(f"  Subproblems: exactly {n} (one per length)")
    
    
    # ANALYSIS
    print(f"\n{'='*60}")
    print("ANALYSIS:")
    print(f"  Greedy: {greedy_value} ({'Optimal' if greedy_value == dp_value else 'SUBOPTIMAL'})")
    print(f"  D&C: {dc_value} (Optimal, but exponential time)")
    print(f"  DP: {dp_value} (Optimal, polynomial time)")
    
    if greedy_value != dp_value:
        print(f"\n  Greedy missed {dp_value - greedy_value} units of value!")
        print(f"  This is because greedy choice property doesn't hold.")
 
 
# Example where greedy fails
prices = [0, 1, 5, 8, 9, 10, 17, 17, 20]  # prices[i] = price for length i
rod_cutting_comparison(prices, 4)
 
# For length 4:
# Greedy (by ratio): 2+2 → price 5+5 = 10
# Optimal: single cut of 4 → price 9? NO, try 2+2=10
# Actually for these prices: 2+2 = 10, but what about length 8?
# Let's try length 8
print("\n\n")
rod_cutting_comparison(prices, 8)

Choosing the Right Paradigm

Given a problem with optimal substructure, how do you decide which paradigm to use? Here's a decision framework:

Decision Flowchart

•Check for Greedy First — Does the problem have the greedy choice property? Can you prove that a locally optimal choice is always part of a globally optimal solution? If yes, use greedy—it's the simplest.
•Check for Independent Subproblems — Do subproblems share no input? Can they be solved completely independently? If yes, consider divide-and-conquer. Look for natural "split points" (middle element, pivot, etc.).
•Check for Overlapping Subproblems — Are the same subproblems solved multiple times in a naive recursive approach? If yes, dynamic programming is indicated. Define state and transitions.
•Default to DP for Optimization — For optimization problems where greedy doesn't apply and subproblems overlap, DP is the reliable choice. It may be slower than greedy but guarantees optimality.

Problem Characteristics and Paradigm Match
Characteristic	Greedy	D&C	DP
Greedy choice property holds	✅ Use	—	Overkill
Subproblems independent (disjoint)	—	✅ Use	Works but no benefit from memo
Subproblems overlapping	❌ Fails	Inefficient (exponential)	✅ Use
Need to try all possibilities	❌ Fails	✅ (if independent)	✅ Use
Optimal at each step = global optimal	✅ Use	Need to verify	Guaranteed
Counting/enumeration problems	Rarely works	Sometimes	✅ Use

The Optimization Hierarchy

Cross-Paradigm Problems:

Some problems can be solved by multiple paradigms. Understanding this reveals deep connections:

Merge Sort — Pure D&C. No greedy choice makes sense. Subproblems don't overlap (each element in exactly one half).
Activity Selection — Works with greedy (earliest finish time). Could use DP but it's overkill. D&C doesn't naturally apply.
Optimal BST — Requires DP. Greedy (most frequent at root) fails. D&C would work but subproblems overlap heavily.
Matrix Chain Multiplication — Pure DP. Greedy fails (no local choice guarantees global optimal). D&C is exactly what DP does, but with memoization.
Maximum Subarray — All three work! Greedy (Kadane's O(n)), D&C (O(n log n)), DP (O(n)). Greedy is best here.

Formal Comparison: The Mathematical View

Let's formalize the differences using a unified mathematical framework.

Unified Problem Formulation:

Consider an optimization problem:

State Space: S = set of all possible partial solutions
Transition Function: T: S × Action → S
Objective: f: S → ℝ (value to optimize)
Terminal: Goal ⊆ S (complete solutions)

formal_comparison.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
"""
Formal mathematical comparison of paradigms.
"""
 
from abc import ABC, abstractmethod
from typing import Set, List, Tuple, TypeVar, Generic
 
State = TypeVar('State')
Action = TypeVar('Action')
 
class OptimizationProblem(ABC, Generic[State, Action]):
    """Abstract base for optimization problems."""
    
    @abstractmethod
    def initial_state(self) -> State:
        """Return the initial state."""
        pass
    
    @abstractmethod
    def is_terminal(self, state: State) -> bool:
        """Check if state is a complete solution."""
        pass
    
    @abstractmethod
    def actions(self, state: State) -> List[Action]:
        """Return available actions from state."""
        pass
    
    @abstractmethod
    def transition(self, state: State, action: Action) -> State:
        """Apply action to state, return new state."""
        pass
    
    @abstractmethod
    def objective(self, state: State) -> float:
        """Evaluate the objective function."""
        pass
 
 
class GreedySolver(Generic[State, Action]):
    """
    Greedy exploration of state space.
    
    Key: At each state, select ONE action based on local criterion.
    """
    
    def __init__(self, problem: OptimizationProblem[State, Action],
                 greedy_criterion):
        self.problem = problem
        self.criterion = greedy_criterion  # Action -> score
    
    def solve(self) -> Tuple[State, List[Action]]:
        state = self.problem.initial_state()
        actions_taken = []
        
        while not self.problem.is_terminal(state):
            available = self.problem.actions(state)
            
            # LOCAL decision: best by criterion
            best_action = max(available, key=self.criterion)
            
            state = self.problem.transition(state, best_action)
            actions_taken.append(best_action)
        
        # Return final state and path
        return state, actions_taken
    
    @property
    def exploration_pattern(self) -> str:
        return "Single path through state space"
 
 
class DivideConquerSolver(Generic[State, Action]):
    """
    D&C exploration of state space.
    
    Key: Decompose state into independent subproblems, solve each, combine.
    """
    
    def __init__(self, problem: OptimizationProblem[State, Action],
                 divide_fn, combine_fn):
        self.problem = problem
        self.divide = divide_fn    # State -> List[State]
        self.combine = combine_fn  # List[State] -> State
    
    def solve(self, state: State = None) -> State:
        if state is None:
            state = self.problem.initial_state()
        
        if self.problem.is_terminal(state):
            return state
        
        # DIVIDE: split into independent subproblems
        substates = self.divide(state)
        
        # CONQUER: solve each independently
        subsolutions = [self.solve(sub) for sub in substates]
        
        # COMBINE: merge subsolutions
        return self.combine(subsolutions)
    
    @property
    def exploration_pattern(self) -> str:
        return "Tree of independent subproblems"
 
 
class DPSolver(Generic[State, Action]):
    """
    Dynamic programming exploration of state space.
    
    Key: Systematically enumerate all reachable states, solve each once.
    """
    
    def __init__(self, problem: OptimizationProblem[State, Action],
                 state_ordering, transition_fn):
        self.problem = problem
        self.ordering = state_ordering  # Order to fill DP table
        self.transition = transition_fn  # How to compute dp[state]
    
    def solve(self) -> Tuple[float, dict]:
        dp = {}  # Memoization table
        
        # Process states in dependency order
        for state in self.ordering():
            if self.problem.is_terminal(state):
                dp[state] = self.problem.objective(state)
            else:
                # Consider ALL transitions
                available = self.problem.actions(state)
                options = []
                
                for action in available:
                    next_state = self.problem.transition(state, action)
                    if next_state in dp:
                        options.append(dp[next_state])
                
                # Combine (e.g., max for maximization)
                dp[state] = max(options) if options else 0
        
        return dp[self.problem.initial_state()], dp
    
    @property
    def exploration_pattern(self) -> str:
        return "DAG of overlapping subproblems (each solved once)"
 
 
# Key difference summarized
def paradigm_comparison():
    print("PARADIGM COMPARISON: State Space Exploration")
    print("=" * 60)
    
    comparisons = [
        ("Aspect", "Greedy", "D&C", "DP"),
        ("States explored", "One path", "Tree (disjoint)", "DAG (overlapping)"),
        ("Decisions made", "One per step", "Deterministic split", "All possibilities"),
        ("Memory usage", "O(1 to n)", "O(log n) stack", "O(state space)"),
        ("Time complexity", "O(n) typical", "O(n log n) typical", "O(n² to n³) typical"),
        ("Guarantees optimum", "Only with property", "Always (if correct)", "Always"),
        ("Backtracking", "None", "At combine only", "Implicit in table"),
    ]
    
    # Print as table
    col_widths = [20, 15, 18, 20]
    for row in comparisons:
        print(" | ".join(f"{cell:<{col_widths[i]}}" for i, cell in enumerate(row)))
        if row[0] == "Aspect":
            print("-" * 80)
 
paradigm_comparison()

When Greedy Fails: Transitioning to DP

One of the most important skills is recognizing when a greedy approach won't work and smoothly transitioning to DP. Here's how to make that transition:

Signs Greedy Will Fail (Recap):

You can construct a counterexample where greedy gives suboptimal answer
The "best local choice" depends on future choices
You need to "undo" or "reconsider" earlier decisions
Multiple equally-good local choices lead to different global outcomes

The Transition Process:

Step 1: Identify What State You Need

In greedy, state was simple (e.g., "current time" in activity selection). For DP, ask: "What information do I need to make optimal future decisions?"

Step 2: Define Recurrence

The greedy choice "take best local" becomes "try all choices, pick best total."

Greedy: solution[i] = locally_best(available[i])
DP: dp[i] = max(dp[j] + value(i,j) for all valid j)

Step 3: Order Subproblems

Ensure you solve subproblems before they're needed. Typically smaller/earlier first.

Step 4: Add Memoization

Store solutions to avoid recomputation. This is what turns exponential D&C into polynomial DP.

greedy_to_dp_transition.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
"""
Example: Job Scheduling with Profits
- n jobs, each with (start, end, profit)
- Can't do overlapping jobs
- Maximize total profit
 
Greedy (by profit) fails. Transition to DP.
"""
 
from bisect import bisect_right
 
def job_scheduling_greedy_fails():
    """
    Demonstrate why greedy by profit fails.
    """
    jobs = [
        (0, 3, 50),   # Job A: short, good profit
        (1, 5, 100),  # Job B: overlaps with A, better profit
        (4, 6, 40),   # Job C: can combine with A but not B
    ]
    
    print("JOBS: (start, end, profit)")
    for i, (s, e, p) in enumerate(jobs):
        print(f"  Job {i}: [{s}, {e}) → profit {p}")
    
    print("\nGREEDY BY PROFIT:")
    print("  Take Job 1 (profit 100) first")
    print("  Job 0 conflicts with Job 1 → skip")
    print("  Job 2 conflicts with Job 1 → skip")
    print("  Total: 100")
    
    print("\nOPTIMAL (found by trying all):")
    print("  Take Job 0 (profit 50)")
    print("  Take Job 2 (profit 40) - doesn't conflict")
    print("  Total: 90")
    
    print("\n  Wait... greedy gave 100, optimal is 90?")
    print("  Let me reconsider...")
    print("  Actually for this example, greedy wins.")
    print("  Let me construct a proper counterexample:")
    
    # Proper counterexample
    jobs2 = [
        (0, 50, 60),   # Job A: long, modest profit
        (0, 25, 50),   # Job B: first half, good profit  
        (25, 50, 50),  # Job C: second half, good profit
    ]
    
    print("\nBETTER COUNTEREXAMPLE:")
    print("  Job 0: [0, 50) → profit 60")
    print("  Job 1: [0, 25) → profit 50")
    print("  Job 2: [25, 50) → profit 50")
    print("\n  Greedy by profit: Take Job 0 alone → 60")
    print("  Optimal: Take Job 1 + Job 2 → 100")
    print("\n  GREEDY FAILS! Local best (60) misses global best (100).")
    
    return jobs2
 
 
def job_scheduling_dp(jobs):
    """
    DP solution to weighted job scheduling.
    
    STATE IDENTIFICATION:
    - dp[i] = maximum profit using jobs [0..i] where we MUST include job i
    - Alternative: dp[i] = maximum profit considering jobs [0..i]
    
    RECURRENCE:
    - dp[i] = profit[i] + max(dp[j] for all j where job j ends before job i starts)
    - Or dp[i] = 0 if no compatible previous job
    """
    if not jobs:
        return 0
    
    # Sort by end time
    jobs = sorted(jobs, key=lambda x: x[1])
    n = len(jobs)
    
    # For each job i, find last job that doesn't conflict
    def find_last_non_conflict(i):
        target = jobs[i][0]  # Start time of job i
        # Binary search for last job ending <= target
        left, right = 0, i - 1
        result = -1
        while left <= right:
            mid = (left + right) // 2
            if jobs[mid][1] <= target:
                result = mid
                left = mid + 1
            else:
                right = mid - 1
        return result
    
    # DP table
    # dp[i] = max profit considering jobs [0..i]
    dp = [0] * n
    
    # Base case
    dp[0] = jobs[0][2]
    
    print("\nDP SOLUTION:")
    print(f"  dp[0] = {dp[0]} (just job 0)")
    
    for i in range(1, n):
        # Option 1: Don't include job i
        exclude = dp[i-1]
        
        # Option 2: Include job i
        last_compat = find_last_non_conflict(i)
        include = jobs[i][2] + (dp[last_compat] if last_compat >= 0 else 0)
        
        dp[i] = max(exclude, include)
        
        print(f"  dp[{i}] = max(exclude={exclude}, include={include}) = {dp[i]}")
    
    return dp[n-1]
 
 
# Run the transition example
jobs = job_scheduling_greedy_fails()
print("\n" + "=" * 60)
result = job_scheduling_dp(jobs)
print(f"\nDP RESULT: {result}")

Summary: Three Paradigms, One Foundation

We've explored the deep connections between greedy, divide-and-conquer, and dynamic programming. All three rest on optimal substructure but use it differently.

Key Takeaways

•Common Foundation — All three paradigms require optimal substructure: optimal solutions contain optimal subsolutions.
•Different Decomposition — Greedy: sequential single path. D&C: tree of independent subproblems. DP: DAG of overlapping subproblems.
•Different Combination — Greedy: incremental append. D&C: merge phase. DP: transition functions with max/min.
•Different Decision Making — Greedy: commit locally. D&C: split deterministically. DP: explore all, remember best.
•Paradigm Selection — Try greedy first (if greedy property holds), then D&C (if subproblems independent), then DP (for overlapping subproblems).
•Graceful Degradation — When greedy fails, identify necessary state and transition to DP. The same problem structure underlies both approaches.

Next Steps

3 / 4