Data Structures & AlgorithmsState Design & Transition Functions

State Design & Transition Functions

LevelAdvanced

Duration90 mins

TopicState Design & Transition Functions

3 / 4

Writing Transition Equations

The Heart of Recurrence

You've defined your state. You know what information to track. Now comes the moment of truth: writing the transition equation.

The transition equation (also called the recurrence relation) is the mathematical expression that relates the value of one state to the values of other states. It's the formula that makes dynamic programming work—the precise statement of how to build larger solutions from smaller ones.

A correctly written transition equation is elegant, unambiguous, and directly translatable to code. A flawed one leads to bugs that are notoriously difficult to diagnose. In this page, we'll develop the skills to write transition equations that are correct by construction.

What You Will Learn

By the end of this page, you'll understand the anatomy of a transition equation, master the systematic process of deriving transitions from problem structure, recognize common transition patterns across problem types, and learn to verify transition correctness before implementation.

Anatomy of a Transition Equation

Every DP transition equation has the same fundamental structure:

dp[current_state] = aggregate(dp[previous_state₁] + cost₁,
                              dp[previous_state₂] + cost₂,
                              ...)

Let's break down each component:

1. Left Side: Current State

The state whose value we're computing
Must be a valid state in our state space
In top-down: this is the function parameter
In bottom-up: this is the cell we're filling

2. Right Side: Previous States

States we can transition FROM to reach the current state
Must all be 'smaller' subproblems (no cycles)
Each represents one possible decision or path

3. Aggregate Function

Combines values from previous states
min() for minimization problems
max() for maximization problems
sum() for counting problems
and()/or() for feasibility problems

4. Transition Costs

Additional value gained/lost in the transition
May be 0, 1, or problem-specific values
Added to/combined with previous state values

Transition Equation Components in Classic Problems
Problem	Current State	Previous States	Aggregate	Transition Cost
Fibonacci	dp[n]	dp[n-1], dp[n-2]	sum	+0 (just add values)
Coin Change	dp[amount]	dp[amount - coin] for each coin	min	+1 (one coin)
LCS	dp[i][j]	dp[i-1][j-1], dp[i-1][j], dp[i][j-1]	max	+1 if match, +0 else
Knapsack	dp[i][w]	dp[i-1][w], dp[i-1][w-wt[i]]	max	+0 or +val[i]
Unique Paths	dp[r][c]	dp[r-1][c], dp[r][c-1]	sum	×1 (paths multiply)
Edit Distance	dp[i][j]	dp[i-1][j-1], dp[i-1][j], dp[i][j-1]	min	+0 or +1

Optimization vs. Counting

The aggregate function reveals the problem type. Optimization (min/max): pick the best among options. Counting (sum): add up all options. Feasibility (or): any path works. Getting this wrong (using min when you should sum) is a common bug that produces wrong answers.

The Choice-Based Derivation Method

The most reliable way to derive transition equations is the choice-based method. It works by asking: "What choices can I make at this state, and what state does each choice lead to (or come from)?"

Step 1: Identify All Choices

At the current state, list every possible action or decision. These might be:

Take or don't take an item
Move right or move down
Use this character or skip it
Apply operator A, B, or C

Step 2: Map Each Choice to a Sub-State

For each choice, determine which smaller subproblem it connects to:

If I take item i, what's the remaining subproblem?
If I move right, where do I end up?
If I match these characters, what's left?

Step 3: Express the Value

For each choice, write the value as: value_of_choice = dp[resulting_state] + cost_of_choice

Step 4: Combine with Aggregate

Apply the appropriate aggregate function to all choices.

choice_based_derivation.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
# ===== EXAMPLE: 0/1 Knapsack Using Choice-Based Method =====
 
def derive_knapsack_transition():
    """
    Problem: Given n items with weights and values, maximize value 
             within capacity W.
    
    State: dp[i][w] = max value using items 0..i-1 with capacity w
    
    STEP 1: Identify Choices at state (i, w)
        Choice A: Don't take item i-1
        Choice B: Take item i-1 (only if weight[i-1] <= w)
    
    STEP 2: Map Choices to Sub-States
        Choice A → dp[i-1][w] 
                   (same capacity, one less item considered)
        Choice B → dp[i-1][w - weight[i-1]]
                   (reduced capacity, one less item considered)
    
    STEP 3: Express Values
        Value of A = dp[i-1][w] + 0 (no value gained)
        Value of B = dp[i-1][w - weight[i-1]] + value[i-1]
    
    STEP 4: Aggregate (maximization)
        dp[i][w] = max(
            dp[i-1][w],                               # Choice A
            dp[i-1][w - weight[i-1]] + value[i-1]     # Choice B
        )
        
        (with condition: Choice B only valid if weight[i-1] <= w)
    """
    pass
 
 
# ===== EXAMPLE: Edit Distance Using Choice-Based Method =====
 
def derive_edit_distance_transition():
    """
    Problem: Minimum operations (insert, delete, replace) to transform 
             word1 into word2.
    
    State: dp[i][j] = min edits to transform word1[0..i-1] to word2[0..j-1]
    
    STEP 1: Identify Choices at state (i, j)
        If word1[i-1] == word2[j-1]:
            Choice A: Match characters (no operation needed)
        
        If word1[i-1] != word2[j-1]:
            Choice B: Replace word1[i-1] with word2[j-1]
            Choice C: Delete word1[i-1]
            Choice D: Insert word2[j-1] into word1
    
    STEP 2: Map Choices to Sub-States
        Choice A → dp[i-1][j-1] (both advance)
        Choice B → dp[i-1][j-1] (both advance after replace)
        Choice C → dp[i-1][j]   (word1 advances, word2 stays)
        Choice D → dp[i][j-1]   (word1 stays, word2 advances)
    
    STEP 3: Express Values
        Value of A = dp[i-1][j-1] + 0 (no edit)
        Value of B = dp[i-1][j-1] + 1 (one edit)
        Value of C = dp[i-1][j] + 1   (one edit)
        Value of D = dp[i][j-1] + 1   (one edit)
    
    STEP 4: Aggregate (minimization)
        if word1[i-1] == word2[j-1]:
            dp[i][j] = dp[i-1][j-1]
        else:
            dp[i][j] = 1 + min(
                dp[i-1][j-1],  # Replace
                dp[i-1][j],    # Delete
                dp[i][j-1]     # Insert
            )
    """
    pass
 
 
# ===== EXAMPLE: Unique Paths (Counting) Using Choice-Based Method =====
 
def derive_unique_paths_transition():
    """
    Problem: Count paths from (0,0) to (m-1, n-1) moving only right/down.
    
    State: dp[r][c] = number of paths to reach cell (r, c)
    
    STEP 1: Identify Choices (how could we ARRIVE at (r, c)?)
        Choice A: Come from above (r-1, c)
        Choice B: Come from left (r, c-1)
    
    NOTE: For counting, we think "how did we get here?" not 
          "where can we go?"
    
    STEP 2: Map Choices to Sub-States
        Choice A → dp[r-1][c]
        Choice B → dp[r][c-1]
    
    STEP 3: Express Values
        Value of A = dp[r-1][c] × 1 (all paths to (r-1,c) extend here)
        Value of B = dp[r][c-1] × 1 (all paths to (r,c-1) extend here)
    
    STEP 4: Aggregate (counting = sum)
        dp[r][c] = dp[r-1][c] + dp[r][c-1]
        
        (with boundary conditions for first row/column)
    """
    pass

Forward vs. Backward Transitions

Transitions can be written in two equivalent ways, depending on your perspective:

Backward (Pull) Transitions: "To compute state S, what states do I need to look at?"

dp[S] = aggregate(dp[prev_states] + costs)

The current state pulls values from previous states
Natural for bottom-up: fill dp[S] by reading already-filled cells
Most common formulation

Forward (Push) Transitions: "From state S, what states can I reach?"

for each next_state reachable from S:
    dp[next_state] = aggregate(dp[next_state], dp[S] + cost)

The current state pushes its value to future states
Natural when transitions are conditional/complex
Useful when a state has many incoming transitions

When to Use Each:

Use Backward (Pull) When

•Each state has a small, fixed set of predecessors
•The transition logic is straightforward
•You're implementing standard bottom-up DP
•Examples: LCS, Edit Distance, Grid paths

Use Forward (Push) When

•A state can reach many successors
•Transitions are conditional on state values
•You're processing events in order
•Examples: Interval scheduling, event processing

forward_vs_backward.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# ===== BACKWARD (PULL) TRANSITION: Coin Change =====
def coin_change_backward(coins: list[int], amount: int) -> int:
    """
    Backward formulation: for each amount, PULL from smaller amounts.
    
    dp[a] = min over all coins c of { dp[a - c] + 1 }
    
    "To make amount a, look at each coin c and check what it would cost
     to have already made (a - c), then add one coin."
    """
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    
    for a in range(1, amount + 1):
        for coin in coins:
            if coin <= a and dp[a - coin] != float('inf'):
                dp[a] = min(dp[a], dp[a - coin] + 1)  # PULL from dp[a - coin]
    
    return dp[amount] if dp[amount] != float('inf') else -1
 
 
# ===== FORWARD (PUSH) TRANSITION: Coin Change =====
def coin_change_forward(coins: list[int], amount: int) -> int:
    """
    Forward formulation: for each amount, PUSH to larger amounts.
    
    for each coin c:
        dp[a + c] = min(dp[a + c], dp[a] + 1)
    
    "Having made amount a with cost dp[a], I can make amount (a + c)
     by adding one coin of value c."
    """
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    
    for a in range(amount):
        if dp[a] == float('inf'):
            continue
        for coin in coins:
            if a + coin <= amount:
                dp[a + coin] = min(dp[a + coin], dp[a] + 1)  # PUSH to dp[a + coin]
    
    return dp[amount] if dp[amount] != float('inf') else -1
 
 
# ===== FORWARD MORE NATURAL: Jump Game II =====
def jump_game_forward(nums: list[int]) -> int:
    """
    Forward is more natural here: from each position, we know
    where we CAN jump to, not necessarily where we came from.
    
    From position i, we can reach positions i+1, i+2, ..., i+nums[i]
    """
    n = len(nums)
    if n <= 1:
        return 0
    
    dp = [float('inf')] * n
    dp[0] = 0
    
    for i in range(n):
        if dp[i] == float('inf'):
            continue
        # PUSH: from i, update all reachable positions
        for j in range(i + 1, min(i + nums[i] + 1, n)):
            dp[j] = min(dp[j], dp[i] + 1)
    
    return dp[n - 1] if dp[n - 1] != float('inf') else -1
 
 
# ===== BACKWARD MORE NATURAL: Stone Game =====
def stone_game_backward(piles: list[int]) -> bool:
    """
    Backward is natural for interval DP: we pull from smaller intervals.
    
    dp[i][j] = max stones player can get from piles[i..j]
    
    "To know the best score for interval [i, j], look at what happens
     after taking from either end."
    """
    n = len(piles)
    dp = [[0] * n for _ in range(n)]
    
    # Base case: single pile
    for i in range(n):
        dp[i][i] = piles[i]
    
    # Expand intervals
    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            # PULL from smaller intervals [i+1, j] and [i, j-1]
            take_left = piles[i] + (sum(piles[i+1:j+1]) - dp[i+1][j])
            take_right = piles[j] + (sum(piles[i:j]) - dp[i][j-1])
            dp[i][j] = max(take_left, take_right)
    
    total = sum(piles)
    return dp[0][n-1] > total - dp[0][n-1]

Both Are Equivalent

Forward and backward formulations compute the same answer. Choose the one that feels more natural for the problem and leads to cleaner code. If you're stuck with one, try the other—sometimes the alternative formulation is dramatically clearer.

Common Transition Patterns

Certain transition patterns recur across many DP problems. Recognizing these patterns accelerates your problem-solving dramatically.

Essential Transition Patterns

•Linear Recurrence: dp[i] = f(dp[i-1], dp[i-2], ..., dp[i-k]) — Fixed look-back at previous k states. Examples: Fibonacci, House Robber.
•Take/Skip Pattern: dp[i] = max(take(dp[i-2]) + gain, skip(dp[i-1])) — Binary choice per element. Examples: House Robber, Knapsack (per item).
•Two-Pointer Expansion: dp[i][j] = f(dp[i+1][j-1], dp[i][j-1], dp[i+1][j]) — Build intervals from smaller intervals. Examples: Palindrome, Matrix Chain.
•Grid Navigation: dp[r][c] = f(dp[r-1][c], dp[r][c-1]) — Each cell depends on left/up neighbors. Examples: Grid paths, Minimum path sum.
•Sequence Alignment: dp[i][j] = f(dp[i-1][j-1], dp[i-1][j], dp[i][j-1]) — 2D matching on sequences. Examples: LCS, Edit Distance.
•State Machine: Different equations per state variable. Examples: Stock trading with hold/not-hold states.

transition_patterns.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
# ===== PATTERN: Linear Recurrence =====
def fibonacci_pattern(n: int) -> int:
    """
    Pattern: dp[i] = dp[i-1] + dp[i-2]
    
    Fixed look-back depth (2), linear state space.
    Many problems reduce to this pattern with different coefficients.
    """
    if n <= 1:
        return n
    dp = [0] * (n + 1)
    dp[0], dp[1] = 0, 1
    for i in range(2, n + 1):
        dp[i] = dp[i-1] + dp[i-2]
    return dp[n]
 
 
# ===== PATTERN: Take/Skip =====
def house_robber_pattern(nums: list[int]) -> int:
    """
    Pattern: dp[i] = max(dp[i-1], dp[i-2] + nums[i])
    
    Binary choice at each position:
        - Skip: carry forward previous best
        - Take: use value, skip to i-2 (due to constraint)
    """
    if not nums:
        return 0
    if len(nums) == 1:
        return nums[0]
    
    dp = [0] * len(nums)
    dp[0] = nums[0]
    dp[1] = max(nums[0], nums[1])
    
    for i in range(2, len(nums)):
        dp[i] = max(dp[i-1], dp[i-2] + nums[i])
    
    return dp[-1]
 
 
# ===== PATTERN: Interval DP =====
def longest_palindrome_subsequence(s: str) -> int:
    """
    Pattern: dp[i][j] = f(dp[i+1][j-1], dp[i][j-1], dp[i+1][j])
    
    Build larger intervals from smaller ones.
    Process in order of increasing interval length.
    """
    n = len(s)
    dp = [[0] * n for _ in range(n)]
    
    # Base case: single characters
    for i in range(n):
        dp[i][i] = 1
    
    # Expand intervals
    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            if s[i] == s[j]:
                dp[i][j] = dp[i+1][j-1] + 2
            else:
                dp[i][j] = max(dp[i+1][j], dp[i][j-1])
    
    return dp[0][n-1]
 
 
# ===== PATTERN: State Machine =====
def stock_with_cooldown_pattern(prices: list[int]) -> int:
    """
    Pattern: Multiple states with different transitions
    
    States: hold, sold (in cooldown), rest (can buy)
    Transitions form a state machine:
        rest → hold (buy)
        hold → sold (sell)
        sold → rest (cooldown ends)
    """
    if not prices:
        return 0
    
    n = len(prices)
    hold = [0] * n   # Holding stock
    sold = [0] * n   # Just sold, in cooldown
    rest = [0] * n   # Not holding, can buy
    
    hold[0] = -prices[0]
    sold[0] = 0
    rest[0] = 0
    
    for i in range(1, n):
        # Each state has its own transition equation:
        hold[i] = max(hold[i-1], rest[i-1] - prices[i])
        sold[i] = hold[i-1] + prices[i]
        rest[i] = max(rest[i-1], sold[i-1])
    
    return max(sold[-1], rest[-1])
 
 
# ===== PATTERN: Sequence Alignment =====
def lcs_pattern(text1: str, text2: str) -> int:
    """
    Pattern: dp[i][j] depends on dp[i-1][j-1], dp[i-1][j], dp[i][j-1]
    
    The diagonal, left, and up dependencies are characteristic of
    two-sequence alignment problems.
    """
    m, n = len(text1), len(text2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if text1[i-1] == text2[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1  # Diagonal + match
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])  # Left or up
    
    return dp[m][n]

Writing Correct Transitions: A Systematic Process

Follow this systematic process to derive correct transition equations:

Step 1: Write the Semantic Meaning

Before writing any equations, write in plain English what dp[state] means. Be precise.

dp[i][w] = "the maximum value obtainable by considering items 0
           through i-1 with a knapsack capacity of w"

Step 2: Enumerate Possible Last Decisions

To reach the current state, what was the last action or decision? List all possibilities.

Last decision for dp[i][w]:
  (a) Did NOT take item i-1
  (b) DID take item i-1 (if it fits)

Step 3: Write Each Case's Contribution

For each last decision, express the value in terms of other states.

(a) Didn't take: dp[i-1][w] + 0
(b) Took:        dp[i-1][w - weight[i-1]] + value[i-1]

Step 4: Combine with Correct Aggregate

Use max for maximization, min for minimization, sum for counting.

dp[i][w] = max(
    dp[i-1][w],
    dp[i-1][w - weight[i-1]] + value[i-1]
) if weight[i-1] <= w else dp[i-1][w]

Common Transition Errors

Error 1: Wrong aggregate — Using max when you should sum (or vice versa). Ask: am I optimizing or counting?

Error 2: Off-by-one indices — Does dp[i] include element i or not? Be consistent.

Error 3: Missing cases — Forgetting a valid transition path leads to suboptimal answers.

Error 4: Invalid state access — Accessing dp[-1] or dp[n+1] without boundary checks.

systematic_derivation.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
# ===== COMPLETE DERIVATION: Word Break Problem =====
 
def word_break_derivation(s: str, wordDict: list[str]) -> bool:
    """
    STEP 1: Semantic Meaning
        dp[i] = True if s[0..i-1] can be segmented into dictionary words
        dp[i] = False otherwise
        
        What does dp[i] represent? "Can the first i characters be segmented?"
    
    STEP 2: Last Decisions
        If dp[i] is True, what was the last word used?
        The last word must end at position i-1.
        So the last word is s[j:i] for some j where 0 <= j < i.
        
        For dp[i] to be True:
            - dp[j] must be True (prefix s[0..j-1] is segmentable)
            - s[j:i] must be in wordDict (the "bridge" is a valid word)
    
    STEP 3: Case Contribution
        dp[i] = True if there exists j such that:
            dp[j] = True AND s[j:i] in wordDict
        
        dp[i] = False otherwise (no valid segmentation)
    
    STEP 4: Aggregate
        Feasibility problem → use logical OR
        dp[i] = any(dp[j] and s[j:i] in wordDict for j in range(i))
    """
    n = len(s)
    word_set = set(wordDict)
    
    dp = [False] * (n + 1)
    dp[0] = True  # Empty string is trivially valid
    
    for i in range(1, n + 1):
        for j in range(i):
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break  # Early termination for feasibility
    
    return dp[n]
 
 
# ===== COMPLETE DERIVATION: Decode Ways =====
 
def decode_ways_derivation(s: str) -> int:
    """
    STEP 1: Semantic Meaning
        dp[i] = number of ways to decode s[0..i-1]
        
        "How many valid letter sequences decode the first i digits?"
    
    STEP 2: Last Decisions
        At position i-1 (0-indexed), we decoded either:
        (a) One digit (s[i-1]) as a letter (1-9 → A-I)  
        (b) Two digits (s[i-2:i]) as a letter (10-26 → J-Z)
    
    STEP 3: Case Contribution
        (a) If s[i-1] in '123456789':
            Ways = dp[i-1] (all ways to decode s[0..i-2], extend each by one letter)
        (b) If s[i-2:i] forms valid 2-digit:
            Ways = dp[i-2] (all ways to decode s[0..i-3], extend each by one letter)
    
    STEP 4: Aggregate
        Counting problem → use SUM
        dp[i] = (contribution from case a) + (contribution from case b)
    """
    if not s or s[0] == '0':
        return 0
    
    n = len(s)
    dp = [0] * (n + 1)
    dp[0] = 1  # Base: empty string has one "decoding" (nothing)
    dp[1] = 1  # First character (already validated non-zero)
    
    for i in range(2, n + 1):
        # Case (a): single digit s[i-1]
        if s[i-1] != '0':
            dp[i] += dp[i-1]
        
        # Case (b): two digits s[i-2:i]
        two_digit = int(s[i-2:i])
        if 10 <= two_digit <= 26:
            dp[i] += dp[i-2]
    
    return dp[n]
 
 
# ===== COMPLETE DERIVATION: Minimum Path Sum =====
 
def min_path_sum_derivation(grid: list[list[int]]) -> int:
    """
    STEP 1: Semantic Meaning
        dp[r][c] = minimum path sum to reach cell (r, c) from (0, 0)
        
        "What's the smallest sum path to get here?"
    
    STEP 2: Last Decisions
        To reach (r, c), we came from:
        (a) Cell above: (r-1, c)
        (b) Cell to left: (r, c-1)
    
    STEP 3: Case Contribution
        (a) From above: dp[r-1][c] + grid[r][c]
        (b) From left:  dp[r][c-1] + grid[r][c]
    
    STEP 4: Aggregate
        Minimization → use MIN
        dp[r][c] = min(dp[r-1][c], dp[r][c-1]) + grid[r][c]
        
        Boundary: First row can only come from left
                  First column can only come from above
    """
    m, n = len(grid), len(grid[0])
    dp = [[0] * n for _ in range(m)]
    
    dp[0][0] = grid[0][0]
    
    # First row: only from left
    for c in range(1, n):
        dp[0][c] = dp[0][c-1] + grid[0][c]
    
    # First column: only from above
    for r in range(1, m):
        dp[r][0] = dp[r-1][0] + grid[r][0]
    
    # Interior: min of both
    for r in range(1, m):
        for c in range(1, n):
            dp[r][c] = min(dp[r-1][c], dp[r][c-1]) + grid[r][c]
    
    return dp[m-1][n-1]

Verifying Transition Correctness

Before implementing, verify your transition equation with these checks:

Check 1: Acyclicity

The state graph must be a DAG. Each transition must go to a "smaller" state. If dp[i] depends on dp[i] (directly or indirectly through cycles), the recurrence is invalid.

Check 2: Completeness

Every possible way to reach the current state should be accounted for. Missing a case means your solution finds suboptimal answers.

Check 3: Correct Aggregate

Optimization (find best) → max/min
Counting (how many) → sum
Feasibility (can we?) → any/or
All valid (every path works?) → all/and

Check 4: Base Case Coverage

The smallest subproblems (base cases) must be answerable without recursion, and all recursive paths must eventually reach base cases.

verification_examples.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
# ===== VERIFICATION EXAMPLE: Matrix Chain Multiplication =====
 
def verify_matrix_chain():
    """
    State: dp[i][j] = min cost to multiply matrices i through j
    
    CHECK 1: Acyclicity
        dp[i][j] depends on dp[i][k] and dp[k+1][j] where i <= k < j
        Since (k-i) < (j-i) and (j-k-1) < (j-i), we always access
        smaller intervals. ✓ Acyclic
    
    CHECK 2: Completeness
        For interval [i, j], we can split at any k from i to j-1.
        We try all k values: no splits are missed. ✓ Complete
    
    CHECK 3: Correct Aggregate
        We want minimum cost → use min. ✓ Correct
        
    CHECK 4: Base Cases
        dp[i][i] = 0 (single matrix, no multiplication needed)
        All intervals of length 1 are covered. ✓ Base cases defined
    """
    pass
 
 
def matrix_chain_order(dims: list[int]) -> int:
    """
    dims[i-1] × dims[i] gives dimensions of matrix i.
    Find minimum multiplications to compute product of all matrices.
    
    Transition: dp[i][j] = min over k of { dp[i][k] + dp[k+1][j] + cost(split at k) }
    """
    n = len(dims) - 1  # Number of matrices
    dp = [[0] * n for _ in range(n)]
    
    # Base case: single matrix (length 1 intervals)
    for i in range(n):
        dp[i][i] = 0
    
    # Fill by increasing interval length
    for length in range(2, n + 1):  # length of interval
        for i in range(n - length + 1):
            j = i + length - 1
            dp[i][j] = float('inf')
            for k in range(i, j):
                # Cost = left side + right side + cost of multiplying results
                cost = dp[i][k] + dp[k+1][j] + dims[i] * dims[k+1] * dims[j+1]
                dp[i][j] = min(dp[i][j], cost)
    
    return dp[0][n-1]
 
 
# ===== HAND-TRACING: The Ultimate Verification =====
def hand_trace_coin_change():
    """
    For simple inputs, trace the DP table by hand.
    
    coins = [1, 2, 5], amount = 6
    
    Initial: dp = [0, inf, inf, inf, inf, inf, inf]
    
    a=1: coin 1: dp[1] = min(inf, dp[0]+1) = 1
    a=2: coin 1: dp[2] = min(inf, dp[1]+1) = 2
         coin 2: dp[2] = min(2, dp[0]+1) = 1
    a=3: coin 1: dp[3] = min(inf, dp[2]+1) = 2
         coin 2: dp[3] = min(2, dp[1]+1) = 2
    a=4: coin 1: dp[4] = min(inf, dp[3]+1) = 3
         coin 2: dp[4] = min(3, dp[2]+1) = 2
    a=5: coin 1: dp[5] = min(inf, dp[4]+1) = 3
         coin 2: dp[5] = min(3, dp[3]+1) = 3
         coin 5: dp[5] = min(3, dp[0]+1) = 1
    a=6: coin 1: dp[6] = min(inf, dp[5]+1) = 2
         coin 2: dp[6] = min(2, dp[4]+1) = 2
         coin 5: dp[6] = min(2, dp[1]+1) = 2
    
    Final: dp = [0, 1, 1, 2, 2, 1, 2]
    Answer: dp[6] = 2 (either 5+1 or 2+2+2)
    
    Hand tracing confirms the transition is correct!
    """
    coins = [1, 2, 5]
    amount = 6
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    
    for a in range(1, amount + 1):
        for coin in coins:
            if coin <= a and dp[a - coin] != float('inf'):
                dp[a] = min(dp[a], dp[a - coin] + 1)
    
    return dp[amount]  # Returns 2

The Hand-Trace Method

For any new transition equation, take a small example and fill the DP table by hand. This catches indexing errors, missing cases, and logic mistakes before you debug compiled code. If hand-tracing produces the expected answer, you've likely got it right.

Summary: Mastering Transition Equations

We've covered the art and science of writing DP transitions. Let's consolidate:

Key Takeaways

•Transitions have a standard anatomy — Current state = aggregate of (previous states + costs). Left side is what we compute; right side is how we compute it.
•Use choice-based derivation — Ask "what choices led here?" enumerate them, express each as a term, combine with the appropriate aggregate.
•Forward vs. backward are equivalent — Pull (backward) fills the current state from predecessors. Push (forward) updates successors from current. Choose what's natural.
•Recognize common patterns — Linear recurrence, take/skip, interval expansion, grid navigation, sequence alignment, state machine. Pattern recognition accelerates derivation.
•Follow the systematic 4-step process — (1) Semantic meaning, (2) Last decisions, (3) Case contributions, (4) Aggregate correctly.
•Verify before implementing — Check acyclicity, completeness, correct aggregate, and base case coverage. Hand-trace on small inputs.

What's Next:

With transitions written, we face one more challenge: handling edge cases. The next page covers boundary conditions, empty inputs, special values, and the subtle cases that often break otherwise-correct DP implementations.

Page Complete

You now understand transition equation anatomy, can derive transitions using the choice-based method, recognize when to use forward vs. backward formulations, and know how to verify correctness. Next, we'll tackle the edge cases that make DP implementations robust.

3 / 4

Loading learning content...

Data Structures & AlgorithmsState Design & Transition Functions

State Design & Transition Functions

LevelAdvanced

Duration90 mins

TopicState Design & Transition Functions

3 / 4

Writing Transition Equations

The Heart of Recurrence

You've defined your state. You know what information to track. Now comes the moment of truth: writing the transition equation.

What You Will Learn

Anatomy of a Transition Equation

Every DP transition equation has the same fundamental structure:

dp[current_state] = aggregate(dp[previous_state₁] + cost₁,
                              dp[previous_state₂] + cost₂,
                              ...)

Let's break down each component:

1. Left Side: Current State

The state whose value we're computing
Must be a valid state in our state space
In top-down: this is the function parameter
In bottom-up: this is the cell we're filling

2. Right Side: Previous States

States we can transition FROM to reach the current state
Must all be 'smaller' subproblems (no cycles)
Each represents one possible decision or path

3. Aggregate Function

Combines values from previous states
min() for minimization problems
max() for maximization problems
sum() for counting problems
and()/or() for feasibility problems

4. Transition Costs

Additional value gained/lost in the transition
May be 0, 1, or problem-specific values
Added to/combined with previous state values

Transition Equation Components in Classic Problems
Problem	Current State	Previous States	Aggregate	Transition Cost
Fibonacci	dp[n]	dp[n-1], dp[n-2]	sum	+0 (just add values)
Coin Change	dp[amount]	dp[amount - coin] for each coin	min	+1 (one coin)
LCS	dp[i][j]	dp[i-1][j-1], dp[i-1][j], dp[i][j-1]	max	+1 if match, +0 else
Knapsack	dp[i][w]	dp[i-1][w], dp[i-1][w-wt[i]]	max	+0 or +val[i]
Unique Paths	dp[r][c]	dp[r-1][c], dp[r][c-1]	sum	×1 (paths multiply)
Edit Distance	dp[i][j]	dp[i-1][j-1], dp[i-1][j], dp[i][j-1]	min	+0 or +1

Optimization vs. Counting

The Choice-Based Derivation Method

Step 1: Identify All Choices

At the current state, list every possible action or decision. These might be:

Take or don't take an item
Move right or move down
Use this character or skip it
Apply operator A, B, or C

Step 2: Map Each Choice to a Sub-State

For each choice, determine which smaller subproblem it connects to:

If I take item i, what's the remaining subproblem?
If I move right, where do I end up?
If I match these characters, what's left?

Step 3: Express the Value

For each choice, write the value as: value_of_choice = dp[resulting_state] + cost_of_choice

Step 4: Combine with Aggregate

Apply the appropriate aggregate function to all choices.

choice_based_derivation.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
# ===== EXAMPLE: 0/1 Knapsack Using Choice-Based Method =====
 
def derive_knapsack_transition():
    """
    Problem: Given n items with weights and values, maximize value 
             within capacity W.
    
    State: dp[i][w] = max value using items 0..i-1 with capacity w
    
    STEP 1: Identify Choices at state (i, w)
        Choice A: Don't take item i-1
        Choice B: Take item i-1 (only if weight[i-1] <= w)
    
    STEP 2: Map Choices to Sub-States
        Choice A → dp[i-1][w] 
                   (same capacity, one less item considered)
        Choice B → dp[i-1][w - weight[i-1]]
                   (reduced capacity, one less item considered)
    
    STEP 3: Express Values
        Value of A = dp[i-1][w] + 0 (no value gained)
        Value of B = dp[i-1][w - weight[i-1]] + value[i-1]
    
    STEP 4: Aggregate (maximization)
        dp[i][w] = max(
            dp[i-1][w],                               # Choice A
            dp[i-1][w - weight[i-1]] + value[i-1]     # Choice B
        )
        
        (with condition: Choice B only valid if weight[i-1] <= w)
    """
    pass
 
 
# ===== EXAMPLE: Edit Distance Using Choice-Based Method =====
 
def derive_edit_distance_transition():
    """
    Problem: Minimum operations (insert, delete, replace) to transform 
             word1 into word2.
    
    State: dp[i][j] = min edits to transform word1[0..i-1] to word2[0..j-1]
    
    STEP 1: Identify Choices at state (i, j)
        If word1[i-1] == word2[j-1]:
            Choice A: Match characters (no operation needed)
        
        If word1[i-1] != word2[j-1]:
            Choice B: Replace word1[i-1] with word2[j-1]
            Choice C: Delete word1[i-1]
            Choice D: Insert word2[j-1] into word1
    
    STEP 2: Map Choices to Sub-States
        Choice A → dp[i-1][j-1] (both advance)
        Choice B → dp[i-1][j-1] (both advance after replace)
        Choice C → dp[i-1][j]   (word1 advances, word2 stays)
        Choice D → dp[i][j-1]   (word1 stays, word2 advances)
    
    STEP 3: Express Values
        Value of A = dp[i-1][j-1] + 0 (no edit)
        Value of B = dp[i-1][j-1] + 1 (one edit)
        Value of C = dp[i-1][j] + 1   (one edit)
        Value of D = dp[i][j-1] + 1   (one edit)
    
    STEP 4: Aggregate (minimization)
        if word1[i-1] == word2[j-1]:
            dp[i][j] = dp[i-1][j-1]
        else:
            dp[i][j] = 1 + min(
                dp[i-1][j-1],  # Replace
                dp[i-1][j],    # Delete
                dp[i][j-1]     # Insert
            )
    """
    pass
 
 
# ===== EXAMPLE: Unique Paths (Counting) Using Choice-Based Method =====
 
def derive_unique_paths_transition():
    """
    Problem: Count paths from (0,0) to (m-1, n-1) moving only right/down.
    
    State: dp[r][c] = number of paths to reach cell (r, c)
    
    STEP 1: Identify Choices (how could we ARRIVE at (r, c)?)
        Choice A: Come from above (r-1, c)
        Choice B: Come from left (r, c-1)
    
    NOTE: For counting, we think "how did we get here?" not 
          "where can we go?"
    
    STEP 2: Map Choices to Sub-States
        Choice A → dp[r-1][c]
        Choice B → dp[r][c-1]
    
    STEP 3: Express Values
        Value of A = dp[r-1][c] × 1 (all paths to (r-1,c) extend here)
        Value of B = dp[r][c-1] × 1 (all paths to (r,c-1) extend here)
    
    STEP 4: Aggregate (counting = sum)
        dp[r][c] = dp[r-1][c] + dp[r][c-1]
        
        (with boundary conditions for first row/column)
    """
    pass

Forward vs. Backward Transitions

Transitions can be written in two equivalent ways, depending on your perspective:

Backward (Pull) Transitions: "To compute state S, what states do I need to look at?"

dp[S] = aggregate(dp[prev_states] + costs)

The current state pulls values from previous states
Natural for bottom-up: fill dp[S] by reading already-filled cells
Most common formulation

Forward (Push) Transitions: "From state S, what states can I reach?"

for each next_state reachable from S:
    dp[next_state] = aggregate(dp[next_state], dp[S] + cost)

The current state pushes its value to future states
Natural when transitions are conditional/complex
Useful when a state has many incoming transitions

When to Use Each:

Use Backward (Pull) When

•Each state has a small, fixed set of predecessors
•The transition logic is straightforward
•You're implementing standard bottom-up DP
•Examples: LCS, Edit Distance, Grid paths

Use Forward (Push) When

•A state can reach many successors
•Transitions are conditional on state values
•You're processing events in order
•Examples: Interval scheduling, event processing

forward_vs_backward.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# ===== BACKWARD (PULL) TRANSITION: Coin Change =====
def coin_change_backward(coins: list[int], amount: int) -> int:
    """
    Backward formulation: for each amount, PULL from smaller amounts.
    
    dp[a] = min over all coins c of { dp[a - c] + 1 }
    
    "To make amount a, look at each coin c and check what it would cost
     to have already made (a - c), then add one coin."
    """
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    
    for a in range(1, amount + 1):
        for coin in coins:
            if coin <= a and dp[a - coin] != float('inf'):
                dp[a] = min(dp[a], dp[a - coin] + 1)  # PULL from dp[a - coin]
    
    return dp[amount] if dp[amount] != float('inf') else -1
 
 
# ===== FORWARD (PUSH) TRANSITION: Coin Change =====
def coin_change_forward(coins: list[int], amount: int) -> int:
    """
    Forward formulation: for each amount, PUSH to larger amounts.
    
    for each coin c:
        dp[a + c] = min(dp[a + c], dp[a] + 1)
    
    "Having made amount a with cost dp[a], I can make amount (a + c)
     by adding one coin of value c."
    """
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    
    for a in range(amount):
        if dp[a] == float('inf'):
            continue
        for coin in coins:
            if a + coin <= amount:
                dp[a + coin] = min(dp[a + coin], dp[a] + 1)  # PUSH to dp[a + coin]
    
    return dp[amount] if dp[amount] != float('inf') else -1
 
 
# ===== FORWARD MORE NATURAL: Jump Game II =====
def jump_game_forward(nums: list[int]) -> int:
    """
    Forward is more natural here: from each position, we know
    where we CAN jump to, not necessarily where we came from.
    
    From position i, we can reach positions i+1, i+2, ..., i+nums[i]
    """
    n = len(nums)
    if n <= 1:
        return 0
    
    dp = [float('inf')] * n
    dp[0] = 0
    
    for i in range(n):
        if dp[i] == float('inf'):
            continue
        # PUSH: from i, update all reachable positions
        for j in range(i + 1, min(i + nums[i] + 1, n)):
            dp[j] = min(dp[j], dp[i] + 1)
    
    return dp[n - 1] if dp[n - 1] != float('inf') else -1
 
 
# ===== BACKWARD MORE NATURAL: Stone Game =====
def stone_game_backward(piles: list[int]) -> bool:
    """
    Backward is natural for interval DP: we pull from smaller intervals.
    
    dp[i][j] = max stones player can get from piles[i..j]
    
    "To know the best score for interval [i, j], look at what happens
     after taking from either end."
    """
    n = len(piles)
    dp = [[0] * n for _ in range(n)]
    
    # Base case: single pile
    for i in range(n):
        dp[i][i] = piles[i]
    
    # Expand intervals
    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            # PULL from smaller intervals [i+1, j] and [i, j-1]
            take_left = piles[i] + (sum(piles[i+1:j+1]) - dp[i+1][j])
            take_right = piles[j] + (sum(piles[i:j]) - dp[i][j-1])
            dp[i][j] = max(take_left, take_right)
    
    total = sum(piles)
    return dp[0][n-1] > total - dp[0][n-1]

Both Are Equivalent

Common Transition Patterns

Certain transition patterns recur across many DP problems. Recognizing these patterns accelerates your problem-solving dramatically.

Essential Transition Patterns

•Linear Recurrence: dp[i] = f(dp[i-1], dp[i-2], ..., dp[i-k]) — Fixed look-back at previous k states. Examples: Fibonacci, House Robber.
•Take/Skip Pattern: dp[i] = max(take(dp[i-2]) + gain, skip(dp[i-1])) — Binary choice per element. Examples: House Robber, Knapsack (per item).
•Two-Pointer Expansion: dp[i][j] = f(dp[i+1][j-1], dp[i][j-1], dp[i+1][j]) — Build intervals from smaller intervals. Examples: Palindrome, Matrix Chain.
•Grid Navigation: dp[r][c] = f(dp[r-1][c], dp[r][c-1]) — Each cell depends on left/up neighbors. Examples: Grid paths, Minimum path sum.
•Sequence Alignment: dp[i][j] = f(dp[i-1][j-1], dp[i-1][j], dp[i][j-1]) — 2D matching on sequences. Examples: LCS, Edit Distance.
•State Machine: Different equations per state variable. Examples: Stock trading with hold/not-hold states.

transition_patterns.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
# ===== PATTERN: Linear Recurrence =====
def fibonacci_pattern(n: int) -> int:
    """
    Pattern: dp[i] = dp[i-1] + dp[i-2]
    
    Fixed look-back depth (2), linear state space.
    Many problems reduce to this pattern with different coefficients.
    """
    if n <= 1:
        return n
    dp = [0] * (n + 1)
    dp[0], dp[1] = 0, 1
    for i in range(2, n + 1):
        dp[i] = dp[i-1] + dp[i-2]
    return dp[n]
 
 
# ===== PATTERN: Take/Skip =====
def house_robber_pattern(nums: list[int]) -> int:
    """
    Pattern: dp[i] = max(dp[i-1], dp[i-2] + nums[i])
    
    Binary choice at each position:
        - Skip: carry forward previous best
        - Take: use value, skip to i-2 (due to constraint)
    """
    if not nums:
        return 0
    if len(nums) == 1:
        return nums[0]
    
    dp = [0] * len(nums)
    dp[0] = nums[0]
    dp[1] = max(nums[0], nums[1])
    
    for i in range(2, len(nums)):
        dp[i] = max(dp[i-1], dp[i-2] + nums[i])
    
    return dp[-1]
 
 
# ===== PATTERN: Interval DP =====
def longest_palindrome_subsequence(s: str) -> int:
    """
    Pattern: dp[i][j] = f(dp[i+1][j-1], dp[i][j-1], dp[i+1][j])
    
    Build larger intervals from smaller ones.
    Process in order of increasing interval length.
    """
    n = len(s)
    dp = [[0] * n for _ in range(n)]
    
    # Base case: single characters
    for i in range(n):
        dp[i][i] = 1
    
    # Expand intervals
    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            if s[i] == s[j]:
                dp[i][j] = dp[i+1][j-1] + 2
            else:
                dp[i][j] = max(dp[i+1][j], dp[i][j-1])
    
    return dp[0][n-1]
 
 
# ===== PATTERN: State Machine =====
def stock_with_cooldown_pattern(prices: list[int]) -> int:
    """
    Pattern: Multiple states with different transitions
    
    States: hold, sold (in cooldown), rest (can buy)
    Transitions form a state machine:
        rest → hold (buy)
        hold → sold (sell)
        sold → rest (cooldown ends)
    """
    if not prices:
        return 0
    
    n = len(prices)
    hold = [0] * n   # Holding stock
    sold = [0] * n   # Just sold, in cooldown
    rest = [0] * n   # Not holding, can buy
    
    hold[0] = -prices[0]
    sold[0] = 0
    rest[0] = 0
    
    for i in range(1, n):
        # Each state has its own transition equation:
        hold[i] = max(hold[i-1], rest[i-1] - prices[i])
        sold[i] = hold[i-1] + prices[i]
        rest[i] = max(rest[i-1], sold[i-1])
    
    return max(sold[-1], rest[-1])
 
 
# ===== PATTERN: Sequence Alignment =====
def lcs_pattern(text1: str, text2: str) -> int:
    """
    Pattern: dp[i][j] depends on dp[i-1][j-1], dp[i-1][j], dp[i][j-1]
    
    The diagonal, left, and up dependencies are characteristic of
    two-sequence alignment problems.
    """
    m, n = len(text1), len(text2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if text1[i-1] == text2[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1  # Diagonal + match
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])  # Left or up
    
    return dp[m][n]

Writing Correct Transitions: A Systematic Process

Follow this systematic process to derive correct transition equations:

Step 1: Write the Semantic Meaning

Before writing any equations, write in plain English what dp[state] means. Be precise.

dp[i][w] = "the maximum value obtainable by considering items 0
           through i-1 with a knapsack capacity of w"

Step 2: Enumerate Possible Last Decisions

To reach the current state, what was the last action or decision? List all possibilities.

Last decision for dp[i][w]:
  (a) Did NOT take item i-1
  (b) DID take item i-1 (if it fits)

Step 3: Write Each Case's Contribution

For each last decision, express the value in terms of other states.

(a) Didn't take: dp[i-1][w] + 0
(b) Took:        dp[i-1][w - weight[i-1]] + value[i-1]

Step 4: Combine with Correct Aggregate

Use max for maximization, min for minimization, sum for counting.

dp[i][w] = max(
    dp[i-1][w],
    dp[i-1][w - weight[i-1]] + value[i-1]
) if weight[i-1] <= w else dp[i-1][w]

Common Transition Errors

Error 1: Wrong aggregate — Using max when you should sum (or vice versa). Ask: am I optimizing or counting?

Error 2: Off-by-one indices — Does dp[i] include element i or not? Be consistent.

Error 3: Missing cases — Forgetting a valid transition path leads to suboptimal answers.

Error 4: Invalid state access — Accessing dp[-1] or dp[n+1] without boundary checks.

systematic_derivation.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
# ===== COMPLETE DERIVATION: Word Break Problem =====
 
def word_break_derivation(s: str, wordDict: list[str]) -> bool:
    """
    STEP 1: Semantic Meaning
        dp[i] = True if s[0..i-1] can be segmented into dictionary words
        dp[i] = False otherwise
        
        What does dp[i] represent? "Can the first i characters be segmented?"
    
    STEP 2: Last Decisions
        If dp[i] is True, what was the last word used?
        The last word must end at position i-1.
        So the last word is s[j:i] for some j where 0 <= j < i.
        
        For dp[i] to be True:
            - dp[j] must be True (prefix s[0..j-1] is segmentable)
            - s[j:i] must be in wordDict (the "bridge" is a valid word)
    
    STEP 3: Case Contribution
        dp[i] = True if there exists j such that:
            dp[j] = True AND s[j:i] in wordDict
        
        dp[i] = False otherwise (no valid segmentation)
    
    STEP 4: Aggregate
        Feasibility problem → use logical OR
        dp[i] = any(dp[j] and s[j:i] in wordDict for j in range(i))
    """
    n = len(s)
    word_set = set(wordDict)
    
    dp = [False] * (n + 1)
    dp[0] = True  # Empty string is trivially valid
    
    for i in range(1, n + 1):
        for j in range(i):
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break  # Early termination for feasibility
    
    return dp[n]
 
 
# ===== COMPLETE DERIVATION: Decode Ways =====
 
def decode_ways_derivation(s: str) -> int:
    """
    STEP 1: Semantic Meaning
        dp[i] = number of ways to decode s[0..i-1]
        
        "How many valid letter sequences decode the first i digits?"
    
    STEP 2: Last Decisions
        At position i-1 (0-indexed), we decoded either:
        (a) One digit (s[i-1]) as a letter (1-9 → A-I)  
        (b) Two digits (s[i-2:i]) as a letter (10-26 → J-Z)
    
    STEP 3: Case Contribution
        (a) If s[i-1] in '123456789':
            Ways = dp[i-1] (all ways to decode s[0..i-2], extend each by one letter)
        (b) If s[i-2:i] forms valid 2-digit:
            Ways = dp[i-2] (all ways to decode s[0..i-3], extend each by one letter)
    
    STEP 4: Aggregate
        Counting problem → use SUM
        dp[i] = (contribution from case a) + (contribution from case b)
    """
    if not s or s[0] == '0':
        return 0
    
    n = len(s)
    dp = [0] * (n + 1)
    dp[0] = 1  # Base: empty string has one "decoding" (nothing)
    dp[1] = 1  # First character (already validated non-zero)
    
    for i in range(2, n + 1):
        # Case (a): single digit s[i-1]
        if s[i-1] != '0':
            dp[i] += dp[i-1]
        
        # Case (b): two digits s[i-2:i]
        two_digit = int(s[i-2:i])
        if 10 <= two_digit <= 26:
            dp[i] += dp[i-2]
    
    return dp[n]
 
 
# ===== COMPLETE DERIVATION: Minimum Path Sum =====
 
def min_path_sum_derivation(grid: list[list[int]]) -> int:
    """
    STEP 1: Semantic Meaning
        dp[r][c] = minimum path sum to reach cell (r, c) from (0, 0)
        
        "What's the smallest sum path to get here?"
    
    STEP 2: Last Decisions
        To reach (r, c), we came from:
        (a) Cell above: (r-1, c)
        (b) Cell to left: (r, c-1)
    
    STEP 3: Case Contribution
        (a) From above: dp[r-1][c] + grid[r][c]
        (b) From left:  dp[r][c-1] + grid[r][c]
    
    STEP 4: Aggregate
        Minimization → use MIN
        dp[r][c] = min(dp[r-1][c], dp[r][c-1]) + grid[r][c]
        
        Boundary: First row can only come from left
                  First column can only come from above
    """
    m, n = len(grid), len(grid[0])
    dp = [[0] * n for _ in range(m)]
    
    dp[0][0] = grid[0][0]
    
    # First row: only from left
    for c in range(1, n):
        dp[0][c] = dp[0][c-1] + grid[0][c]
    
    # First column: only from above
    for r in range(1, m):
        dp[r][0] = dp[r-1][0] + grid[r][0]
    
    # Interior: min of both
    for r in range(1, m):
        for c in range(1, n):
            dp[r][c] = min(dp[r-1][c], dp[r][c-1]) + grid[r][c]
    
    return dp[m-1][n-1]

Verifying Transition Correctness

Before implementing, verify your transition equation with these checks:

Check 1: Acyclicity

The state graph must be a DAG. Each transition must go to a "smaller" state. If dp[i] depends on dp[i] (directly or indirectly through cycles), the recurrence is invalid.

Check 2: Completeness

Every possible way to reach the current state should be accounted for. Missing a case means your solution finds suboptimal answers.

Check 3: Correct Aggregate

Optimization (find best) → max/min
Counting (how many) → sum
Feasibility (can we?) → any/or
All valid (every path works?) → all/and

Check 4: Base Case Coverage

The smallest subproblems (base cases) must be answerable without recursion, and all recursive paths must eventually reach base cases.

verification_examples.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
# ===== VERIFICATION EXAMPLE: Matrix Chain Multiplication =====
 
def verify_matrix_chain():
    """
    State: dp[i][j] = min cost to multiply matrices i through j
    
    CHECK 1: Acyclicity
        dp[i][j] depends on dp[i][k] and dp[k+1][j] where i <= k < j
        Since (k-i) < (j-i) and (j-k-1) < (j-i), we always access
        smaller intervals. ✓ Acyclic
    
    CHECK 2: Completeness
        For interval [i, j], we can split at any k from i to j-1.
        We try all k values: no splits are missed. ✓ Complete
    
    CHECK 3: Correct Aggregate
        We want minimum cost → use min. ✓ Correct
        
    CHECK 4: Base Cases
        dp[i][i] = 0 (single matrix, no multiplication needed)
        All intervals of length 1 are covered. ✓ Base cases defined
    """
    pass
 
 
def matrix_chain_order(dims: list[int]) -> int:
    """
    dims[i-1] × dims[i] gives dimensions of matrix i.
    Find minimum multiplications to compute product of all matrices.
    
    Transition: dp[i][j] = min over k of { dp[i][k] + dp[k+1][j] + cost(split at k) }
    """
    n = len(dims) - 1  # Number of matrices
    dp = [[0] * n for _ in range(n)]
    
    # Base case: single matrix (length 1 intervals)
    for i in range(n):
        dp[i][i] = 0
    
    # Fill by increasing interval length
    for length in range(2, n + 1):  # length of interval
        for i in range(n - length + 1):
            j = i + length - 1
            dp[i][j] = float('inf')
            for k in range(i, j):
                # Cost = left side + right side + cost of multiplying results
                cost = dp[i][k] + dp[k+1][j] + dims[i] * dims[k+1] * dims[j+1]
                dp[i][j] = min(dp[i][j], cost)
    
    return dp[0][n-1]
 
 
# ===== HAND-TRACING: The Ultimate Verification =====
def hand_trace_coin_change():
    """
    For simple inputs, trace the DP table by hand.
    
    coins = [1, 2, 5], amount = 6
    
    Initial: dp = [0, inf, inf, inf, inf, inf, inf]
    
    a=1: coin 1: dp[1] = min(inf, dp[0]+1) = 1
    a=2: coin 1: dp[2] = min(inf, dp[1]+1) = 2
         coin 2: dp[2] = min(2, dp[0]+1) = 1
    a=3: coin 1: dp[3] = min(inf, dp[2]+1) = 2
         coin 2: dp[3] = min(2, dp[1]+1) = 2
    a=4: coin 1: dp[4] = min(inf, dp[3]+1) = 3
         coin 2: dp[4] = min(3, dp[2]+1) = 2
    a=5: coin 1: dp[5] = min(inf, dp[4]+1) = 3
         coin 2: dp[5] = min(3, dp[3]+1) = 3
         coin 5: dp[5] = min(3, dp[0]+1) = 1
    a=6: coin 1: dp[6] = min(inf, dp[5]+1) = 2
         coin 2: dp[6] = min(2, dp[4]+1) = 2
         coin 5: dp[6] = min(2, dp[1]+1) = 2
    
    Final: dp = [0, 1, 1, 2, 2, 1, 2]
    Answer: dp[6] = 2 (either 5+1 or 2+2+2)
    
    Hand tracing confirms the transition is correct!
    """
    coins = [1, 2, 5]
    amount = 6
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    
    for a in range(1, amount + 1):
        for coin in coins:
            if coin <= a and dp[a - coin] != float('inf'):
                dp[a] = min(dp[a], dp[a - coin] + 1)
    
    return dp[amount]  # Returns 2

The Hand-Trace Method

Summary: Mastering Transition Equations

We've covered the art and science of writing DP transitions. Let's consolidate:

Key Takeaways

•Transitions have a standard anatomy — Current state = aggregate of (previous states + costs). Left side is what we compute; right side is how we compute it.
•Use choice-based derivation — Ask "what choices led here?" enumerate them, express each as a term, combine with the appropriate aggregate.
•Forward vs. backward are equivalent — Pull (backward) fills the current state from predecessors. Push (forward) updates successors from current. Choose what's natural.
•Recognize common patterns — Linear recurrence, take/skip, interval expansion, grid navigation, sequence alignment, state machine. Pattern recognition accelerates derivation.
•Follow the systematic 4-step process — (1) Semantic meaning, (2) Last decisions, (3) Case contributions, (4) Aggregate correctly.
•Verify before implementing — Check acyclicity, completeness, correct aggregate, and base case coverage. Hand-trace on small inputs.

What's Next:

Page Complete

3 / 4