Data Structures & AlgorithmsState Design & Transition Functions

State Design & Transition Functions

LevelAdvanced

Duration90 mins

TopicState Design & Transition Functions

2 / 4

What Information to Track

The Information Selection Problem

You've understood that DP state is a compact representation of subproblems. But here's the million-dollar question: How do you know what information to include in your state?

This question haunts every programmer learning dynamic programming. You stare at a problem, intuit that DP might work, but can't figure out what the state should track. Should you track the last element processed? The running sum? A count of something? A boolean flag? All of the above?

The difference between expert DP practitioners and struggling beginners often comes down to this single skill: systematic identification of state-relevant information.

In this page, we'll develop a rigorous framework for answering the question "What should my state track?" for any problem.

What You Will Learn

By the end of this page, you'll have a systematic approach to identifying what information matters for DP state. You'll understand the distinction between essential and reconstructable information, learn techniques for minimizing state without losing correctness, and develop intuition for common information types across problem categories.

The Information Trichotomy

When analyzing a DP problem, all information about your current progress falls into exactly one of three categories:

Category 1: Essential Information (Must Track) Information that directly affects which decisions are legal or what value they produce. If you don't track this, you cannot correctly solve the subproblem.

Category 2: Derivable Information (Can Compute from Essential) Information that can be recomputed from the essential information plus the problem's fixed inputs. Tracking this wastes space—derive it when needed.

Category 3: Irrelevant Information (Should Ignore) Information about how you reached the current point that doesn't affect future decisions or values. Tracking this bloats your state space unnecessarily.

Information Categories in Classic DP Problems
Problem	Essential (Track)	Derivable (Compute)	Irrelevant (Ignore)
0/1 Knapsack	Items considered (index i), Capacity used (w)	Remaining capacity (W - w)	Order items were taken, specific items taken (for value computation)
LCS	Position in each string (i, j)	Current LCS characters (can reconstruct)	Which specific matching decisions were made
Edit Distance	Position in each string (i, j)	Current distance to reach (i,j)	Which edits were chosen (for reconstruction)
Climbing Stairs	Current stair number (i)	Path taken to get here	Number of 1-steps vs 2-steps taken
House Robber	Current house (i), Previous robbed?	Total money so far	Which specific houses were robbed

The Over-Tracking Trap

A common mistake: tracking what specific choices were made rather than the consequences of those choices. For 0/1 Knapsack, you don't need to know which items are in the bag—only the total weight used. The item selection is reconstruction information, not state-defining information.

The Essential Information Identification Framework

To systematically identify what your state must track, ask these four diagnostic questions about any piece of candidate information:

Question 1: Does this affect what choices I can make?

If tracking vs. not tracking this value changes the set of legal next moves, it's essential.

Example: In House Robber, whether I robbed house i-1 determines if I can rob house i. Essential.

Counter-example: In standard Knapsack, which specific items I took earlier doesn't affect whether I can take item i (only total weight matters). Not essential to track individually.

Question 2: Does this affect the value/cost of my choices?

Even if all choices remain legal, does this information change their associated values?

Example: In Coin Change, which coins I used before doesn't change the 'cost' (1 coin) of using any future coin. Not essential.

Counter-example: In a variant where consecutive same-denomination coins cost half, the last coin used affects future costs. Now it's essential.

Question 3: Can I derive this from other tracked information plus fixed inputs?

If yes, don't track it—compute it.

Example: In Knapsack, if I track total weight used w, I can compute remaining capacity as W - w. Don't track both.

Example: In grid paths, if I track position (r, c) and know I started at (0, 0), I can compute total moves made = r + c. Don't track moves separately.

Question 4: Does the optimal value at this state depend on this information?

The ultimate test: if two problem configurations differ only in this information, do they have the same optimal answer? If yes, it's irrelevant to state.

The Swap Test

Imagine two scenarios that differ only in one piece of information. If the optimal future strategy and value are identical in both scenarios, that information is NOT essential. If they differ, it IS essential. This mental 'swap test' clarifies many ambiguous cases.

Position Information: Where Are You in the Problem?

The most fundamental information to track is position—where you are in the problem's structure. This manifests differently depending on the problem type:

Linear Sequence Problems:

Track the index i of the current element
State: dp[i] = best answer considering elements 0..i (or i..n-1)
Examples: House Robber, Maximum Subarray, LIS

Two-Sequence Problems:

Track position in both sequences: (i, j)
State: dp[i][j] = best answer for prefixes A[0..i-1] and B[0..j-1]
Examples: LCS, Edit Distance, Interleaving Strings

Grid Problems:

Track row and column: (r, c)
State: dp[r][c] = best answer to reach/from position (r, c)
Examples: Unique Paths, Minimum Path Sum, Dungeon Game

position_tracking.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# ===== POSITION IN LINEAR SEQUENCE =====
def longest_increasing_subsequence(nums: list[int]) -> int:
    """
    Position Information: index i
    
    State: dp[i] = length of longest increasing subsequence ending at index i
    
    Why track position i?
        - Each position represents a unique subproblem: "best LIS ending here"
        - Element at i determines which previous elements can connect to it
        - The position IS the subproblem identity
    
    Alternative framing (ending vs starting):
        - dp[i] = LIS ending at i (bottom-up, forward)
        - dp[i] = LIS starting at i (top-down, backward in logic)
        
    Both work; the choice affects transition direction.
    """
    if not nums:
        return 0
    
    n = len(nums)
    dp = [1] * n  # Every element itself is an LIS of length 1
    
    for i in range(1, n):
        for j in range(i):
            if nums[j] < nums[i]:
                dp[i] = max(dp[i], dp[j] + 1)
    
    return max(dp)
 
 
# ===== POSITION IN TWO SEQUENCES =====
def edit_distance(word1: str, word2: str) -> int:
    """
    Position Information: (i, j) - position in each string
    
    State: dp[i][j] = min edits to transform word1[0..i-1] to word2[0..j-1]
    
    Why track (i, j)?
        - The subproblem is uniquely identified by how much of each string
          we've processed
        - Position (i, j) defines the remaining work: transform word1[i:] to word2[j:]
        - Neither i nor j alone captures the full subproblem
        
    Insight: We never need to track WHICH edits were made, because:
        - Only the count matters for the optimal value
        - Reconstruction can determine edits by backtracking
    """
    m, n = len(word1), len(word2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Base cases: transforming to/from empty string
    for i in range(m + 1):
        dp[i][0] = i  # i deletions
    for j in range(n + 1):
        dp[0][j] = j  # j insertions
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                dp[i][j] = dp[i-1][j-1]  # No edit needed
            else:
                dp[i][j] = 1 + min(
                    dp[i-1][j],    # Delete from word1
                    dp[i][j-1],    # Insert into word1
                    dp[i-1][j-1]   # Replace in word1
                )
    
    return dp[m][n]
 
 
# ===== POSITION IN GRID =====
def minimum_path_sum(grid: list[list[int]]) -> int:
    """
    Position Information: (r, c) - row and column
    
    State: dp[r][c] = minimum sum to reach cell (r, c) from (0, 0)
    
    Why track (r, c)?
        - Position uniquely defines subproblem: "best path to here"
        - The grid value at (r, c) comes from the fixed input (no need to track)
        - Direction constraints limit transitions to (r-1, c) or (r, c-1)
        
    No need to track:
        - The path taken (reconstruction can find it)
        - The running sum (dp[r][c] IS the running sum to this point)
    """
    if not grid or not grid[0]:
        return 0
        
    m, n = len(grid), len(grid[0])
    dp = [[0] * n for _ in range(m)]
    
    dp[0][0] = grid[0][0]
    
    # First row: can only come from left
    for c in range(1, n):
        dp[0][c] = dp[0][c-1] + grid[0][c]
    
    # First column: can only come from above
    for r in range(1, m):
        dp[r][0] = dp[r-1][0] + grid[r][0]
    
    # Interior cells: min of coming from left or above
    for r in range(1, m):
        for c in range(1, n):
            dp[r][c] = min(dp[r-1][c], dp[r][c-1]) + grid[r][c]
    
    return dp[m-1][n-1]

Resource Information: What Have You Consumed?

Many problems involve limited resources that constrain your choices. When resources affect what you can do, you must track them.

Common Resource Types:

Capacity/Weight - Knapsack problems: "How much space remains?"
Budget/Money - Purchasing problems: "How much can I spend?"
Time - Scheduling problems: "How much time is left?"
Count Limits - Transaction limits: "How many actions remain?"

The Resource Tracking Rule: Track the resource if and only if the resource level affects:

Which actions are legal, OR
The optimal value achievable from this point

If the resource is unlimited or doesn't constrain anything, don't track it.

resource_tracking.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# ===== RESOURCE: Capacity (Knapsack) =====
def knapsack_2d(weights: list[int], values: list[int], capacity: int) -> int:
    """
    Resource Information: remaining capacity w
    
    State: dp[i][w] = max value using items 0..i-1 with capacity w
    
    Why track capacity?
        - Capacity determines which items can still be taken
        - An item of weight 5 is only legal if w >= 5
        - Without tracking capacity, we can't enforce the constraint
        
    Optimization possible?
        - Can reduce to dp[w] since we only need previous row
        - But we still need capacity dimension for correctness
    """
    n = len(weights)
    dp = [[0] * (capacity + 1) for _ in range(n + 1)]
    
    for i in range(1, n + 1):
        for w in range(capacity + 1):
            dp[i][w] = dp[i-1][w]  # Don't take item i-1
            if weights[i-1] <= w:
                dp[i][w] = max(dp[i][w],
                               dp[i-1][w - weights[i-1]] + values[i-1])
    
    return dp[n][capacity]
 
 
# ===== RESOURCE: Transaction Count (Stock Trading) =====
def max_profit_ii_with_limit(prices: list[int], max_transactions: int) -> int:
    """
    Resource Information: transactions remaining
    
    State: dp[i][j][holding]
        - i = day
        - j = transactions used (0 to max_transactions)
        - holding = 0 or 1
    
    Why track transaction count?
        - If j == max_transactions and we're not holding, we cannot buy again
        - The count directly limits future actions
        
    Note: If transactions were unlimited, j would NOT be essential
          (see "Best Time to Buy and Sell Stock II" which needs no j dimension)
    """
    if not prices or max_transactions == 0:
        return 0
    
    n = len(prices)
    k = max_transactions
    
    # dp[j][0] = max profit using j transactions, not holding
    # dp[j][1] = max profit using j transactions, holding
    INF = float('inf')
    dp = [[-INF, -INF] for _ in range(k + 1)]
    dp[0][0] = 0  # 0 transactions, not holding, profit = 0
    
    for price in prices:
        for j in range(k, -1, -1):  # Reverse to use previous day's values
            if dp[j][1] > -INF:
                # Sell: complete transaction, not holding
                dp[j][0] = max(dp[j][0], dp[j][1] + price)
            if j > 0 and dp[j-1][0] > -INF:
                # Buy: use one more transaction, now holding
                dp[j][1] = max(dp[j][1], dp[j-1][0] - price)
    
    return max(dp[j][0] for j in range(k + 1) if dp[j][0] > -INF)
 
 
# ===== RESOURCE: When NOT to Track (Unlimited Resource) =====
def max_profit_unlimited(prices: list[int]) -> int:
    """
    NO resource tracking needed: unlimited transactions
    
    State: dp[i][holding] or just greedy
    
    Why no transaction count?
        - With unlimited transactions, count doesn't constrain anything
        - Every buy-sell opportunity is independent
        - Remove the dimension = simpler state
        
    Greedy suffices: buy before every rise, sell before every fall
    """
    profit = 0
    for i in range(1, len(prices)):
        if prices[i] > prices[i-1]:
            profit += prices[i] - prices[i-1]
    return profit

Resource Compression

When resource values are large but sparse (e.g., capacity of 10^9 but only 1000 distinct useful values), use coordinate compression. Map the large resource values to a smaller index range. This keeps state space tractable for problems with large numeric constraints.

Decision History: What Did You Just Do?

Some problems have local constraints that depend on recent history. In these cases, you must track enough history to enforce the constraints, but no more.

When to Track Decision History:

Consecutive Action Constraints: "Cannot do same action twice in a row" → Track the last action taken
K-Step Look-Back Constraints: "Cannot repeat an action within K steps" → Track the last K actions (or use clever encoding)
Accumulation Constraints: "Cannot exceed X consecutive same actions" → Track current streak count

Key Insight: Track the minimal history that lets you verify constraints. Often, you need only the last 1-2 decisions, not the entire path.

decision_history.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
# ===== HISTORY: Last Decision (Paint House) =====
def min_cost_paint_houses(costs: list[list[int]]) -> int:
    """
    Decision History: last color used
    
    Problem: Paint n houses with 3 colors. No two adjacent houses same color.
             costs[i][c] = cost to paint house i with color c
    
    State: dp[i][c] = min cost to paint houses 0..i with house i being color c
    
    Why track last color?
        - The constraint "no adjacent same color" requires knowing the
          color of house i-1 to determine valid colors for house i
        - Without tracking, we cannot enforce the constraint
        
    What NOT to track:
        - All previous colors (only adjacent matters)
        - Running total (that's what dp[i][c] represents)
    """
    if not costs:
        return 0
    
    n = len(costs)
    num_colors = len(costs[0])  # Assume 3
    
    # dp[c] = min cost to paint through current house, ending with color c
    dp = costs[0][:]  # Base case: first house
    
    for i in range(1, n):
        new_dp = [0] * num_colors
        for c in range(num_colors):
            # Minimum of all OTHER colors from previous house + this cost
            new_dp[c] = min(dp[prev_c] for prev_c in range(num_colors) 
                          if prev_c != c) + costs[i][c]
        dp = new_dp
    
    return min(dp)
 
 
# ===== HISTORY: Streak Count (Maximum Ones with K Flips) =====
def max_consecutive_ones_iii(nums: list[int], k: int) -> int:
    """
    This problem often seems to need history but actually uses sliding window.
    Let's look at a variant that truly needs streak tracking.
    """
    # Sliding window approach (no DP needed for this problem)
    left = 0
    zeros = 0
    max_len = 0
    
    for right in range(len(nums)):
        if nums[right] == 0:
            zeros += 1
        
        while zeros > k:
            if nums[left] == 0:
                zeros -= 1
            left += 1
        
        max_len = max(max_len, right - left + 1)
    
    return max_len
 
 
# ===== HISTORY: K-Step Constraint (Pizza with 3n Slices) =====
def max_size_pizza_slices(slices: list[int]) -> int:
    """
    Problem: Circular arrangement of pizza slices. If you take slice i,
             you cannot take slices i-1 or i+1 (they get removed).
             Find max sum of n/3 slices (n = len(slices)).
    
    This is like House Robber but circular and with count limit.
    
    State: dp[i][j][took_prev]
        - i = current slice index
        - j = slices taken so far
        - took_prev = did we take slice i-1? (0 or 1)
    
    For circular: compute twice (include first, exclude first) like circular House Robber
    
    Why track took_prev?
        - Cannot take current if took previous (adjacent constraint)
        - This single bit of history is sufficient
    """
    n = len(slices)
    target = n // 3  # Number of slices to take
    
    def max_sum_linear(arr: list[int], target: int) -> int:
        """Max sum picking 'target' non-adjacent elements from linear array."""
        m = len(arr)
        INF = float('-inf')
        
        # dp[i][j] = max sum using first i elements, taking exactly j
        # To save space, we track: can't take if took previous
        # Actually simpler: dp[i][j][0] = didn't take i-th, dp[i][j][1] = took i-th
        
        # dp[j][took_last]: max sum with j taken, last taken or not
        dp = [[INF, INF] for _ in range(target + 1)]
        dp[0][0] = 0  # 0 taken, last not taken (trivially true at start)
        
        for i in range(m):
            new_dp = [[INF, INF] for _ in range(target + 1)]
            for j in range(target + 1):
                if dp[j][0] != INF:
                    # Didn't take last element
                    # Don't take current
                    new_dp[j][0] = max(new_dp[j][0], dp[j][0])
                    # Take current (if we have quota)
                    if j < target:
                        new_dp[j + 1][1] = max(new_dp[j + 1][1], 
                                               dp[j][0] + arr[i])
                if dp[j][1] != INF:
                    # Took last element - cannot take this one
                    new_dp[j][0] = max(new_dp[j][0], dp[j][1])
            dp = new_dp
        
        return max(dp[target][0], dp[target][1])
    
    # Circular: either include first (exclude last) or exclude first
    return max(
        max_sum_linear(slices[:-1], target),  # Include first, exclude last
        max_sum_linear(slices[1:], target)    # Exclude first, include last
    )

History Encoding Tricks

When you need to track K recent decisions and each has C choices, the naive state space multiplies by C^K. Reduce this by: (1) Using a single integer to encode recent history as a base-C number, (2) Recognizing that only certain combinations are reachable, (3) Looking for symmetries that reduce distinct states.

Aggregate Information: Summarizing Your Progress

Sometimes you need to track an aggregate statistic about your choices—not the detailed choices themselves, but some summary.

Common Aggregates:

Sum/Total - Track when constraints involve totals (e.g., subset sum)
Count - Track when you need to count elements with a property
Min/Max Seen - Track when future decisions depend on extremes
Difference - Track when the gap between quantities matters

aggregate_tracking.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# ===== AGGREGATE: Sum (Subset Sum / Partition) =====
def can_partition(nums: list[int]) -> bool:
    """
    Aggregate Information: current subset sum
    
    Problem: Can array be partitioned into two subsets with equal sum?
    
    State: dp[i][s] = can we form sum s using elements 0..i-1?
           Simplified: dp[s] = can we form sum s?
    
    Why track sum (aggregate) instead of subset (details)?
        - Two different subsets with the same sum are equivalent for our purpose
        - Only the sum value matters, not which elements compose it
        - This reduces exponential subsets to polynomial sums
    
    State space: O(n × total_sum) instead of O(2^n) subsets
    """
    total = sum(nums)
    if total % 2 != 0:
        return False
    target = total // 2
    
    # dp[s] = True if sum s is achievable
    dp = [False] * (target + 1)
    dp[0] = True  # Sum 0 is always achievable (empty subset)
    
    for num in nums:
        # Reverse to avoid using num multiple times
        for s in range(target, num - 1, -1):
            if dp[s - num]:
                dp[s] = True
    
    return dp[target]
 
 
# ===== AGGREGATE: Count (Target Sum with Signs) =====
def find_target_sum_ways(nums: list[int], target: int) -> int:
    """
    Aggregate Information: current running sum
    
    Problem: Assign +/- to each number to achieve target sum.
             Count the number of ways.
    
    State: dp[i][s] = number of ways to achieve sum s using first i numbers
    
    Why track sum (aggregate)?
        - Two different sign assignments with the same running sum
          have identical futures
        - Only the sum distinguishes subproblems, not the sequence of signs
    
    Trick: Shift sums to handle negatives, or use dict for sparse sums.
    """
    # Alternative formulation: P - N = target, P + N = total
    # So P = (target + total) / 2, count subsets summing to P
    total = sum(nums)
    if (target + total) % 2 != 0 or abs(target) > total:
        return 0
    
    p_sum = (target + total) // 2
    if p_sum < 0:
        return 0
    
    # Count subsets summing to p_sum
    dp = [0] * (p_sum + 1)
    dp[0] = 1
    
    for num in nums:
        for s in range(p_sum, num - 1, -1):
            dp[s] += dp[s - num]
    
    return dp[p_sum]
 
 
# ===== AGGREGATE: Difference (Balanced Partition) =====
def minimum_difference_partition(nums: list[int]) -> int:
    """
    Aggregate Information: one subset's sum (difference derivable)
    
    Problem: Partition into two subsets minimizing |sum1 - sum2|.
    
    State: dp[s] = True if subset sum s is achievable
    
    Why track sum rather than subset?
        - The minimum difference depends only on what sums are achievable
        - For achievable sum s, difference = |s - (total - s)| = |2s - total|
        - Find achievable s closest to total/2
    """
    total = sum(nums)
    half = total // 2
    
    dp = [False] * (half + 1)
    dp[0] = True
    
    for num in nums:
        for s in range(half, num - 1, -1):
            if dp[s - num]:
                dp[s] = True
    
    # Find largest achievable sum <= half
    for s in range(half, -1, -1):
        if dp[s]:
            return total - 2 * s  # |s - (total - s)| = |2s - total|
    
    return total  # Should never reach here
 
 
# ===== AGGREGATE: Max Seen (Best Time for Stock with Cooldown) =====
# Sometimes we track the "best so far" rather than current state
def max_profit_with_min_tracking(prices: list[int]) -> int:
    """
    Aggregate Information: minimum price seen so far (one-transaction version)
    
    Problem: Buy once, sell once later. Maximize profit.
    
    State: implicitly, min_price_so_far
    
    Why track min (aggregate)?
        - Current profit = price - min_price_before_today
        - Instead of comparing all pairs, maintain rolling minimum
        - This reduces O(n²) to O(n)
    
    This is a simple example where DP collapses to tracking one aggregate.
    """
    if not prices:
        return 0
    
    min_price = prices[0]
    max_profit = 0
    
    for price in prices[1:]:
        max_profit = max(max_profit, price - min_price)
        min_price = min(min_price, price)
    
    return max_profit

Aggregate Range Matters

Tracking aggregates works well when the range is bounded. Subset sums from 0 to total_sum give O(n × sum) states. But if values can be large (up to 10^9), direct sum tracking may be infeasible. Consider: (1) Coordinate compression, (2) Meet-in-the-middle, or (3) Alternative formulations that avoid large aggregates.

Worked Example: Systematic Information Analysis

Let's apply our framework to a problem that requires careful information selection.

Problem: Cherry Pickup

You have an n×n grid. Starting from (0, 0), pick up cherries while walking to (n-1, n-1) using only right or down moves. Then return to (0, 0) using only left or up moves, picking up remaining cherries. Find the maximum total cherries.

Initial Analysis:

Naively, this seems like two separate trips. But that's wrong—the cherries picked on the first trip affect what's available for the second. The trips are dependent.

Key Insight:

Instead of one person making two trips, imagine two people walking simultaneously from (0, 0) to (n-1, n-1). If they both land on the same cell, only one can pick the cherry. This transforms the problem into a single forward pass with two "cursors".

Information Analysis:

Information Classification for Cherry Pickup
Information	Essential?	Reasoning
Position of person 1: (r1, c1)	Yes (partially)	Need to know where person 1 is for valid moves
Position of person 2: (r2, c2)	Yes (partially)	Need to know where person 2 is for valid moves
Which cells have been picked	No	Implicitly tracked—if both at same cell, it's picked once
Total cherries so far	No	This is what dp[...] represents (the optimal value)
Steps taken so far	Derivable	r1 + c1 = r2 + c2 = steps, so given step count, we need only 3 variables

State Optimization:

Both persons take the same number of steps at each point (since they move in sync). If person 1 is at (r1, c1) and we're at step k = r1 + c1, then c1 = k - r1. Similarly for person 2: c2 = k - r2.

So instead of tracking 4 variables (r1, c1, r2, c2), we track 3: (k, r1, r2), deriving c1 and c2.

Final State: dp[k][r1][r2] = max cherries when both have taken k steps, person 1 at row r1, person 2 at row r2.

State Space: O(n) steps × O(n) rows for r1 × O(n) rows for r2 = O(n³)

cherry_pickup.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
def cherry_pickup(grid: list[list[int]]) -> int:
    """
    Cherry Pickup with Optimal Information Tracking
    
    INFORMATION TRACKED:
        - k: step number (0 to 2*(n-1))
        - r1: row of person 1 (0 to min(k, n-1))
        - r2: row of person 2 (0 to min(k, n-1))
    
    DERIVED INFORMATION:
        - c1 = k - r1 (column of person 1)
        - c2 = k - r2 (column of person 2)
    
    NOT TRACKED:
        - Which cells were picked (implicit: same cell means one cherry)
        - Running total (that's the dp value)
        - Path history (irrelevant to future)
    
    Time: O(n³), Space: O(n²) with optimization
    """
    n = len(grid)
    INF = float('-inf')
    
    # dp[r1][r2] = max cherries with person 1 at row r1, person 2 at row r2
    # After k steps, c1 = k - r1, c2 = k - r2
    
    # Initialize for step 0 at (0, 0)
    if grid[0][0] == -1:
        return 0
    
    dp = [[INF] * n for _ in range(n)]
    dp[0][0] = grid[0][0]  # Both at (0,0), one cherry
    
    for k in range(1, 2 * n - 1):
        new_dp = [[INF] * n for _ in range(n)]
        
        for r1 in range(max(0, k - n + 1), min(k + 1, n)):
            c1 = k - r1
            if c1 < 0 or c1 >= n or grid[r1][c1] == -1:
                continue
                
            for r2 in range(max(0, k - n + 1), min(k + 1, n)):
                c2 = k - r2
                if c2 < 0 or c2 >= n or grid[r2][c2] == -1:
                    continue
                
                # Cherries at current positions
                cherries = grid[r1][c1]
                if r1 != r2:  # If not same cell, add person 2's cherry too
                    cherries += grid[r2][c2]
                
                # Possible previous positions (came from step k-1)
                for pr1 in [r1, r1 - 1]:  # Person 1 came from (pr1, pc1)
                    if pr1 < 0 or k - 1 - pr1 < 0 or k - 1 - pr1 >= n:
                        continue
                    for pr2 in [r2, r2 - 1]:  # Person 2 came from (pr2, pc2)
                        if pr2 < 0 or k - 1 - pr2 < 0 or k - 1 - pr2 >= n:
                            continue
                        if dp[pr1][pr2] != INF:
                            new_dp[r1][r2] = max(new_dp[r1][r2],
                                                  dp[pr1][pr2] + cherries)
        
        dp = new_dp
    
    result = dp[n-1][n-1]
    return max(0, result) if result != INF else 0
 
 
# Example usage and verification
if __name__ == "__main__":
    test_grid = [
        [0, 1, -1],
        [1, 0, -1],
        [1, 1, 1]
    ]
    print(f"Max cherries: {cherry_pickup(test_grid)}")  # Expected: 5

Summary: Mastering Information Selection

We've developed a rigorous framework for identifying what information to track in DP states. Let's consolidate:

Key Takeaways

•Classify information into three categories — Essential (must track), Derivable (compute from essential), Irrelevant (ignore). Only essential information belongs in your state.
•Use the four diagnostic questions — Does it affect choices? Does it affect values? Can it be derived? Does the optimal answer depend on it? These questions distinguish essential from non-essential.
•Position is almost always essential — Where you are in sequences, grids, or trees typically defines the subproblem. Track indices, coordinates, or node identifiers.
•Resource constraints add dimensions — Capacity, budget, count limits become state dimensions when they constrain future actions or affect optimal values.
•Track minimal history for local constraints — If constraints depend on recent decisions, track just enough history to verify them—often just the last decision.
•Use aggregates instead of details — Sums, counts, and extremes often suffice where detailed choice tracking would explode the state space.
•Look for derivable relationships — When variables are linked (r + c = steps), use the relationship to reduce dimensions.

What's Next:

With state defined and the right information identified, we can now write transition functions—the equations that express how one state's value relates to others. The next page covers writing correct, efficient recurrence relations.

Page Complete

You now have a systematic framework for determining what information your DP state must track. You understand the information trichotomy, the four diagnostic questions, and how to apply them to position, resource, history, and aggregate information. Next, we'll learn to express transitions between states.

2 / 4

Loading learning content...

Data Structures & AlgorithmsState Design & Transition Functions

State Design & Transition Functions

LevelAdvanced

Duration90 mins

TopicState Design & Transition Functions

2 / 4

What Information to Track

The Information Selection Problem

You've understood that DP state is a compact representation of subproblems. But here's the million-dollar question: How do you know what information to include in your state?

The difference between expert DP practitioners and struggling beginners often comes down to this single skill: systematic identification of state-relevant information.

In this page, we'll develop a rigorous framework for answering the question "What should my state track?" for any problem.

What You Will Learn

The Information Trichotomy

When analyzing a DP problem, all information about your current progress falls into exactly one of three categories:

Information Categories in Classic DP Problems
Problem	Essential (Track)	Derivable (Compute)	Irrelevant (Ignore)
0/1 Knapsack	Items considered (index i), Capacity used (w)	Remaining capacity (W - w)	Order items were taken, specific items taken (for value computation)
LCS	Position in each string (i, j)	Current LCS characters (can reconstruct)	Which specific matching decisions were made
Edit Distance	Position in each string (i, j)	Current distance to reach (i,j)	Which edits were chosen (for reconstruction)
Climbing Stairs	Current stair number (i)	Path taken to get here	Number of 1-steps vs 2-steps taken
House Robber	Current house (i), Previous robbed?	Total money so far	Which specific houses were robbed