Loading learning content...
Dynamic programming is fundamentally about state design—choosing the right way to describe subproblems so that larger problems can be built from smaller ones. In the 0/1 Knapsack problem, the state choice might seem obvious in retrospect, but understanding why it works reveals deep insights applicable to all DP problems.
The state for 0/1 Knapsack is typically expressed as:
dp[i][c] = Maximum value using items 1 through i with remaining capacity c
But why this particular formulation? Why track "capacity remaining" rather than "capacity used"? Why do we need two dimensions at all? Could we use fewer? Could different state formulations lead to better algorithms?
These questions touch on the heart of dynamic programming design. Answering them will sharpen your ability to design DP solutions for novel problems.
By the end of this page, you will understand why 'capacity remaining' is the natural state variable for knapsack problems, how alternative formulations compare, and how to reason about state design for DP problems in general. You'll also see how the state dimensions directly determine time and space complexity.
In dynamic programming, state refers to the minimal information needed to describe a subproblem completely. A good state definition must satisfy several criteria:
1. Sufficient Information: The state must contain enough information to solve the subproblem without referring back to the full problem description. Given the state, we should be able to make optimal decisions for that subproblem.
2. Optimal Substructure: The state must be defined so that optimal solutions to subproblems combine to form optimal solutions to larger problems. The state boundaries must align with the problem's natural decomposition.
3. Overlapping Subproblems: The state space should exhibit repetition—many distinct problem instances should reduce to the same state. This enables the caching that makes DP efficient.
4. Minimality: The state should be as compact as possible while still satisfying the above. Unnecessary state dimensions increase complexity without benefit.
The state must capture exactly what matters for future decisions. Include too little, and you can't make optimal choices. Include too much, and you lose the efficiency of subproblem reuse. Finding the right balance is the art of DP.
For the 0/1 Knapsack problem, we need to ask: What information do we need to decide whether to include item i?
The answer crystallizes into two pieces:
This leads naturally to a two-dimensional state space indexed by (item_index, remaining_capacity).
Consider the decision to include item i. This decision depends critically on one factor: whether item i fits in the knapsack given the remaining capacity.
If we have items [A, B, C] and we've already included A, the decision about B and C doesn't depend on A itself—it depends only on how much capacity A consumed. Two different items that consume the same weight leave us in identical decision situations for subsequent items.
This observation is crucial:
The past matters only through its effect on remaining capacity.
Formally, if two partial solutions S₁ and S₂ have the same remaining capacity, then the optimal continuation from S₁ is the same as from S₂. This is the principle that enables us to collapse exponentially many partial solutions into polynomially many states.
| Items Considered | Items Included | Weight Used | Remaining Capacity |
|---|---|---|---|
| [A, B] | {A} | 3 | 7 |
| [A, B] | {B} | 3 | 7 |
| [A, B] | {} | 0 | 10 |
In this example, the first two rows represent different partial solutions (one includes A, one includes B), but if both A and B have weight 3, the remaining decisions for items C, D, etc. are identical. We don't need to distinguish between "included A" and "included B"—only the remaining capacity matters.
This insight is profound: it means the state space is O(n × W) rather than O(2^n). We've compressed an exponential space into a polynomial one.
There are 2^n possible subsets of items, but only (W+1) possible remaining capacities. By tracking capacity instead of the specific subset, we achieve massive compression. This is why DP works—it exploits the structure that makes many distinct histories equivalent.
A natural question arises: why track "remaining capacity" rather than "used capacity"? Mathematically, they're equivalent—if we know one, we can compute the other. But the choice affects how naturally the recurrence reads and how we think about the problem.
Remaining Capacity Formulation:
dp[i][c] = max value using items 1..i with remaining capacity c
Recurrence: dp[i][c] = max(dp[i-1][c], dp[i-1][c-w[i]] + v[i])
Used Capacity Formulation:
dp'[i][u] = max value using items 1..i with exactly u capacity used
Recurrence: dp'[i][u] = max(dp'[i-1][u], dp'[i-1][u-w[i]] + v[i])
Final answer: max(dp'[n][u]) for all u ≤ W
Both formulations work, but remaining capacity is generally preferred because:
The answer is at a specific cell: dp[n][W] directly gives the maximum value with full capacity available. With used capacity, you'd need to find max(dp'[n][u]) across all valid u.
The constraint is naturally expressed: Checking w[i] <= c (item fits in remaining) is more intuitive than checking bounds on u + w[i].
Consistency with problem statement: The problem gives us capacity W as what we have, not as what we've used.
That said, recognizing that both are equivalent helps when you encounter problems where "used resource" is more natural (e.g., "minimize cost to achieve at least value V").
The 0/1 Knapsack state is inherently two-dimensional: (item_index, capacity). Let's understand why both dimensions are necessary and how they interact.
Dimension 1: Item Index (i)
This dimension tracks which items we've processed or considered. At state i, we've made decisions for items 1 through i and are building toward the full solution.
dp[i][c] considers only items 1 through ii-1 to i (processing one item)Dimension 2: Remaining Capacity (c)
This dimension tracks the resource constraint—how much capacity remains for future items.
dp[i][c], exactly c capacity is availableThink of the DP table as a grid where rows are items (progressing through decisions) and columns are capacities (tracking resource consumption). We fill it row by row, and each cell depends only on cells in the previous row. This dependency structure is what enables efficient computation.
State Space Size:
The total number of states is:
$$(n + 1) \times (W + 1) = O(n \cdot W)$$
This is pseudo-polynomial complexity—polynomial in the numeric value of W, but exponential in the number of bits needed to represent W. A capacity of W = 10^9 would create a billion states per item, which is infeasible. We'll discuss this limitation and workarounds later.
| n (items) | W (capacity) | States | Feasible? |
|---|---|---|---|
| 100 | 1,000 | 100,000 | ✓ Easy |
| 1,000 | 10,000 | 10,000,000 | ✓ Reasonable |
| 100 | 1,000,000 | 100,000,000 | ⚠️ Tight |
| 100 | 10^9 | 10^11 | ✗ Infeasible |
Understanding how states depend on each other is crucial for both correctness and optimization. In 0/1 Knapsack, the dependencies are particularly clean.
Dependency Rule:
dp[i][c] depends only on:
dp[i-1][c] (exclude item i)dp[i-1][c-w[i]] (include item i, if c ≥ w[i])Both dependencies are in row i-1, meaning:
State: dp[i][c]Dependencies: ┌─────────────────────────────┐Row i-1: │ ... dp[i-1][c-w[i]] ... dp[i-1][c] │ └──────────┬──────────────┬───┘ │ │ │ Include │ Exclude │ (add v[i]) │ (keep same) │ │ v vRow i: │ ... dp[i][c] ... │ Key observations:• Dependencies only point to row i-1• c-w[i] is always ≤ c (leftward or same position)• Each cell in row i can be computed independentlyImplications of the Dependency Structure:
1. Fill Order:
We must complete row i-1 before starting row i. Within a row, cells can be filled in any order (left-to-right, right-to-left, or even in parallel).
2. Space Optimization: Since each row only depends on the previous row, we can reduce space from O(nW) to O(W) by keeping only two rows (or even one row with careful indexing). This is a critical optimization we'll explore later.
3. Parallelization: Cells within the same row are independent, making this amenable to parallel computation if W is large enough to benefit.
4. Correctness Guarantee: The acyclic dependency structure ensures we never access uncomputed values—a common source of bugs in DP implementations.
While the standard (item_index, capacity) formulation is canonical, understanding alternatives deepens our grasp of state design.
Formulation A: Subset-Based State (Exponential)
dp[S] = max value using exactly the items in subset S
This works but has 2^n states—no better than brute force. It fails to exploit the "capacity is all that matters" insight.
Formulation B: Prefix Items with Max Capacity Used
dp[i][v] = minimum weight to achieve exactly value v using items 1..i
This is the dual formulation—instead of maximizing value for a given weight limit, we minimize weight for a given value. It's useful when values are bounded but weights are large.
Formulation C: Suffix Items
dp[i][c] = max value using items i..n with remaining capacity c
This is equivalent to the prefix formulation but processes items in reverse. Some find it more intuitive because it matches the recursive structure more naturally.
| Formulation | State Space | Best When | Complexity |
|---|---|---|---|
| (item, capacity) | O(nW) | W is moderate | O(nW) |
| Subset-based | O(2^n) | Never (dominated) | O(2^n) |
| (item, value) | O(nV) | V << W (values small) | O(nV) |
| Meet-in-middle | O(2^(n/2)) | n ≤ 40, W huge | O(2^(n/2)) |
When capacity W is huge but the sum of values Σv[i] is moderate, the dual formulation dp[i][v] = min weight for value v is more efficient. This demonstrates how the 'right' state depends on the problem parameters, not just the problem structure.
The 0/1 Knapsack state design illustrates principles applicable to all DP problems. When facing a new problem, apply these guidelines:
Principle 1: Identify What Matters for Future Decisions
Ask: "If I'm at step i, what do I need to know to make optimal decisions for steps i+1 onward?" For knapsack, only the remaining capacity matters—not which specific items we chose.
Principle 2: Find the Natural Decomposition
Most DP problems decompose along one or two dimensions:
Principle 3: Minimize State Dimensions
Each state dimension multiplies complexity. Ask: "Is this dimension actually necessary, or can it be derived from others?" Unnecessary dimensions waste time and space.
Principle 4: Ensure Polynomial State Count
The state space size directly determines DP complexity. Choose state variables with bounded ranges. Avoid states that enumerate subsets or permutations directly.
Mastering state design is perhaps the most transferable skill in dynamic programming. Once you can identify the right state for a problem, the recurrence and implementation often follow naturally.
The 0/1 Knapsack problem is NP-Hard, yet our DP solution runs in O(nW) time. How can this be? The answer lies in understanding pseudo-polynomial complexity.
Polynomial vs Pseudo-Polynomial:
A truly polynomial algorithm runs in time polynomial in the input size (number of bits needed to represent the input). For knapsack:
n items require at least n bits to listTotal input size: O(n log M + log W) bits.
Our O(nW) algorithm is polynomial in n but exponential in log W (the number of bits representing W). If W = 2^k, the algorithm runs in O(n · 2^k) time—exponential in the input size!
When W is given in binary, O(nW) is not polynomial in input size. With W = 10^9, we have 10^9 capacity states—but log₂(10^9) ≈ 30 bits means we're doing 2^30 operations for just 30 bits of input. This is exponential in the representation, not polynomial.
When Pseudo-Polynomial Is Acceptable:
Small W in practice: Many real problems have moderate capacities (W ≤ 10^6), making O(nW) practical.
W scales with n: If W = O(n) or W = O(n²), then O(nW) = O(n²) or O(n³), which are genuinely polynomial.
Approximation alternative: For large W, fully polynomial-time approximation schemes (FPTAS) can achieve (1+ε) approximations in time O(n² / ε).
Implications for Problem Solving:
We've taken a deep dive into the state design of the 0/1 Knapsack problem, uncovering principles that extend far beyond this specific algorithm.
What's Next:
With the state design firmly understood, we'll explore a powerful variant: the Unbounded Knapsack problem, where each item type can be selected multiple times. This changes the recurrence structure in subtle but important ways.
You now understand why 'capacity remaining' is the key state variable in knapsack problems and how this choice enables dynamic programming. These state design principles will serve you well across the full spectrum of DP problems.