Loading content...
Imagine you're solving the Traveling Salesman Problem: given n cities, find the shortest route that visits every city exactly once and returns to the start. The brute force approach would examine all n! permutations—impossibly slow even for modest n. But what if we could reduce this to 2ⁿ × n operations?
The key insight is that for many optimization problems, what matters isn't the order of past decisions, but rather which elements have been included. If we can represent 'which elements are in our current subset' efficiently, we unlock a powerful technique called Bitmask DP.
A bitmask uses a binary number—an integer—to represent a set. Each bit position corresponds to an element: bit i is 1 if element i is in the set, 0 otherwise. With a single integer, we can represent any subset of up to 20-25 elements, and perform set operations (union, intersection, membership) in O(1) time using bitwise operators.
By the end of this page, you will:
• Understand how bitmasks represent subsets as integers • Master essential bitwise operations for DP • Solve the Traveling Salesman Problem using bitmask DP • Tackle assignment and subset-sum problems with this technique • Recognize when bitmask DP applies (and its limitations) • Appreciate the elegance of state compression
Before diving into DP, let's master the bit manipulation fundamentals. A bitmask is an integer where each bit represents whether a corresponding element is present (1) or absent (0).
Example: For a set of 5 elements {0, 1, 2, 3, 4}, the bitmask 10110 (binary) = 22 (decimal) represents the subset {1, 2, 4}.
Bit position: 4 3 2 1 0
Bitmask: 1 0 1 1 0
Meaning: ✓ ✗ ✓ ✓ ✗ → Elements {4, 2, 1}
Key Insight: A set of n elements has 2ⁿ possible subsets. Conveniently, an n-bit integer can represent exactly 2ⁿ different values (0 to 2ⁿ-1). The correspondence is perfect!
| Operation | Syntax | Effect | Example (mask=22, i=3) |
|---|---|---|---|
| Check if element i is in set | (mask >> i) & 1 | Returns 1 if present, 0 if absent | (22 >> 3) & 1 = 0 (absent) |
| Add element i to set | mask | (1 << i) | Sets bit i to 1 | 22 | (1 << 3) = 30 |
| Remove element i from set | mask & ~(1 << i) | Sets bit i to 0 | 22 & ~(1 << 2) = 18 |
| Toggle element i | mask ^ (1 << i) | Flips bit i | 22 ^ (1 << 1) = 20 |
| Set union | mask1 | mask2 | Elements in either set | 22 | 9 = 31 |
| Set intersection | mask1 & mask2 | Elements in both sets | 22 & 9 = 0 |
| Set difference | mask1 & ~mask2 | Elements in mask1 but not mask2 | 22 & ~9 = 22 |
| Count set bits (popcount) | bin(mask).count('1') | Number of elements | bin(22).count('1') = 3 |
| Check if empty | mask == 0 | True if no elements | 22 == 0 → False |
| All elements selected | mask == (1 << n) - 1 | All n bits are 1 | 22 == 31 → False |
Iterate over all subsets: for mask in range(1 << n)
Iterate over elements in a mask: while mask: i = (mask & -mask).bit_length() - 1 # Lowest set bit position mask &= mask - 1 # Remove lowest set bit
Iterate over all subsets of a mask: subset = mask while subset: # process subset subset = (subset - 1) & mask
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
def demonstrate_bitmask_operations(): """ Comprehensive demonstration of bitmask operations for subset manipulation. """ n = 5 # Number of elements {0, 1, 2, 3, 4} # Create a subset {1, 2, 4} mask = 0 mask |= (1 << 1) # Add element 1 mask |= (1 << 2) # Add element 2 mask |= (1 << 4) # Add element 4 print(f"Subset {{1, 2, 4}} as bitmask: {mask} (binary: {bin(mask)})") # Check membership for i in range(n): present = (mask >> i) & 1 print(f" Element {i}: {'present' if present else 'absent'}") # Iterate over all 2^n subsets print(f"\nAll {1 << n} subsets of {{0, 1, 2, 3, 4}}:") for subset_mask in range(1 << n): elements = [i for i in range(n) if (subset_mask >> i) & 1] print(f" {subset_mask:2d} (binary: {bin(subset_mask):>7}) -> {elements}") # Iterate over elements in a specific mask print(f"\nIterating over elements in mask {mask}:") temp_mask = mask while temp_mask: # Get position of lowest set bit lowest_bit = temp_mask & -temp_mask # Isolates lowest set bit position = lowest_bit.bit_length() - 1 print(f" Element at position {position}") temp_mask &= temp_mask - 1 # Clear lowest set bit # Iterate over all subsets of a mask print(f"\nAll subsets of mask {mask} (subset {{1, 2, 4}}):") subset = mask while subset: elements = [i for i in range(n) if (subset >> i) & 1] print(f" {subset:2d} -> {elements}") subset = (subset - 1) & mask # Next subset print(f" 0 -> []") # Don't forget the empty subset def count_set_bits(mask: int) -> int: """Count number of 1-bits in mask. Also known as popcount or Hamming weight.""" count = 0 while mask: mask &= mask - 1 # Clear lowest set bit count += 1 return count # Alternative using built-in (Python 3.10+)# count = mask.bit_count() demonstrate_bitmask_operations()The Traveling Salesman Problem (TSP) is the quintessential bitmask DP problem. It perfectly illustrates why this technique is so powerful.
Problem Statement:
Given n cities and a distance matrix dist[i][j] representing the distance from city i to city j, find the shortest route that:
The Naive Approach:
Enumerate all permutations of cities: n! possibilities. For n=15, that's over 1.3 trillion routes—completely impractical.
The Bitmask DP Insight:
Consider the state: we're at city i, having visited exactly the set of cities represented by bitmask mask. What information do we actually need?
This is the critical insight! The minimum cost to reach city i with visited set mask is the same regardless of the path taken to get there. This overlapping substructure enables DP.
The number of distinct permutations is n!, but the number of distinct (subset, current_city) pairs is only 2ⁿ × n. For n=15:
• Permutations: 15! ≈ 1.3 × 10¹² • Bitmask states: 2¹⁵ × 15 ≈ 4.9 × 10⁵
This is a reduction by a factor of over 2 billion! Bitmask DP makes TSP tractable for small n.
Formulating the DP:
Let dp[mask][i] = minimum distance to reach city i, having visited exactly the cities in mask, and city i is included in mask.
Base Case:
dp[1][0] = 0 — we start at city 0, having 'visited' just city 0, with 0 distance traveled.dp[1][j] = ∞ for j ≠ 0 (we must start at city 0).Transition:
To arrive at city j with visited set mask (where j ∈ mask):
i where i ∈ mask and i ≠ jmask \ {j} = mask ^ (1 << j)dp[mask][j] = min over all i ∈ (mask \ {j}) {
dp[mask ^ (1 << j)][i] + dist[i][j]
}
Final Answer:
We want to visit all cities (mask = (1 << n) - 1 = all 1s) and return to city 0:
Result = min over all i ≠ 0 {
dp[(1 << n) - 1][i] + dist[i][0]
}
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138
def traveling_salesman(dist: list[list[int]]) -> int: """ Find the minimum cost Hamiltonian cycle (TSP) using bitmask DP. Args: dist: dist[i][j] = distance from city i to city j Returns: Minimum total distance for a round trip visiting all cities Time Complexity: O(2^n × n²) Space Complexity: O(2^n × n) Practical limit: n ≈ 20-22 on modern hardware """ n = len(dist) INF = float('inf') # dp[mask][i] = min distance to reach city i having visited exactly cities in mask dp = [[INF] * n for _ in range(1 << n)] # Base case: start at city 0 dp[1][0] = 0 # mask=1 means only city 0 visited # Process all masks in increasing order of set bits for mask in range(1 << n): if not (mask & 1): # Must include city 0 (our start) continue for last in range(n): # Skip if 'last' is not in the current mask if not (mask & (1 << last)): continue if dp[mask][last] == INF: continue # Try extending to each unvisited city for next_city in range(n): if mask & (1 << next_city): # Already visited continue new_mask = mask | (1 << next_city) new_cost = dp[mask][last] + dist[last][next_city] dp[new_mask][next_city] = min(dp[new_mask][next_city], new_cost) # Find minimum cost to complete the cycle full_mask = (1 << n) - 1 result = INF for last in range(1, n): # End at any city except 0 if dp[full_mask][last] != INF: result = min(result, dp[full_mask][last] + dist[last][0]) return result if result != INF else -1 def traveling_salesman_with_path(dist: list[list[int]]) -> tuple[int, list[int]]: """ TSP with path reconstruction. """ n = len(dist) INF = float('inf') dp = [[INF] * n for _ in range(1 << n)] parent = [[(-1, -1)] * n for _ in range(1 << n)] # (prev_mask, prev_city) dp[1][0] = 0 for mask in range(1 << n): if not (mask & 1): continue for last in range(n): if not (mask & (1 << last)): continue if dp[mask][last] == INF: continue for next_city in range(n): if mask & (1 << next_city): continue new_mask = mask | (1 << next_city) new_cost = dp[mask][last] + dist[last][next_city] if new_cost < dp[new_mask][next_city]: dp[new_mask][next_city] = new_cost parent[new_mask][next_city] = (mask, last) # Find best ending city full_mask = (1 << n) - 1 best_cost = INF best_last = -1 for last in range(1, n): if dp[full_mask][last] != INF: cost = dp[full_mask][last] + dist[last][0] if cost < best_cost: best_cost = cost best_last = last if best_last == -1: return -1, [] # Reconstruct path path = [0] # End with return to 0 mask, city = full_mask, best_last cities = [] while city != -1: cities.append(city) prev_mask, prev_city = parent[mask][city] mask, city = prev_mask, prev_city path = list(reversed(cities)) + [0] return best_cost, path # Exampleif __name__ == "__main__": # Distance matrix for 4 cities distances = [ [0, 10, 15, 20], [10, 0, 35, 25], [15, 35, 0, 30], [20, 25, 30, 0] ] min_dist = traveling_salesman(distances) print(f"Minimum tour distance: {min_dist}") min_dist, path = traveling_salesman_with_path(distances) print(f"Minimum tour distance: {min_dist}") print(f"Optimal path: {' -> '.join(map(str, path))}") # Verify total = sum(distances[path[i]][path[i+1]] for i in range(len(path)-1)) print(f"Verified distance: {total}")The Assignment Problem is another classic application of bitmask DP. Given n workers and n jobs, with a cost matrix cost[i][j] for assigning worker i to job j, find the minimum-cost assignment where each worker gets exactly one job and each job gets exactly one worker.
Why Bitmask DP?
As we assign workers to jobs (processing workers in order 0, 1, 2, ..., n-1), we need to track which jobs have already been taken. A bitmask perfectly represents this set of assigned jobs!
State Definition:
Let dp[i][mask] = minimum cost to assign workers 0, 1, ..., i-1 such that exactly the jobs in mask are taken.
Since worker i is the i-th worker (0-indexed) and mask has exactly i bits set (one job per worker), we can simplify:
Let dp[mask] = minimum cost to assign the first popcount(mask) workers, where mask indicates which jobs are taken.
Transition:
For the current worker i = popcount(mask), try assigning them to each available job j where j ∉ mask:
new_mask = mask | (1 << j)
dp[new_mask] = min(dp[new_mask], dp[mask] + cost[i][j])
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112
def min_cost_assignment(cost: list[list[int]]) -> int: """ Solve the assignment problem: assign n workers to n jobs minimizing total cost. Each worker gets exactly one job; each job gets exactly one worker. Args: cost: cost[i][j] = cost of assigning worker i to job j Returns: Minimum total cost of assignment Time Complexity: O(2^n × n) Space Complexity: O(2^n) """ n = len(cost) INF = float('inf') # dp[mask] = min cost when jobs in 'mask' are assigned to first popcount(mask) workers dp = [INF] * (1 << n) dp[0] = 0 # No jobs assigned, no cost for mask in range(1 << n): if dp[mask] == INF: continue # Current worker index = number of jobs already assigned = popcount(mask) worker = bin(mask).count('1') if worker >= n: # All workers assigned continue # Try assigning this worker to each unassigned job for job in range(n): if mask & (1 << job): # Job already taken continue new_mask = mask | (1 << job) dp[new_mask] = min(dp[new_mask], dp[mask] + cost[worker][job]) return dp[(1 << n) - 1] def min_cost_assignment_with_solution(cost: list[list[int]]) -> tuple[int, list[int]]: """ Assignment problem with assignment reconstruction. Returns: (min_cost, assignment) where assignment[i] = job assigned to worker i """ n = len(cost) INF = float('inf') dp = [INF] * (1 << n) parent = [-1] * (1 << n) # Which job was added to reach this mask dp[0] = 0 for mask in range(1 << n): if dp[mask] == INF: continue worker = bin(mask).count('1') if worker >= n: continue for job in range(n): if mask & (1 << job): continue new_mask = mask | (1 << job) new_cost = dp[mask] + cost[worker][job] if new_cost < dp[new_mask]: dp[new_mask] = new_cost parent[new_mask] = job # Reconstruct assignment final_mask = (1 << n) - 1 assignment = [0] * n mask = final_mask for worker in range(n - 1, -1, -1): job = parent[mask] assignment[worker] = job mask ^= (1 << job) # Remove this job return dp[final_mask], assignment # Exampleif __name__ == "__main__": # Cost matrix: cost[i][j] = cost for worker i to do job j cost_matrix = [ [9, 2, 7, 8], [6, 4, 3, 7], [5, 8, 1, 8], [7, 6, 9, 4] ] min_cost = min_cost_assignment(cost_matrix) print(f"Minimum assignment cost: {min_cost}") min_cost, assignment = min_cost_assignment_with_solution(cost_matrix) print(f"Minimum assignment cost: {min_cost}") print(f"Assignment (worker -> job): {assignment}") for worker, job in enumerate(assignment): print(f" Worker {worker} -> Job {job} (cost: {cost_matrix[worker][job]})") # Verify total = sum(cost_matrix[w][assignment[w]] for w in range(len(cost_matrix))) print(f"Verified total: {total}")The Assignment Problem can also be solved using the Hungarian Algorithm in O(n³) time—much better than O(2ⁿ × n) for large n. However, bitmask DP is valuable when:
• n is small (≤ 20) • The problem has additional constraints that make Hungarian inapplicable • You need to count or enumerate solutions, not just find one optimal • The cost function is complex or non-standard
While classic subset sum uses a different DP formulation (tracking achievable sums), some variants benefit from bitmask representation, especially when we need to track which elements are used or partition into multiple groups.
Partition into K Equal Sum Subsets (LeetCode 698):
Given an integer array nums and an integer k, determine if it's possible to divide the array into k non-empty subsets with equal sum.
Approach:
If total sum is divisible by k, each subset must sum to target = sum/k. We use bitmasks to track which elements have been assigned.
State: dp[mask] = True if elements in mask can be perfectly partitioned into some number of groups, each summing to target.
Alternatively, dp[mask] = the remainder left in the current partial bucket after assigning elements in mask.
The Insight:
If we process elements and track 'how full is the current bucket', whenever a bucket fills (remainder reaches target), we automatically start a new bucket. We just need to verify we can fill all buckets.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119
def can_partition_k_subsets(nums: list[int], k: int) -> bool: """ Determine if array can be partitioned into k subsets with equal sum. Approach: Bitmask DP where dp[mask] = remainder in current bucket when elements in mask have been assigned. Time Complexity: O(2^n × n) Space Complexity: O(2^n) """ total = sum(nums) # Quick checks if total % k != 0: return False target = total // k if max(nums) > target: return False n = len(nums) # dp[mask] = remainder in current bucket after using elements in mask # -1 means this state is unreachable dp = [-1] * (1 << n) dp[0] = 0 # No elements used, bucket is empty for mask in range(1 << n): if dp[mask] == -1: continue current_remainder = dp[mask] for i in range(n): if mask & (1 << i): # Element already used continue new_remainder = current_remainder + nums[i] if new_remainder > target: # Exceeds bucket capacity continue if new_remainder == target: new_remainder = 0 # Start fresh bucket new_mask = mask | (1 << i) # Only update if not yet reached (avoids issues with multiple paths) if dp[new_mask] == -1: dp[new_mask] = new_remainder # Check if we can use all elements with complete buckets full_mask = (1 << n) - 1 return dp[full_mask] == 0 def can_partition_k_subsets_optimized(nums: list[int], k: int) -> bool: """ Optimized version with sorting to enable early termination. """ total = sum(nums) if total % k != 0: return False target = total // k if max(nums) > target: return False # Sort descending - larger elements are harder to place, try them first nums = sorted(nums, reverse=True) n = len(nums) dp = [-1] * (1 << n) dp[0] = 0 for mask in range(1 << n): if dp[mask] == -1: continue current = dp[mask] for i in range(n): if mask & (1 << i): continue # Pruning: if same element as before was skipped, skip this too # (only for consecutive equal elements) if i > 0 and not (mask & (1 << (i - 1))) and nums[i] == nums[i - 1]: continue new_val = current + nums[i] if new_val > target: continue new_mask = mask | (1 << i) if dp[new_mask] == -1: dp[new_mask] = new_val % target return dp[(1 << n) - 1] == 0 # Exampleif __name__ == "__main__": # Example 1 nums1 = [4, 3, 2, 3, 5, 2, 1] k1 = 4 print(f"nums={nums1}, k={k1}: {can_partition_k_subsets(nums1, k1)}") # Target sum = 20/4 = 5 # Possible partition: {5}, {1,4}, {2,3}, {2,3} # Example 2 nums2 = [1, 2, 3, 4] k2 = 3 print(f"nums={nums2}, k={k2}: {can_partition_k_subsets(nums2, k2)}") # Sum = 10, not divisible by 3Bitmask DP is powerful but has strict applicability conditions. Learning to recognize suitable problems saves time and prevents applying the wrong technique.
2^20 ≈ 10^6 — feasible 2^25 ≈ 3.3 × 10^7 — borderline, but often okay with fast inner loops 2^30 ≈ 10^9 — typically too slow for online judges
Always check the constraint on n. If n can be 30+, look for a different approach.
| Problem Category | State Representation | Key Insight | Example Problems |
|---|---|---|---|
| Traveling Salesman | dp[mask][current] | Track visited cities, not order | Shortest superstring, Hamiltonian path |
| Assignment/Matching | dp[mask] for assigned items | Match items to positions one by one | Minimum cost assignment, special subsequences |
| Partition into subsets | dp[mask] for partition progress | Track bucket fill level | Equal sum partition, fair candy swap |
| Subset selection | dp[mask] for selected subset | Optimal subset satisfying constraints | Maximum AND, set covering |
| Constraint satisfaction | dp[mask] for satisfied constraints | Which constraints are met | Compatible groups, scheduling |
Alternatives When n Is Too Large:
If the bitmask approach seems right but n is too large:
Meet in the Middle: Split elements into two halves, compute results for each half (2^(n/2) each), then combine. Reduces O(2ⁿ) to O(2^(n/2) × log factor).
Exponential-time exact algorithms with pruning: Branch and bound, backtracking with good heuristics.
Polynomial-time special cases: Some problems that look exponential have polynomial solutions (e.g., Hungarian for assignment, Edmonds' blossom for matching).
Approximation algorithms: For NP-hard problems at scale, settle for near-optimal solutions.
Efficient bitmask DP implementation requires attention to both correctness and performance. Here are key techniques:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
"""Generic Bitmask DP Template This template covers the most common pattern: tracking selected elementsand an auxiliary 'current' element or position.""" def bitmask_dp_template(n: int, cost_function) -> int: """ Template for bitmask DP with state dp[mask][current]. Args: n: Number of elements cost_function: cost_function(mask, current, next) returns the cost of adding 'next' when at 'current' with 'mask' selected Returns: Minimum cost to process all elements """ INF = float('inf') # dp[mask][current] = min cost to reach this state dp = [[INF] * n for _ in range(1 << n)] # Base case: start from any single element (modify as needed) for start in range(n): dp[1 << start][start] = 0 # Or initial cost for starting here # Process all masks for mask in range(1, 1 << n): for current in range(n): # Skip if current is not in mask if not (mask & (1 << current)): continue if dp[mask][current] == INF: continue # Try transitioning to each element not in mask for next_elem in range(n): if mask & (1 << next_elem): # Already selected continue new_mask = mask | (1 << next_elem) new_cost = dp[mask][current] + cost_function(mask, current, next_elem) dp[new_mask][next_elem] = min(dp[new_mask][next_elem], new_cost) # Answer: minimum over all ending positions with full mask full_mask = (1 << n) - 1 return min(dp[full_mask][end] for end in range(n)) def bitmask_dp_simple_template(n: int) -> int: """ Simpler template without current position tracking. Use when you only care about which elements are selected, not which one was added last. """ INF = float('inf') # Precompute popcount for efficiency popcount = [bin(i).count('1') for i in range(1 << n)] dp = [INF] * (1 << n) dp[0] = 0 # Base case: empty set for mask in range(1 << n): if dp[mask] == INF: continue num_selected = popcount[mask] # Try adding each unselected element for i in range(n): if mask & (1 << i): continue new_mask = mask | (1 << i) cost = compute_cost(mask, num_selected, i) # Problem-specific dp[new_mask] = min(dp[new_mask], dp[mask] + cost) return dp[(1 << n) - 1] def compute_cost(mask: int, num_selected: int, new_element: int) -> int: """Placeholder: compute cost of adding new_element.""" return 1 # Replace with actual logic # Efficient subset enumerationdef enumerate_subsets_of_mask(mask: int) -> list[int]: """Enumerate all subsets of a given mask (including empty set).""" subsets = [] subset = mask while subset: subsets.append(subset) subset = (subset - 1) & mask subsets.append(0) # Include empty subset return subsets # Example: count subsets with propertydef count_subsets_with_sum(nums: list[int], target: int) -> int: """Count subsets that sum to exactly target.""" n = len(nums) count = 0 for mask in range(1 << n): subset_sum = sum(nums[i] for i in range(n) if mask & (1 << i)) if subset_sum == target: count += 1 return countBitmask DP is a powerful technique that transforms intractable combinatorial problems into manageable ones by representing subsets as integers. Let's consolidate the key concepts:
You now understand Bitmask DP—a technique that makes exponential search tractable through clever state representation. The next page explores DP on Trees, where the structure of tree graphs defines the DP state and transitions, enabling efficient solutions for hierarchical optimization problems.