Loading content...
In computer science, dynamic programming is an algorithmic technique that transforms exponential problems into polynomial ones by recognizing and exploiting overlapping subproblems. It's the secret weapon behind efficient solutions to shortest paths, sequence alignment, knapsack problems—and query optimization.
The insight that join ordering exhibits optimal substructure—the optimal plan for tables {A,B,C} builds upon optimal plans for subsets like {A,B} and {C}—is one of the most important discoveries in database systems. The System R optimizer (IBM, 1979) pioneered the use of dynamic programming for join ordering, and its approach remains the foundation of most cost-based optimizers today.
This page takes you deep into how dynamic programming makes optimal join ordering tractable, transforming a problem with potentially factorial complexity into one solvable in polynomial time.
By the end of this page, you will understand: (1) The principle of optimal substructure in join ordering, (2) The System R dynamic programming algorithm, (3) How memoization tables store intermediate results, (4) Complexity analysis and space requirements, (5) Extensions for interesting orders and physical properties, and (6) When dynamic programming becomes impractical.
Dynamic programming works when a problem has optimal substructure: the optimal solution to the whole problem is composed of optimal solutions to its subproblems.
The join ordering insight:
Consider finding the optimal plan for joining tables {A, B, C}. Any final join must combine two disjoint subsets:
{A, B, C} = {A, B} ⋈ {C}
= {A, C} ⋈ {B}
= {A} ⋈ {B, C}
= {B, C} ⋈ {A}
= {B} ⋈ {A, C}
= {C} ⋈ {A, B}
Critical observation: For the plan {A, B} ⋈ {C} to be optimal overall, the subplan for {A, B} must itself be optimal among all ways to join A and B.
Why? If there were a better way to join A and B (call it {A, B}'), then {A, B}' ⋈ {C} would be cheaper than {A, B} ⋈ {C}, contradicting our assumption that the original plan was optimal.
Formal statement:
Let OPT(S) denote the cost of the optimal plan for joining the set of tables S. Then:
OPT(S) = min over all partitions {S₁, S₂} of S where S₁ ∪ S₂ = S and S₁ ∩ S₂ = ∅:
OPT(S₁) + OPT(S₂) + JoinCost(S₁, S₂)
The optimal cost for S is the minimum over all ways to partition S into two non-empty subsets, of the cost to optimally join each subset plus the cost to join them together.
This property—that optimal solutions contain optimal subsolutions—is Bellman's Principle of Optimality (1957). It's the mathematical foundation that makes dynamic programming work, applicable from shortest paths to edit distance to query optimization.
The System R optimizer (Selinger et al., 1979) introduced the dynamic programming approach to join ordering. Here's the algorithm:
Phase 1: Single-table access plans
For each table R in the query, enumerate all access methods:
Store the best plan(s) for each table in a table keyed by {R}.
Phase 2: Build optimal plans bottom-up
12345678910111213141516171819202122232425262728293031
// Dynamic Programming for Join Ordering (System R Style)// Input: Set of tables T = {T₁, T₂, ..., Tₙ}// Output: Optimal plan for joining all tables function OptimalJoinPlan(T): // Step 1: Initialize single-table plans for each table Tᵢ in T: for each access method A for Tᵢ: plan = CreateAccessPlan(Tᵢ, A) cost = ComputeCost(plan) UpdateBestPlan({Tᵢ}, plan, cost) // Step 2: Build plans for increasingly larger subsets for size = 2 to |T|: for each subset S of T where |S| = size: for each non-empty proper subset S₁ of S: S₂ = S - S₁ if S₂ is empty: continue // Get best plans for subproblems leftPlan = GetBestPlan(S₁) rightPlan = GetBestPlan(S₂) // Try all join methods for each joinMethod in {NL, SMJ, HJ, ...}: plan = CreateJoinPlan(leftPlan, rightPlan, joinMethod) cost = Cost(leftPlan) + Cost(rightPlan) + JoinCost(joinMethod, S₁, S₂) UpdateBestPlan(S, plan, cost) // Step 3: Return best plan for all tables return GetBestPlan(T)Example walkthrough:
Joining tables {A, B, C}:
Iteration 1 (size = 1):
Best({A}) = SeqScan(A), cost = 100
Best({B}) = IdxScan(B), cost = 50
Best({C}) = SeqScan(C), cost = 200
Iteration 2 (size = 2):
Consider {A, B}:
Plan: Best({A}) NLJ Best({B}) = NLJ(SeqScan(A), IdxScan(B)), cost = 100 + 50 + 500 = 650
Plan: Best({A}) HJ Best({B}) = HJ(...), cost = 100 + 50 + 150 = 300
→ Best({A,B}) = HJ(...), cost = 300
Consider {A, C}:
Plan: HJ(...), cost = 400
→ Best({A,C}) = HJ(...), cost = 400
Consider {B, C}:
Plan: HJ(...), cost = 350
→ Best({B,C}) = HJ(...), cost = 350
Iteration 3 (size = 3):
Consider {A, B, C}:
Plan: Best({A,B}) ⋈ Best({C}) = HJ(...) ⋈ SeqScan(C)
cost = 300 + 200 + 180 = 680
Plan: Best({A,C}) ⋈ Best({B}) = HJ(...) ⋈ IdxScan(B)
cost = 400 + 50 + 120 = 570
Plan: Best({A}) ⋈ Best({B,C}) = SeqScan(A) ⋈ HJ(...)
cost = 100 + 350 + 200 = 650
→ Best({A,B,C}) = Best({A,C}) ⋈ Best({B}), cost = 570
Final answer: Join A with C first (hash join), then join the result with B.
The heart of the dynamic programming approach is the memoization table (often called the DP table or plan dictionary)—a data structure that stores optimal subplans.
Table structure:
The DP table maps subsets of tables to their optimal plans:
Key: S (a subset of tables, often represented as a bitmask)
Value: {
best_plan: the optimal plan for joining S,
best_cost: the cost of that plan,
output_cardinality: rows produced,
physical_properties: ordering, partitioning, etc.
}
Efficient subset representation:
For n tables, subsets are often represented as n-bit integers (bitmasks):
Tables: A, B, C, D (4 tables)
{A} → 0001 = 1
{B} → 0010 = 2
{C} → 0100 = 4
{D} → 1000 = 8
{A, B} → 0011 = 3
{A, C} → 0101 = 5
{A, B, C, D} → 1111 = 15
This allows:
(S & (1 << i)) != 0S₁ | S₂S₁ & ~S₂Iteration over subsets:
To enumerate all subsets of a set S, we use bit tricks:
12345678910111213141516171819
// Enumerate all non-empty proper subsets of Svoid enumerateSubsets(int S) { // Iterate through all subsets from S-1 down to 1 for (int subset = S; subset > 0; subset = (subset - 1) & S) { if (subset != S) { // Exclude the set itself int complement = S ^ subset; // S₂ = S - S₁ if (complement > 0) { // Process partition (subset, complement) processPartition(subset, complement); } } if (subset == 0) break; // Handle underflow }} // Example: S = 0111 (binary) = {A, B, C}// Subsets: 110, 101, 100, 011, 010, 001// Non-empty complements pair as: (110, 001), (101, 010), (100, 011)// Which represent: ({B,C}, {A}), ({A,C}, {B}), ({C}, {A,B})The expression (subset - 1) & S produces the next smaller subset of S. This classic bit manipulation technique enables iterating through all 2^n subsets of an n-element set efficiently, essential for dynamic programming over subsets.
Let's analyze the time and space complexity of the dynamic programming algorithm.
Number of subproblems:
For n tables, there are 2ⁿ possible subsets. We solve the optimization problem for each subset.
Work per subproblem:
For a subset S of size k, we consider all partitions into non-empty subsets. The number of such partitions is:
(Number of proper subsets) / 2 = (2ᵏ - 2) / 2 = 2ᵏ⁻¹ - 1
For each partition, we try multiple join methods (constant factor j).
Total time complexity:
Summing over all subsets:
| Component | Formula | Explanation |
|---|---|---|
| Subsets of size k | C(n,k) | n choose k ways to pick k tables |
| Partitions per subset | 2^(k-1) - 1 | Ways to split k elements into 2 groups |
| Join methods | j | Constant (NL, HJ, SMJ, etc.) |
| Total | ∑ C(n,k) × 2^(k-1) × j ≈ O(3ⁿ × j) | Sum over all k from 2 to n |
Derivation of O(3ⁿ):
Using the binomial identity:
∑(k=0 to n) C(n,k) × 2ᵏ = (1+2)ⁿ = 3ⁿ
Our sum is similar, yielding O(3ⁿ) time complexity.
Space complexity:
We store one entry per subset: O(2ⁿ) entries.
Each entry contains a plan (O(n) for tree structure) and metadata: O(n × 2ⁿ) total.
Practical limits:
| Tables | Subsets (2ⁿ) | Work (3ⁿ) |
|---|---|---|
| 5 | 32 | 243 |
| 10 | 1,024 | 59,049 |
| 15 | 32,768 | 14.3 million |
| 20 | ~1 million | ~3.5 billion |
| 25 | ~33 million | ~850 billion |
Observations:
While O(3ⁿ) is exponential, it's vastly better than the naive O(n!) for exploring all join orderings. For 10 tables, 3^10 ≈ 59,000 vs 10! = 3.6 million. DP achieves optimality with manageable cost for typical query sizes.
The basic DP algorithm keeps one best plan per subset. But this can miss globally optimal plans that leverage sorted outputs. Interesting orders extend DP to track multiple plans per subset.
The problem:
Consider:
SELECT * FROM A JOIN B ON A.x = B.x ORDER BY A.x;
If we only keep the cheapest plan for {A,B} (hash join at 100), we miss that sort-merge is globally better because it avoids final sort.
Solution: Track plans per interesting order
Instead of:
DP[{A,B}] = <best_plan, cost>
We maintain:
DP[{A,B}][no_order] = <HJ plan, cost=100>
DP[{A,B}][ordered_by_A.x] = <SMJ plan, cost=130>
DP[{A,B}][ordered_by_B.y] = <another SMJ plan, cost=140>
Identifying interesting orders:
An order is "interesting" if it benefits a downstream operator:
Modified DP with interesting orders:
for each subset S:
for each interesting order O (including "no order"):
for each partition (S₁, S₂) of S:
for each order O₁ compatible with O for S₁:
for each order O₂ compatible with O for S₂:
leftPlan = DP[S₁][O₁]
rightPlan = DP[S₂][O₂]
for each join method:
resultOrder = ComputeOutputOrder(join_method, O₁, O₂)
if resultOrder matches O or no sorting needed:
cost = Cost(leftPlan) + Cost(rightPlan) + JoinCost
UpdateDP(S, resultOrder, plan, cost)
Complexity impact:
If there are k interesting orders:
Typically k is small (< 10 for most queries), so overhead is manageable.
A plan with order O₁ can still be pruned if another plan produces the same order at lower cost. We only keep the Pareto-optimal frontier: plans that are best for at least one requirement. Plans dominated in all properties are discarded.
Interesting orders are a special case of physical properties—characteristics of plan outputs beyond logical equivalence. Modern optimizers generalize DP to handle multiple property types:
Property Types:
| Property | Description | Optimization Impact |
|---|---|---|
| Ordering | Columns output is sorted by | Avoids sorts for ORDER BY, merge joins |
| Partitioning | How data is distributed across nodes | Enables collocated joins in parallel systems |
| Replication | Whether data is fully replicated | Enables broadcast joins |
| Compression | Whether output remains compressed | Reduces memory and I/O |
| Grouping | Columns output is grouped by | Enables streaming aggregation |
| Uniqueness | Columns that form a key | Optimizes distinct, semi-joins |
The requirement and provider model:
The Cascades/Volcano optimizer framework formalizes this:
Requirements: Properties a parent operator needs from its child
Providers: Properties an operator's output provides
Enforcers: Operators that add missing properties
Optimization as requirement propagation:
Plan for ORDER BY A.x:
Require: output sorted by A.x
Option 1: Find plan already sorted by A.x
→ Use as-is
Option 2: Find cheapest unsorted plan + Sort enforcer
→ Add explicit sort
Pick cheaper option
This generalizes beautifully to any number of property types.
Properties can subsume each other. Sorting by (A, B) subsumes requirement for sorting by (A). A plan sorted by (A, B) satisfies requests for (A) ordering. Optimizers exploit this to avoid redundant plan copies.
A significant optimization reduces the search space by avoiding Cartesian products (cross-joins without join predicates). In most queries, cross products are extremely expensive and rarely optimal.
The connected subgraph restriction:
Instead of considering all 2ⁿ subsets, only consider subsets where the tables are connected in the query's join graph:
Query: A JOIN B ON A.x = B.x
JOIN C ON B.y = C.y
JOIN D ON C.z = D.z
Join Graph:
A ─── B ─── C ─── D
Connected subsets: Disconnected (skip):
{A}, {B}, {C}, {D} {A, C}, {A, D}, {B, D}
{A, B} {A, C, D} (no edge A-C or A-D)
{B, C}, {C, D} etc.
{A, B, C}, {B, C, D}
{A, B, C, D}
Benefits:
Query graph shapes and complexity:
| Graph Shape | Connected Subsets | Complexity |
|---|---|---|
| Chain (A-B-C-D) | O(n²) | Polynomial |
| Star (all connect to center) | O(2^(n-1)) | Exponential but smaller |
| Clique (all connected) | O(2ⁿ) | Full exponential |
| Cycle | O(n × 2^(n-1)) | Between chain and clique |
For well-structured queries (chains, stars), optimization is dramatically faster.
Implementation detail:
Enumerating connected subsets requires tracking the join graph. During DP:
for each connected subset S ordered by size:
for each way to split S into S₁ and S₂ where:
- S₁ and S₂ are both connected
- S₁ and S₂ share a join edge (the joining predicate)
combine plans for S₁ and S₂
Sometimes cross products are unavoidable or even optimal—e.g., joining a tiny lookup table where broadcasting is cheap. Smart optimizers check if pruned cross products might actually win and evaluate them if potentially beneficial.
When query size exceeds DP's practical limits, optimizers employ alternative strategies:
1. Greedy Heuristics
Build the plan greedily, always making the locally optimal choice:
remaining = all tables
while remaining has more than one table:
find pair (R, S) with minimum join cost
create plan: current_result ⋈ S (or R ⋈ S if starting)
remove R, S from remaining, add result
Fast (O(n²) or O(n² log n)) but no optimality guarantee.
2. Randomized Search
Genetic algorithms, simulated annealing, or random restarts:
current_plan = random initial plan
best_plan = current_plan
for iterations in budget:
neighbor = apply random transformation to current_plan
if Cost(neighbor) < Cost(current_plan):
current_plan = neighbor
else:
current_plan = neighbor with probability P (annealing)
best_plan = min(best_plan, current_plan)
No optimality guarantee, but often finds good plans for large queries.
3. Iterative Improvement
Start with heuristic plan, improve locally:
current = greedy_plan()
while improving:
for each transformation:
if Cost(transform(current)) < Cost(current):
current = transform(current)
4. Limited Expansion
Apply DP to small subqueries, heuristics for overall structure:
for each block of 5-10 tables:
optimal_subplan[block] = DP_optimize(block)
overall_plan = greedy_combine(optimal_subplan[])
| Tables | Recommended Approach | Optimality | Typical Time |
|---|---|---|---|
| ≤ 6 | Full DP | Guaranteed optimal | < 1 ms |
| 7-12 | DP with pruning | Optimal for considered space | < 100 ms |
| 13-20 | DP + greedy fallback | Near-optimal | < 1 s |
20 | GEQO / Heuristics | Good (no guarantee) | Configurable |
PostgreSQL: Uses DP for small queries, GEQO (genetic) for large. SQL Server: Uses DP with sophisticated pruning and timeouts. Oracle: Uses DP with branch-and-bound and extensive heuristics. Commercial systems often use proprietary extensions that blend approaches.
We've explored the algorithmic heart of cost-based optimization. Let's consolidate the key insights:
What's next:
With dynamic programming covering join ordering, a crucial structural decision remains: Should join plans be left-deep trees (linear chains) or bushy trees (balanced structures)? The next page explores Bushy vs Left-Deep Trees—the trade-offs between plan shape, parallelism, and optimization complexity.
You now understand how dynamic programming makes optimal join ordering tractable. This elegant algorithm—recognizing optimal substructure and exploiting memoization—transforms an apparently impossible search into a polynomial computation, sitting at the core of every serious query optimizer.