Loading learning content...
When you execute a multi-table join in most database systems, the query plan almost always takes a specific shape: a left-deep tree. This isn't an accident or an optimization limitation—it's a deliberate architectural choice that aligns query plan structure with the mechanics of efficient query execution.
Left-deep plans dominate because they enable pipelining: intermediate results flow directly from one join to the next without being materialized to disk. This single property often outweighs any theoretical benefits of bushy (balanced) trees, making left-deep the default choice for the vast majority of queries.
Understanding left-deep plans is essential for understanding how databases actually execute queries, why certain query patterns perform well, and when to consider alternatives.
By the end of this page, you will understand what left-deep plans are, why they enable pipelining, their relationship to different join algorithms, when they're optimal, and when alternative tree shapes should be considered. You'll develop practical intuition for reasoning about query execution.
A left-deep plan (also called a left-deep tree, linear tree, or pipeline tree) is a join tree where every right child is a base table, and all intermediate results appear on the left spine of the tree.
Visual representation for a 5-table join (A, B, C, D, E):
⋈₄
/ \
⋈₃ E
/ \
⋈₂ D
/ \
⋈₁ C
/ \
A B
Reading bottom-up:
At every step, one input is an intermediate result (from all previous joins) and the other input is a base table accessed directly.
Key structural properties:
The 'outer' relation is always composite — The left (outer) input to each join contains all tables joined so far
The 'inner' relation is always a base table — The right (inner) input is accessed from storage for each probe
Sequential data flow — Tuples flow from bottom to top along the left spine
Single active intermediate — Only one intermediate result exists at any time
Join order = table sequence — The left-to-right reading of base tables defines the join order
In join nomenclature, 'outer' (left) and 'inner' (right) traditionally refer to nested-loop join semantics: the outer relation is scanned once, and for each outer tuple, the inner relation is probed. Left-deep tree structure aligns with nested-loop execution, even when other join algorithms are used.
The fundamental advantage of left-deep plans is that they enable pipelining—a mode of execution where tuples flow continuously through the query plan without intermediate materialization.
How pipelining works:
In a pipelined execution model (the iterator model or Volcano model):
open(), next(), and close() methodsnext() on its child, which calls next() on its child, and so onnext() call returns a single tupleNo intermediate results are materialized to memory or disk—a tuple produced by one join immediately becomes input to the next.
1234567891011121314151617181920212223242526
// Left-deep plan: (((A ⋈ B) ⋈ C) ⋈ D)// Pipelined execution - no intermediate materialization class PipelinedNLJoin: left: Operator // Intermediate result (pipelined) right: Table // Base table method next(): while true: if currentRightTuple is null: // Get next outer tuple outerTuple = left.next() if outerTuple is null: return null // Exhausted right.rewind() rightTuple = right.next() if rightTuple is not null: if joinCondition(outerTuple, rightTuple): return combine(outerTuple, rightTuple) else: currentRightTuple = null // Retry with next outer // Execution for full plan:// root.next() → join3.next() → join2.next() → join1.next() → A.next()// Results bubble up immediately without buffering| Aspect | Pipelining (Left-Deep) | Materialization (Bushy) |
|---|---|---|
| Memory usage | O(1) for intermediate tuples | O(result size) for each subtree |
| Disk I/O | None for intermediates | Potential spilling to disk |
| First result latency | Fast (as soon as first match) | Delayed (wait for subtree completion) |
| Parallelism opportunity | Limited within plan | Independent subtrees parallelizable |
| Blocking operations | Minimal | Each bushy node is a block |
Pipelining's key benefit is memory efficiency. In a left-deep plan, you never need to hold a complete intermediate result in memory—each intermediate tuple is processed and discarded before the next is generated. For large joins where intermediates could be gigabytes, this is the difference between completion and out-of-memory failure.
Left-deep plans interact differently with the three major join algorithms. Understanding these interactions explains why left-deep trees favor certain execution strategies:
Index nested-loop: The poster child for left-deep:
Consider the query:
SELECT * FROM Orders o
JOIN Customers c ON o.customer_id = c.id
JOIN Products p ON o.product_id = p.id
JOIN Suppliers s ON p.supplier_id = s.id
With left-deep and index nested-loop:
Total cost: |Orders| × (1 + 1 + 1 + 1) = O(|Orders|)
This linear scaling is only possible because:
For hash joins in left-deep plans, the right (inner) relation becomes the build side. If intermediate results grow large while base tables remain small, this is ideal. But if base tables are larger than intermediates, performance can degrade—the smaller relation should be the build side.
Right-deep plans are the mirror image of left-deep: every left child is a base table, and intermediates accumulate on the right spine.
Visual representation:
⋈₄
/ \
A ⋈₃
/ \
B ⋈₂
/ \
C ⋈₁
/ \
D E
Here, D joins E first, then C joins that result, and so on. Base tables are on the left; intermediates accumulate on the right.
Why right-deep might be preferred:
Right-deep plans shine with hash join cascades:
If all base tables fit in memory as hash tables, this approach:
When right-deep beats left-deep:
| Characteristic | Left-Deep | Right-Deep |
|---|---|---|
| Pipelining | Natural (outer side) | Reversed (inner side) |
| Memory during execution | One intermediate at a time | All base table hash tables |
| Index utilization | Excellent (inner indexed) | Poor (outer is intermediate) |
| Hash join efficiency | Good if right is smaller | Good if bases are smaller |
| Optimizer default | Usually yes | Rarely considered |
Some systems implement symmetric hash joins that can pipeline both inputs. This blurs the distinction between left-deep and right-deep for hash joins, but the distinction remains important for nested-loop and merge joins.
Bushy plans (also called bushy trees) allow intermediate results on both sides of a join. This enables independent subtrees to execute in parallel, potentially reducing overall latency.
Bushy plan example for 8 tables:
⋈
/ \
⋈ ⋈
/ \ / \
⋈ ⋈ ⋈ ⋈
/\ /\ /\ /\
A B C D E F G H
This balanced tree has depth log₂(8) = 3 instead of linear depth 7.
Advantages of bushy plans:
Bushy plans have a fundamental disadvantage: each internal subtree must materialize its result before the parent join can proceed. If these intermediates are large, the materialization cost can overwhelm any parallelism benefits. This is why bushy plans are usually beneficial only when subtrees are highly selective.
Bushy plans in practice:
Most query optimizers default to left-deep enumeration but may consider bushy plans in specific cases:
Distributed query engines like Spark prefer bushy plans because they maximize cluster utilization—multiple stages can execute in parallel across nodes, even at the cost of shuffle operations between stages.
Beyond pure left-deep, right-deep, and bushy, some systems consider zig-zag or other hybrid tree shapes that combine properties of multiple approaches.
Zig-zag tree example:
⋈
/ \
⋈ E ← Right placement
/ \
D ⋈ ← Left placement
/ \
⋈ C ← Right placement
/ \
A B
This alternating pattern doesn't fit cleanly into left-deep or right-deep categories. It might arise when:
| Factor | Favors Left-Deep | Favors Right-Deep/Bushy |
|---|---|---|
| Available indexes | On base tables (inner access) | None; hash joins dominate |
| Memory availability | Limited (pipelining preserves memory) | Abundant (can hold hash tables) |
| Execution model | Single-threaded, iterator-based | Parallel, stage-based |
| Intermediate sizes | Intermediate > base tables | Base tables > intermediates |
| Latency requirements | First-row-fast needed | Throughput more important |
| Join algorithms | Nested-loop, index NL | Hash join cascades |
Modern optimizers don't rigidly enforce tree shapes. They evaluate cost and choose accordingly. 'Left-deep restriction' often means 'enumerate left-deep first; consider alternatives if cost suggests benefit.' The shape is a means to an end (efficient execution), not an end in itself.
Understanding left-deep plans helps query writers structure their SQL for optimal execution:
Unlike inner joins, outer joins are not freely reorderable. LEFT JOIN A, B, C may only work in certain sequences to preserve null-extension semantics. This can force suboptimal orderings. Consider whether you truly need OUTER JOIN or if INNER JOIN with explicit null handling is cleaner.
Reading EXPLAIN plans for left-deep structure:
In most EXPLAIN outputs, left-deep plans appear as nested indentation:
→ Nested Loop
→ Nested Loop
→ Nested Loop
→ Seq Scan on A
→ Index Scan on B
→ Index Scan on C
→ Index Scan on D
Each nested loop's first child is another nested loop (the accumulated intermediate), and the second child is a base table scan. This nesting pattern is the visual signature of left-deep execution.
Signs of non-left-deep execution:
Despite their advantages, left-deep plans aren't always optimal. Recognizing when to consider alternatives is an advanced skill:
| Scenario | Why Left-Deep Struggles | Better Alternative |
|---|---|---|
| Large base tables, small intermediates | Repeated base table scans | Right-deep with hash cascade |
| Highly parallel environment | Sequential left-spine bottleneck | Bushy for parallel subtrees |
| Multiple selective subqueries | Linear processing misses parallelism | Bushy joining subquery results |
| No indexes on join columns | Index NL impossible; NL is O(n²) | Hash join (either depth) |
| Distributed execution | Network for each step | Bushy to minimize shuffles |
Case study: Analytical query with aggregations
Consider:
SELECT region, SUM(sales)
FROM (
SELECT region, amount as sales FROM NorthData
UNION ALL
SELECT region, amount FROM SouthData
UNION ALL
SELECT region, amount FROM EastData
UNION ALL
SELECT region, amount FROM WestData
) combined
JOIN Regions ON combined.region = Regions.id
GROUP BY region;
Here, a bushy plan that processes each regional subquery in parallel, then unions and joins, is far more efficient than a left-deep plan that processes regions sequentially. Modern optimizers recognize UNION ALL as parallelizable and create appropriate bushy structures.
Star schema joins (one fact table with multiple dimension tables) are a common case where left-deep isn't ideal. Specialized star join optimization first filters all dimension tables, then joins their results with the fact table—a partially bushy approach. Many commercial DBs have specific optimizations for star queries.
We've explored left-deep plans—the dominant tree shape in query optimization. Let's consolidate the key insights:
Module complete:
Congratulations! You've completed the Join Ordering module. You now understand:
This knowledge is fundamental to understanding query optimization in any relational database system. Join ordering decisions directly affect real-world query performance by orders of magnitude, making this among the most impactful topics in database internals.
You now have a comprehensive understanding of join ordering—from the exponential search space through dynamic programming and heuristics to the practical execution of left-deep plans. This knowledge enables you to reason about query performance, understand optimizer behavior, and diagnose join ordering issues in production systems.