Database Management SystemsJoin Ordering

Join Ordering in Query Optimization

LevelAdvanced

Duration60 mins

TopicJoin Ordering

5 / 5

Left-Deep Plans

The Workhorse of Query Execution

When you execute a multi-table join in most database systems, the query plan almost always takes a specific shape: a left-deep tree. This isn't an accident or an optimization limitation—it's a deliberate architectural choice that aligns query plan structure with the mechanics of efficient query execution.

Left-deep plans dominate because they enable pipelining: intermediate results flow directly from one join to the next without being materialized to disk. This single property often outweighs any theoretical benefits of bushy (balanced) trees, making left-deep the default choice for the vast majority of queries.

Understanding left-deep plans is essential for understanding how databases actually execute queries, why certain query patterns perform well, and when to consider alternatives.

What You Will Learn

By the end of this page, you will understand what left-deep plans are, why they enable pipelining, their relationship to different join algorithms, when they're optimal, and when alternative tree shapes should be considered. You'll develop practical intuition for reasoning about query execution.

Anatomy of Left-Deep Plans

A left-deep plan (also called a left-deep tree, linear tree, or pipeline tree) is a join tree where every right child is a base table, and all intermediate results appear on the left spine of the tree.

Visual representation for a 5-table join (A, B, C, D, E):

Reading bottom-up:

Join A with B → intermediate result
Join result with C → intermediate result
Join result with D → intermediate result
Join result with E → final result

At every step, one input is an intermediate result (from all previous joins) and the other input is a base table accessed directly.

Key structural properties:

The 'outer' relation is always composite — The left (outer) input to each join contains all tables joined so far
The 'inner' relation is always a base table — The right (inner) input is accessed from storage for each probe
Sequential data flow — Tuples flow from bottom to top along the left spine
Single active intermediate — Only one intermediate result exists at any time
Join order = table sequence — The left-to-right reading of base tables defines the join order

Left vs Right Terminology

In join nomenclature, 'outer' (left) and 'inner' (right) traditionally refer to nested-loop join semantics: the outer relation is scanned once, and for each outer tuple, the inner relation is probed. Left-deep tree structure aligns with nested-loop execution, even when other join algorithms are used.

Pipelining: The Central Advantage

The fundamental advantage of left-deep plans is that they enable pipelining—a mode of execution where tuples flow continuously through the query plan without intermediate materialization.

How pipelining works:

In a pipelined execution model (the iterator model or Volcano model):

Each operator implements open(), next(), and close() methods
The top operator calls next() on its child, which calls next() on its child, and so on
Each next() call returns a single tuple
Tuples flow from leaves to root one at a time

No intermediate results are materialized to memory or disk—a tuple produced by one join immediately becomes input to the next.

Pipelined Nested-Loop Join Execution
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Left-deep plan: (((A ⋈ B) ⋈ C) ⋈ D)
// Pipelined execution - no intermediate materialization
 
class PipelinedNLJoin:
    left: Operator   // Intermediate result (pipelined)
    right: Table     // Base table
    
    method next():
        while true:
            if currentRightTuple is null:
                // Get next outer tuple
                outerTuple = left.next()
                if outerTuple is null:
                    return null  // Exhausted
                right.rewind()
                
            rightTuple = right.next()
            if rightTuple is not null:
                if joinCondition(outerTuple, rightTuple):
                    return combine(outerTuple, rightTuple)
            else:
                currentRightTuple = null  // Retry with next outer
 
// Execution for full plan:
// root.next() → join3.next() → join2.next() → join1.next() → A.next()
// Results bubble up immediately without buffering

Pipelining vs Materialization Comparison
Aspect	Pipelining (Left-Deep)	Materialization (Bushy)
Memory usage	O(1) for intermediate tuples	O(result size) for each subtree
Disk I/O	None for intermediates	Potential spilling to disk
First result latency	Fast (as soon as first match)	Delayed (wait for subtree completion)
Parallelism opportunity	Limited within plan	Independent subtrees parallelizable
Blocking operations	Minimal	Each bushy node is a block

The Memory Advantage

Pipelining's key benefit is memory efficiency. In a left-deep plan, you never need to hold a complete intermediate result in memory—each intermediate tuple is processed and discarded before the next is generated. For large joins where intermediates could be gigabytes, this is the difference between completion and out-of-memory failure.

Compatibility with Join Algorithms

Left-deep plans interact differently with the three major join algorithms. Understanding these interactions explains why left-deep trees favor certain execution strategies:

Left-Deep Plan Compatibility by Join Algorithm

•Nested-Loop Join — Perfect fit. Outer (left) relation is scanned once; inner (right) is probed per outer tuple. Pipelining from outer, repeated access to inner. Left-deep + NL join is the classic pattern.
•Index Nested-Loop Join — Optimal for left-deep. Base tables on the right can use indexes for point lookups. Each outer tuple probes an index on the inner table—exactly O(log n) or O(1) access per tuple.
•Hash Join — Works, with caveats. Typically builds hash table on the smaller (right) relation, probes with left. Left-deep is fine if right relations are small enough to fit in memory.
•Merge Join — Requires sorted inputs. Both inputs must be sorted. Pipelining is possible if one input produces sorted output (e.g., from index scan or previous merge join).

Index nested-loop: The poster child for left-deep:

Consider the query:

SELECT * FROM Orders o
JOIN Customers c ON o.customer_id = c.id
JOIN Products p ON o.product_id = p.id
JOIN Suppliers s ON p.supplier_id = s.id

With left-deep and index nested-loop:

Scan Orders (outer)
For each order, index lookup into Customers (O(1) with hash index)
For each result, index lookup into Products (O(1))
For each result, index lookup into Suppliers (O(1))

Total cost: |Orders| × (1 + 1 + 1 + 1) = O(|Orders|)

This linear scaling is only possible because:

Each base table has an index on the join column
Left-deep structure means each base table is accessed via index, not scanned
Pipelining avoids materializing intermediate |Orders| × |Customers| results

Hash Join Build Side

For hash joins in left-deep plans, the right (inner) relation becomes the build side. If intermediate results grow large while base tables remain small, this is ideal. But if base tables are larger than intermediates, performance can degrade—the smaller relation should be the build side.

Right-Deep Plans: The Mirror Image

Right-deep plans are the mirror image of left-deep: every left child is a base table, and intermediates accumulate on the right spine.

Visual representation:

Here, D joins E first, then C joins that result, and so on. Base tables are on the left; intermediates accumulate on the right.

Why right-deep might be preferred:

Right-deep plans shine with hash join cascades:

Build phase (all at once): Build hash tables for D, C, B, A—all base tables, potentially all small
Probe phase (single pass): Start with E, probe through the cascade of hash tables

If all base tables fit in memory as hash tables, this approach:

Builds all hash tables in parallel (if system supports it)
Performs a single streaming pass through the rightmost table
Minimizes I/O by accessing each base table exactly once

When right-deep beats left-deep:

Base tables are small (fit in memory as hash tables)
Intermediate results are large
Hash join is the dominant algorithm
Parallel hash table construction is available

Left-Deep vs Right-Deep Trade-offs
Characteristic	Left-Deep	Right-Deep
Pipelining	Natural (outer side)	Reversed (inner side)
Memory during execution	One intermediate at a time	All base table hash tables
Index utilization	Excellent (inner indexed)	Poor (outer is intermediate)
Hash join efficiency	Good if right is smaller	Good if bases are smaller
Optimizer default	Usually yes	Rarely considered

Symmetric Hash Join

Some systems implement symmetric hash joins that can pipeline both inputs. This blurs the distinction between left-deep and right-deep for hash joins, but the distinction remains important for nested-loop and merge joins.

Bushy Plans: When Balance Helps

Bushy plans (also called bushy trees) allow intermediate results on both sides of a join. This enables independent subtrees to execute in parallel, potentially reducing overall latency.

Bushy plan example for 8 tables:

              ⋈
           /     \
          ⋈       ⋈
         / \     / \
        ⋈   ⋈   ⋈   ⋈
       /\ /\ /\ /\
      A B C D E F G H

This balanced tree has depth log₂(8) = 3 instead of linear depth 7.

Advantages of bushy plans:

Parallelism — Independent subtrees can execute concurrently on different cores/machines
Reduced height — Fewer operators in the critical path
Flexibility — Can combine highly selective subtrees before involving unselective ones

The Materialization Cost

Bushy plans have a fundamental disadvantage: each internal subtree must materialize its result before the parent join can proceed. If these intermediates are large, the materialization cost can overwhelm any parallelism benefits. This is why bushy plans are usually beneficial only when subtrees are highly selective.

When to Consider Bushy Plans

•Parallel/distributed execution — MPP (massively parallel processing) systems benefit from independent subtrees across nodes
•Highly selective subtrees — If joining A-B produces 100 rows and C-D produces 50 rows, bushing avoids expanding through large tables
•Subquery decorrelation — Decorrelated EXISTS subqueries naturally produce bushy structures
•Set operations — UNION/INTERSECT between subqueries create bushy plan shapes
•Limited memory — When intermediate results are too large for memory, controlled bushy structure with selective subtrees first may help

Bushy plans in practice:

Most query optimizers default to left-deep enumeration but may consider bushy plans in specific cases:

When query structure mandates it (multiple FROM subqueries)
When parallelism is explicitly requested
When cost models suggest small intermediate results
When the system is distributed (e.g., Spark, Presto)

Distributed query engines like Spark prefer bushy plans because they maximize cluster utilization—multiple stages can execute in parallel across nodes, even at the cost of shuffle operations between stages.

Zig-Zag and Hybrid Tree Shapes

Beyond pure left-deep, right-deep, and bushy, some systems consider zig-zag or other hybrid tree shapes that combine properties of multiple approaches.

Zig-zag tree example:

          ⋈
         / \
        ⋈   E    ← Right placement
       / \
      D   ⋈      ← Left placement
         / \
        ⋈   C    ← Right placement
       / \
      A   B

This alternating pattern doesn't fit cleanly into left-deep or right-deep categories. It might arise when:

Certain base tables are best on the build side (for hash join)
Others are best on the probe side
Mixed index availability requires different access patterns

Tree Shape Decision Factors
Factor	Favors Left-Deep	Favors Right-Deep/Bushy
Available indexes	On base tables (inner access)	None; hash joins dominate
Memory availability	Limited (pipelining preserves memory)	Abundant (can hold hash tables)
Execution model	Single-threaded, iterator-based	Parallel, stage-based
Intermediate sizes	Intermediate > base tables	Base tables > intermediates
Latency requirements	First-row-fast needed	Throughput more important
Join algorithms	Nested-loop, index NL	Hash join cascades

Optimizer Intelligence

Modern optimizers don't rigidly enforce tree shapes. They evaluate cost and choose accordingly. 'Left-deep restriction' often means 'enumerate left-deep first; consider alternatives if cost suggests benefit.' The shape is a means to an end (efficient execution), not an end in itself.

Practical Implications for Query Writers

Understanding left-deep plans helps query writers structure their SQL for optimal execution:

Query Writing Best Practices

•Ensure indexes exist on join columns — Left-deep execution excels when each inner table can be accessed via index. Missing indexes force table scans, breaking the O(n) scaling.
•Place selective predicates early (logically) — Write queries with selective conditions on leading tables in the FROM clause. While optimizers can reorder, clear structure helps.
•Use foreign keys correctly — FK relationships tell the optimizer about cardinality constraints. Left-deep plans with FK lookups are highly efficient.
•Avoid unnecessary subqueries — Multiple correlated subqueries can force non-optimal tree shapes. Prefer joins when possible.
•Monitor for index nested-loop usage — In EXPLAIN output, look for 'Index Nested Loop' or similar. This indicates the plan is leveraging left-deep + index advantages.
•Be cautious with OUTER JOINs — Left/right outer joins constrain join ordering. Some orderings become invalid, limiting optimizer freedom.

OUTER JOIN Ordering Constraints

Unlike inner joins, outer joins are not freely reorderable. LEFT JOIN A, B, C may only work in certain sequences to preserve null-extension semantics. This can force suboptimal orderings. Consider whether you truly need OUTER JOIN or if INNER JOIN with explicit null handling is cleaner.

Reading EXPLAIN plans for left-deep structure:

In most EXPLAIN outputs, left-deep plans appear as nested indentation:

→ Nested Loop
   → Nested Loop
      → Nested Loop
         → Seq Scan on A
         → Index Scan on B
      → Index Scan on C  
   → Index Scan on D

Each nested loop's first child is another nested loop (the accumulated intermediate), and the second child is a base table scan. This nesting pattern is the visual signature of left-deep execution.

Signs of non-left-deep execution:

Hash Join nodes between intermediate results (bushy hint)
Materialize nodes (intermediate materialization)
Parallel Append or Parallel Hash joining subtrees

When Left-Deep Plans Are Suboptimal

Despite their advantages, left-deep plans aren't always optimal. Recognizing when to consider alternatives is an advanced skill:

Scenarios Where Left-Deep Underperforms
Scenario	Why Left-Deep Struggles	Better Alternative
Large base tables, small intermediates	Repeated base table scans	Right-deep with hash cascade
Highly parallel environment	Sequential left-spine bottleneck	Bushy for parallel subtrees
Multiple selective subqueries	Linear processing misses parallelism	Bushy joining subquery results
No indexes on join columns	Index NL impossible; NL is O(n²)	Hash join (either depth)
Distributed execution	Network for each step	Bushy to minimize shuffles

Case study: Analytical query with aggregations

Consider:

SELECT region, SUM(sales)
FROM (
    SELECT region, amount as sales FROM NorthData
    UNION ALL
    SELECT region, amount FROM SouthData
    UNION ALL
    SELECT region, amount FROM EastData  
    UNION ALL
    SELECT region, amount FROM WestData
) combined
JOIN Regions ON combined.region = Regions.id
GROUP BY region;

Here, a bushy plan that processes each regional subquery in parallel, then unions and joins, is far more efficient than a left-deep plan that processes regions sequentially. Modern optimizers recognize UNION ALL as parallelizable and create appropriate bushy structures.

Star Joins: A Special Case

Star schema joins (one fact table with multiple dimension tables) are a common case where left-deep isn't ideal. Specialized star join optimization first filters all dimension tables, then joins their results with the fact table—a partially bushy approach. Many commercial DBs have specific optimizations for star queries.

Signs You Might Need Non-Left-Deep

•Query has multiple independent subqueries that could run in parallel
•EXPLAIN shows large estimated row counts flowing through the entire left spine
•System is heavily parallelized but query uses only one core
•Hash joins dominate but build sides are the large intermediates
•Distributed execution shows excessive network shuffle at each step

Summary: Left-Deep Plans

We've explored left-deep plans—the dominant tree shape in query optimization. Let's consolidate the key insights:

Key Takeaways

•Left-deep plans have all base tables on the right — Each join combines an intermediate (left) with a base table (right), creating a linear spine.
•Pipelining is the key advantage — Tuples flow through the plan without materialization, minimizing memory usage and enabling streaming execution.
•Index nested-loop joins fit perfectly — Each base table can be accessed via index, achieving O(n) scaling for the entire join chain.
•Right-deep plans favor hash join cascades — When base tables are small enough to be hash table build sides, right-deep enables efficient single-pass probing.
•Bushy plans enable parallelism — Independent subtrees can execute concurrently, but materialization costs may offset benefits.
•Most optimizers default to left-deep — The search space is smaller (n! vs super-exponential), and execution is memory-efficient.
•Non-left-deep is warranted in specific scenarios — Parallel execution, small base tables, and specific query shapes may favor alternative structures.

Module complete:

Congratulations! You've completed the Join Ordering module. You now understand:

Why join ordering is critical (Page 1)
The exponential search space (Page 2)
Dynamic programming for optimal orderings (Page 3)
Heuristic approaches for large queries (Page 4)
Left-deep plans and execution (Page 5)

This knowledge is fundamental to understanding query optimization in any relational database system. Join ordering decisions directly affect real-world query performance by orders of magnitude, making this among the most impactful topics in database internals.

Module Complete

You now have a comprehensive understanding of join ordering—from the exponential search space through dynamic programming and heuristics to the practical execution of left-deep plans. This knowledge enables you to reason about query performance, understand optimizer behavior, and diagnose join ordering issues in production systems.

5 / 5

Loading learning content...

Database Management SystemsJoin Ordering

Join Ordering in Query Optimization

LevelAdvanced

Duration60 mins

TopicJoin Ordering

5 / 5

Left-Deep Plans

The Workhorse of Query Execution

Understanding left-deep plans is essential for understanding how databases actually execute queries, why certain query patterns perform well, and when to consider alternatives.

What You Will Learn

Anatomy of Left-Deep Plans

Visual representation for a 5-table join (A, B, C, D, E):

Reading bottom-up:

Join A with B → intermediate result
Join result with C → intermediate result
Join result with D → intermediate result
Join result with E → final result

At every step, one input is an intermediate result (from all previous joins) and the other input is a base table accessed directly.

Key structural properties:

The 'outer' relation is always composite — The left (outer) input to each join contains all tables joined so far
The 'inner' relation is always a base table — The right (inner) input is accessed from storage for each probe
Sequential data flow — Tuples flow from bottom to top along the left spine
Single active intermediate — Only one intermediate result exists at any time
Join order = table sequence — The left-to-right reading of base tables defines the join order

Left vs Right Terminology

Pipelining: The Central Advantage

The fundamental advantage of left-deep plans is that they enable pipelining—a mode of execution where tuples flow continuously through the query plan without intermediate materialization.

How pipelining works:

In a pipelined execution model (the iterator model or Volcano model):

Each operator implements open(), next(), and close() methods
The top operator calls next() on its child, which calls next() on its child, and so on
Each next() call returns a single tuple
Tuples flow from leaves to root one at a time

No intermediate results are materialized to memory or disk—a tuple produced by one join immediately becomes input to the next.

Pipelined Nested-Loop Join Execution
Pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Left-deep plan: (((A ⋈ B) ⋈ C) ⋈ D)
// Pipelined execution - no intermediate materialization
 
class PipelinedNLJoin:
    left: Operator   // Intermediate result (pipelined)
    right: Table     // Base table
    
    method next():
        while true:
            if currentRightTuple is null:
                // Get next outer tuple
                outerTuple = left.next()
                if outerTuple is null:
                    return null  // Exhausted
                right.rewind()
                
            rightTuple = right.next()
            if rightTuple is not null:
                if joinCondition(outerTuple, rightTuple):
                    return combine(outerTuple, rightTuple)
            else:
                currentRightTuple = null  // Retry with next outer
 
// Execution for full plan:
// root.next() → join3.next() → join2.next() → join1.next() → A.next()
// Results bubble up immediately without buffering

Pipelining vs Materialization Comparison
Aspect	Pipelining (Left-Deep)	Materialization (Bushy)
Memory usage	O(1) for intermediate tuples	O(result size) for each subtree
Disk I/O	None for intermediates	Potential spilling to disk
First result latency	Fast (as soon as first match)	Delayed (wait for subtree completion)
Parallelism opportunity	Limited within plan	Independent subtrees parallelizable
Blocking operations	Minimal	Each bushy node is a block

The Memory Advantage

Compatibility with Join Algorithms

Left-deep plans interact differently with the three major join algorithms. Understanding these interactions explains why left-deep trees favor certain execution strategies:

Left-Deep Plan Compatibility by Join Algorithm

•Nested-Loop Join — Perfect fit. Outer (left) relation is scanned once; inner (right) is probed per outer tuple. Pipelining from outer, repeated access to inner. Left-deep + NL join is the classic pattern.
•Index Nested-Loop Join — Optimal for left-deep. Base tables on the right can use indexes for point lookups. Each outer tuple probes an index on the inner table—exactly O(log n) or O(1) access per tuple.
•Hash Join — Works, with caveats. Typically builds hash table on the smaller (right) relation, probes with left. Left-deep is fine if right relations are small enough to fit in memory.
•Merge Join — Requires sorted inputs. Both inputs must be sorted. Pipelining is possible if one input produces sorted output (e.g., from index scan or previous merge join).

Index nested-loop: The poster child for left-deep:

Consider the query:

SELECT * FROM Orders o
JOIN Customers c ON o.customer_id = c.id
JOIN Products p ON o.product_id = p.id
JOIN Suppliers s ON p.supplier_id = s.id

With left-deep and index nested-loop:

Scan Orders (outer)
For each order, index lookup into Customers (O(1) with hash index)
For each result, index lookup into Products (O(1))
For each result, index lookup into Suppliers (O(1))

Total cost: |Orders| × (1 + 1 + 1 + 1) = O(|Orders|)

This linear scaling is only possible because:

Each base table has an index on the join column
Left-deep structure means each base table is accessed via index, not scanned
Pipelining avoids materializing intermediate |Orders| × |Customers| results

Hash Join Build Side

Right-Deep Plans: The Mirror Image

Right-deep plans are the mirror image of left-deep: every left child is a base table, and intermediates accumulate on the right spine.

Visual representation:

Here, D joins E first, then C joins that result, and so on. Base tables are on the left; intermediates accumulate on the right.

Why right-deep might be preferred:

Right-deep plans shine with hash join cascades:

Build phase (all at once): Build hash tables for D, C, B, A—all base tables, potentially all small
Probe phase (single pass): Start with E, probe through the cascade of hash tables

If all base tables fit in memory as hash tables, this approach:

Builds all hash tables in parallel (if system supports it)
Performs a single streaming pass through the rightmost table
Minimizes I/O by accessing each base table exactly once

When right-deep beats left-deep:

Base tables are small (fit in memory as hash tables)
Intermediate results are large
Hash join is the dominant algorithm
Parallel hash table construction is available

Left-Deep vs Right-Deep Trade-offs
Characteristic	Left-Deep	Right-Deep
Pipelining	Natural (outer side)	Reversed (inner side)
Memory during execution	One intermediate at a time	All base table hash tables
Index utilization	Excellent (inner indexed)	Poor (outer is intermediate)
Hash join efficiency	Good if right is smaller	Good if bases are smaller
Optimizer default	Usually yes	Rarely considered

Symmetric Hash Join

Bushy Plans: When Balance Helps

Bushy plans (also called bushy trees) allow intermediate results on both sides of a join. This enables independent subtrees to execute in parallel, potentially reducing overall latency.

Bushy plan example for 8 tables:

              ⋈
           /     \
          ⋈       ⋈
         / \     / \
        ⋈   ⋈   ⋈   ⋈
       /\ /\ /\ /\
      A B C D E F G H

This balanced tree has depth log₂(8) = 3 instead of linear depth 7.

Advantages of bushy plans:

Parallelism — Independent subtrees can execute concurrently on different cores/machines
Reduced height — Fewer operators in the critical path
Flexibility — Can combine highly selective subtrees before involving unselective ones

The Materialization Cost

When to Consider Bushy Plans

•Parallel/distributed execution — MPP (massively parallel processing) systems benefit from independent subtrees across nodes
•Highly selective subtrees — If joining A-B produces 100 rows and C-D produces 50 rows, bushing avoids expanding through large tables
•Subquery decorrelation — Decorrelated EXISTS subqueries naturally produce bushy structures
•Set operations — UNION/INTERSECT between subqueries create bushy plan shapes
•Limited memory — When intermediate results are too large for memory, controlled bushy structure with selective subtrees first may help

Bushy plans in practice:

Most query optimizers default to left-deep enumeration but may consider bushy plans in specific cases:

When query structure mandates it (multiple FROM subqueries)
When parallelism is explicitly requested
When cost models suggest small intermediate results
When the system is distributed (e.g., Spark, Presto)

Zig-Zag and Hybrid Tree Shapes

Beyond pure left-deep, right-deep, and bushy, some systems consider zig-zag or other hybrid tree shapes that combine properties of multiple approaches.

Zig-zag tree example:

          ⋈
         / \
        ⋈   E    ← Right placement
       / \
      D   ⋈      ← Left placement
         / \
        ⋈   C    ← Right placement
       / \
      A   B

This alternating pattern doesn't fit cleanly into left-deep or right-deep categories. It might arise when:

Certain base tables are best on the build side (for hash join)
Others are best on the probe side
Mixed index availability requires different access patterns

Tree Shape Decision Factors
Factor	Favors Left-Deep	Favors Right-Deep/Bushy
Available indexes	On base tables (inner access)	None; hash joins dominate
Memory availability	Limited (pipelining preserves memory)	Abundant (can hold hash tables)
Execution model	Single-threaded, iterator-based	Parallel, stage-based
Intermediate sizes	Intermediate > base tables	Base tables > intermediates
Latency requirements	First-row-fast needed	Throughput more important
Join algorithms	Nested-loop, index NL	Hash join cascades

Optimizer Intelligence

Practical Implications for Query Writers

Understanding left-deep plans helps query writers structure their SQL for optimal execution:

Query Writing Best Practices

•Ensure indexes exist on join columns — Left-deep execution excels when each inner table can be accessed via index. Missing indexes force table scans, breaking the O(n) scaling.
•Place selective predicates early (logically) — Write queries with selective conditions on leading tables in the FROM clause. While optimizers can reorder, clear structure helps.
•Use foreign keys correctly — FK relationships tell the optimizer about cardinality constraints. Left-deep plans with FK lookups are highly efficient.
•Avoid unnecessary subqueries — Multiple correlated subqueries can force non-optimal tree shapes. Prefer joins when possible.
•Monitor for index nested-loop usage — In EXPLAIN output, look for 'Index Nested Loop' or similar. This indicates the plan is leveraging left-deep + index advantages.
•Be cautious with OUTER JOINs — Left/right outer joins constrain join ordering. Some orderings become invalid, limiting optimizer freedom.

OUTER JOIN Ordering Constraints

Reading EXPLAIN plans for left-deep structure:

In most EXPLAIN outputs, left-deep plans appear as nested indentation:

→ Nested Loop
   → Nested Loop
      → Nested Loop
         → Seq Scan on A
         → Index Scan on B
      → Index Scan on C  
   → Index Scan on D

Each nested loop's first child is another nested loop (the accumulated intermediate), and the second child is a base table scan. This nesting pattern is the visual signature of left-deep execution.

Signs of non-left-deep execution:

Hash Join nodes between intermediate results (bushy hint)
Materialize nodes (intermediate materialization)
Parallel Append or Parallel Hash joining subtrees

When Left-Deep Plans Are Suboptimal

Despite their advantages, left-deep plans aren't always optimal. Recognizing when to consider alternatives is an advanced skill:

Scenarios Where Left-Deep Underperforms
Scenario	Why Left-Deep Struggles	Better Alternative
Large base tables, small intermediates	Repeated base table scans	Right-deep with hash cascade
Highly parallel environment	Sequential left-spine bottleneck	Bushy for parallel subtrees
Multiple selective subqueries	Linear processing misses parallelism	Bushy joining subquery results
No indexes on join columns	Index NL impossible; NL is O(n²)	Hash join (either depth)
Distributed execution	Network for each step	Bushy to minimize shuffles

Case study: Analytical query with aggregations

Consider:

SELECT region, SUM(sales)
FROM (
    SELECT region, amount as sales FROM NorthData
    UNION ALL
    SELECT region, amount FROM SouthData
    UNION ALL
    SELECT region, amount FROM EastData  
    UNION ALL
    SELECT region, amount FROM WestData
) combined
JOIN Regions ON combined.region = Regions.id
GROUP BY region;

Star Joins: A Special Case

Signs You Might Need Non-Left-Deep

•Query has multiple independent subqueries that could run in parallel
•EXPLAIN shows large estimated row counts flowing through the entire left spine
•System is heavily parallelized but query uses only one core
•Hash joins dominate but build sides are the large intermediates
•Distributed execution shows excessive network shuffle at each step

Summary: Left-Deep Plans

We've explored left-deep plans—the dominant tree shape in query optimization. Let's consolidate the key insights:

Key Takeaways

•Left-deep plans have all base tables on the right — Each join combines an intermediate (left) with a base table (right), creating a linear spine.
•Pipelining is the key advantage — Tuples flow through the plan without materialization, minimizing memory usage and enabling streaming execution.
•Index nested-loop joins fit perfectly — Each base table can be accessed via index, achieving O(n) scaling for the entire join chain.
•Right-deep plans favor hash join cascades — When base tables are small enough to be hash table build sides, right-deep enables efficient single-pass probing.
•Bushy plans enable parallelism — Independent subtrees can execute concurrently, but materialization costs may offset benefits.
•Most optimizers default to left-deep — The search space is smaller (n! vs super-exponential), and execution is memory-efficient.
•Non-left-deep is warranted in specific scenarios — Parallel execution, small base tables, and specific query shapes may favor alternative structures.

Module complete:

Congratulations! You've completed the Join Ordering module. You now understand:

Why join ordering is critical (Page 1)
The exponential search space (Page 2)
Dynamic programming for optimal orderings (Page 3)
Heuristic approaches for large queries (Page 4)
Left-deep plans and execution (Page 5)

Module Complete

5 / 5