Cost Based Optimization - Learning Module

Loading content...

0/241

Bushy vs Left-Deep Trees

The Shape of Query Plans

When joining multiple tables, the optimizer doesn't just choose which tables to join first—it also decides the shape of the join tree. Should all joins be chained linearly, forming a left-deep or right-deep tree? Or should intermediate results be combined in a balanced, bushy structure?

This seemingly abstract structural decision has profound practical implications:

Left-deep trees enable pipelining and streaming execution
Bushy trees enable parallelism and can produce smaller intermediates
The number of possible trees differs dramatically between shapes
Memory requirements and blocking behavior depend on tree structure

This page explores the taxonomy of join tree shapes, their properties, and how optimizers choose between them—or explore all simultaneously.

What You Will Learn

By the end of this page, you will understand: (1) The definition of left-deep, right-deep, and bushy join trees, (2) The number of trees of each type, (3) Execution characteristics and trade-offs, (4) When each shape is advantageous, (5) How optimizers restrict or explore different tree shapes, and (6) Modern approaches that consider all shapes adaptively.

Join Tree Taxonomy

A join tree is a binary tree where leaf nodes are base tables and internal nodes are join operations. For n tables, there are n-1 joins and many possible tree structures.

Left-Deep Trees:

In a left-deep tree, all right children are base relations (leaf nodes):

        ⋈
       / \
      ⋈   D
     / \
    ⋈   C
   / \
  A   B

Execution order: ((A ⋈ B) ⋈ C) ⋈ D

The join result always flows down the left spine; each join combines the running result with a fresh base table.

Right-Deep Trees:

Symmetrically, all left children are base relations:

      ⋈
     / \
    A   ⋈
       / \
      B   ⋈
         / \
        C   D

Execution order: A ⋈ (B ⋈ (C ⋈ D))

Bushy Trees (Zig-Zag Trees):

Both children of an internal node can be intermediate results:

        ⋈
       / \
      ⋈   ⋈
     / \ / \
    A  B C  D

Execution order: (A ⋈ B) ⋈ (C ⋈ D)

Bushy trees are more general—they include left-deep and right-deep as special cases.

Join Tree Shape Classification
Tree Type	Definition	Visual Pattern	Key Property
Left-Deep	Right children are always leaves	Leans left (chain)	Sequential/pipeline friendly
Right-Deep	Left children are always leaves	Leans right (chain)	Build-side ready for hash
Bushy	Either child can be intermediate	Balanced binary tree	Enables parallelism
Zig-Zag	Neither strictly left nor right deep	Irregular shape	Intermediate case

Counting Join Trees

The number of possible join trees varies dramatically by shape:

Left-Deep Trees:

For n tables, we choose which table goes first, which goes second, etc.—a permutation:

Number of left-deep trees = n!

For 10 tables: 10! = 3,628,800

All Binary Trees (Bushy):

The number of different binary tree shapes with n leaves is the (n-1)th Catalan number:

Cₙ₋₁ = (2(n-1))! / (n! × (n-1)!)

For each tree shape, we can assign tables to leaves in n! ways:

Number of bushy trees = Cₙ₋₁ × n! = (2n-2)! / (n-1)!

For 10 tables: C₉ × 10! = 4862 × 3,628,800 ≈ 17.6 billion trees!

Number of Join Trees by Type and Table Count
Tables (n)	Left-Deep (n!)	Bushy ((2n-2)!/(n-1)!)	Ratio
3	6	12	2×
4	24	120	5×
5	120	1,680	14×
6	720	30,240	42×
8	40,320	17,297,280	429×
10	3,628,800	17,643,225,600	4,862×
12	479,001,600	6.9 × 10¹⁴	1.4 million×

Optimizer implications:

If optimization explores only left-deep trees, the search space is O(n!), already challenging but manageable with DP.

If exploring all bushy trees, the search space explodes—17 billion possibilities for just 10 tables.

This is why many optimizers restrict to left-deep trees by default:

Smaller search space = faster optimization
Left-deep trees have nice execution properties (see next section)
Bushy trees are explored only when beneficial (parallel systems, specific query patterns)

The Catalan Numbers

Catalan numbers (1, 1, 2, 5, 14, 42, 132, ...) appear throughout combinatorics: binary trees, parenthesizations, non-crossing partitions. They grow approximately as 4^n / (n^(3/2) × √π), ensuring bushy enumeration is infeasible beyond ~10-12 tables without pruning.

Left-Deep Execution Characteristics

Left-deep trees have execution properties that make them particularly suitable for traditional, single-threaded, disk-based database systems:

Property 1: Pipelining

In a left-deep tree with nested loop or hash joins, intermediate results can flow directly to the next join without full materialization:

((A ⋈ B) ⋈ C) ⋈ D

Execution:
1. For each tuple from A:
2.   Find matching tuples in B (using index or hash)
3.   For each match, find matching tuples in C
4.   For each match, find matching tuples in D
5.   Output result

No intermediate result is ever fully materialized to disk (unless memory overflows). This is streaming execution.

Property 2: Index Nested Loop Friendly

With a left-deep tree and index nested loop joins:

((A ⋈ B) ⋈ C) ⋈ D
    idx   idx   idx

- For each tuple flowing through:
- Probe index on B → probe index on C → probe index on D
- Optimal when indexes exist on right-side tables
- Memory requirement: O(1) per join level (just input buffers)

Left-Deep Tree Advantages

•Full pipelining — Tuples stream through without intermediate materialization, reducing I/O dramatically
•Minimal memory — Only buffers for current tuple(s) needed, not full intermediate results
•Index-friendly — Right tables accessed via index probes; left spine builds the driving stream
•Simpler optimization — O(n!) search space, manageable with DP and interesting orders
•Predictable execution — Linear flow is easy to reason about and monitor

The Historical Context

System R (1979) restricted to left-deep trees partly due to limited memory and the prevalence of nested loop joins with indexes. As systems gained memory and hash joins became common, this restriction relaxed—but left-deep remains preferred for many OLTP workloads.

Bushy Tree Advantages

Despite larger search space and materialization needs, bushy trees offer significant advantages in specific scenarios:

Advantage 1: Smaller Intermediate Results

Consider four tables with these cardinalities:

A: 1,000,000 rows
B: 1,000 rows (highly selective join with A → 1,000 rows)
C: 500,000 rows
D: 100 rows (highly selective join with C → 100 rows)

Left-deep plan:

((A ⋈ B) ⋈ C) ⋈ D

Intermediate sizes:
  A ⋈ B = 1,000 rows
  (A ⋈ B) ⋈ C = 500,000 rows (cross-product-ish if no predicate!)
  final = ... depends on D

Bushy plan:

(A ⋈ B) ⋈ (C ⋈ D)

Intermediate sizes:
  A ⋈ B = 1,000 rows
  C ⋈ D = 100 rows
  final = 1,000 × 100 / selectivity = perhaps 500 rows

The bushy plan independently reduces both branches before the final join, avoiding the massive intermediate.

Advantage 2: Parallelism Within Query

In a bushy tree, independent branches can execute in parallel:

          ⋈ (wait for both)
         / \
        ⋈   ⋈
       /|   |\
      A B   C D

    Worker 1       Worker 2
    Computes A⋈B   Computes C⋈D
          ↓              ↓
          Wait for both...
              ↓
          Final join

With 8 tables and a perfectly balanced bushy tree (4 leaves per branch), speedup can approach 2× for that stage. More branches enable more parallelism.

Advantage 3: Hash Join Build Side Optimization

Hash joins perform best when the build side (inner relation) is small. Bushy trees allow computing small intermediates before joining:

Bushy: (D ⋈ E) served as build side for join with larger (A ⋈ B ⋈ C)
       → Small hash table, efficient probing

Left-deep: Must build on base tables only
           → If all base tables are large, hash tables are huge

When Bushy Trees Win

•Parallel execution environments — MPP databases, Spark SQL, distributed systems
•Queries with independent selective joins — Reducing both branches before combining
•Memory-rich systems — Can afford to materialize multiple intermediates
•Star schema queries — Dimension tables joined in parallel before joining fact table
•Complex many-table joins — Better intermediate cardinality control

Right-Deep Trees and Special Cases

Right-deep trees are less common but have their place:

Right-Deep Structure:

      ⋈
     / \
    A   ⋈
       / \
      B   ⋈
         / \
        C   D

Execution: A ⋈ (B ⋈ (C ⋈ D))

Execution with hash joins:

In a right-deep tree with hash joins:

Build hash table on D
Probe with C, producing C⋈D, build hash table on result
Probe with B, producing B⋈(C⋈D), build hash table on result
Probe with A, producing final result

Each join's build side is the result of the next-inner join.

Advantages of right-deep for hash joins:

Right-Deep Tree Characteristics

•All hash tables built before probing — Can prepare all build sides, then stream through
•Pipeline on probe side — Once hash tables ready, A tuples stream through all joins
•Cache efficiency — All hash tables may fit in cache simultaneously
•Vectorized execution friendly — Batch processing of probe tuples

When right-deep helps:

When the innermost tables (C, D) are small and intermediate results grow
A is large but needs to stream through once all hash tables are built
Systems with vectorized hash join implementations

The trade-off:

Right-deep requires building all hash tables before any output is produced—fully blocking on build phases. This increases memory pressure and latency to first result.

Zig-Zag and Mixed Trees:

Some plans are neither purely left-deep nor right-deep:

        ⋈
       / \
      ⋈   C
     / \
    A   ⋈
       / \
      B   D

(A ⋈ (B ⋈ D)) ⋈ C

These "zig-zag" plans combine properties of both. Modern optimizers explore these when beneficial.

Hash Join Pipelines

Right-deep trees can implement "hash join pipelines" where probe tuples flow through multiple hash tables in sequence. This is efficient for analytical workloads with one large fact table (probe side) joining many small dimensions (build sides).

Controlling Tree Shape in Optimizers

Optimizers use several strategies to control which tree shapes are considered:

Strategy 1: Restrict to Left-Deep

Only generate left-deep plans during enumeration:

for each permutation P of tables:
  plan = ((P[0] ⋈ P[1]) ⋈ P[2]) ⋈ ... ⋈ P[n-1]
  cost = evaluate(plan)
  update best if cheaper

Search space: O(n!), easily handled by DP.

Strategy 2: Bushy with Pruning

Explore bushy trees but aggressively prune:

for each subset S in increasing size:
  for each partition (S₁, S₂) of S:
    if estimate(S₁ ⋈ S₂) > current_best × threshold:
      skip (prune)
    else:
      evaluate and update

Pruning makes bushy exploration tractable for moderate query sizes.

Strategy 3: Shape Heuristics

Use query characteristics to select shape:

if query is star schema:
  prefer bushy (dimension joins first)
else if query has chain structure:
  prefer left-deep following chain
else if parallel execution:
  prefer bushy for parallelism
else:
  default to left-deep

Tree Shape Restrictions in Commercial Systems
System	Default Behavior	Bushy Support	Control Mechanism
PostgreSQL	Left-deep preferred	Considers some bushy	GEQO for large queries
MySQL	Left-deep (historically)	Recent versions explore more	Optimizer hints
SQL Server	Explores bushy	Full support	Search levels and timeouts
Oracle	Flexible	Full support	Optimizer mode settings
Spark SQL	Bushy by default	Native parallel	Adaptive query execution

Query Hints for Shape Control

Many systems allow hints to force tree shape. Oracle's /*+ ORDERED */ hint enforces left-deep following FROM clause order. SQL Server's OPTION (FORCE ORDER) does similarly. Use sparingly—the optimizer usually knows best.

Parallel Execution and Tree Shape

The relationship between tree shape and parallelism is crucial for modern analytics systems:

Intra-Query Parallelism with Bushy Trees:

            ⋈ (parallel merge)
          /   \
       ⋈       ⋈
      / \     / \
     A   B   C   D
   
   [Worker Pool 1] [Worker Pool 2]
   Processes A⋈B   Processes C⋈D
         ↓             ↓
      Result 1      Result 2
                ↓
         Final Join

Bushy structure naturally exposes independent work units. A perfectly balanced bushy tree of 2^k leaves can achieve up to k levels of binary parallelism.

Pipeline Parallelism with Left-Deep:

Left-deep trees enable a different form of parallelism:

((A ⋈ B) ⋈ C) ⋈ D

        Stage 1     Stage 2     Stage 3
        A ⋈ B   →   _ ⋈ C   →   _ ⋈ D
            ↘  tuple flow  ↘
   
Tuples pipeline through stages;
each stage can use multiple workers for its operation.

This is pipeline parallelism: stages operate concurrently on different tuples.

Distributed Systems Considerations:

Tree Shape and Distributed Execution
Aspect	Left-Deep	Bushy
Data movement	Sequential: result of join n ships to node(s) for join n+1	May ship multiple intermediates in parallel
Parallelism within query	Limited to worker parallelism per stage	Multiple independent stages run simultaneously
Network traffic	Potentially lower (one intermediate at a time)	Potentially higher (multiple intermediates)
Load balancing	Easier (single flow)	More complex (multiple flows)
Fault tolerance	Simpler checkpointing	More complex state management

Modern MPP systems (Spark, Presto, Snowflake):

These systems aggressively use bushy plans:

Hash partitioning distributes data across workers
Exchange operators shuffle data for collocated joins
Independent stages execute in parallel across the cluster
Broadcast joins for small dimensions avoid shuffles

Bushy trees align naturally with this execution model, enabling full cluster utilization.

The cost model adapts:

In parallel systems, the cost model accounts for:

Degrees of parallelism achievable by each plan
Data shuffle costs (exchange operators)
Skew and load balancing
Memory per worker constraints

Exchange Operators

Parallel plans introduce 'exchange' operators that repartition or broadcast data. These appear at plan boundaries—where data must move between parallel workers. Exchange costs are major factors in parallel plan optimization.

Practical Tree Shape Selection

Given the trade-offs, how do real optimizers choose tree shapes? Here's a practical decision framework:

Consider Left-Deep When:

Prefer Left-Deep

•Index nested loop joins available
•Memory is limited
•Single-node OLTP workload
•Fast time to first row matters
•Query is simple (few tables)
•Streaming / cursor access pattern

Prefer Bushy

•Parallel / distributed execution
•Large intermediate results in left-deep
•Star / snowflake schema queries
•Hash joins dominate
•Memory is abundant
•Total throughput > first row latency

Adaptive approaches:

Modern optimizers don't commit to one shape dogmatically:

Costing considers both: Enumerate left-deep AND promising bushy plans; let cost decide
Bushy sampling: For large queries, sample random bushy plans and keep the best
Heuristic triggers: Switch to bushy exploration when cardinality estimates suggest left-deep will have huge intermediates
Runtime adaptation: Start with left-deep; if intermediate materialization is needed anyway (memory overflow), consider restructuring

Example decision process:

1. Estimate cardinalities for all joins
2. Compute left-deep plan cost (quick since O(n!) is tractable)
3. If left-deep has acceptable cost → use it
4. If intermediates are huge → explore bushy alternatives
5. If parallel execution → weight bushy for parallelism benefits
6. Compare and select

Rule of Thumb

For single-node OLTP: left-deep is usually fine. For analytics on data warehouses or MPP systems: bushy trees often win significantly. When in doubt, let the optimizer decide—but understand WHY it chose what it did when analyzing slow queries.

Summary: Tree Shape in Query Optimization

We've explored the structural dimension of query optimization. Let's consolidate the key insights:

Key Takeaways

•Left-deep trees chain joins linearly — Right children are always base tables; enables pipelining and minimal memory.
•Bushy trees allow balanced structures — Either child can be intermediate; enables parallelism and smaller intermediates.
•Search space differs dramatically — Left-deep: O(n!); Bushy: O((2n-2)!/(n-1)!) — thousands of times larger.
•Left-deep favors streaming execution — Tuples flow through without full materialization; great for index nested loops.
•Bushy favors parallel and analytical workloads — Independent branches run concurrently; better intermediate size control.
•Right-deep suits hash join pipelines — Build all hash tables, then stream probe side through.
•Modern optimizers explore both — Use cost-based selection, heuristics, and adaptive approaches to choose wisely.
•Parallel systems prefer bushy — MPP databases naturally align with bushy execution for full cluster utilization.

What's next:

With tree shapes understood, the final piece of cost-based optimization is the search strategy itself. The next page explores Search Strategies—how optimizers balance thoroughness with speed, employing techniques from greedy algorithms to genetic optimization to find excellent plans within time budgets.

Page Complete

You now understand the taxonomy of join tree shapes and their trade-offs. This structural dimension—distinct from join ordering and algorithm selection—critically affects execution performance. Recognizing when each shape excels helps explain optimizer behavior and guides performance tuning.

Bushy vs Left-Deep Trees

The Shape of Query Plans

This seemingly abstract structural decision has profound practical implications:

Left-deep trees enable pipelining and streaming execution
Bushy trees enable parallelism and can produce smaller intermediates
The number of possible trees differs dramatically between shapes
Memory requirements and blocking behavior depend on tree structure

This page explores the taxonomy of join tree shapes, their properties, and how optimizers choose between them—or explore all simultaneously.

What You Will Learn

Join Tree Taxonomy

A join tree is a binary tree where leaf nodes are base tables and internal nodes are join operations. For n tables, there are n-1 joins and many possible tree structures.

Left-Deep Trees:

In a left-deep tree, all right children are base relations (leaf nodes):

        ⋈
       / \
      ⋈   D
     / \
    ⋈   C
   / \
  A   B

Execution order: ((A ⋈ B) ⋈ C) ⋈ D

The join result always flows down the left spine; each join combines the running result with a fresh base table.

Right-Deep Trees:

Symmetrically, all left children are base relations:

      ⋈
     / \
    A   ⋈
       / \
      B   ⋈
         / \
        C   D

Execution order: A ⋈ (B ⋈ (C ⋈ D))

Bushy Trees (Zig-Zag Trees):

Both children of an internal node can be intermediate results:

        ⋈
       / \
      ⋈   ⋈
     / \ / \
    A  B C  D

Execution order: (A ⋈ B) ⋈ (C ⋈ D)

Bushy trees are more general—they include left-deep and right-deep as special cases.

Join Tree Shape Classification
Tree Type	Definition	Visual Pattern	Key Property
Left-Deep	Right children are always leaves	Leans left (chain)	Sequential/pipeline friendly
Right-Deep	Left children are always leaves	Leans right (chain)	Build-side ready for hash
Bushy	Either child can be intermediate	Balanced binary tree	Enables parallelism
Zig-Zag	Neither strictly left nor right deep	Irregular shape	Intermediate case

Counting Join Trees

The number of possible join trees varies dramatically by shape:

Left-Deep Trees:

For n tables, we choose which table goes first, which goes second, etc.—a permutation:

Number of left-deep trees = n!

For 10 tables: 10! = 3,628,800

All Binary Trees (Bushy):

The number of different binary tree shapes with n leaves is the (n-1)th Catalan number:

Cₙ₋₁ = (2(n-1))! / (n! × (n-1)!)

For each tree shape, we can assign tables to leaves in n! ways:

Number of bushy trees = Cₙ₋₁ × n! = (2n-2)! / (n-1)!

For 10 tables: C₉ × 10! = 4862 × 3,628,800 ≈ 17.6 billion trees!

Number of Join Trees by Type and Table Count
Tables (n)	Left-Deep (n!)	Bushy ((2n-2)!/(n-1)!)	Ratio
3	6	12	2×
4	24	120	5×
5	120	1,680	14×
6	720	30,240	42×
8	40,320	17,297,280	429×
10	3,628,800	17,643,225,600	4,862×
12	479,001,600	6.9 × 10¹⁴	1.4 million×

Optimizer implications:

If optimization explores only left-deep trees, the search space is O(n!), already challenging but manageable with DP.

If exploring all bushy trees, the search space explodes—17 billion possibilities for just 10 tables.

This is why many optimizers restrict to left-deep trees by default:

Smaller search space = faster optimization
Left-deep trees have nice execution properties (see next section)
Bushy trees are explored only when beneficial (parallel systems, specific query patterns)

The Catalan Numbers

Left-Deep Execution Characteristics

Left-deep trees have execution properties that make them particularly suitable for traditional, single-threaded, disk-based database systems:

Property 1: Pipelining

In a left-deep tree with nested loop or hash joins, intermediate results can flow directly to the next join without full materialization:

((A ⋈ B) ⋈ C) ⋈ D

Execution:
1. For each tuple from A:
2.   Find matching tuples in B (using index or hash)
3.   For each match, find matching tuples in C
4.   For each match, find matching tuples in D
5.   Output result

No intermediate result is ever fully materialized to disk (unless memory overflows). This is streaming execution.

Property 2: Index Nested Loop Friendly

With a left-deep tree and index nested loop joins:

((A ⋈ B) ⋈ C) ⋈ D
    idx   idx   idx

- For each tuple flowing through:
- Probe index on B → probe index on C → probe index on D
- Optimal when indexes exist on right-side tables
- Memory requirement: O(1) per join level (just input buffers)

Left-Deep Tree Advantages

•Full pipelining — Tuples stream through without intermediate materialization, reducing I/O dramatically
•Minimal memory — Only buffers for current tuple(s) needed, not full intermediate results
•Index-friendly — Right tables accessed via index probes; left spine builds the driving stream
•Simpler optimization — O(n!) search space, manageable with DP and interesting orders
•Predictable execution — Linear flow is easy to reason about and monitor

The Historical Context

Bushy Tree Advantages

Despite larger search space and materialization needs, bushy trees offer significant advantages in specific scenarios:

Advantage 1: Smaller Intermediate Results

Consider four tables with these cardinalities:

A: 1,000,000 rows
B: 1,000 rows (highly selective join with A → 1,000 rows)
C: 500,000 rows
D: 100 rows (highly selective join with C → 100 rows)

Left-deep plan:

((A ⋈ B) ⋈ C) ⋈ D

Intermediate sizes:
  A ⋈ B = 1,000 rows
  (A ⋈ B) ⋈ C = 500,000 rows (cross-product-ish if no predicate!)
  final = ... depends on D

Bushy plan:

(A ⋈ B) ⋈ (C ⋈ D)

Intermediate sizes:
  A ⋈ B = 1,000 rows
  C ⋈ D = 100 rows
  final = 1,000 × 100 / selectivity = perhaps 500 rows

The bushy plan independently reduces both branches before the final join, avoiding the massive intermediate.

Advantage 2: Parallelism Within Query

In a bushy tree, independent branches can execute in parallel:

          ⋈ (wait for both)
         / \
        ⋈   ⋈
       /|   |\
      A B   C D

    Worker 1       Worker 2
    Computes A⋈B   Computes C⋈D
          ↓              ↓
          Wait for both...
              ↓
          Final join

With 8 tables and a perfectly balanced bushy tree (4 leaves per branch), speedup can approach 2× for that stage. More branches enable more parallelism.

Advantage 3: Hash Join Build Side Optimization

Hash joins perform best when the build side (inner relation) is small. Bushy trees allow computing small intermediates before joining:

Bushy: (D ⋈ E) served as build side for join with larger (A ⋈ B ⋈ C)
       → Small hash table, efficient probing

Left-deep: Must build on base tables only
           → If all base tables are large, hash tables are huge

When Bushy Trees Win

•Parallel execution environments — MPP databases, Spark SQL, distributed systems
•Queries with independent selective joins — Reducing both branches before combining
•Memory-rich systems — Can afford to materialize multiple intermediates
•Star schema queries — Dimension tables joined in parallel before joining fact table
•Complex many-table joins — Better intermediate cardinality control

Right-Deep Trees and Special Cases

Right-deep trees are less common but have their place:

Right-Deep Structure:

      ⋈
     / \
    A   ⋈
       / \
      B   ⋈
         / \
        C   D

Execution: A ⋈ (B ⋈ (C ⋈ D))

Execution with hash joins:

In a right-deep tree with hash joins:

Build hash table on D
Probe with C, producing C⋈D, build hash table on result
Probe with B, producing B⋈(C⋈D), build hash table on result
Probe with A, producing final result

Each join's build side is the result of the next-inner join.

Advantages of right-deep for hash joins:

Right-Deep Tree Characteristics

•All hash tables built before probing — Can prepare all build sides, then stream through
•Pipeline on probe side — Once hash tables ready, A tuples stream through all joins
•Cache efficiency — All hash tables may fit in cache simultaneously
•Vectorized execution friendly — Batch processing of probe tuples

When right-deep helps:

When the innermost tables (C, D) are small and intermediate results grow
A is large but needs to stream through once all hash tables are built
Systems with vectorized hash join implementations

The trade-off:

Right-deep requires building all hash tables before any output is produced—fully blocking on build phases. This increases memory pressure and latency to first result.

Zig-Zag and Mixed Trees:

Some plans are neither purely left-deep nor right-deep:

        ⋈
       / \
      ⋈   C
     / \
    A   ⋈
       / \
      B   D

(A ⋈ (B ⋈ D)) ⋈ C

These "zig-zag" plans combine properties of both. Modern optimizers explore these when beneficial.

Hash Join Pipelines

Controlling Tree Shape in Optimizers

Optimizers use several strategies to control which tree shapes are considered:

Strategy 1: Restrict to Left-Deep

Only generate left-deep plans during enumeration:

for each permutation P of tables:
  plan = ((P[0] ⋈ P[1]) ⋈ P[2]) ⋈ ... ⋈ P[n-1]
  cost = evaluate(plan)
  update best if cheaper

Search space: O(n!), easily handled by DP.

Strategy 2: Bushy with Pruning

Explore bushy trees but aggressively prune:

for each subset S in increasing size:
  for each partition (S₁, S₂) of S:
    if estimate(S₁ ⋈ S₂) > current_best × threshold:
      skip (prune)
    else:
      evaluate and update

Pruning makes bushy exploration tractable for moderate query sizes.

Strategy 3: Shape Heuristics

Use query characteristics to select shape:

if query is star schema:
  prefer bushy (dimension joins first)
else if query has chain structure:
  prefer left-deep following chain
else if parallel execution:
  prefer bushy for parallelism
else:
  default to left-deep

Tree Shape Restrictions in Commercial Systems
System	Default Behavior	Bushy Support	Control Mechanism
PostgreSQL	Left-deep preferred	Considers some bushy	GEQO for large queries
MySQL	Left-deep (historically)	Recent versions explore more	Optimizer hints
SQL Server	Explores bushy	Full support	Search levels and timeouts
Oracle	Flexible	Full support	Optimizer mode settings
Spark SQL	Bushy by default	Native parallel	Adaptive query execution

Query Hints for Shape Control

Parallel Execution and Tree Shape

The relationship between tree shape and parallelism is crucial for modern analytics systems:

Intra-Query Parallelism with Bushy Trees:

            ⋈ (parallel merge)
          /   \
       ⋈       ⋈
      / \     / \
     A   B   C   D
   
   [Worker Pool 1] [Worker Pool 2]
   Processes A⋈B   Processes C⋈D
         ↓             ↓
      Result 1      Result 2
                ↓
         Final Join

Bushy structure naturally exposes independent work units. A perfectly balanced bushy tree of 2^k leaves can achieve up to k levels of binary parallelism.

Pipeline Parallelism with Left-Deep:

Left-deep trees enable a different form of parallelism:

((A ⋈ B) ⋈ C) ⋈ D

        Stage 1     Stage 2     Stage 3
        A ⋈ B   →   _ ⋈ C   →   _ ⋈ D
            ↘  tuple flow  ↘
   
Tuples pipeline through stages;
each stage can use multiple workers for its operation.

This is pipeline parallelism: stages operate concurrently on different tuples.

Distributed Systems Considerations:

Tree Shape and Distributed Execution
Aspect	Left-Deep	Bushy
Data movement	Sequential: result of join n ships to node(s) for join n+1	May ship multiple intermediates in parallel
Parallelism within query	Limited to worker parallelism per stage	Multiple independent stages run simultaneously
Network traffic	Potentially lower (one intermediate at a time)	Potentially higher (multiple intermediates)
Load balancing	Easier (single flow)	More complex (multiple flows)
Fault tolerance	Simpler checkpointing	More complex state management

Modern MPP systems (Spark, Presto, Snowflake):

These systems aggressively use bushy plans:

Hash partitioning distributes data across workers
Exchange operators shuffle data for collocated joins
Independent stages execute in parallel across the cluster
Broadcast joins for small dimensions avoid shuffles

Bushy trees align naturally with this execution model, enabling full cluster utilization.

The cost model adapts:

In parallel systems, the cost model accounts for:

Degrees of parallelism achievable by each plan
Data shuffle costs (exchange operators)
Skew and load balancing
Memory per worker constraints

Exchange Operators

Practical Tree Shape Selection

Given the trade-offs, how do real optimizers choose tree shapes? Here's a practical decision framework:

Consider Left-Deep When:

Prefer Left-Deep

•Index nested loop joins available
•Memory is limited
•Single-node OLTP workload
•Fast time to first row matters
•Query is simple (few tables)
•Streaming / cursor access pattern

Prefer Bushy

•Parallel / distributed execution
•Large intermediate results in left-deep
•Star / snowflake schema queries
•Hash joins dominate
•Memory is abundant
•Total throughput > first row latency

Adaptive approaches:

Modern optimizers don't commit to one shape dogmatically:

Costing considers both: Enumerate left-deep AND promising bushy plans; let cost decide
Bushy sampling: For large queries, sample random bushy plans and keep the best
Heuristic triggers: Switch to bushy exploration when cardinality estimates suggest left-deep will have huge intermediates
Runtime adaptation: Start with left-deep; if intermediate materialization is needed anyway (memory overflow), consider restructuring

Example decision process:

1. Estimate cardinalities for all joins
2. Compute left-deep plan cost (quick since O(n!) is tractable)
3. If left-deep has acceptable cost → use it
4. If intermediates are huge → explore bushy alternatives
5. If parallel execution → weight bushy for parallelism benefits
6. Compare and select

Rule of Thumb

Summary: Tree Shape in Query Optimization

We've explored the structural dimension of query optimization. Let's consolidate the key insights:

Key Takeaways

•Left-deep trees chain joins linearly — Right children are always base tables; enables pipelining and minimal memory.
•Bushy trees allow balanced structures — Either child can be intermediate; enables parallelism and smaller intermediates.
•Search space differs dramatically — Left-deep: O(n!); Bushy: O((2n-2)!/(n-1)!) — thousands of times larger.
•Left-deep favors streaming execution — Tuples flow through without full materialization; great for index nested loops.
•Bushy favors parallel and analytical workloads — Independent branches run concurrently; better intermediate size control.
•Right-deep suits hash join pipelines — Build all hash tables, then stream probe side through.
•Modern optimizers explore both — Use cost-based selection, heuristics, and adaptive approaches to choose wisely.
•Parallel systems prefer bushy — MPP databases naturally align with bushy execution for full cluster utilization.

What's next:

Page Complete