Loading content...
A single CTE is powerful. Multiple CTEs are transformative. When you combine several CTEs in one query, you unlock the ability to build data transformation pipelines—sequences of operations that progressively refine raw data into precisely the result you need.
This approach mirrors how data engineers think about ETL pipelines: extract base data, transform through multiple stages, load into the final form. With multiple CTEs, this entire pipeline can exist within a single SQL statement, executed atomically, optimized holistically, and maintained as a coherent unit.
By the end of this page, you will master the syntax for defining multiple CTEs, understand dependency ordering and reference rules, learn techniques for managing complex CTE chains, and develop patterns for building sophisticated analytical queries that would be nearly impossible with traditional subqueries.
The syntax for multiple CTEs extends naturally from single CTE syntax. CTEs are defined sequentially within a single WITH clause, separated by commas.
1234567891011121314151617181920212223242526272829303132
-- Multiple CTE Syntax PatternWITH -- First CTE first_cte AS ( SELECT column1, column2 FROM table1 WHERE condition1 ), -- Second CTE (can reference first_cte) second_cte AS ( SELECT column1, calculated_field FROM first_cte WHERE condition2 ), -- Third CTE (can reference first_cte and second_cte) third_cte AS ( SELECT f.column1, s.calculated_field, additional_data FROM first_cte f INNER JOIN second_cte s ON f.column1 = s.column1 ) -- Note: No comma after the last CTE -- Main query can reference any CTESELECT *FROM third_cteORDER BY column1;The most common errors with multiple CTEs: 1) Missing comma between CTEs, 2) Extra comma after the last CTE before the main query, 3) Using WITH before each CTE instead of just once at the start. Remember: one WITH, multiple CTEs separated by commas.
| Rule | Correct | Incorrect |
|---|---|---|
| WITH keyword | Once at the start | WITH before each CTE |
| CTE separator | Comma between CTEs | Semicolon or nothing |
| After last CTE | No comma | Comma before main query |
| Reference order | Later can reference earlier | Earlier referencing later |
| Main query | Immediately follows last CTE | Separated by semicolon |
When you have multiple CTEs, they form a dependency graph. Understanding this graph is essential for designing effective queries. CTEs can only reference CTEs defined before them in the WITH clause.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101
-- PATTERN 1: Linear Chain (Pipeline)-- Each CTE depends only on the immediately preceding CTE-- A → B → C → D WITH raw_data AS ( SELECT * FROM source_table ), cleaned_data AS ( -- Depends on: raw_data SELECT * FROM raw_data WHERE valid = TRUE ), enriched_data AS ( -- Depends on: cleaned_data SELECT cd.*, ref.label FROM cleaned_data cd JOIN reference ref ON cd.type_id = ref.id ), aggregated_data AS ( -- Depends on: enriched_data SELECT label, COUNT(*), SUM(amount) FROM enriched_data GROUP BY label )SELECT * FROM aggregated_data; -- PATTERN 2: Diamond (Multiple paths converge)-- A-- / \-- B C-- \ /-- D WITH base_transactions AS ( SELECT transaction_id, customer_id, amount, category FROM transactions WHERE transaction_date >= CURRENT_DATE - INTERVAL '30 days' ), customer_totals AS ( -- Depends on: base_transactions SELECT customer_id, SUM(amount) as customer_total FROM base_transactions GROUP BY customer_id ), category_totals AS ( -- Depends on: base_transactions (parallel branch) SELECT category, SUM(amount) as category_total FROM base_transactions GROUP BY category ), combined_analysis AS ( -- Depends on: base_transactions, customer_totals, category_totals SELECT bt.transaction_id, bt.amount, ct.customer_total, cat.category_total, bt.amount / ct.customer_total as pct_of_customer, bt.amount / cat.category_total as pct_of_category FROM base_transactions bt JOIN customer_totals ct USING (customer_id) JOIN category_totals cat USING (category) )SELECT * FROM combined_analysis; -- PATTERN 3: Fan-Out (One source, multiple consumers)-- A-- / | \-- B C D WITH all_orders AS ( SELECT order_id, customer_id, order_date, amount, region FROM orders WHERE status = 'completed' ), orders_by_customer AS ( SELECT customer_id, COUNT(*) as orders, SUM(amount) as revenue FROM all_orders GROUP BY customer_id ), orders_by_region AS ( SELECT region, COUNT(*) as orders, SUM(amount) as revenue FROM all_orders GROUP BY region ), orders_by_month AS ( SELECT DATE_TRUNC('month', order_date) as month, COUNT(*) as orders, SUM(amount) as revenue FROM all_orders GROUP BY DATE_TRUNC('month', order_date) )-- Main query might use any or all of theseSELECT 'by_customer' as metric, COUNT(*) as records FROM orders_by_customerUNION ALLSELECT 'by_region', COUNT(*) FROM orders_by_regionUNION ALLSELECT 'by_month', COUNT(*) FROM orders_by_month;CTEs can be referenced multiple times within the same query—from later CTEs, from the main query, or from subqueries within either. This flexibility enables powerful patterns.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283
-- REFERENCE PATTERN 1: Self-comparison-- Compare aggregates from the same CTEWITH monthly_sales AS ( SELECT DATE_TRUNC('month', sale_date) as month, SUM(amount) as revenue FROM sales WHERE sale_date >= CURRENT_DATE - INTERVAL '2 years' GROUP BY DATE_TRUNC('month', sale_date))SELECT current.month, current.revenue as current_revenue, prior.revenue as prior_month_revenue, yoy.revenue as same_month_last_year, current.revenue - prior.revenue as mom_change, current.revenue - yoy.revenue as yoy_changeFROM monthly_sales currentLEFT JOIN monthly_sales prior ON current.month = prior.month + INTERVAL '1 month'LEFT JOIN monthly_sales yoy ON current.month = yoy.month + INTERVAL '1 year'ORDER BY current.month; -- REFERENCE PATTERN 2: Cross-reference between CTEsWITH products AS ( SELECT product_id, product_name, category_id FROM product_catalog WHERE status = 'active' ), product_sales AS ( SELECT product_id, SUM(quantity) as units_sold, SUM(revenue) as revenue FROM order_items WHERE order_date >= CURRENT_DATE - INTERVAL '1 year' GROUP BY product_id ), category_averages AS ( -- References both previous CTEs SELECT p.category_id, AVG(ps.units_sold) as avg_units, AVG(ps.revenue) as avg_revenue FROM products p LEFT JOIN product_sales ps USING (product_id) GROUP BY p.category_id )-- Main query references all threeSELECT p.product_name, ps.units_sold, ps.revenue, ca.avg_units as category_avg_units, ps.units_sold - ca.avg_units as units_vs_category_avg, CASE WHEN ps.units_sold > ca.avg_units * 1.5 THEN 'Star Performer' WHEN ps.units_sold < ca.avg_units * 0.5 THEN 'Underperformer' ELSE 'Average' END as performance_tierFROM products pLEFT JOIN product_sales ps USING (product_id)LEFT JOIN category_averages ca USING (category_id)ORDER BY ps.revenue DESC NULLS LAST; -- REFERENCE PATTERN 3: CTE in subqueryWITH active_customers AS ( SELECT customer_id, customer_name, region FROM customers WHERE status = 'active' AND last_purchase_date >= CURRENT_DATE - INTERVAL '90 days')SELECT region, (SELECT COUNT(*) FROM active_customers ac WHERE ac.region = regions.region) as active_count, (SELECT SUM(amount) FROM orders o WHERE o.customer_id IN ( SELECT customer_id FROM active_customers ac WHERE ac.region = regions.region )) as region_revenueFROM (SELECT DISTINCT region FROM active_customers) regions;When a CTE is referenced multiple times, the database may materialize it (compute once, store temporarily) rather than re-execute. This can dramatically improve performance when the CTE contains expensive operations like aggregations or complex joins. However, behavior varies by database—check your execution plan.
Multiple CTEs enable you to build complete data pipelines within a single query. Each CTE represents a stage in the pipeline: extraction, cleansing, transformation, enrichment, aggregation, and presentation.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115
-- COMPLETE DATA PIPELINE: Cohort Analysis-- Business Question: How do customer cohorts perform over time? WITH -- STAGE 1: EXTRACTION -- Extract raw order data with customer information raw_orders AS ( SELECT o.order_id, o.customer_id, o.order_date, o.amount, c.registration_date, c.acquisition_source FROM orders o INNER JOIN customers c ON o.customer_id = c.customer_id WHERE o.status = 'completed' AND o.order_date >= '2023-01-01' ), -- STAGE 2: CLEANSING -- Remove invalid records, handle edge cases clean_orders AS ( SELECT * FROM raw_orders WHERE amount > 0 AND registration_date <= order_date -- Registration before first order AND registration_date >= '2020-01-01' -- Reasonable date range ), -- STAGE 3: ENRICHMENT -- Add derived fields needed for analysis enriched_orders AS ( SELECT *, DATE_TRUNC('month', registration_date) as cohort_month, DATE_TRUNC('month', order_date) as order_month, -- Calculate months since registration (EXTRACT(YEAR FROM order_date) - EXTRACT(YEAR FROM registration_date)) * 12 + (EXTRACT(MONTH FROM order_date) - EXTRACT(MONTH FROM registration_date)) as months_since_registration FROM clean_orders ), -- STAGE 4: AGGREGATION (Level 1) -- Aggregate to customer-month level customer_month_metrics AS ( SELECT customer_id, cohort_month, order_month, months_since_registration, COUNT(DISTINCT order_id) as orders, SUM(amount) as revenue FROM enriched_orders GROUP BY customer_id, cohort_month, order_month, months_since_registration ), -- STAGE 5: AGGREGATION (Level 2) -- Aggregate to cohort-period level for cohort analysis cohort_analysis AS ( SELECT cohort_month, months_since_registration as period, COUNT(DISTINCT customer_id) as active_customers, SUM(orders) as total_orders, SUM(revenue) as total_revenue, AVG(revenue) as avg_revenue_per_customer FROM customer_month_metrics GROUP BY cohort_month, months_since_registration ), -- STAGE 6: ENRICHMENT (Level 2) -- Add cohort size for retention calculation cohort_sizes AS ( SELECT cohort_month, COUNT(DISTINCT customer_id) as cohort_size FROM clean_orders GROUP BY cohort_month ), -- STAGE 7: FINAL TRANSFORMATION -- Calculate retention rates and format for presentation cohort_retention AS ( SELECT ca.cohort_month, cs.cohort_size, ca.period, ca.active_customers, ca.total_revenue, ROUND(ca.active_customers::numeric / cs.cohort_size * 100, 1) as retention_rate, ROUND(ca.total_revenue / ca.active_customers, 2) as revenue_per_active FROM cohort_analysis ca INNER JOIN cohort_sizes cs USING (cohort_month) ) -- PRESENTATION LAYERSELECT TO_CHAR(cohort_month, 'YYYY-MM') as cohort, cohort_size as initial_customers, period as months_after_signup, active_customers, retention_rate || '%' as retention, total_revenue, revenue_per_active as arpcFROM cohort_retentionWHERE period <= 12 -- First year onlyORDER BY cohort_month, period;| Stage | Purpose | Typical Operations |
|---|---|---|
| Extraction | Gather raw data from source tables | SELECT, JOIN source tables |
| Cleansing | Remove invalid/incomplete records | WHERE clauses, NULL handling |
| Enrichment | Add calculated fields, lookups | CASE, date calculations, JOINs |
| Aggregation | Summarize to required granularity | GROUP BY, aggregate functions |
| Transformation | Reshape for final needs | Pivoting, calculations on aggregates |
| Presentation | Format for output/consumption | Formatting, final ordering |
As queries grow to include many CTEs, complexity management becomes crucial. Without discipline, multi-CTE queries can become as hard to maintain as the nested subqueries they replace.
stage_description or description_granularity123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
-- WELL-ORGANIZED Multi-CTE Query WITH -- ═══════════════════════════════════════════════ -- BASE DATA CTEs -- ═══════════════════════════════════════════════ -- All active orders in reporting period -- Depends on: (none - base table) base_orders AS ( SELECT order_id, customer_id, order_date, amount FROM orders WHERE status = 'completed' AND order_date >= CURRENT_DATE - INTERVAL '1 year' ), -- All active customers -- Depends on: (none - base table) base_customers AS ( SELECT customer_id, customer_name, segment, region FROM customers WHERE status = 'active' ), -- ═══════════════════════════════════════════════ -- CUSTOMER METRICS CTEs -- ═══════════════════════════════════════════════ -- Customer purchase summaries -- Depends on: base_orders customer_order_metrics AS ( SELECT customer_id, COUNT(*) as order_count, SUM(amount) as total_spent, MIN(order_date) as first_order, MAX(order_date) as last_order FROM base_orders GROUP BY customer_id ), -- Customer categorization -- Depends on: customer_order_metrics customer_segments AS ( SELECT customer_id, order_count, total_spent, CASE WHEN total_spent >= 10000 THEN 'VIP' WHEN total_spent >= 1000 THEN 'Regular' ELSE 'Occasional' END as spend_tier FROM customer_order_metrics ), -- ═══════════════════════════════════════════════ -- REPORT CTEs -- ═══════════════════════════════════════════════ -- Final joined report -- Depends on: base_customers, customer_segments customer_report AS ( SELECT bc.customer_name, bc.segment, bc.region, cs.order_count, cs.total_spent, cs.spend_tier FROM base_customers bc LEFT JOIN customer_segments cs USING (customer_id) ) -- Main QuerySELECT *FROM customer_reportORDER BY total_spent DESC NULLS LAST;If your multi-CTE query exceeds 150-200 lines, or requires more than 10 CTEs, consider: 1) Creating permanent views for commonly-used CTEs, 2) Splitting into multiple queries with temp tables, 3) Using stored procedures to encapsulate stages. CTEs are powerful but not infinitely scalable.
CTEs can be designed as parallel (independent of each other) or sequential (each depending on the previous). Understanding the difference helps you write more efficient queries and enables database optimizers to work more effectively.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
-- PARALLEL CTEs: Independent branchesWITH -- These three CTEs are independent (parallel) sales_by_region AS ( SELECT region, SUM(amount) as regional_total FROM sales GROUP BY region ), sales_by_category AS ( SELECT category, SUM(amount) as category_total FROM sales GROUP BY category ), sales_by_quarter AS ( SELECT DATE_TRUNC('quarter', sale_date) as quarter, SUM(amount) as quarterly_total FROM sales GROUP BY DATE_TRUNC('quarter', sale_date) )-- Main query combines parallel branchesSELECT r.region, r.regional_total, (SELECT SUM(category_total) FROM sales_by_category) as all_categories, (SELECT MAX(quarterly_total) FROM sales_by_quarter) as peak_quarterFROM sales_by_region r; -- SEQUENTIAL CTEs: Pipeline transformationWITH -- Step 1 (no dependency) raw_transactions AS ( SELECT * FROM transactions WHERE date >= '2024-01-01' ), -- Step 2 (depends on Step 1) valid_transactions AS ( SELECT * FROM raw_transactions WHERE amount > 0 AND status = 'completed' ), -- Step 3 (depends on Step 2) enriched_transactions AS ( SELECT vt.*, c.customer_tier FROM valid_transactions vt JOIN customers c USING (customer_id) ), -- Step 4 (depends on Step 3) summarized_transactions AS ( SELECT customer_tier, SUM(amount) as total, COUNT(*) as count FROM enriched_transactions GROUP BY customer_tier )SELECT * FROM summarized_transactions; -- HYBRID: Parallel branches that convergeWITH -- Parallel branch A orders_last_month AS ( SELECT customer_id, SUM(amount) as monthly_total FROM orders WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month') AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY customer_id ), -- Parallel branch B orders_same_month_last_year AS ( SELECT customer_id, SUM(amount) as yoy_total FROM orders WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '13 months') AND order_date < DATE_TRUNC('month', CURRENT_DATE - INTERVAL '12 months') GROUP BY customer_id ), -- Convergence point: combines both parallel branches customer_comparison AS ( SELECT COALESCE(lm.customer_id, ly.customer_id) as customer_id, COALESCE(lm.monthly_total, 0) as this_month, COALESCE(ly.yoy_total, 0) as same_month_last_year FROM orders_last_month lm FULL OUTER JOIN orders_same_month_last_year ly USING (customer_id) )SELECT customer_id, this_month, same_month_last_year, this_month - same_month_last_year as yoy_changeFROM customer_comparisonWHERE this_month > 0 OR same_month_last_year > 0;Multiple CTEs introduce performance considerations that don't exist with simpler queries. Understanding these helps you write efficient multi-CTE statements.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
-- TECHNIQUE 1: Minimize CTE width (columns)-- Bad: Wide CTE that downstream only uses 2 columnsWITH wide_cte AS ( SELECT * FROM large_table -- 50 columns)SELECT id, name FROM wide_cte; -- Good: Narrow CTE with only needed columnsWITH narrow_cte AS ( SELECT id, name FROM large_table)SELECT id, name FROM narrow_cte; -- TECHNIQUE 2: Push filters into earliest possible CTE-- Bad: Late filteringWITH all_orders AS ( SELECT * FROM orders),enriched AS ( SELECT ao.*, c.name FROM all_orders ao JOIN customers c USING (customer_id))SELECT * FROM enriched WHERE order_date >= '2024-01-01'; -- Good: Early filteringWITH recent_orders AS ( SELECT * FROM orders WHERE order_date >= '2024-01-01' -- Filter early),enriched AS ( SELECT ro.*, c.name FROM recent_orders ro JOIN customers c USING (customer_id))SELECT * FROM enriched; -- TECHNIQUE 3: Control materialization (PostgreSQL 12+)WITH -- Don't materialize simple filters active_users AS NOT MATERIALIZED ( SELECT user_id FROM users WHERE active = true ), -- Do materialize expensive aggregations used multiple times user_stats AS MATERIALIZED ( SELECT user_id, COUNT(*) as actions, SUM(duration) as total_time FROM user_activity WHERE activity_date >= CURRENT_DATE - 30 GROUP BY user_id )SELECT ...; -- TECHNIQUE 4: Avoid unnecessary CTE chains-- Bad: Trivial CTEs that add overheadWITH cte1 AS (SELECT * FROM table1), cte2 AS (SELECT * FROM cte1 WHERE x > 0), cte3 AS (SELECT * FROM cte2)SELECT * FROM cte3; -- Good: Combine trivial stepsWITH filtered_table AS ( SELECT * FROM table1 WHERE x > 0)SELECT * FROM filtered_table;Don't assume that adding or removing CTEs improves performance. Always compare execution plans (EXPLAIN ANALYZE) before and after changes. The optimizer may produce unexpected results, and behavior varies significantly between database systems.
We've explored the power of combining multiple CTEs into sophisticated query structures. Let's consolidate the key insights:
What's Next:
Having mastered multi-CTE composition, we're now ready for the most powerful CTE feature: Recursive CTEs. The next page explores how CTEs can reference themselves to traverse hierarchies, generate series, and solve problems that are impossible with standard SQL.
You now understand how to compose multiple CTEs, manage their dependencies, build data transformation pipelines, and optimize multi-CTE query performance. You're ready to explore recursive CTEs.