Loading content...
You now understand both CTEs and traditional subqueries. But when should you use each? This isn't just an academic question—it affects readability, maintainability, performance, and even whether certain queries are possible at all.
The answer isn't always "use CTEs." Subqueries remain the right choice in many scenarios. Understanding the trade-offs enables you to choose the optimal approach for each situation, writing SQL that is not just correct but elegant and efficient.
By the end of this page, you will understand the fundamental differences between CTEs and subqueries, compare their performance characteristics, know when each approach excels, and have a practical decision framework for choosing between them in your daily SQL work.
CTEs and subqueries solve similar problems—embedding one query's results into another—but they differ fundamentally in structure, scope, and capabilities.
| Aspect | CTE (WITH Clause) | Subquery |
|---|---|---|
| Definition location | Before the main query | Inline, embedded in the query |
| Naming | Always named | Usually anonymous (aliases optional) |
| Reusability | Can be referenced multiple times | Must be duplicated for reuse |
| Self-reference | Allowed with RECURSIVE | Not possible |
| Reading order | Top-down (definition then usage) | Inside-out (nested context) |
| Maximum nesting | Flat structure (unlimited CTEs) | Deeply nested (practical limits) |
| Scope | Visible throughout statement | Visible only in immediate context |
| SQL Standard | SQL:1999 and later | Original SQL (pre-1999) |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
-- The SAME logical query, two different structures -- SUBQUERY APPROACH: Nested, inside-out readingSELECT c.customer_name, customer_totals.total_amountFROM customers cINNER JOIN ( -- Must read this first to understand the outer query SELECT customer_id, SUM(amount) as total_amount FROM ( -- Must read this first to understand the middle query SELECT order_id, customer_id, amount FROM orders WHERE status = 'completed' AND order_date >= CURRENT_DATE - INTERVAL '1 year' ) AS completed_orders GROUP BY customer_id HAVING SUM(amount) > ( -- And this is a scalar subquery inside HAVING SELECT AVG(total) FROM ( SELECT customer_id, SUM(amount) as total FROM orders WHERE status = 'completed' GROUP BY customer_id ) AS all_totals )) AS customer_totals ON c.customer_id = customer_totals.customer_id; -- CTE APPROACH: Flat, top-down readingWITH completed_orders AS ( -- Step 1: Define completed orders SELECT order_id, customer_id, amount FROM orders WHERE status = 'completed' AND order_date >= CURRENT_DATE - INTERVAL '1 year' ), customer_totals AS ( -- Step 2: Aggregate by customer SELECT customer_id, SUM(amount) as total_amount FROM completed_orders GROUP BY customer_id ), average_total AS ( -- Step 3: Calculate overall average SELECT AVG(total_amount) as avg_amount FROM customer_totals ) -- Step 4: Final resultSELECT c.customer_name, ct.total_amountFROM customers cINNER JOIN customer_totals ct ON c.customer_id = ct.customer_idCROSS JOIN average_totalWHERE ct.total_amount > average_total.avg_amount;Readability is subjective but measurable. Code is readable when its purpose is clear without extensive study, its structure matches its logic, and modifications are straightforward.
For simple, single-use subqueries (1-2 levels of nesting), inline subqueries often read better—they keep related logic together. The readability advantage of CTEs grows dramatically as query complexity increases. The threshold is typically around 2-3 levels of nesting or when the same logic appears twice.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
-- SUBQUERY WINS: Simple, single-use, low nesting-- Reading "WHERE id IN (SELECT...)" is natural and clearSELECT product_name, priceFROM productsWHERE category_id IN ( SELECT category_id FROM categories WHERE department = 'Electronics'); -- No CTE needed - it would add verbosity without value-- This CTE version is longer but not clearer:WITH electronics_categories AS ( SELECT category_id FROM categories WHERE department = 'Electronics')SELECT product_name, priceFROM productsWHERE category_id IN (SELECT category_id FROM electronics_categories); -- CTE WINS: Complex, multi-use, deep nesting-- Compare the subquery monster:SELECT d.department_name, (SELECT COUNT(*) FROM employees e WHERE e.department_id = d.department_id AND e.salary > (SELECT AVG(salary) FROM employees WHERE department_id = d.department_id)) as above_avg_countFROM departments dWHERE d.department_id IN ( SELECT department_id FROM employees GROUP BY department_id HAVING AVG(salary) > (SELECT AVG(salary) FROM employees)); -- Versus the CTE clarity:WITH dept_avg_salaries AS ( SELECT department_id, AVG(salary) as dept_avg FROM employees GROUP BY department_id ), company_avg_salary AS ( SELECT AVG(salary) as company_avg FROM employees ), above_avg_departments AS ( SELECT department_id FROM dept_avg_salaries, company_avg_salary WHERE dept_avg > company_avg ), above_dept_avg_employees AS ( SELECT e.department_id, COUNT(*) as count FROM employees e JOIN dept_avg_salaries das ON e.department_id = das.department_id WHERE e.salary > das.dept_avg GROUP BY e.department_id )SELECT d.department_name, COALESCE(aae.count, 0) as above_avg_countFROM departments dJOIN above_avg_departments aad ON d.department_id = aad.department_idLEFT JOIN above_dept_avg_employees aae ON d.department_id = aae.department_id;A common misconception is that CTEs are inherently slower or faster than subqueries. The truth is more nuanced: performance depends on how the query optimizer handles each construct, which varies by database and query structure.
| Database | Default Behavior | Performance Notes |
|---|---|---|
| PostgreSQL 11 and earlier | Always materialized | CTEs could be significantly slower due to optimization barrier |
| PostgreSQL 12+ | Optimizer chooses | Much better; inlines single-reference CTEs by default |
| MySQL 8.0+ | Generally inlined | Non-recursive CTEs merged into main query; minimal overhead |
| SQL Server | Optimizer chooses | May spool to tempdb; performance varies widely |
| Oracle | Generally inlined | /*+ MATERIALIZE */ hint forces materialization when beneficial |
| SQLite | Always inlined | CTEs are pure syntax sugar; no materialization |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
-- SCENARIO 1: Single-reference CTE-- Usually IDENTICAL performance to subquery-- Optimizer inlines the CTE WITH active_orders AS ( SELECT * FROM orders WHERE status = 'active')SELECT * FROM active_orders WHERE amount > 100; -- Equivalent subquery (optimizer likely produces same plan):SELECT * FROM ( SELECT * FROM orders WHERE status = 'active') AS active_orders WHERE amount > 100; -- SCENARIO 2: Multi-reference CTE-- CTE MAY be faster (compute once, reuse)-- Depends on whether optimizer materializes WITH monthly_totals AS ( SELECT DATE_TRUNC('month', order_date) as month, SUM(amount) as total FROM orders WHERE status = 'completed' GROUP BY DATE_TRUNC('month', order_date))SELECT curr.month, curr.total, prev.total as prev_monthFROM monthly_totals currLEFT JOIN monthly_totals prev ON curr.month = prev.month + INTERVAL '1 month'; -- With subquery, the aggregation would execute TWICE:SELECT curr.month, curr.total, prev.total as prev_monthFROM (SELECT DATE_TRUNC('month', order_date) as month, SUM(amount) as total FROM orders WHERE status = 'completed' GROUP BY DATE_TRUNC('month', order_date)) currLEFT JOIN (SELECT DATE_TRUNC('month', order_date) as month, SUM(amount) as total FROM orders WHERE status = 'completed' GROUP BY DATE_TRUNC('month', order_date)) prev ON curr.month = prev.month + INTERVAL '1 month'; -- SCENARIO 3: Predicate pushdown impact-- CTE materialization can PREVENT predicate pushdown -- CTE version (may materialize ALL active orders first)WITH active_orders AS ( SELECT * FROM orders WHERE status = 'active')SELECT * FROM active_orders WHERE amount > 1000000; -- Very selective filter -- Subquery version (optimizer can push filter into subquery)SELECT * FROM (SELECT * FROM orders WHERE status = 'active') aoWHERE ao.amount > 1000000; -- More likely to combine conditions -- PostgreSQL 12+ hint for explicit control:WITH active_orders AS NOT MATERIALIZED ( SELECT * FROM orders WHERE status = 'active')SELECT * FROM active_orders WHERE amount > 1000000;Never assume CTE or subquery is faster without measuring. Use EXPLAIN ANALYZE (PostgreSQL), EXPLAIN (MySQL), SET STATISTICS IO ON (SQL Server), or EXPLAIN PLAN (Oracle) to compare actual execution. Query optimizer behavior changes with data distribution, statistics, and database updates.
Some things are only possible with CTEs. Others can be done by either approach. Understanding these capabilities defines hard boundaries for your choice.
| Capability | CTE | Subquery | Notes |
|---|---|---|---|
| Recursion | ✅ Yes | ❌ No | CTEs are the ONLY way to do recursion in standard SQL |
| Self-reference | ✅ Yes | ❌ No | Required for hierarchies, graphs, series generation |
| Multiple references | ✅ Native | ⚠️ Copy-paste | Subqueries must be duplicated; CTEs reference by name |
| DML statements (INSERT/UPDATE/DELETE) | ✅ Yes | ✅ Yes | Both work in FROM clause or with data-modifying CTEs |
| Correlated use | ⚠️ Limited | ✅ Yes | Correlated subqueries reference outer query; CTEs cannot |
| Scalar context | ⚠️ Awkward | ✅ Natural | Scalar subqueries in SELECT list are more natural |
| EXISTS/NOT EXISTS | ⚠️ Possible | ✅ Natural | Subqueries are the idiomatic pattern for existence checks |
| LATERAL joins | ✅ Yes | ✅ Yes | Both can be used with LATERAL |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
-- CAPABILITY: Recursion (CTE only)-- Impossible with subqueriesWITH RECURSIVE hierarchy AS ( SELECT id, name, parent_id, 0 as depth FROM nodes WHERE parent_id IS NULL UNION ALL SELECT n.id, n.name, n.parent_id, h.depth + 1 FROM nodes n JOIN hierarchy h ON n.parent_id = h.id)SELECT * FROM hierarchy; -- CAPABILITY: Multiple references (CTE advantage)-- CTE: Define once, use many timesWITH order_stats AS ( SELECT DATE_TRUNC('day', order_date) as day, COUNT(*) as orders, SUM(amount) as revenue FROM orders GROUP BY DATE_TRUNC('day', order_date))SELECT day, orders, revenue, AVG(orders) OVER (ORDER BY day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as avg_7day_orders, (SELECT AVG(revenue) FROM order_stats) as overall_avg_revenue, revenue - (SELECT AVG(revenue) FROM order_stats) as revenue_vs_avgFROM order_statsORDER BY day; -- CAPABILITY: Correlated subquery (Subquery advantage)-- Each row's subquery references outer rowSELECT p.product_name, p.price, (SELECT COUNT(*) FROM order_items oi WHERE oi.product_id = p.product_id) as times_ordered, -- Correlated! (SELECT MAX(order_date) FROM orders o JOIN order_items oi ON o.order_id = oi.order_id WHERE oi.product_id = p.product_id) as last_ordered -- Correlated!FROM products p; -- CTE cannot be correlated in the same way:-- This WON'T work:/*WITH product_order_count AS ( SELECT COUNT(*) as cnt FROM order_items WHERE product_id = p.product_id -- ERROR: p is not visible here!)SELECT p.product_name, (SELECT cnt FROM product_order_count)FROM products p;*/ -- CAPABILITY: EXISTS pattern (Subquery more natural)-- Checking existence is idiomatic with subqueriesSELECT c.customer_nameFROM customers cWHERE EXISTS ( SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id AND o.amount > 10000); -- CTE version is more verbose:WITH high_value_customer_ids AS ( SELECT DISTINCT customer_id FROM orders WHERE amount > 10000)SELECT c.customer_nameFROM customers cWHERE c.customer_id IN (SELECT customer_id FROM high_value_customer_ids);Code maintainability—the ease of understanding, modifying, and debugging—often matters more than marginal performance differences. Consider how each approach affects long-term code health.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
-- MAINTENANCE SCENARIO: "Change the date range from 1 year to 6 months" -- SUBQUERY VERSION: Must find and update EVERY occurrenceSELECT ...FROM (SELECT ... FROM orders WHERE order_date >= CURRENT_DATE - INTERVAL '1 year') aJOIN (SELECT ... FROM orders WHERE order_date >= CURRENT_DATE - INTERVAL '1 year' ...) bWHERE product_id IN ( SELECT product_id FROM order_items WHERE order_id IN ( SELECT order_id FROM orders WHERE order_date >= CURRENT_DATE - INTERVAL '1 year' -- Another one! ));-- Risk: Miss one occurrence and have inconsistent logic -- CTE VERSION: Update ONE placeWITH date_range AS ( -- SINGLE SOURCE OF TRUTH SELECT CURRENT_DATE - INTERVAL '6 months' AS start_date -- Changed here only! ), relevant_orders AS ( SELECT * FROM orders, date_range WHERE order_date >= date_range.start_date ), order_products AS ( SELECT DISTINCT product_id FROM order_items WHERE order_id IN (SELECT order_id FROM relevant_orders) )SELECT ...FROM relevant_orders aJOIN relevant_orders b ON ...WHERE product_id IN (SELECT product_id FROM order_products); -- DEBUGGING SCENARIO: "Why are some customers missing?" -- CTE VERSION: Easy to debug step by stepWITH step1_active_customers AS ( SELECT * FROM customers WHERE status = 'active' ), step2_with_orders AS ( SELECT c.* FROM step1_active_customers c WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id) ), step3_high_value AS ( SELECT c.* FROM step2_with_orders c JOIN (SELECT customer_id, SUM(amount) as total FROM orders GROUP BY customer_id) totals ON c.customer_id = totals.customer_id WHERE totals.total > 1000 )-- Debug: Change final SELECT to check each step-- SELECT COUNT(*) FROM step1_active_customers; -- 5000-- SELECT COUNT(*) FROM step2_with_orders; -- 3200 (1800 never ordered)-- SELECT COUNT(*) FROM step3_high_value; -- 450 (most < $1000)SELECT * FROM step3_high_value;Based on everything we've covered, here's a practical decision framework for choosing between CTEs and subqueries.
Use CTE when: you need recursion, multiple references, complex transformations, or team collaboration. Use Subquery when: it's simple and single-use, you need correlation, you're checking existence, or you're in a scalar context.
DECISION FLOWCHART: CTE vs Subquery START │ ▼┌─────────────────────────────────────┐│ Do you need recursion? ││ (hierarchies, graphs, series) │└─────────────────────────────────────┘ │ YES → USE CTE (only option) │ NO ▼┌─────────────────────────────────────┐│ Will you reference the same result ││ multiple times? │└─────────────────────────────────────┘ │ YES → USE CTE (avoid duplication) │ NO ▼┌─────────────────────────────────────┐│ Is nesting deeper than 2 levels? │└─────────────────────────────────────┘ │ YES → USE CTE (improve readability) │ NO ▼┌─────────────────────────────────────┐│ Is this a correlated subquery? ││ (references outer query) │└─────────────────────────────────────┘ │ YES → USE SUBQUERY (CTE cannot correlate) │ NO ▼┌─────────────────────────────────────┐│ Is this an EXISTS check? │└─────────────────────────────────────┘ │ YES → USE SUBQUERY (more idiomatic) │ NO ▼┌─────────────────────────────────────┐│ Is this a simple scalar subquery? ││ (single value in SELECT list) │└─────────────────────────────────────┘ │ YES → USE SUBQUERY (natural fit) │ NO ▼┌─────────────────────────────────────┐│ Will multiple people maintain this? │└─────────────────────────────────────┘ │ YES → USE CTE (better collaboration) │ NO ▼┌─────────────────────────────────────┐│ Is the logic complex enough to ││ benefit from a descriptive name? │└─────────────────────────────────────┘ │ YES → USE CTE │ NO → USE SUBQUERY (simpler is fine)| Use CTE When | Use Subquery When |
|---|---|
| Recursion is needed | Simple, single-use derived table |
| Referenced 2+ times | Correlated reference to outer query |
| Complex multi-step transformation | EXISTS / NOT EXISTS checks |
| Nesting would exceed 2 levels | Scalar value in SELECT list |
| Team needs to maintain query | IN (SELECT ...) with simple filter |
| Debugging intermediate steps | Query fits easily on one screen |
| Named intermediate results add clarity | CTE would just add boilerplate |
Let's look at real-world scenarios where the choice between CTE and subquery is clear, applying our decision framework.
Scenario: Find products currently in stock
This is a simple existence check—exactly where subqueries shine.
12345678910111213141516171819202122232425
-- SUBQUERY: Correct choice-- Simple, single-use, existence check, correlatedSELECT p.product_name, p.priceFROM products pWHERE EXISTS ( SELECT 1 FROM inventory i WHERE i.product_id = p.product_id AND i.quantity > 0); -- CTE alternative: More verbose, no added benefitWITH in_stock_products AS ( SELECT DISTINCT product_id FROM inventory WHERE quantity > 0)SELECT p.product_name, p.priceFROM products pWHERE p.product_id IN (SELECT product_id FROM in_stock_products); -- Why subquery wins:-- ✓ Simpler (7 lines vs 10)-- ✓ More direct expression of intent-- ✓ EXISTS pattern is idiomatic SQL-- ✓ Optimizer handles EXISTS efficientlyWe've completed our comprehensive exploration of Common Table Expressions. Let's consolidate everything from this final comparison page:
Module Complete: Common Table Expressions
You've now mastered CTEs comprehensively:
With this knowledge, you can write SQL that is not just correct, but elegant, maintainable, and powerful. CTEs transform how you approach complex data problems, enabling query designs that would be impractical—or impossible—with subqueries alone.
Congratulations! You've completed the Common Table Expressions module. You now possess the skills to write sophisticated CTEs, from basic named subqueries through recursive hierarchical traversals, and the judgment to choose the optimal query structure for any situation.