Loading learning content...
If you've ever stared at a deeply nested subquery, struggling to understand what it does, you've experienced one of SQL's most significant readability challenges. Subqueries nested three, four, or five levels deep become nearly impossible to maintain—each inner query's context trapped within layers of parentheses, each modification requiring careful navigation through a maze of brackets.
Common Table Expressions (CTEs) solve this problem elegantly. Introduced in SQL:1999 and now supported by every major database system, CTEs allow you to define temporary, named result sets that exist only for the duration of a single query. They transform convoluted, deeply-nested SQL into clean, readable, top-down logic flows.
By the end of this page, you will understand the complete syntax of the WITH clause, how CTE execution works internally, the scoping rules that govern CTE visibility, and the fundamental patterns that make CTEs indispensable in production SQL. You'll be equipped to refactor complex nested queries into elegant, maintainable CTEs.
Before diving into syntax, let's understand the pain point that motivated CTEs' creation. Consider a business requirement: Find customers whose total orders exceed the average order value, grouped by their region's performance tier.
Without CTEs, this query becomes a nightmare of nested subqueries:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
-- The pre-CTE approach: Deeply nested, hard to followSELECT c.customer_name, c.region, customer_totals.total_amountFROM customers cINNER JOIN ( SELECT customer_id, SUM(amount) as total_amount FROM orders WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR) GROUP BY customer_id HAVING SUM(amount) > ( SELECT AVG(order_total) FROM ( SELECT customer_id, SUM(amount) as order_total FROM orders WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR) GROUP BY customer_id ) AS customer_averages )) AS customer_totals ON c.customer_id = customer_totals.customer_idWHERE c.region IN ( SELECT region FROM ( SELECT region, SUM(amount) as region_total FROM customers INNER JOIN orders ON customers.customer_id = orders.customer_id GROUP BY region HAVING SUM(amount) > ( SELECT AVG(region_sum) FROM ( SELECT SUM(amount) as region_sum FROM customers INNER JOIN orders ON customers.customer_id = orders.customer_id GROUP BY region ) AS region_averages ) ) AS top_regions)ORDER BY customer_totals.total_amount DESC;This query is 45 lines of code with 4 levels of nesting. If the business logic changes—say, the date range needs to be configurable—you must update it in multiple places deep within the query structure. Finding the right closing parenthesis becomes a debugging exercise in itself.
The fundamental problems with nested subqueries:
Readability degrades exponentially — Each nesting level adds cognitive load. Understanding the innermost query requires holding all outer contexts in memory.
Logic is inverted — You write the innermost operation first, but read from outside-in. The execution order and reading order are misaligned.
Duplication is inevitable — Similar subqueries (like date filtering) must be repeated at each level, violating DRY principles.
Debugging is treacherous — Testing individual components requires extracting and reconstructing each subquery independently.
Modification is error-prone — A single misplaced parenthesis can break the query or silently change its semantics.
The WITH clause introduces Common Table Expressions—temporary named result sets that you define before the main query. Think of CTEs as temporary views that exist only for the duration of a single SQL statement.
Here's the fundamental syntax:
1234567891011
-- Basic CTE Syntax StructureWITH cte_name [(column_alias_1, column_alias_2, ...)] AS ( -- CTE query definition SELECT column1, column2, ... FROM some_table WHERE conditions)-- Main query that references the CTESELECT *FROM cte_nameWHERE additional_conditions;Let's break down each component of the syntax:
1. The WITH keyword
The WITH keyword signals the beginning of CTE definitions. It must appear before the main query (SELECT, INSERT, UPDATE, DELETE, or MERGE).
2. CTE name
Every CTE requires a unique name within its scope. This name follows standard identifier rules and becomes a temporary table-like reference for the duration of the statement.
3. Optional column aliases
You can explicitly name the columns returned by the CTE. If omitted, columns inherit names from the SELECT clause inside the CTE.
4. The AS keyword
Separates the CTE name and column list from the actual query definition.
5. CTE query (the subquery)
Enclosed in parentheses, this is any valid SELECT statement. It defines the result set that the CTE name represents.
6. Main query
Follows all CTE definitions and can reference any defined CTE as if it were a table.
When the CTE's SELECT uses expressions or functions, explicit column aliases become essential. Compare: WITH stats AS (SELECT COUNT(*)) (no column name) vs WITH stats(total_count) AS (SELECT COUNT(*)) (clear column name). The explicit form is always clearer and more maintainable.
Now let's explore the complete syntax patterns you'll encounter and use in production systems. Each variation serves specific purposes.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
-- Pattern 1: Basic CTE without column aliases-- Column names are inherited from the SELECT clauseWITH active_customers AS ( SELECT customer_id, customer_name, email, created_at FROM customers WHERE status = 'active' AND last_login >= CURRENT_DATE - INTERVAL '30 days')SELECT customer_name, emailFROM active_customersORDER BY created_at DESC; -- Pattern 2: CTE with explicit column aliases-- Useful when SELECT uses expressionsWITH customer_metrics ( customer_id, full_name, total_orders, average_order_value, customer_since) AS ( SELECT c.customer_id, CONCAT(c.first_name, ' ', c.last_name), COUNT(o.order_id), AVG(o.amount), c.created_at FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.first_name, c.last_name, c.created_at)SELECT full_name, total_orders, ROUND(average_order_value, 2) as avg_orderFROM customer_metricsWHERE total_orders > 5; -- Pattern 3: CTE used with different statement types-- CTEs work with INSERT, UPDATE, DELETE, not just SELECT -- With INSERTWITH high_value_orders AS ( SELECT order_id, customer_id, amount FROM orders WHERE amount > 1000 AND processed = FALSE)INSERT INTO priority_queue (order_id, customer_id, amount, priority)SELECT order_id, customer_id, amount, 'HIGH'FROM high_value_orders; -- With UPDATEWITH inactive_customers AS ( SELECT customer_id FROM customers WHERE last_login < CURRENT_DATE - INTERVAL '1 year')UPDATE customersSET status = 'inactive'WHERE customer_id IN (SELECT customer_id FROM inactive_customers); -- With DELETEWITH duplicate_entries AS ( SELECT id FROM ( SELECT id, ROW_NUMBER() OVER ( PARTITION BY email ORDER BY created_at DESC ) as rn FROM subscribers ) ranked WHERE rn > 1)DELETE FROM subscribersWHERE id IN (SELECT id FROM duplicate_entries);| Element | Required | Purpose | Example |
|---|---|---|---|
| WITH | Yes | Initiates CTE declaration block | WITH cte_name AS (...) |
| CTE Name | Yes | Identifier for referencing the CTE | active_users, sales_summary |
| Column List | No | Explicit column naming | (id, name, total) |
| AS | Yes | Separates name from query | AS (SELECT ...) |
| Parentheses | Yes | Enclose the CTE query | (SELECT * FROM ...) |
| Main Query | Yes | Query that uses the CTE(s) | SELECT * FROM cte_name |
Understanding how databases execute CTEs is crucial for writing performant queries. The execution model varies significantly between database systems and even between different queries within the same system.
Two Fundamental Approaches:
123456789101112131415161718192021222324252627282930313233343536373839404142434445
-- Example: How execution model affects performance -- Scenario: Find orders from customers in active regions-- The CTE is referenced ONCE WITH active_regions AS ( SELECT region_id, region_name FROM regions WHERE status = 'active')SELECT o.order_id, o.amount, ar.region_nameFROM orders oINNER JOIN customers c ON o.customer_id = c.customer_idINNER JOIN active_regions ar ON c.region_id = ar.region_idWHERE o.order_date >= '2024-01-01'; -- With INLINE EXPANSION (PostgreSQL default for single-reference):-- Optimizer can push the date filter and join conditions together-- May use index on regions.status effectively-- Equivalent to writing the subquery directly in the join -- Now consider: CTE referenced MULTIPLE TIMES WITH monthly_totals AS ( SELECT DATE_TRUNC('month', order_date) as month, SUM(amount) as total FROM orders GROUP BY DATE_TRUNC('month', order_date))SELECT curr.month, curr.total as current_month, prev.total as previous_month, curr.total - COALESCE(prev.total, 0) as growthFROM monthly_totals currLEFT JOIN monthly_totals prev ON curr.month = prev.month + INTERVAL '1 month'; -- With MATERIALIZATION (often chosen for multi-reference):-- The aggregation happens ONCE-- Result is stored in temp memory/disk-- Both self-join references read from the stored result-- Much more efficient than computing aggregation twicePostgreSQL 12+ offers MATERIALIZED and NOT MATERIALIZED hints. SQL Server may materialize CTEs in spools. MySQL 8.0+ generally inlines non-recursive CTEs. Oracle traditionally materialized with the /*+ MATERIALIZE */ hint. Always check your database's execution plan to understand actual behavior.
| Database | Default Behavior | Control Mechanism |
|---|---|---|
| PostgreSQL 12+ | Optimizer chooses | MATERIALIZED / NOT MATERIALIZED keywords |
| PostgreSQL <12 | Always materialized | No control available |
| MySQL 8.0+ | Generally inlined | Optimizer decides; limited control |
| SQL Server | Optimizer chooses | May spool to tempdb automatically |
| Oracle | Generally inlined | /*+ MATERIALIZE / or /+ INLINE */ hints |
| SQLite | Always inlined | No materialization option |
CTEs follow specific scoping rules that determine where they can be referenced. Understanding these rules prevents frustrating errors and enables powerful query patterns.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
-- RULE 1: CTEs are visible only within their statementWITH temp_data AS ( SELECT * FROM customers WHERE active = true)SELECT * FROM temp_data; -- Works: within same statement SELECT * FROM temp_data; -- ERROR: temp_data doesn't exist here -- RULE 2: Later CTEs can reference earlier CTEsWITH base_orders AS ( SELECT order_id, customer_id, amount FROM orders WHERE amount > 100 ), enriched_orders AS ( -- This CTE can reference base_orders SELECT bo.order_id, bo.amount, c.customer_name FROM base_orders bo -- Reference to earlier CTE INNER JOIN customers c ON bo.customer_id = c.customer_id )SELECT * FROM enriched_orders; -- RULE 3: Earlier CTEs CANNOT reference later CTEsWITH first_cte AS ( SELECT * FROM second_cte -- ERROR: second_cte not yet defined ), second_cte AS ( SELECT * FROM some_table )SELECT * FROM first_cte; -- RULE 4: A CTE cannot reference itself (unless RECURSIVE)WITH order_summary AS ( SELECT customer_id, SUM(amount) as total, COUNT(*) as order_count FROM orders GROUP BY customer_id -- Cannot write: UNION SELECT * FROM order_summary -- This requires RECURSIVE keyword)SELECT * FROM order_summary; -- RULE 5: CTEs can be referenced multiple times in the main queryWITH active_users AS ( SELECT user_id, username, department_id FROM users WHERE status = 'active')SELECT d.department_name, (SELECT COUNT(*) FROM active_users au WHERE au.department_id = d.department_id) as user_count, (SELECT STRING_AGG(username, ', ') FROM active_users au WHERE au.department_id = d.department_id) as usernamesFROM departments dWHERE d.department_id IN (SELECT department_id FROM active_users); -- RULE 6: CTEs in subqueries have their own scopeSELECT *FROM ( -- This is a separate statement; outer WITH is not visible WITH inner_cte AS (SELECT 1 as val) SELECT * FROM inner_cte) AS subquery; -- The outer query cannot reference inner_cteA frequent error is trying to use a CTE across multiple statements. Remember: WITH cte AS (...) SELECT * FROM cte; SELECT * FROM cte; — the second SELECT fails because the CTE expired after the first statement ended.
Let's revisit the nightmare query from Section 1 and transform it into clean, readable CTEs. This transformation demonstrates the dramatic improvement in clarity that CTEs provide.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
-- BEFORE: The deeply nested query from Section 1-- (45 lines, 4 levels of nesting, repeated logic) -- AFTER: The same logic using CTEs-- Clear, top-down flow, each concept named and isolated WITH -- Step 1: Define the date filter (single source of truth) date_filter AS ( SELECT CURRENT_DATE - INTERVAL '1 year' AS start_date, CURRENT_DATE AS end_date ), -- Step 2: Calculate each customer's total orders customer_order_totals AS ( SELECT customer_id, SUM(amount) AS total_amount FROM orders, date_filter WHERE order_date >= date_filter.start_date GROUP BY customer_id ), -- Step 3: Calculate the average customer total avg_customer_total AS ( SELECT AVG(total_amount) AS avg_amount FROM customer_order_totals ), -- Step 4: Identify above-average customers above_average_customers AS ( SELECT cot.customer_id, cot.total_amount FROM customer_order_totals cot, avg_customer_total WHERE cot.total_amount > avg_customer_total.avg_amount ), -- Step 5: Calculate each region's total region_totals AS ( SELECT c.region, SUM(cot.total_amount) AS region_total FROM customers c INNER JOIN customer_order_totals cot ON c.customer_id = cot.customer_id GROUP BY c.region ), -- Step 6: Calculate average region total avg_region_total AS ( SELECT AVG(region_total) AS avg_amount FROM region_totals ), -- Step 7: Identify top-performing regions top_regions AS ( SELECT region FROM region_totals, avg_region_total WHERE region_total > avg_region_total.avg_amount ) -- Final query: Combine all the piecesSELECT c.customer_name, c.region, aac.total_amountFROM customers cINNER JOIN above_average_customers aac ON c.customer_id = aac.customer_idINNER JOIN top_regions tr ON c.region = tr.regionORDER BY aac.total_amount DESC;When refactoring nested subqueries to CTEs: 1) Identify the innermost subquery, 2) Give it a meaningful name, 3) Extract it as a CTE, 4) Replace the subquery with the CTE name, 5) Repeat outward until all nesting is flattened. Each step is testable.
Like any powerful feature, CTEs can be misused. Following these best practices ensures your CTEs enhance rather than hinder your SQL.
active_premium_customers beats temp1 or cte1high_value_orders describes intent; orders_filtered_by_amount describes implementationbase_, enriched_, final_ prefixes to show data flow1234567891011121314151617181920212223242526272829303132
-- BAD: Meaningless namesWITH t1 AS (SELECT ...), t2 AS (SELECT ... FROM t1), cte AS (SELECT ... FROM t2)SELECT * FROM cte; -- GOOD: Intention-revealing namesWITH raw_transactions AS (SELECT ...), validated_transactions AS (SELECT ... FROM raw_transactions), transaction_summary AS (SELECT ... FROM validated_transactions)SELECT * FROM transaction_summary; -- BETTER: Names that tell a storyWITH all_orders_last_quarter AS ( SELECT * FROM orders WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE - INTERVAL '3 months') ), orders_with_customer_data AS ( SELECT o.*, c.customer_tier, c.region FROM all_orders_last_quarter o JOIN customers c USING (customer_id) ), premium_customer_orders AS ( SELECT * FROM orders_with_customer_data WHERE customer_tier = 'premium' )SELECT region, COUNT(*) as premium_orders, SUM(amount) as premium_revenueFROM premium_customer_ordersGROUP BY region;WITH active AS (SELECT * FROM users WHERE active) adds verbosity without benefitCTEs add verbosity. For a single, simple subquery, the inline version may be clearer: WHERE id IN (SELECT id FROM other) doesn't need a CTE. Reserve CTEs for queries with genuine complexity, reuse, or need for step-by-step clarity.
We've established a comprehensive understanding of CTE syntax and fundamentals. Let's consolidate the key takeaways:
WITH name AS (query), they exist only for the duration of one statementWhat's Next:
Now that you understand the fundamental syntax of CTEs, the next page explores Named Subqueries in depth—how to think about CTEs as modular query components, when to extract logic into CTEs versus keep it inline, and patterns for building complex queries from simple, composable parts.
You now understand the complete syntax of the WITH clause, how CTEs execute internally, the scoping rules that govern visibility, and best practices for effective CTE usage. You're ready to explore advanced CTE patterns.