Loading content...
Consider this analytical challenge: "Show each product's sales alongside the average sales of all products in the same price tier, where price tiers are dynamically calculated as quartiles."
This query requires:
No single table contains this data. You need to create an intermediate result set—a temporary table-like structure computed on the fly. This is where table subqueries shine.
A table subquery (also called a derived table or inline view) returns multiple rows that can be treated as a virtual table. It's the most powerful form of subquery, enabling complex multi-step transformations within a single SQL statement.
By the end of this page, you will master table subqueries—understanding how to create derived tables, use them in JOINs, apply set comparison operators (IN, ANY, ALL), and leverage them for complex analytical patterns. You'll learn when table subqueries outperform alternatives and when CTEs might be preferred.
A table subquery is a subquery that returns a complete result set—zero, one, or many rows with one or more columns. Unlike scalar and row subqueries, table subqueries don't have cardinality restrictions.
Formal Definition:
A table subquery is a SELECT statement whose result forms a relation (table) that can be used in contexts expecting a table or set: the FROM clause, set operators, IN operator, EXISTS checks, ANY/ALL comparisons, and set functions.
Key Insight:
In relational terms, every SELECT query produces a relation—a table. Table subqueries make this explicit, allowing you to embed one relation's construction inside another query.
123456789101112131415161718
-- Table subquery in FROM clause (Derived Table)-- Creates a virtual table that can be joined or filtered SELECT dept_summary.department_name, dept_summary.avg_salaryFROM ( SELECT d.department_name, AVG(e.salary) AS avg_salary, COUNT(*) AS employee_count FROM departments d JOIN employees e ON d.department_id = e.department_id GROUP BY d.department_name) AS dept_summary -- Alias requiredWHERE dept_summary.avg_salary > 50000; -- The subquery produces a table with columns:-- | department_name | avg_salary | employee_count |-- This table is then filtered by the outer WHERE clauseA subquery in the FROM clause is called a 'derived table' in MySQL and SQL Server, and an 'inline view' in Oracle. All terms describe the same concept: a dynamically computed table-like result set that exists only for the duration of the query.
The most powerful use of table subqueries is in the FROM clause, creating a derived table that acts as a data source for the outer query.
Syntax Requirements:
Basic Pattern:
SELECT columns
FROM (subquery) AS alias
[WHERE ...]
[JOIN ...]
[GROUP BY ...]
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
-- Example 1: Simple derived table for aggregation filtering-- Find departments where average salary exceeds company average SELECT *FROM ( SELECT department_id, department_name, AVG(salary) AS avg_salary FROM departments d JOIN employees e USING (department_id) GROUP BY department_id, department_name) AS dept_avgsWHERE avg_salary > (SELECT AVG(salary) FROM employees); -- Example 2: Derived table for row numbering (pre-window function technique)SELECT ranked.*, (SELECT COUNT(*) FROM sales s2 WHERE s2.total > ranked.total) + 1 AS rankFROM ( SELECT salesperson, SUM(amount) AS total FROM sales GROUP BY salesperson) AS ranked; -- Example 3: Multiple derived tables joined togetherSELECT monthly.month_name, monthly.total_orders, prev.total_orders AS prev_month_orders, monthly.total_orders - COALESCE(prev.total_orders, 0) AS growthFROM ( SELECT MONTH(order_date) AS month_num, MONTHNAME(order_date) AS month_name, COUNT(*) AS total_orders FROM orders WHERE YEAR(order_date) = 2024 GROUP BY MONTH(order_date), MONTHNAME(order_date)) AS monthlyLEFT JOIN ( SELECT MONTH(order_date) + 1 AS next_month, COUNT(*) AS total_orders FROM orders WHERE YEAR(order_date) = 2024 GROUP BY MONTH(order_date)) AS prev ON monthly.month_num = prev.next_month;Always alias computed columns in your derived table (e.g., AVG(salary) AS avg_salary). Without aliases, some databases generate names like 'AVG(salary)' or '?column?', making the outer query reference awkward or impossible. Explicit aliases ensure clarity and portability.
Derived tables become especially powerful when joined with physical tables or other derived tables. This pattern enables complex analytical queries that would otherwise require temporary tables or multiple queries.
Common Patterns:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
-- Pattern 1: Add aggregated context to each row-- Show each employee with their department's average salary SELECT e.employee_name, e.salary, dept_avg.avg_salary AS department_average, e.salary - dept_avg.avg_salary AS diff_from_dept_avgFROM employees eJOIN ( SELECT department_id, AVG(salary) AS avg_salary FROM employees GROUP BY department_id) AS dept_avg ON e.department_id = dept_avg.department_id; -- Pattern 2: Filter base table using aggregated criteria-- Show employees in departments with above-average headcount SELECT e.*FROM employees eJOIN ( SELECT department_id FROM employees GROUP BY department_id HAVING COUNT(*) > ( SELECT AVG(dept_count) FROM ( SELECT COUNT(*) AS dept_count FROM employees GROUP BY department_id ) AS counts )) AS large_depts ON e.department_id = large_depts.department_id; -- Pattern 3: Ranked subset selection-- Show top 3 products per category by revenue SELECT p.category_id, p.product_name, ranked.revenueFROM ( SELECT product_id, SUM(quantity * price) AS revenue, ROW_NUMBER() OVER ( PARTITION BY category_id ORDER BY SUM(quantity * price) DESC ) AS rank FROM order_items GROUP BY product_id, category_id) AS rankedJOIN products p ON ranked.product_id = p.product_idWHERE ranked.rank <= 3;| Pattern | Purpose | Example Use Case |
|---|---|---|
| Base + Aggregate | Add summary data to each row | Employee salary vs department average |
| Filter by Aggregate | Limit rows based on group properties | Employees in large departments only |
| Self-Join with Aggregate | Compare row to its own group | Products ranked within category |
| Multi-Level Aggregate | Aggregate of aggregates | Average of department averages |
| Pivot/Reshape | Transform data structure | Rows to columns transformation |
The IN operator tests whether a value exists within the result set of a table subquery. It's one of the most common uses of table subqueries.
Syntax:
WHERE column IN (SELECT column FROM table WHERE ...)
The subquery must return a single column (or be used with row tuples for multi-column IN).
1234567891011121314151617181920212223242526272829303132333435
-- Basic IN: find employees in specific departmentsSELECT employee_name, salaryFROM employeesWHERE department_id IN ( SELECT department_id FROM departments WHERE location = 'New York'); -- IN with computed valuesSELECT product_name, priceFROM productsWHERE category_id IN ( SELECT category_id FROM categories WHERE parent_category_id = ( SELECT category_id FROM categories WHERE name = 'Electronics' )); -- NOT IN: find products never orderedSELECT product_nameFROM productsWHERE product_id NOT IN ( SELECT DISTINCT product_id FROM order_items WHERE product_id IS NOT NULL -- Critical for NOT IN!); -- IN with expressionSELECT * FROM ordersWHERE YEAR(order_date) IN ( SELECT fiscal_year FROM active_fiscal_years);If a NOT IN subquery returns ANY NULL value, the entire NOT IN evaluates to UNKNOWN for all outer rows, returning zero results. This is because x NOT IN (1, 2, NULL) becomes x<>1 AND x<>2 AND x<>NULL, and anything compared to NULL yields UNKNOWN. Always filter NULLs in NOT IN subqueries or use NOT EXISTS instead.
1234567891011121314151617
-- DANGEROUS: NOT IN with potential NULLSELECT * FROM employeesWHERE department_id NOT IN (SELECT department_id FROM inactive_depts);-- Returns NO ROWS if inactive_depts has any NULL department_id! -- SAFE: Filter NULLs explicitlySELECT * FROM employeesWHERE department_id NOT IN ( SELECT department_id FROM inactive_depts WHERE department_id IS NOT NULL); -- SAFER: Use NOT EXISTS (handles NULL correctly)SELECT e.* FROM employees eWHERE NOT EXISTS ( SELECT 1 FROM inactive_depts i WHERE i.department_id = e.department_id);ANY (also written as SOME) and ALL are powerful operators for comparing a value against all values returned by a table subquery.
Semantics:
Where operator can be =, <>, <, >, <=, or >=.
| Expression | Equivalent Logic | Description |
|---|---|---|
x = ANY (set) | x IN set | x equals at least one element |
x <> ANY (set) | TRUE if set has element ≠ x | x differs from at least one element |
x > ANY (set) | x > MIN(set) | x greater than smallest element |
x < ANY (set) | x < MAX(set) | x less than largest element |
x = ALL (set) | x equals every element (rare) | All elements identical to x |
x <> ALL (set) | x NOT IN set | x differs from every element |
x > ALL (set) | x > MAX(set) | x greater than every element |
x < ALL (set) | x < MIN(set) | x less than every element |
1234567891011121314151617181920212223242526272829303132333435363738
-- ANY Examples: -- Salary greater than ANY engineering salary (> minimum)SELECT employee_name, salaryFROM employeesWHERE salary > ANY ( SELECT salary FROM employees WHERE department = 'Engineering');-- Equivalent: WHERE salary > (SELECT MIN(salary) FROM employees WHERE department = 'Engineering') -- Different from at least one manager's departmentSELECT * FROM employeesWHERE department_id <> ANY ( SELECT department_id FROM employees WHERE is_manager = TRUE); -- ALL Examples: -- Salary greater than ALL engineering salaries (> maximum)SELECT employee_name, salaryFROM employeesWHERE salary > ALL ( SELECT salary FROM employees WHERE department = 'Engineering');-- Equivalent: WHERE salary > (SELECT MAX(salary) FROM employees WHERE department = 'Engineering') -- Order total exceeds all previous orders from same customerSELECT o.*FROM orders oWHERE o.total_amount > ALL ( SELECT total_amount FROM orders WHERE customer_id = o.customer_id AND order_id < o.order_id);For ALL with an empty subquery, the result is TRUE (vacuously true—there are no elements to violate the condition). For ANY with an empty subquery, the result is FALSE (no element satisfies the condition). This can cause unexpected behavior if your subquery might return zero rows.
12345678910111213141516
-- ALL with empty subquery is TRUESELECT * FROM productsWHERE price > ALL (SELECT price FROM products WHERE 1 = 0);-- Returns ALL products! (empty subquery → vacuously true) -- ANY with empty subquery is FALSESELECT * FROM products WHERE price > ANY (SELECT price FROM products WHERE 1 = 0);-- Returns NO products (empty subquery → false) -- Guard against empty subquery if unexpected:SELECT * FROM productsWHERE price > ALL ( SELECT price FROM competing_products WHERE active = TRUE)AND EXISTS (SELECT 1 FROM competing_products WHERE active = TRUE);EXISTS tests whether a subquery returns any rows at all. It's often the most efficient way to check for related data existence.
Key Characteristics:
123456789101112131415161718192021222324252627282930313233
-- Basic EXISTS: find customers with ordersSELECT customer_name, emailFROM customers cWHERE EXISTS ( SELECT 1 -- Column selection doesn't matter FROM orders o WHERE o.customer_id = c.customer_id -- Correlated to outer query); -- NOT EXISTS: find customers without ordersSELECT customer_name, emailFROM customers cWHERE NOT EXISTS ( SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id); -- EXISTS with additional conditionsSELECT product_nameFROM products pWHERE EXISTS ( SELECT 1 FROM order_items oi JOIN orders o ON oi.order_id = o.order_id WHERE oi.product_id = p.product_id AND o.order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 30 DAY)); -- EXISTS for anti-join (orphan detection)SELECT d.*FROM departments dWHERE NOT EXISTS ( SELECT 1 FROM employees e WHERE e.department_id = d.department_id);In EXISTS subqueries, the SELECT list doesn't affect the result—only row existence matters. Using SELECT 1 (or SELECT NULL) is conventional to signal intent: 'I only care if rows exist, not their values.' Modern optimizers treat SELECT * and SELECT 1 identically in EXISTS context.
Table subqueries can be nested multiple levels deep, with each level building on the results of inner levels. While powerful, deep nesting can impact readability and sometimes performance.
Multi-Level Example:
1234567891011121314151617181920212223242526272829303132333435363738394041
-- Three-level nesting: employees in departments with above-average -- headcount where headcount average is computed from active departments SELECT e.employee_name, e.department_idFROM employees eWHERE e.department_id IN ( -- Level 2: departments with above-average headcount SELECT department_id FROM ( -- Level 3: compute headcount per department SELECT department_id, COUNT(*) AS headcount FROM employees GROUP BY department_id ) AS dept_counts WHERE headcount > ( -- Level 3b: compute average headcount SELECT AVG(headcount) FROM ( SELECT COUNT(*) AS headcount FROM employees GROUP BY department_id ) AS counts )); -- Equivalent with CTEs (often clearer):WITH dept_counts AS ( SELECT department_id, COUNT(*) AS headcount FROM employees GROUP BY department_id),avg_headcount AS ( SELECT AVG(headcount) AS avg_count FROM dept_counts)SELECT e.employee_name, e.department_idFROM employees eWHERE e.department_id IN ( SELECT department_id FROM dept_counts, avg_headcount WHERE headcount > avg_count);When nesting exceeds two levels, Common Table Expressions (CTEs, covered in Chapter 16) often provide better readability. CTEs allow naming intermediate results, making the query's logical flow clearer. Performance is typically equivalent, as optimizers inline CTEs in most cases.
Subquery Chaining Pattern:
Another approach is chaining derived tables in the FROM clause rather than nesting in WHERE:
123456789101112131415161718192021222324252627
-- Chained derived tables: step-by-step transformationSELECT final.*FROM ( -- Step 3: Add ranking within category SELECT augmented.*, ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY score DESC) AS rank FROM ( -- Step 2: Compute score SELECT base.*, (revenue * 0.6 + margin * 0.4) AS score FROM ( -- Step 1: Base metrics SELECT p.product_id, p.product_name, p.category_id, SUM(oi.quantity * oi.price) AS revenue, SUM(oi.quantity * (oi.price - p.cost)) AS margin FROM products p JOIN order_items oi ON p.product_id = oi.product_id GROUP BY p.product_id, p.product_name, p.category_id ) AS base ) AS augmented) AS finalWHERE rank <= 5; -- Top 5 per categoryStandard derived tables can't reference columns from preceding tables in the same FROM clause. LATERAL subqueries remove this restriction, enabling per-row derived table computation.
Standard Subquery vs. LATERAL:
-- Standard: derived table is independent
SELECT * FROM t1, (SELECT * FROM t2 WHERE t2.y = ... ) AS sub
-- Cannot reference t1 columns in subquery!
-- LATERAL: derived table can reference preceding tables
SELECT * FROM t1, LATERAL (SELECT * FROM t2 WHERE t2.x = t1.x) AS sub
-- Can reference t1.x inside LATERAL subquery
12345678910111213141516171819202122232425262728293031323334
-- LATERAL: Top N per group (PostgreSQL, MySQL 8.0.14+, SQL Server with CROSS/OUTER APPLY) -- PostgreSQL/MySQL syntax:SELECT d.department_name, top_earners.*FROM departments d,LATERAL ( SELECT e.employee_name, e.salary FROM employees e WHERE e.department_id = d.department_id -- References d! ORDER BY e.salary DESC LIMIT 3) AS top_earners; -- SQL Server equivalent using CROSS APPLY (effectively LATERAL):SELECT d.department_name, top_earners.*FROM departments dCROSS APPLY ( SELECT TOP 3 e.employee_name, e.salary FROM employees e WHERE e.department_id = d.department_id ORDER BY e.salary DESC) AS top_earners; -- OUTER APPLY: like LEFT JOIN, includes parent rows even if subquery returns nothingSELECT d.department_name, e.employee_name, e.salaryFROM departments dOUTER APPLY ( SELECT TOP 1 employee_name, salary FROM employees WHERE department_id = d.department_id ORDER BY hire_date ASC) AS e; -- First hire per department, including empty departments| Database | Syntax | Availability |
|---|---|---|
| PostgreSQL | LATERAL | Since 9.3 (2013) |
| MySQL | LATERAL | Since 8.0.14 (2019) |
| SQL Server | CROSS APPLY / OUTER APPLY | Since 2005 |
| Oracle | LATERAL or CROSS APPLY | Since 12c (2013) |
| SQLite | Not supported | N/A |
LATERAL excels at Top-N-per-group queries, unnesting arrays/JSON, and table-valued function invocations. Before LATERAL, these patterns required correlated subqueries (in SELECT) or complex window function workarounds.
Table subquery performance depends on how the optimizer handles them. Understanding optimizer behavior helps you write efficient queries.
Optimizer Strategies:
| Pattern | Optimization | Performance Impact |
|---|---|---|
| IN (subquery) | Semi-join or hash lookup | Usually efficient with indexes |
| NOT IN (subquery) | Anti-join | Good, but watch for NULL issues |
| EXISTS (correlated) | Semi-join or nested loop | Efficient with correlation index |
| Derived table | Inline or materialize | Depends on size and reuse |
| ANY/ALL | Converted to MIN/MAX | Very efficient after conversion |
1234567891011121314151617181920212223242526272829303132333435363738394041
-- Tip 1: Index the correlation columnsCREATE INDEX idx_orders_customer ON orders(customer_id); -- This makes EXISTS efficient:SELECT * FROM customers cWHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id); -- Tip 2: Limit derived table size when possibleSELECT d.*, stats.order_countFROM departments dJOIN ( SELECT department_id, COUNT(*) as order_count FROM orders WHERE order_date >= '2024-01-01' -- Filter early in derived table GROUP BY department_id) AS stats ON d.department_id = stats.department_id; -- Tip 3: Use EXISTS instead of COUNT for existence checks-- Less efficient:SELECT * FROM products pWHERE (SELECT COUNT(*) FROM order_items oi WHERE oi.product_id = p.product_id) > 0; -- More efficient:SELECT * FROM products pWHERE EXISTS (SELECT 1 FROM order_items oi WHERE oi.product_id = p.product_id); -- Tip 4: Avoid redundant subquery execution-- Inefficient (subquery repeated):SELECT *FROM employeesWHERE salary > (SELECT AVG(salary) FROM employees) AND department_id IN (SELECT department_id FROM employees WHERE salary > (SELECT AVG(salary) FROM employees)); -- Better: single subquery or CTEWITH avg_sal AS (SELECT AVG(salary) AS avg FROM employees)SELECT * FROM employees, avg_salWHERE salary > avg AND department_id IN (SELECT department_id FROM employees, avg_sal WHERE salary > avg);Always examine execution plans for complex table subqueries. Look for 'Materialize' (subquery computed once and reused) vs 'Dependent Subquery' (per-row evaluation). Watch for full table scans in IN subqueries—they often indicate missing indexes.
Table subqueries unlock SQL's full compositional power. Let's consolidate the essential concepts:
What's Next:
We've explored scalar, row, and table subqueries. But where you place a subquery affects its behavior and semantics. The next page provides a comprehensive guide to subquery placement—understanding how subqueries behave differently in SELECT, FROM, WHERE, HAVING, and other contexts, and choosing the optimal location for your logic.
You now command the full power of table subqueries—derived tables for complex transformations, set operators for membership testing, and existence checks for correlation. Combined with scalar and row subqueries, you have a complete toolkit for nested query composition.