Loading content...
Consider this query challenge: "Find all employees who work in the same department AND have the same job title as employee John Smith."
With scalar subqueries, you'd need two separate conditions:
WHERE department_id = (SELECT department_id FROM employees WHERE name = 'John Smith')
AND job_title = (SELECT job_title FROM employees WHERE name = 'John Smith')
This works, but it's verbose and inefficient—you're querying John Smith's record twice. What if you could fetch both values in one subquery and compare them as a unit?
This is precisely what row subqueries enable. A row subquery returns one row with multiple columns, allowing you to compare tuples (ordered sets of values) directly. It's a powerful pattern for matching composite identities, replicating records, and expressing complex relationships concisely.
By the end of this page, you will understand row subqueries—how they differ from scalar subqueries, where they can be used, the syntax for row constructors and comparisons, database support variations, and practical patterns for multi-column matching.
A row subquery is a subquery that returns exactly one row containing multiple columns. The result is a composite value—a tuple—that can be compared against another tuple of the same structure.
Formal Definition:
A row subquery is a SELECT statement that, upon execution, yields a result containing at most one row with two or more columns. The columns form a tuple that can be used in row comparisons. If the query returns zero rows, the comparison evaluates to NULL/UNKNOWN. If it returns more than one row, a runtime error occurs.
Terminology Note:
In relational theory, a tuple is an ordered collection of attribute values—essentially a row. Row subqueries produce tuples, and row comparisons match tuples against tuples.
1234567891011121314151617181920
-- Row subquery: returns ONE row with MULTIPLE columns-- This subquery returns (department_id, job_title) for John Smith SELECT employee_name, department_id, job_titleFROM employeesWHERE (department_id, job_title) = ( SELECT department_id, job_title FROM employees WHERE employee_name = 'John Smith'); -- Breaking it down:-- 1. The subquery returns one row: (10, 'Engineer')-- 2. The outer WHERE compares each employee's (dept_id, job_title) tuple-- against the subquery's tuple-- 3. Only employees with BOTH values matching are returned -- This is equivalent to:-- WHERE department_id = 10 AND job_title = 'Engineer'-- But derived dynamically from John Smith's recordA row subquery retrieves multiple columns in one query execution, while multiple scalar subqueries each execute separately. Row subqueries are more efficient and semantically clearer when columns come from the same row. They guarantee the values are from a single, consistent record.
To compare a row subquery result, you need to express the outer query's values as a row constructor (also called a row value expression or tuple expression).
Syntax:
A row constructor groups multiple values into a tuple using parentheses:
(column1, column2, column3) -- Three-element tuple
For explicit clarity, some databases support the ROW keyword:
ROW(column1, column2, column3) -- Explicit row constructor
Note: The ROW keyword is optional in many databases (MySQL, PostgreSQL) but required in others (SQL Server doesn't support standard row comparison syntax at all).
123456789101112131415161718192021222324
-- Implicit row constructor (most common)SELECT * FROM employeesWHERE (department_id, job_title) = (SELECT department_id, job_title FROM employees WHERE id = 1); -- Explicit ROW keyword (PostgreSQL, MySQL)SELECT * FROM employeesWHERE ROW(department_id, job_title) = (SELECT department_id, job_title FROM employees WHERE id = 1); -- Row constructor with literalsSELECT * FROM employeesWHERE (department_id, job_title) = (10, 'Engineer'); -- Row constructor in SELECT (creating composite values - advanced usage)SELECT employee_name, (department_id, job_title) AS dept_job_tupleFROM employees; -- Note: output format varies by database -- Row constructor with expressionsSELECT * FROM productsWHERE (category_id, YEAR(created_date)) = ( SELECT category_id, YEAR(created_date) FROM products WHERE product_id = 100);| Database | Implicit Tuple | ROW Keyword | Row Comparison Support |
|---|---|---|---|
| MySQL | ✅ Supported | ✅ Optional | Full support (=, <>, <, >, <=, >=) |
| PostgreSQL | ✅ Supported | ✅ Optional | Full support with rich operators |
| Oracle | ✅ Supported | ❌ Not used | Limited (= comparison only) |
| SQL Server | ❌ Not supported | ❌ Not supported | No direct row comparison; use AND conditions |
| SQLite | ✅ Supported | ❌ Not used | Limited equality comparison |
Row comparisons extend scalar comparison operators to tuples. Understanding how these operators work with multiple values is essential for correct query semantics.
Equality (=) Comparison:
Two tuples are equal if and only if ALL corresponding elements are equal:
(a1, a2, a3) = (b1, b2, b3)
↔ a1 = b1 AND a2 = b2 AND a3 = b3
123456789101112131415
-- Row equality: all columns must matchSELECT * FROM employeesWHERE (department_id, job_title, location_id) = ( SELECT department_id, job_title, location_id FROM employees WHERE employee_id = 101); -- Equivalent expanded form:SELECT * FROM employeesWHERE department_id = (SELECT department_id FROM employees WHERE employee_id = 101) AND job_title = (SELECT job_title FROM employees WHERE employee_id = 101) AND location_id = (SELECT location_id FROM employees WHERE employee_id = 101); -- The row form is preferred: cleaner, executes one subquery instead of threeInequality (<>) Comparison:
Two tuples are unequal if ANY corresponding element differs:
(a1, a2) <> (b1, b2)
↔ a1 <> b1 OR a2 <> b2
12345678910
-- Row inequality: at least one column must differSELECT * FROM employees eWHERE (e.department_id, e.job_title) <> ( SELECT department_id, job_title FROM employees WHERE employee_id = 101); -- Returns all employees who differ in department OR job title OR both-- from employee 101Ordering Comparisons (<, >, <=, >=):
Row ordering uses lexicographic (dictionary) order—columns are compared left to right, with earlier columns taking precedence:
(a1, a2) < (b1, b2)
↔ (a1 < b1) OR (a1 = b1 AND a2 < b2)
This is like alphabetical sorting: 'AA' < 'AB' < 'B'. The first column is most significant.
1234567891011121314151617
-- Row ordering: lexicographic comparison-- (10, 'A') < (10, 'B') → TRUE (first elements equal, second compared)-- (10, 'Z') < (20, 'A') → TRUE (first element determines order) SELECT * FROM employeesWHERE (department_id, hire_date) < (10, '2020-01-01'); -- Returns:-- 1. All employees in departments < 10 (ANY hire date)-- 2. All employees in department 10 hired BEFORE 2020-01-01 -- Useful for composite ordering like pagination:SELECT * FROM ordersWHERE (order_date, order_id) > ('2024-01-01', 1000)ORDER BY order_date, order_idLIMIT 100;-- This efficiently fetches the next page after the given cursorLexicographic comparison requires all corresponding elements to be comparable. Mixing incompatible types (e.g., comparing (INT, VARCHAR) against (INT, DATE)) causes errors. Ensure tuple columns have matching or compatible types.
NULL values introduce complexity in row comparisons, following SQL's three-valued logic. The behavior can be counterintuitive.
Equality with NULL:
If any corresponding element involves NULL, the comparison may yield UNKNOWN rather than TRUE or FALSE:
(10, NULL) = (10, 'A') → UNKNOWN (NULL = 'A' is unknown)
(10, NULL) = (10, NULL) → UNKNOWN (NULL = NULL is unknown)
(NULL, 'A') = (10, 'A') → UNKNOWN (NULL = 10 is unknown)
123456789101112131415161718192021222324
-- Create test dataCREATE TABLE test_rows ( id INT, col1 INT, col2 VARCHAR(10)); INSERT INTO test_rows VALUES (1, 10, 'A');INSERT INTO test_rows VALUES (2, 10, NULL);INSERT INTO test_rows VALUES (3, NULL, 'A');INSERT INTO test_rows VALUES (4, NULL, NULL); -- Query: Find rows matching (10, 'A')SELECT * FROM test_rows WHERE (col1, col2) = (10, 'A');-- Returns: only row 1-- Rows 2, 3, 4 involve NULL → comparison is UNKNOWN → not returned -- Query: Find rows NOT matching (10, 'A') SELECT * FROM test_rows WHERE (col1, col2) <> (10, 'A');-- Returns: ONLY rows where we're CERTAIN they differ-- Row 2: (10, NULL) <> (10, 'A') → 10=10 but NULL<>'A' is unknown → UNKNOWN-- Row 3: (NULL, 'A') <> (10, 'A') → NULL<>10 is unknown → UNKNOWN-- Row 4: (NULL, NULL) <> (10, 'A') → UNKNOWN-- No rows returned! (surprising but correct per SQL semantics)IS DISTINCT FROM (NULL-safe comparison):
Some databases offer NULL-safe comparison operators that treat NULLs as regular values:
123456789101112131415161718192021
-- PostgreSQL: IS DISTINCT FROM (NULL-safe inequality)SELECT * FROM test_rowsWHERE (col1, col2) IS DISTINCT FROM (10, 'A');-- Returns rows 2, 3, 4 (treats NULL as a regular value) -- PostgreSQL: IS NOT DISTINCT FROM (NULL-safe equality)SELECT * FROM test_rowsWHERE (col1, col2) IS NOT DISTINCT FROM (10, NULL);-- Returns row 2 (NULL matches NULL) -- MySQL: NULL-safe equality operator <=>SELECT * FROM test_rowsWHERE (col1, col2) <=> (10, NULL);-- Note: MySQL's <=> only works on scalars, not row comparisons directly -- Workaround for databases without tuple NULL-safe operators:SELECT * FROM test_rowsWHERE (col1 = 10 OR (col1 IS NULL AND 10 IS NULL)) AND (col2 = 'A' OR (col2 IS NULL AND 'A' IS NULL));When any tuple element can be NULL, row comparisons may silently exclude rows you expect to match or include. Either filter out NULLs before comparison, use NULL-safe operators where available, or expand to explicit AND/OR conditions with IS NULL checks.
Like scalar subqueries, row subqueries must return at most one row. If multiple rows are returned, a runtime error occurs.
Techniques for Guaranteeing Single Row:
1. Filter by Primary/Unique Key:
123456789101112131415
-- Guaranteed single row: primary key filterSELECT * FROM employeesWHERE (department_id, job_title) = ( SELECT department_id, job_title FROM employees WHERE employee_id = 101 -- PK: exactly 0 or 1 row); -- Guaranteed single row: unique constraint filterSELECT * FROM products WHERE (category_id, brand_id) = ( SELECT category_id, brand_id FROM products WHERE sku = 'PROD-001' -- SKU is unique);2. Aggregate Functions to Collapse:
1234567891011121314151617
-- Use aggregates to produce one row-- Find employees matching the department with highest budgetSELECT * FROM employeesWHERE (department_id, location_id) = ( SELECT department_id, location_id FROM departments WHERE budget = (SELECT MAX(budget) FROM departments) LIMIT 1 -- In case of ties, pick one); -- Using MIN/MAX for deterministic selectionSELECT * FROM employeesWHERE (department_id, job_title) = ( SELECT MIN(department_id), MIN(job_title) -- Aggregate → one row FROM employees WHERE hire_date = (SELECT MIN(hire_date) FROM employees));3. LIMIT 1 with ORDER BY:
123456789101112131415161718
-- Explicit LIMIT 1 with deterministic orderingSELECT * FROM employeesWHERE (department_id, job_title) = ( SELECT department_id, job_title FROM employees WHERE salary > 100000 ORDER BY hire_date ASC -- Earliest high earner LIMIT 1); -- Top performer's attributesSELECT * FROM employeesWHERE (department_id, job_title, office_id) = ( SELECT department_id, job_title, office_id FROM employees ORDER BY performance_score DESC FETCH FIRST 1 ROW ONLY);If your row subquery returns multiple rows, you'll see errors like 'Subquery returns more than 1 row' (MySQL) or 'more than one row returned by a subquery used as an expression' (PostgreSQL). Before deployment, test your subquery standalone to verify it never exceeds one row for any possible data state.
Row subqueries excel in scenarios involving composite identity matching, record copying, and multi-attribute relationships.
Use Case 1: Finding Related Records by Composite Key
123456789101112
-- Find all orders with the same (customer_id, shipping_address) as order #12345SELECT order_id, order_date, total_amountFROM ordersWHERE (customer_id, shipping_address_id) = ( SELECT customer_id, shipping_address_id FROM orders WHERE order_id = 12345)AND order_id <> 12345; -- Exclude the reference order itself -- This finds orders shipped to the same customer/address combination-- Useful for detecting patterns or grouping related ordersUse Case 2: Matching Configuration Tuples
1234567891011121314151617
-- Find products with identical configuration to a reference productSELECT product_name, skuFROM productsWHERE (category_id, brand_id, size_id, color_id) = ( SELECT category_id, brand_id, size_id, color_id FROM products WHERE sku = 'REF-PRODUCT-001'); -- Find users with matching (role, department, access_level)SELECT user_name, emailFROM usersWHERE (role_id, department_id, access_level) = ( SELECT role_id, department_id, access_level FROM users WHERE user_id = (SELECT manager_id FROM users WHERE user_id = 456));Use Case 3: Pagination with Composite Cursor
12345678910111213141516171819202122
-- Cursor-based pagination using row comparison-- Fetch next page of results after a known (timestamp, id) position SELECT event_id, event_type, created_at, user_idFROM eventsWHERE (created_at, event_id) > ('2024-01-15 14:30:00', 50000)ORDER BY created_at, event_idLIMIT 50; -- Row comparison handles the edge case where multiple events -- have the same timestamp by using event_id as tiebreaker-- More efficient than OFFSET for deep pagination -- Previous page (reverse direction)SELECT * FROM ( SELECT event_id, event_type, created_at, user_id FROM events WHERE (created_at, event_id) < ('2024-01-15 14:30:00', 50000) ORDER BY created_at DESC, event_id DESC LIMIT 50) AS prev_pageORDER BY created_at, event_id;Use Case 4: Record Cloning/Comparison
123456789101112131415161718
-- Find exact duplicate records (all columns match)SELECT a.*FROM products aWHERE EXISTS ( SELECT 1 FROM products b WHERE b.product_id <> a.product_id AND (b.name, b.category_id, b.brand_id, b.price) = (a.name, a.category_id, a.brand_id, a.price)); -- Audit: find changes from previous versionSELECT current.*FROM employee_history currentJOIN employee_history previous ON current.employee_id = previous.employee_id AND current.version = previous.version + 1WHERE (current.salary, current.department_id, current.job_title) <> (previous.salary, previous.department_id, previous.job_title);While a row subquery used with = must return exactly one row, IN allows matching against a set of rows. This combines the power of row comparison with set membership testing.
Syntax:
WHERE (col1, col2) IN (SELECT colA, colB FROM table WHERE ...)
This returns TRUE if the tuple (col1, col2) matches ANY row in the subquery result.
123456789101112131415161718192021222324252627
-- Find employees in any of the high-performance department/role combinationsSELECT employee_name, salaryFROM employeesWHERE (department_id, job_title) IN ( SELECT department_id, job_title FROM high_performer_profiles WHERE avg_rating > 4.5); -- Find products that match any (category, brand) combination of bestsellersSELECT product_name, priceFROM productsWHERE (category_id, brand_id) IN ( SELECT category_id, brand_id FROM products WHERE units_sold > 10000); -- NOT IN with row tuples: find orphan combinationsSELECT department_id, job_titleFROM salary_gradesWHERE (department_id, job_title) NOT IN ( SELECT DISTINCT department_id, job_title FROM employees WHERE department_id IS NOT NULL AND job_title IS NOT NULL -- Crucial for NOT IN!);The NULL trap with NOT IN also applies to row comparisons. If any element in any subquery tuple is NULL, NOT IN may return no rows. Either filter NULLs in both the tuple and subquery, or use NOT EXISTS which handles NULLs correctly.
123456789101112131415
-- NOT IN with potential NULLs: DANGEROUSSELECT * FROM ordersWHERE (customer_id, product_id) NOT IN ( SELECT customer_id, product_id FROM returns -- If any customer_id or product_id is NULL, this fails silently); -- NOT EXISTS equivalent: NULL-SAFESELECT * FROM orders oWHERE NOT EXISTS ( SELECT 1 FROM returns r WHERE r.customer_id = o.customer_id AND r.product_id = o.product_id);-- Returns correct results even with NULL valuesRow subquery support varies significantly across database systems. Understanding these differences is crucial for writing portable SQL.
MySQL/MariaDB:
123456789101112131415
-- MySQL: Full row comparison support-- All comparison operators work with tuples SELECT * FROM employeesWHERE (department_id, salary) > (10, 50000); -- Lexicographic -- MySQL-specific: ROW() syntax optional but supportedSELECT * FROM employees WHERE ROW(department_id, job_title) = ROW(10, 'Engineer'); -- MySQL optimizes row IN wellSELECT * FROM ordersWHERE (customer_id, product_id) IN ( SELECT customer_id, product_id FROM wishlist);PostgreSQL:
12345678910111213141516
-- PostgreSQL: Excellent row/composite type support-- Most flexible implementation SELECT * FROM employeesWHERE (department_id, job_title) = (10, 'Engineer'); -- Sophisticated NULL handlingSELECT * FROM employeesWHERE (department_id, job_title) IS DISTINCT FROM (10, 'Engineer'); -- Can compare ROW values directlySELECT * FROM t1WHERE ROW(a, b, c) = ROW(1, 2, 3); -- PostgreSQL allows row expressions in more contextsSELECT (department_id, hire_date) FROM employees; -- Returns compositeSQL Server:
SQL Server does not support standard row comparison syntax. You must use expanded AND/OR conditions:
12345678910111213141516171819
-- SQL Server: NO direct row comparison support-- This DOES NOT WORK in SQL Server:-- SELECT * FROM emp WHERE (dept_id, job) = (SELECT dept_id, job FROM emp WHERE id = 1) -- Must expand to individual comparisons:SELECT e.*FROM employees eCROSS APPLY ( SELECT department_id AS ref_dept, job_title AS ref_job FROM employees WHERE employee_id = 1) AS refWHERE e.department_id = ref.ref_dept AND e.job_title = ref.ref_job; -- Or use multiple scalar subqueries:SELECT * FROM employeesWHERE department_id = (SELECT department_id FROM employees WHERE employee_id = 1) AND job_title = (SELECT job_title FROM employees WHERE employee_id = 1);| Feature | MySQL | PostgreSQL | Oracle | SQL Server |
|---|---|---|---|---|
| Tuple equality (=) | ✅ | ✅ | ✅ | ❌ |
| Tuple inequality (<>) | ✅ | ✅ | ❌ | ❌ |
| Tuple ordering (<, >) | ✅ | ✅ | ❌ | ❌ |
| IS DISTINCT FROM | ❌ | ✅ | ❌ | ❌ |
| Tuple IN subquery | ✅ | ✅ | ✅ | ❌ |
| ROW keyword | Optional | Optional | Not used | N/A |
Row subqueries can be efficient or problematic depending on how they're used and optimized.
Advantages of Row Subqueries:
Optimization Behavior:
1234567891011121314151617181920212223242526
-- Row IN subqueries: often optimized to semi-joins-- PostgreSQL execution plan typically shows:-- Hash Semi Join-- Hash Cond: ((employees.department_id = subquery.department_id) -- AND (employees.job_title = subquery.job_title)) EXPLAIN ANALYZESELECT * FROM employeesWHERE (department_id, job_title) IN ( SELECT department_id, job_title FROM target_profiles); -- Composite index can help both sides:CREATE INDEX idx_emp_dept_job ON employees(department_id, job_title);CREATE INDEX idx_profile_dept_job ON target_profiles(department_id, job_title); -- Row comparison for pagination: efficient with proper indexing-- This query can use the composite index for seekingCREATE INDEX idx_events_ts_id ON events(created_at, event_id); SELECT * FROM eventsWHERE (created_at, event_id) > ('2024-01-15 14:30:00', 50000)ORDER BY created_at, event_idLIMIT 50;When using row comparisons for filtering or ordering, create composite indexes matching the column order in your tuple. A (col1, col2) index efficiently supports WHERE (col1, col2) > (val1, val2) through B-tree range scanning.
| Pattern | Performance | Optimization Tips |
|---|---|---|
| Row = (non-correlated subquery) | Excellent | Subquery cached; index the lookup columns |
| Row IN (small result set) | Good | Semi-join optimization; index both sides |
| Row IN (large result set) | Variable | May scan; consider JOIN or hash strategy |
| Row comparison for pagination | Excellent | Composite index; avoids OFFSET overhead |
| Row comparison with functions | Poor | Index not usable; materialize values if possible |
Row subqueries extend SQL's expressive power to multi-column comparisons. Let's consolidate the key concepts:
What's Next:
Scalar subqueries return one value; row subqueries return one row. But many problems require working with sets of rows—multiple rows that feed into set operations or derived table processing. The next page explores table subqueries, which return full result sets for use in FROM clauses, set operators, and multi-row comparisons.
You now understand row subqueries and tuple comparisons—a powerful tool for matching composite identities, implementing efficient pagination, and expressing multi-column conditions concisely. This knowledge prepares you for the full power of table subqueries.