Database Management SystemsSQL Query Writing

SQL Query Writing for Interviews

LevelAdvanced

Duration90 mins

TopicSQL Query Writing

1 / 5

Complex Queries

The Art of Complex SQL Queries

In technical interviews, the ability to write complex SQL queries separates candidates who merely understand database concepts from those who can apply them under pressure. Complex queries aren't just about knowing syntax—they're about understanding how to translate business requirements into efficient, accurate data retrieval.

This page establishes the foundational techniques for constructing sophisticated queries that demonstrate true SQL mastery. We'll explore query architecture patterns, advanced filtering, set operations, and the mental models that enable you to tackle any query challenge with confidence.

What You Will Learn

By the end of this page, you will understand the architecture of complex queries, master advanced filtering techniques including CASE expressions and conditional logic, apply set operations (UNION, INTERSECT, EXCEPT) strategically, and develop a systematic approach to query construction that enables you to solve novel problems during interviews.

Query Architecture Fundamentals

Every complex SQL query follows a logical architecture. Understanding this architecture enables you to decompose intimidating problems into manageable components and construct queries systematically rather than through trial and error.

The SQL Query Processing Order:

While we write queries in a specific syntactic order, the database processes them in a fundamentally different sequence. Understanding this distinction is crucial for writing correct complex queries:

SQL Clause Processing Order vs Writing Order
Processing Order	Clause	Purpose	Writing Order
1	FROM / JOIN	Determine data sources and combine tables	2
2	WHERE	Filter rows before grouping	3
3	GROUP BY	Aggregate rows into groups	4
4	HAVING	Filter groups after aggregation	5
5	SELECT	Determine output columns and expressions	1
6	DISTINCT	Remove duplicate rows	1 (modifier)
7	ORDER BY	Sort the final result set	6
8	LIMIT / OFFSET	Restrict rows returned	7

Critical Interview Insight

A common interview mistake is referencing a SELECT alias in the WHERE clause. This fails because WHERE is processed before SELECT. Understanding processing order prevents subtle errors that can derail your interview query.

The Mental Model for Complex Queries:

Think of query construction as building a pipeline:

Data Collection — What tables contain the information I need?
Row Filtering — Which rows are relevant before aggregation?
Grouping — Do I need to aggregate data into summary groups?
Group Filtering — Which aggregated groups meet my criteria?
Column Selection — What specific values should appear in output?
Result Ordering — How should the final output be arranged?

This mental model transforms complex problems into step-by-step solutions.

Advanced SELECT Techniques

The SELECT clause offers far more power than simple column retrieval. Mastering advanced SELECT techniques enables you to transform, compute, and categorize data directly within your queries, often eliminating the need for application-level post-processing.

CASE Expressions: Conditional Logic in SQL

CASE expressions bring if-then-else logic into SQL, enabling data categorization, conditional aggregation, and dynamic value computation. They are indispensable for complex queries:

case_expression_patterns.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- Pattern 1: Simple CASE (equality matching)
SELECT 
    employee_name,
    department_id,
    CASE department_id
        WHEN 1 THEN 'Engineering'
        WHEN 2 THEN 'Marketing'
        WHEN 3 THEN 'Sales'
        ELSE 'Other'
    END AS department_name
FROM employees;
 
-- Pattern 2: Searched CASE (complex conditions)
SELECT 
    product_name,
    unit_price,
    CASE 
        WHEN unit_price < 10 THEN 'Budget'
        WHEN unit_price BETWEEN 10 AND 50 THEN 'Standard'
        WHEN unit_price BETWEEN 50 AND 100 THEN 'Premium'
        ELSE 'Luxury'
    END AS price_tier,
    CASE 
        WHEN quantity_in_stock = 0 THEN 'Out of Stock'
        WHEN quantity_in_stock < 10 THEN 'Low Stock'
        WHEN quantity_in_stock < 50 THEN 'Normal'
        ELSE 'Well Stocked'
    END AS stock_status
FROM products;
 
-- Pattern 3: CASE in aggregation (conditional counting)
SELECT 
    department_id,
    COUNT(*) AS total_employees,
    COUNT(CASE WHEN salary > 100000 THEN 1 END) AS high_earners,
    COUNT(CASE WHEN salary <= 100000 THEN 1 END) AS standard_earners,
    SUM(CASE WHEN is_active = true THEN 1 ELSE 0 END) AS active_count,
    AVG(CASE WHEN years_experience > 5 THEN salary END) AS avg_senior_salary
FROM employees
GROUP BY department_id;
 
-- Pattern 4: CASE for pivoting data
SELECT 
    year,
    SUM(CASE WHEN quarter = 1 THEN revenue ELSE 0 END) AS q1_revenue,
    SUM(CASE WHEN quarter = 2 THEN revenue ELSE 0 END) AS q2_revenue,
    SUM(CASE WHEN quarter = 3 THEN revenue ELSE 0 END) AS q3_revenue,
    SUM(CASE WHEN quarter = 4 THEN revenue ELSE 0 END) AS q4_revenue
FROM quarterly_financials
GROUP BY year
ORDER BY year;

CASE Expression Best Practices

In conditional aggregations like COUNT(CASE WHEN condition THEN 1 END), CASE returns NULL when the condition is false. Since COUNT ignores NULLs, only matching rows are counted. For SUM-based conditional counts, explicitly return 0 in the ELSE clause to maintain correct totals.

COALESCE and NULLIF: Null Handling Mastery

Proper null handling is critical in complex queries. COALESCE returns the first non-null value, while NULLIF returns null when values match:

null_handling.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- COALESCE: First non-null value
SELECT 
    customer_name,
    COALESCE(preferred_email, work_email, personal_email) AS contact_email,
    COALESCE(nickname, first_name) AS display_name,
    COALESCE(discount_rate, 0) AS effective_discount
FROM customers;
 
-- NULLIF: Convert specific values to NULL
-- Commonly used to prevent division by zero
SELECT 
    product_name,
    total_revenue,
    total_units_sold,
    total_revenue / NULLIF(total_units_sold, 0) AS revenue_per_unit
FROM product_sales;
 
-- Combined usage for robust calculations
SELECT 
    region,
    SUM(sales_amount) AS total_sales,
    SUM(returns_amount) AS total_returns,
    ROUND(
        100.0 * SUM(returns_amount) / 
        NULLIF(COALESCE(SUM(sales_amount), 0), 0), 
        2
    ) AS return_percentage
FROM regional_sales
GROUP BY region;

Set Operations: UNION, INTERSECT, EXCEPT

Set operations combine the results of two or more queries based on mathematical set theory. They are powerful tools for comparing datasets, merging results from different tables, and solving problems that would otherwise require complex joins or subqueries.

Understanding Set Operations:

SQL Set Operations Comparison
Operation	Description	Duplicates	Use Case
UNION	Combines results, removes duplicates	Removed	Merging similar data from multiple sources
UNION ALL	Combines results, keeps duplicates	Kept	Merging when duplicates are meaningful
INTERSECT	Returns only rows in both results	Removed	Finding common elements between sets
EXCEPT / MINUS	Returns rows in first but not second	Removed	Finding differences between sets

Set Operation Requirements

All set operations require that each query produce the same number of columns with compatible data types. Column names are taken from the first query. Sort operations (ORDER BY) apply to the final combined result and must appear only at the end.

set_operations_examples.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
-- UNION: Merge customer contacts from different regions
-- (removes duplicates if customer appears in multiple regions)
SELECT customer_id, email, 'North' AS region
FROM customers_north
UNION
SELECT customer_id, email, 'South' AS region
FROM customers_south;
 
-- UNION ALL: Complete sales log with all transactions
-- (preserves duplicates for accurate totals)
SELECT order_id, product_id, quantity, 'Online' AS channel
FROM online_orders
UNION ALL
SELECT order_id, product_id, quantity, 'Retail' AS channel
FROM retail_orders;
 
-- INTERSECT: Find customers who appear in both tables
SELECT customer_id, email
FROM newsletter_subscribers
INTERSECT
SELECT customer_id, email
FROM active_purchasers;
 
-- EXCEPT: Find subscribers who haven't made a purchase
SELECT customer_id, email
FROM newsletter_subscribers
EXCEPT
SELECT customer_id, email
FROM active_purchasers;
 
-- Complex example: Customer segmentation using set operations
-- Find VIP customers (high purchases) who are also influencers (referrals)
SELECT customer_id, 'VIP Influencer' AS segment
FROM (
    SELECT customer_id FROM customers 
    WHERE total_purchases > 10000
    INTERSECT
    SELECT referrer_id AS customer_id FROM referrals
    GROUP BY referrer_id
    HAVING COUNT(*) >= 5
) AS vip_influencers
 
UNION
 
-- Find VIPs who are not influencers
SELECT customer_id, 'VIP Customer' AS segment
FROM (
    SELECT customer_id FROM customers 
    WHERE total_purchases > 10000
    EXCEPT
    SELECT referrer_id AS customer_id FROM referrals
    GROUP BY referrer_id
    HAVING COUNT(*) >= 5
) AS vip_only
 
ORDER BY segment, customer_id;

Performance Consideration

UNION (without ALL) and INTERSECT require duplicate elimination, which involves sorting or hashing the entire result set. For large datasets, UNION ALL is significantly faster when duplicates don't matter or cannot exist due to data constraints.

Complex Filtering Techniques

Beyond simple WHERE clauses, complex queries require sophisticated filtering patterns. These techniques—combining multiple conditions, using subqueries for dynamic filtering, and leveraging pattern matching—form the backbone of interview-level SQL.

Compound Conditions with AND, OR, NOT:

compound_conditions.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Complex compound conditions
-- Key insight: Use parentheses to control evaluation order
SELECT *
FROM orders
WHERE 
    -- High-value orders OR urgent orders from VIPs
    (total_amount > 1000 AND customer_tier = 'Standard')
    OR 
    (priority = 'Urgent' AND customer_tier = 'VIP')
    -- But exclude cancelled and test orders
    AND status NOT IN ('Cancelled', 'Test')
    AND order_date >= CURRENT_DATE - INTERVAL '30 days';
 
-- Using NOT with IN, EXISTS, BETWEEN
SELECT employee_id, name, department
FROM employees
WHERE 
    department NOT IN ('Testing', 'Temporary')
    AND hire_date NOT BETWEEN '2020-01-01' AND '2020-12-31'
    AND manager_id IS NOT NULL;

Pattern Matching with LIKE and Regular Expressions:

String pattern matching enables flexible text filtering. Understanding both LIKE patterns and regex (database-specific) unlocks powerful search capabilities:

pattern_matching.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- LIKE patterns: % matches any sequence, _ matches single character
SELECT *
FROM products
WHERE 
    -- Starts with 'Pro'
    product_name LIKE 'Pro%'
    -- Contains 'Enterprise' anywhere
    OR product_name LIKE '%Enterprise%'
    -- Exactly 3 characters followed by anything
    OR product_code LIKE '___-%';
 
-- Case-insensitive matching (ILIKE in PostgreSQL)
SELECT * FROM customers
WHERE email ILIKE '%@gmail.com';
 
-- Escape special characters in LIKE
SELECT * FROM documents
WHERE content LIKE '%50%%' ESCAPE '\';  -- Find '50%' literally
 
-- PostgreSQL regex matching
SELECT * FROM products
WHERE product_code ~ '^[A-Z]{3}-[0-9]{4}$';  -- Pattern: ABC-1234
 
-- SIMILAR TO (SQL standard regex-like)
SELECT * FROM emails
WHERE address SIMILAR TO '[a-z]+@[a-z]+\.(com|org|net)';

The IN Operator with Subqueries:

Dynamic filtering using subqueries within IN clauses is a foundational technique for complex queries:

in_subqueries.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
-- Basic IN with subquery
SELECT *
FROM products
WHERE category_id IN (
    SELECT category_id 
    FROM categories 
    WHERE is_active = true
);
 
-- NOT IN with subquery (watch for NULL gotcha!)
-- If subquery returns any NULL, NOT IN returns empty set
SELECT *
FROM customers
WHERE customer_id NOT IN (
    SELECT customer_id 
    FROM orders 
    WHERE customer_id IS NOT NULL  -- Prevent NULL in results
    AND order_date >= CURRENT_DATE - INTERVAL '1 year'
);
 
-- Multiple columns with IN (row constructors)
SELECT *
FROM inventory
WHERE (product_id, warehouse_id) IN (
    SELECT product_id, warehouse_id
    FROM reorder_requests
    WHERE status = 'Pending'
);
 
-- ANY/SOME and ALL for comparisons
SELECT *
FROM products
WHERE price > ALL (
    SELECT price FROM products WHERE category = 'Basic'
);
 
SELECT *
FROM employees
WHERE salary >= SOME (
    SELECT min_salary FROM salary_bands WHERE level = 'Senior'
);

The NOT IN NULL Trap

If a subquery in NOT IN returns any NULL value, the entire NOT IN condition evaluates to UNKNOWN, returning no rows. Always filter NULLs from NOT IN subqueries, or prefer NOT EXISTS which handles NULLs correctly.

Aggregation Mastery

Aggregation transforms detailed row data into summary information. Mastering aggregation—including GROUP BY, HAVING, and aggregate function nuances—is essential for business intelligence queries that frequently appear in interviews.

Core Aggregate Functions:

SQL Aggregate Functions Reference
Function	Purpose	NULL Handling	Common Gotcha
COUNT(*)	Count all rows	Counts rows with NULLs	None
COUNT(column)	Count non-NULL values	Ignores NULLs	Returns 0 for all-NULL
COUNT(DISTINCT col)	Count unique non-NULL values	Ignores NULLs	Performance on large sets
SUM(column)	Sum all values	Ignores NULLs	Returns NULL if all NULL
AVG(column)	Average of values	Ignores NULLs	May return more decimals than expected
MIN(column)	Minimum value	Ignores NULLs	Works with strings (alphabetical)
MAX(column)	Maximum value	Ignores NULLs	Works with strings (alphabetical)

advanced_aggregation.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
-- Multi-level aggregation with GROUP BY
SELECT 
    region,
    product_category,
    COUNT(*) AS order_count,
    COUNT(DISTINCT customer_id) AS unique_customers,
    SUM(order_total) AS total_revenue,
    AVG(order_total) AS avg_order_value,
    MIN(order_date) AS first_order,
    MAX(order_date) AS last_order
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
GROUP BY region, product_category
ORDER BY region, total_revenue DESC;
 
-- HAVING: Filter after aggregation
SELECT 
    customer_id,
    COUNT(*) AS order_count,
    SUM(order_total) AS lifetime_value
FROM orders
WHERE order_status = 'Completed'
GROUP BY customer_id
HAVING 
    COUNT(*) >= 5                    -- At least 5 orders
    AND SUM(order_total) > 1000      -- Spending over $1000
ORDER BY lifetime_value DESC;
 
-- Combining WHERE and HAVING correctly
-- WHERE filters rows BEFORE grouping
-- HAVING filters groups AFTER aggregation
SELECT 
    department,
    AVG(salary) AS avg_salary,
    COUNT(*) AS employee_count
FROM employees
WHERE hire_date >= '2020-01-01'  -- Only recent hires (row filter)
GROUP BY department
HAVING COUNT(*) >= 3             -- Only depts with 3+ employees (group filter)
ORDER BY avg_salary DESC;
 
-- GROUPING SETS for multiple aggregation levels
SELECT 
    COALESCE(region, 'All Regions') AS region,
    COALESCE(product_category, 'All Categories') AS category,
    SUM(revenue) AS total_revenue
FROM sales
GROUP BY GROUPING SETS (
    (region, product_category),  -- Detail level
    (region),                    -- Region subtotal
    (product_category),          -- Category subtotal
    ()                          -- Grand total
)
ORDER BY region NULLS FIRST, category NULLS FIRST;
 
-- ROLLUP for hierarchical aggregation
SELECT 
    year,
    quarter,
    month,
    SUM(sales) AS total_sales
FROM monthly_sales
GROUP BY ROLLUP(year, quarter, month)
ORDER BY year, quarter, month;
 
-- CUBE for all possible combinations
SELECT 
    region,
    product_type,
    SUM(units_sold) AS total_units
FROM sales
GROUP BY CUBE(region, product_type);

Interview Pattern: Top-N Per Group

A very common interview question asks for 'top N items per group' (e.g., top 3 products per category). This cannot be solved with simple GROUP BY—it requires window functions with ROW_NUMBER() or correlated subqueries. We'll cover this pattern in the Window Functions page.

Query Construction Strategy

In interviews, you won't have time to experiment randomly. A systematic approach to query construction helps you arrive at correct solutions efficiently and communicate your thought process clearly.

The SPIDER Framework for Query Construction:

SPIDER: Systematic Query Construction

•S — Scope the Problem — Understand exactly what output is required. What columns? What data types? Any specific ordering? Clarify with interviewer if ambiguous.
•P — Pinpoint Data Sources — Identify which tables contain the required data. Consider whether you need joins, subqueries, or both.
•I — Identify Relationships — Determine how tables connect. What are the join keys? Are there any many-to-many relationships requiring intermediate tables?
•D — Determine Filtering — Establish what rows to include/exclude. Does filtering happen before or after aggregation (WHERE vs HAVING)?
•E — Express Aggregation — If summary data is needed, determine grouping columns and aggregate functions. Consider if multiple aggregation levels are required.
•R — Refine and Validate — Review the query for edge cases, null handling, and correctness. Test mental execution with sample data.

Worked Example: SPIDER in Action

Interview Question: "Find the top 5 customers by total spending in 2023, showing their name, total orders, and average order value. Only include customers with at least 3 completed orders."

spider_example.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/* SPIDER Analysis:
   S - Scope: customer name, total orders, avg order value, top 5 by spending
   P - Pinpoint: customers table, orders table
   I - Identify: customers.id = orders.customer_id
   D - Determine: year 2023, status = 'Completed', at least 3 orders
   E - Express: GROUP BY customer, COUNT, AVG, SUM for filtering/sorting
   R - Refine: HAVING for minimum orders, LIMIT for top 5
*/
 
SELECT 
    c.customer_name,
    COUNT(o.order_id) AS total_orders,
    ROUND(AVG(o.order_total), 2) AS avg_order_value,
    SUM(o.order_total) AS total_spending  -- For sorting
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE 
    o.order_status = 'Completed'
    AND EXTRACT(YEAR FROM o.order_date) = 2023
GROUP BY c.customer_id, c.customer_name
HAVING COUNT(o.order_id) >= 3
ORDER BY total_spending DESC
LIMIT 5;

Communicate Your Process

In interviews, verbalize your SPIDER analysis before writing SQL. This demonstrates structured thinking, helps catch misunderstandings early, and shows the interviewer your problem-solving approach—which matters as much as the final answer.

Common Interview Query Patterns

Certain query patterns appear repeatedly in interviews. Recognizing these patterns enables rapid solution formulation. Here are the foundational patterns that form the basis of most complex queries:

ranking_patterns.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Simple top N (overall)
SELECT * FROM products
ORDER BY sales_count DESC
LIMIT 10;
 
-- Top N with ties (PostgreSQL)
SELECT * FROM products
ORDER BY sales_count DESC
FETCH FIRST 10 ROWS WITH TIES;
 
-- Second highest (common interview question)
SELECT MAX(salary) AS second_highest
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
 
-- Nth highest using LIMIT OFFSET
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 4;  -- 5th highest (0-indexed)

Summary: Complex Query Foundations

We've established the foundational techniques for constructing complex SQL queries—the building blocks upon which all advanced query patterns are constructed.

Key Takeaways:

Core Concepts Mastered

•Query Processing Order — Understand that SQL is processed FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY, not in written order.
•CASE Expressions — Use for conditional logic, data categorization, conditional aggregation, and pivot-style transformations.
•Set Operations — UNION, INTERSECT, and EXCEPT combine query results following mathematical set theory principles.
•Advanced Filtering — Master compound conditions, pattern matching with LIKE/regex, and dynamic filtering with IN subqueries.
•Aggregation — GROUP BY with HAVING, understanding NULL behavior in aggregates, and multi-level grouping with GROUPING SETS.
•SPIDER Framework — Systematic approach: Scope, Pinpoint, Identify, Determine, Express, Refine.

What's Next:

With these foundations in place, we'll explore Multi-Table Joins in the next page—mastering the various join types, understanding their performance characteristics, and learning patterns for combining data from complex relational schemas.

Page Complete

You now have a solid foundation in complex SQL query construction. These techniques—CASE expressions, set operations, advanced filtering, and systematic construction—form the vocabulary for all subsequent SQL mastery.

1 / 5

Loading learning content...

Database Management SystemsSQL Query Writing

SQL Query Writing for Interviews

LevelAdvanced

Duration90 mins

TopicSQL Query Writing

1 / 5

Complex Queries

The Art of Complex SQL Queries

What You Will Learn

Query Architecture Fundamentals

The SQL Query Processing Order:

While we write queries in a specific syntactic order, the database processes them in a fundamentally different sequence. Understanding this distinction is crucial for writing correct complex queries:

SQL Clause Processing Order vs Writing Order
Processing Order	Clause	Purpose	Writing Order
1	FROM / JOIN	Determine data sources and combine tables	2
2	WHERE	Filter rows before grouping	3
3	GROUP BY	Aggregate rows into groups	4
4	HAVING	Filter groups after aggregation	5
5	SELECT	Determine output columns and expressions	1
6	DISTINCT	Remove duplicate rows	1 (modifier)
7	ORDER BY	Sort the final result set	6
8	LIMIT / OFFSET	Restrict rows returned	7

Critical Interview Insight

The Mental Model for Complex Queries:

Think of query construction as building a pipeline:

Data Collection — What tables contain the information I need?
Row Filtering — Which rows are relevant before aggregation?
Grouping — Do I need to aggregate data into summary groups?
Group Filtering — Which aggregated groups meet my criteria?
Column Selection — What specific values should appear in output?
Result Ordering — How should the final output be arranged?

This mental model transforms complex problems into step-by-step solutions.

Advanced SELECT Techniques

CASE Expressions: Conditional Logic in SQL

CASE expressions bring if-then-else logic into SQL, enabling data categorization, conditional aggregation, and dynamic value computation. They are indispensable for complex queries:

case_expression_patterns.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- Pattern 1: Simple CASE (equality matching)
SELECT 
    employee_name,
    department_id,
    CASE department_id
        WHEN 1 THEN 'Engineering'
        WHEN 2 THEN 'Marketing'
        WHEN 3 THEN 'Sales'
        ELSE 'Other'
    END AS department_name
FROM employees;
 
-- Pattern 2: Searched CASE (complex conditions)
SELECT 
    product_name,
    unit_price,
    CASE 
        WHEN unit_price < 10 THEN 'Budget'
        WHEN unit_price BETWEEN 10 AND 50 THEN 'Standard'
        WHEN unit_price BETWEEN 50 AND 100 THEN 'Premium'
        ELSE 'Luxury'
    END AS price_tier,
    CASE 
        WHEN quantity_in_stock = 0 THEN 'Out of Stock'
        WHEN quantity_in_stock < 10 THEN 'Low Stock'
        WHEN quantity_in_stock < 50 THEN 'Normal'
        ELSE 'Well Stocked'
    END AS stock_status
FROM products;
 
-- Pattern 3: CASE in aggregation (conditional counting)
SELECT 
    department_id,
    COUNT(*) AS total_employees,
    COUNT(CASE WHEN salary > 100000 THEN 1 END) AS high_earners,
    COUNT(CASE WHEN salary <= 100000 THEN 1 END) AS standard_earners,
    SUM(CASE WHEN is_active = true THEN 1 ELSE 0 END) AS active_count,
    AVG(CASE WHEN years_experience > 5 THEN salary END) AS avg_senior_salary
FROM employees
GROUP BY department_id;
 
-- Pattern 4: CASE for pivoting data
SELECT 
    year,
    SUM(CASE WHEN quarter = 1 THEN revenue ELSE 0 END) AS q1_revenue,
    SUM(CASE WHEN quarter = 2 THEN revenue ELSE 0 END) AS q2_revenue,
    SUM(CASE WHEN quarter = 3 THEN revenue ELSE 0 END) AS q3_revenue,
    SUM(CASE WHEN quarter = 4 THEN revenue ELSE 0 END) AS q4_revenue
FROM quarterly_financials
GROUP BY year
ORDER BY year;

CASE Expression Best Practices

COALESCE and NULLIF: Null Handling Mastery

Proper null handling is critical in complex queries. COALESCE returns the first non-null value, while NULLIF returns null when values match:

null_handling.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- COALESCE: First non-null value
SELECT 
    customer_name,
    COALESCE(preferred_email, work_email, personal_email) AS contact_email,
    COALESCE(nickname, first_name) AS display_name,
    COALESCE(discount_rate, 0) AS effective_discount
FROM customers;
 
-- NULLIF: Convert specific values to NULL
-- Commonly used to prevent division by zero
SELECT 
    product_name,
    total_revenue,
    total_units_sold,
    total_revenue / NULLIF(total_units_sold, 0) AS revenue_per_unit
FROM product_sales;
 
-- Combined usage for robust calculations
SELECT 
    region,
    SUM(sales_amount) AS total_sales,
    SUM(returns_amount) AS total_returns,
    ROUND(
        100.0 * SUM(returns_amount) / 
        NULLIF(COALESCE(SUM(sales_amount), 0), 0), 
        2
    ) AS return_percentage
FROM regional_sales
GROUP BY region;

Set Operations: UNION, INTERSECT, EXCEPT

Understanding Set Operations:

SQL Set Operations Comparison
Operation	Description	Duplicates	Use Case
UNION	Combines results, removes duplicates	Removed	Merging similar data from multiple sources
UNION ALL	Combines results, keeps duplicates	Kept	Merging when duplicates are meaningful
INTERSECT	Returns only rows in both results	Removed	Finding common elements between sets
EXCEPT / MINUS	Returns rows in first but not second	Removed	Finding differences between sets

Set Operation Requirements

set_operations_examples.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
-- UNION: Merge customer contacts from different regions
-- (removes duplicates if customer appears in multiple regions)
SELECT customer_id, email, 'North' AS region
FROM customers_north
UNION
SELECT customer_id, email, 'South' AS region
FROM customers_south;
 
-- UNION ALL: Complete sales log with all transactions
-- (preserves duplicates for accurate totals)
SELECT order_id, product_id, quantity, 'Online' AS channel
FROM online_orders
UNION ALL
SELECT order_id, product_id, quantity, 'Retail' AS channel
FROM retail_orders;
 
-- INTERSECT: Find customers who appear in both tables
SELECT customer_id, email
FROM newsletter_subscribers
INTERSECT
SELECT customer_id, email
FROM active_purchasers;
 
-- EXCEPT: Find subscribers who haven't made a purchase
SELECT customer_id, email
FROM newsletter_subscribers
EXCEPT
SELECT customer_id, email
FROM active_purchasers;
 
-- Complex example: Customer segmentation using set operations
-- Find VIP customers (high purchases) who are also influencers (referrals)
SELECT customer_id, 'VIP Influencer' AS segment
FROM (
    SELECT customer_id FROM customers 
    WHERE total_purchases > 10000
    INTERSECT
    SELECT referrer_id AS customer_id FROM referrals
    GROUP BY referrer_id
    HAVING COUNT(*) >= 5
) AS vip_influencers
 
UNION
 
-- Find VIPs who are not influencers
SELECT customer_id, 'VIP Customer' AS segment
FROM (
    SELECT customer_id FROM customers 
    WHERE total_purchases > 10000
    EXCEPT
    SELECT referrer_id AS customer_id FROM referrals
    GROUP BY referrer_id
    HAVING COUNT(*) >= 5
) AS vip_only
 
ORDER BY segment, customer_id;

Performance Consideration

Complex Filtering Techniques

Compound Conditions with AND, OR, NOT:

compound_conditions.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Complex compound conditions
-- Key insight: Use parentheses to control evaluation order
SELECT *
FROM orders
WHERE 
    -- High-value orders OR urgent orders from VIPs
    (total_amount > 1000 AND customer_tier = 'Standard')
    OR 
    (priority = 'Urgent' AND customer_tier = 'VIP')
    -- But exclude cancelled and test orders
    AND status NOT IN ('Cancelled', 'Test')
    AND order_date >= CURRENT_DATE - INTERVAL '30 days';
 
-- Using NOT with IN, EXISTS, BETWEEN
SELECT employee_id, name, department
FROM employees
WHERE 
    department NOT IN ('Testing', 'Temporary')
    AND hire_date NOT BETWEEN '2020-01-01' AND '2020-12-31'
    AND manager_id IS NOT NULL;

Pattern Matching with LIKE and Regular Expressions:

String pattern matching enables flexible text filtering. Understanding both LIKE patterns and regex (database-specific) unlocks powerful search capabilities:

pattern_matching.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- LIKE patterns: % matches any sequence, _ matches single character
SELECT *
FROM products
WHERE 
    -- Starts with 'Pro'
    product_name LIKE 'Pro%'
    -- Contains 'Enterprise' anywhere
    OR product_name LIKE '%Enterprise%'
    -- Exactly 3 characters followed by anything
    OR product_code LIKE '___-%';
 
-- Case-insensitive matching (ILIKE in PostgreSQL)
SELECT * FROM customers
WHERE email ILIKE '%@gmail.com';
 
-- Escape special characters in LIKE
SELECT * FROM documents
WHERE content LIKE '%50%%' ESCAPE '\';  -- Find '50%' literally
 
-- PostgreSQL regex matching
SELECT * FROM products
WHERE product_code ~ '^[A-Z]{3}-[0-9]{4}$';  -- Pattern: ABC-1234
 
-- SIMILAR TO (SQL standard regex-like)
SELECT * FROM emails
WHERE address SIMILAR TO '[a-z]+@[a-z]+\.(com|org|net)';

The IN Operator with Subqueries:

Dynamic filtering using subqueries within IN clauses is a foundational technique for complex queries:

in_subqueries.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
-- Basic IN with subquery
SELECT *
FROM products
WHERE category_id IN (
    SELECT category_id 
    FROM categories 
    WHERE is_active = true
);
 
-- NOT IN with subquery (watch for NULL gotcha!)
-- If subquery returns any NULL, NOT IN returns empty set
SELECT *
FROM customers
WHERE customer_id NOT IN (
    SELECT customer_id 
    FROM orders 
    WHERE customer_id IS NOT NULL  -- Prevent NULL in results
    AND order_date >= CURRENT_DATE - INTERVAL '1 year'
);
 
-- Multiple columns with IN (row constructors)
SELECT *
FROM inventory
WHERE (product_id, warehouse_id) IN (
    SELECT product_id, warehouse_id
    FROM reorder_requests
    WHERE status = 'Pending'
);
 
-- ANY/SOME and ALL for comparisons
SELECT *
FROM products
WHERE price > ALL (
    SELECT price FROM products WHERE category = 'Basic'
);
 
SELECT *
FROM employees
WHERE salary >= SOME (
    SELECT min_salary FROM salary_bands WHERE level = 'Senior'
);

The NOT IN NULL Trap

Aggregation Mastery

Core Aggregate Functions:

SQL Aggregate Functions Reference
Function	Purpose	NULL Handling	Common Gotcha
COUNT(*)	Count all rows	Counts rows with NULLs	None
COUNT(column)	Count non-NULL values	Ignores NULLs	Returns 0 for all-NULL
COUNT(DISTINCT col)	Count unique non-NULL values	Ignores NULLs	Performance on large sets
SUM(column)	Sum all values	Ignores NULLs	Returns NULL if all NULL
AVG(column)	Average of values	Ignores NULLs	May return more decimals than expected
MIN(column)	Minimum value	Ignores NULLs	Works with strings (alphabetical)
MAX(column)	Maximum value	Ignores NULLs	Works with strings (alphabetical)

advanced_aggregation.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
-- Multi-level aggregation with GROUP BY
SELECT 
    region,
    product_category,
    COUNT(*) AS order_count,
    COUNT(DISTINCT customer_id) AS unique_customers,
    SUM(order_total) AS total_revenue,
    AVG(order_total) AS avg_order_value,
    MIN(order_date) AS first_order,
    MAX(order_date) AS last_order
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
GROUP BY region, product_category
ORDER BY region, total_revenue DESC;
 
-- HAVING: Filter after aggregation
SELECT 
    customer_id,
    COUNT(*) AS order_count,
    SUM(order_total) AS lifetime_value
FROM orders
WHERE order_status = 'Completed'
GROUP BY customer_id
HAVING 
    COUNT(*) >= 5                    -- At least 5 orders
    AND SUM(order_total) > 1000      -- Spending over $1000
ORDER BY lifetime_value DESC;
 
-- Combining WHERE and HAVING correctly
-- WHERE filters rows BEFORE grouping
-- HAVING filters groups AFTER aggregation
SELECT 
    department,
    AVG(salary) AS avg_salary,
    COUNT(*) AS employee_count
FROM employees
WHERE hire_date >= '2020-01-01'  -- Only recent hires (row filter)
GROUP BY department
HAVING COUNT(*) >= 3             -- Only depts with 3+ employees (group filter)
ORDER BY avg_salary DESC;
 
-- GROUPING SETS for multiple aggregation levels
SELECT 
    COALESCE(region, 'All Regions') AS region,
    COALESCE(product_category, 'All Categories') AS category,
    SUM(revenue) AS total_revenue
FROM sales
GROUP BY GROUPING SETS (
    (region, product_category),  -- Detail level
    (region),                    -- Region subtotal
    (product_category),          -- Category subtotal
    ()                          -- Grand total
)
ORDER BY region NULLS FIRST, category NULLS FIRST;
 
-- ROLLUP for hierarchical aggregation
SELECT 
    year,
    quarter,
    month,
    SUM(sales) AS total_sales
FROM monthly_sales
GROUP BY ROLLUP(year, quarter, month)
ORDER BY year, quarter, month;
 
-- CUBE for all possible combinations
SELECT 
    region,
    product_type,
    SUM(units_sold) AS total_units
FROM sales
GROUP BY CUBE(region, product_type);

Interview Pattern: Top-N Per Group

Query Construction Strategy

In interviews, you won't have time to experiment randomly. A systematic approach to query construction helps you arrive at correct solutions efficiently and communicate your thought process clearly.

The SPIDER Framework for Query Construction:

SPIDER: Systematic Query Construction

•S — Scope the Problem — Understand exactly what output is required. What columns? What data types? Any specific ordering? Clarify with interviewer if ambiguous.
•P — Pinpoint Data Sources — Identify which tables contain the required data. Consider whether you need joins, subqueries, or both.
•I — Identify Relationships — Determine how tables connect. What are the join keys? Are there any many-to-many relationships requiring intermediate tables?
•D — Determine Filtering — Establish what rows to include/exclude. Does filtering happen before or after aggregation (WHERE vs HAVING)?
•E — Express Aggregation — If summary data is needed, determine grouping columns and aggregate functions. Consider if multiple aggregation levels are required.
•R — Refine and Validate — Review the query for edge cases, null handling, and correctness. Test mental execution with sample data.

Worked Example: SPIDER in Action

Interview Question: "Find the top 5 customers by total spending in 2023, showing their name, total orders, and average order value. Only include customers with at least 3 completed orders."

spider_example.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/* SPIDER Analysis:
   S - Scope: customer name, total orders, avg order value, top 5 by spending
   P - Pinpoint: customers table, orders table
   I - Identify: customers.id = orders.customer_id
   D - Determine: year 2023, status = 'Completed', at least 3 orders
   E - Express: GROUP BY customer, COUNT, AVG, SUM for filtering/sorting
   R - Refine: HAVING for minimum orders, LIMIT for top 5
*/
 
SELECT 
    c.customer_name,
    COUNT(o.order_id) AS total_orders,
    ROUND(AVG(o.order_total), 2) AS avg_order_value,
    SUM(o.order_total) AS total_spending  -- For sorting
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE 
    o.order_status = 'Completed'
    AND EXTRACT(YEAR FROM o.order_date) = 2023
GROUP BY c.customer_id, c.customer_name
HAVING COUNT(o.order_id) >= 3
ORDER BY total_spending DESC
LIMIT 5;

Communicate Your Process

Common Interview Query Patterns

Certain query patterns appear repeatedly in interviews. Recognizing these patterns enables rapid solution formulation. Here are the foundational patterns that form the basis of most complex queries:

ranking_patterns.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Simple top N (overall)
SELECT * FROM products
ORDER BY sales_count DESC
LIMIT 10;
 
-- Top N with ties (PostgreSQL)
SELECT * FROM products
ORDER BY sales_count DESC
FETCH FIRST 10 ROWS WITH TIES;
 
-- Second highest (common interview question)
SELECT MAX(salary) AS second_highest
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
 
-- Nth highest using LIMIT OFFSET
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 4;  -- 5th highest (0-indexed)

Summary: Complex Query Foundations

We've established the foundational techniques for constructing complex SQL queries—the building blocks upon which all advanced query patterns are constructed.

Key Takeaways:

Core Concepts Mastered

•Query Processing Order — Understand that SQL is processed FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY, not in written order.
•CASE Expressions — Use for conditional logic, data categorization, conditional aggregation, and pivot-style transformations.
•Set Operations — UNION, INTERSECT, and EXCEPT combine query results following mathematical set theory principles.
•Advanced Filtering — Master compound conditions, pattern matching with LIKE/regex, and dynamic filtering with IN subqueries.
•Aggregation — GROUP BY with HAVING, understanding NULL behavior in aggregates, and multi-level grouping with GROUPING SETS.
•SPIDER Framework — Systematic approach: Scope, Pinpoint, Identify, Determine, Express, Refine.

What's Next:

Page Complete

1 / 5