Loading learning content...
SQL provides two filtering mechanisms: WHERE and HAVING. On the surface, they seem similar—both eliminate data from query results. But this surface similarity masks a fundamental architectural difference that every database professional must understand deeply.
Confusing WHERE and HAVING leads to three categories of problems:
This page eliminates any confusion by examining both clauses from every angle: timing, operands, performance implications, and canonical use cases.
By the end of this page, you will have crystal-clear mental models for WHERE and HAVING. You'll know exactly when to use each, understand performance implications of placement, and be able to craft complex queries that leverage both clauses optimally.
The fundamental difference between WHERE and HAVING is what they filter:
This isn't a minor implementation detail—it reflects fundamentally different operations on different data structures. Let's visualize this:
Key observations from this flow:
Between WHERE and HAVING, a fundamental transformation occurs: the data changes from many rows to few groups. After GROUP BY, each 'row' in your conceptual data stream represents an entire group's summary. HAVING filters at this coarser granularity.
Understanding SQL's logical execution order is essential for mastering WHERE vs HAVING. SQL statements are not processed in the order you write them. Instead, they follow a strict logical sequence:
| Order | Clause | Purpose | Filter Type Available |
|---|---|---|---|
| 1 | FROM (+ JOINs) | Identify source tables and combine them | — |
| 2 | WHERE | Filter individual rows from source | Row-level conditions |
| 3 | GROUP BY | Organize remaining rows into groups | — |
| 4 | Aggregate Functions | Compute summaries per group | — |
| 5 | HAVING | Filter groups based on aggregates | Aggregate conditions |
| 6 | SELECT | Choose and compute output columns | — |
| 7 | DISTINCT | Remove duplicate result rows | — |
| 8 | ORDER BY | Sort final result set | — |
| 9 | LIMIT/OFFSET | Restrict number of rows returned | — |
Critical implications of this order:
This is the logical execution order—how results must behave conceptually. The actual physical execution may differ (query optimizers reorder operations for efficiency). But results must be equivalent to this logical order. Understanding the logical model tells you what's valid SQL.
The timing difference translates directly into what expressions are valid in each clause:
WHERE status = 'active'WHERE UPPER(name) = 'JOHN'WHERE price * quantity > 100WHERE id IN (SELECT ...)WHERE name LIKE 'A%'Aggregate functions (COUNT, SUM, AVG, MIN, MAX, etc.) — They don't exist yet. Column aliases from SELECT — SELECT hasn't run yet.
HAVING COUNT(*) > 5HAVING SUM(x) > 1000HAVING AVG(price) > MIN(price) * 2HAVING SUM(x) > (SELECT AVG(y) FROM ...)Non-grouped, non-aggregated columns — Ambiguous (multiple values per group). SELECT aliases in standard SQL — SELECT hasn't run yet.
The grey area in HAVING:
Interestingly, HAVING can technically filter on GROUP BY columns without aggregates:
SELECT department, COUNT(*)
FROM employees
GROUP BY department
HAVING department = 'Engineering'; -- Valid but unusual
This works because department has a single value per group. However, this is almost always better written with WHERE:
SELECT department, COUNT(*)
FROM employees
WHERE department = 'Engineering' -- Better: filters earlier
GROUP BY department;
The WHERE version is more efficient—it eliminates non-Engineering rows before grouping.
Let's consolidate the key differences in a comprehensive comparison table:
| Aspect | WHERE | HAVING |
|---|---|---|
| Filters what? | Individual rows | Groups (after aggregation) |
| Execution phase | Step 2 (before GROUP BY) | Step 5 (after aggregates) |
| Can use aggregates? | ❌ No — don't exist yet | ✅ Yes — primary purpose |
| Can use raw columns? | ✅ Yes — all columns available | ⚠️ Only if in GROUP BY |
| Affects aggregates? | ✅ Yes — excluded rows don't contribute | ❌ No — aggregates already computed |
| Works without GROUP BY? | ✅ Yes — standard row filtering | ⚠️ Yes, but treats table as one group |
| Performance impact | Filters early = less data to process | Filters late = all groups computed first |
| Use case | Row-level conditions on source data | Aggregate thresholds on grouped data |
Ask yourself: 'Does my condition involve an aggregate function?' If yes → HAVING. If no → WHERE. This simple rule handles 99% of cases correctly.
A critical but often overlooked point: WHERE changes what gets aggregated. Because WHERE filters rows before grouping, excluded rows don't contribute to any group's aggregate.
Demonstration:
Consider this sales table:
| sale_id | product | amount | year |
|---|---|---|---|
| 1 | Widget | 100 | 2023 |
| 2 | Widget | 150 | 2023 |
| 3 | Widget | 200 | 2024 |
| 4 | Gadget | 300 | 2023 |
| 5 | Gadget | 250 | 2024 |
1234
-- Aggregate all sales by productSELECT product, SUM(amount) AS total_salesFROM salesGROUP BY product;Result:
| product | total_sales |
|---|---|
| Widget | 450 |
| Gadget | 550 |
All 5 rows contribute to their respective groups.
Key insight:
The WHERE clause didn't just remove rows from output—it changed the aggregate values themselves. Widget's total dropped from 450 to 250 because the 2024 sale (200) was excluded before the SUM() was computed.
This is fundamentally different from HAVING, which would filter the group after SUM() was calculated.
Filtering with WHERE year = 2023 gives '2023 sales by product.' A HAVING on year (if possible) would give 'all sales for products active in 2023'—a completely different meaning. The placement of your filter fundamentally changes query semantics.
The placement of filters has significant performance implications. The general principle: filter as early as possible.
Why WHERE is often more efficient:
Anti-pattern to avoid:
1234567891011
-- ❌ INEFFICIENT: Using HAVING for row-level filterSELECT region, SUM(sales) AS total_salesFROM sales_dataGROUP BY regionHAVING region = 'North'; -- This could be in WHERE! -- ✓ EFFICIENT: Using WHERE for row-level filterSELECT region, SUM(sales) AS total_salesFROM sales_dataWHERE region = 'North' -- Filters early, reduces grouping workGROUP BY region;Both queries produce identical results, but the WHERE version:
region to avoid scanning the entire tableFor a table with millions of rows and hundreds of regions, this difference can be orders of magnitude.
Modern query optimizers sometimes recognize that a HAVING condition on a GROUP BY column can be pushed down to WHERE. But this optimization isn't guaranteed. Writing correct, efficient SQL from the start is better than hoping the optimizer fixes your mistakes.
The most powerful analytical queries use both WHERE and HAVING, each serving its proper role. WHERE handles row-level business rules; HAVING enforces aggregate thresholds.
Example: Customer Segmentation Analysis
Find high-value customers in the last year who made at least 5 purchases with average order value over $200:
1234567891011121314151617181920
SELECT customer_id, COUNT(*) AS order_count, SUM(order_total) AS total_spent, AVG(order_total) AS avg_order_valueFROM orders-- WHERE: Row-level filters (applied before grouping)WHERE order_date >= CURRENT_DATE - INTERVAL '1 year' -- Only recent orders AND status = 'completed' -- Only successful orders AND store_region = 'East Coast' -- Specific region-- GROUP BY: Form customer groupsGROUP BY customer_id-- HAVING: Aggregate thresholds (applied after grouping)HAVING COUNT(*) >= 5 -- At least 5 orders AND AVG(order_total) > 200 -- Average order > $200-- ORDER BY: Sort for presentationORDER BY total_spent DESCLIMIT 100;Execution walkthrough:
orders tablecustomer_idMoving any WHERE condition to HAVING would change the query's meaning or efficiency. The date, status, and region filters define which orders count toward aggregates. The HAVING conditions define which customers qualify based on their aggregate behavior. Both are essential; neither can substitute for the other.
Let's examine the most common errors when using WHERE and HAVING, with clear explanations of why they occur and how to fix them:
1234567
-- ❌ ERROR: Invalid use of aggregate function in WHERESELECT department, AVG(salary)FROM employeesWHERE AVG(salary) > 50000 -- Cannot use aggregate here!GROUP BY department; -- ERROR: "aggregate functions are not allowed in WHERE"Why it fails: WHERE executes before GROUP BY. No groups exist yet, so AVG(salary) can't be computed.
Fix: Move the condition to HAVING:
12345
-- ✓ CORRECT: Use HAVING for aggregate conditionsSELECT department, AVG(salary) AS avg_salaryFROM employeesGROUP BY departmentHAVING AVG(salary) > 50000;The WHERE and HAVING clauses serve distinct, complementary purposes in SQL. Mastering their differences is essential for writing correct, efficient queries.
What's next:
With a solid understanding of HAVING vs WHERE, we'll explore the types of conditions you can express in HAVING—from simple comparisons to complex multi-aggregate expressions that enable sophisticated data analysis.
You now have a clear mental model distinguishing WHERE from HAVING. You understand their execution timing, valid references, performance implications, and how they cooperate in complex queries. The next page explores the rich variety of aggregate conditions possible with HAVING.