Loading learning content...
SQL is declarative—you describe what you want, not how to get it. But beneath this abstraction lies a precise logical sequence that determines what's valid, what's accessible, and what behaves correctly.
Understanding this execution order is essential for mastering HAVING. It explains:
This page provides the complete picture of SQL's logical execution order, with special attention to HAVING's position and its consequences.
This page covers: the complete SQL logical execution order, what 'scope' each clause has access to, how data transforms at each stage, HAVING's specific position and implications, the difference between logical and physical execution, and practical examples demonstrating execution order effects.
SQL clauses execute in a specific logical sequence, regardless of how they're written in your query. This order determines what each clause can 'see' and reference.
The standard logical execution order:
| Step | Clause | Description | Data State After |
|---|---|---|---|
| 1 | FROM | Identify source table(s) | All rows from source table(s) |
| 2 | JOIN | Combine tables based on join conditions | Combined rows from all joined tables |
| 3 | WHERE | Filter individual rows | Subset of rows meeting conditions |
| 4 | GROUP BY | Organize rows into groups | Groups of rows, each group shares grouping key |
| 5 | Aggregates | Compute summary values per group | Each group has computed aggregate values |
| 6 | HAVING | Filter groups based on aggregate conditions | Subset of groups meeting conditions |
| 7 | SELECT | Choose/compute output columns and aliases | Result columns defined |
| 8 | DISTINCT | Remove duplicate result rows | Unique result rows |
| 9 | ORDER BY | Sort result set | Sorted result set |
| 10 | LIMIT/OFFSET | Restrict number of rows returned | Final result set |
Visualizing the transformation flow:
SQL syntax writes clauses as: SELECT → FROM → WHERE → GROUP BY → HAVING → ORDER BY. But execution order is: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY. This disconnect is a common source of confusion.
Each clause can only reference what exists at its execution stage. This 'scope' determines valid expressions:
Scope rules by clause:
| Clause | Can Access | Cannot Access |
|---|---|---|
| FROM/JOIN | Table names, ON conditions using table columns | Everything else (WHERE, GROUP BY haven't run) |
| WHERE | All columns from FROM/JOIN, subqueries | Aggregates, GROUP BY columns (not yet formed), SELECT aliases |
| GROUP BY | All columns from non-filtered rows, expressions | Aggregates (computed after), SELECT aliases |
| HAVING | Aggregates, GROUP BY columns, subqueries | Non-grouped non-aggregated columns, SELECT aliases* |
| SELECT | Aggregates, GROUP BY columns, expressions | Columns not in GROUP BY (unless aggregated) |
| ORDER BY | SELECT columns/aliases, GROUP BY columns, aggregates | Non-selected, non-grouped columns in most databases |
*Some databases (MySQL, SQL Server) allow SELECT aliases in HAVING as an extension.
Why these rules exist:
The rules aren't arbitrary—they follow from the data state at each stage:
12345678910111213141516171819202122232425262728
-- ❌ WHERE can't use aggregate (doesn't exist at step 3)SELECT department, AVG(salary)FROM employeesWHERE AVG(salary) > 50000 -- ERROR!GROUP BY department; -- ❌ HAVING can't use non-grouped column (ambiguous at step 6)SELECT department, MAX(salary)FROM employeesGROUP BY departmentHAVING employee_name = 'John'; -- ERROR! employee_name not in GROUP BY -- ❌ SELECT can't include non-grouped, non-aggregated columnSELECT department, employee_name, AVG(salary) -- ERROR! employee_nameFROM employeesGROUP BY department; -- ⚠️ HAVING using SELECT alias (non-standard, works in some databases)SELECT department, AVG(salary) AS avg_salFROM employeesGROUP BY departmentHAVING avg_sal > 50000; -- May work in MySQL, errors in PostgreSQL -- ✓ ORDER BY can use SELECT aliases (exception: runs after SELECT)SELECT department, AVG(salary) AS avg_salFROM employeesGROUP BY departmentORDER BY avg_sal DESC; -- Works because ORDER BY runs after SELECTHAVING occupies a critical position: after GROUP BY and aggregation, but before SELECT. This placement has specific implications:
What exists when HAVING runs:
Practical demonstration: Trace through a query
Let's trace execution step by step:
12345678
-- The QuerySELECT department, COUNT(*) AS emp_count, AVG(salary) AS avg_salaryFROM employeesWHERE hire_date > '2020-01-01'GROUP BY departmentHAVING COUNT(*) >= 5 AND AVG(salary) > 60000ORDER BY avg_salary DESCLIMIT 10;| Step | Clause | What Happens | Data Example |
|---|---|---|---|
| 1 | FROM | Load employees table | 1000 rows, all columns |
| 2 | WHERE | Keep hire_date > 2020-01-01 | 400 rows remain |
| 3 | GROUP BY | Group by department | 15 groups (e.g., Engineering: 50 rows, Sales: 30 rows...) |
| 4 | Aggregates | Compute COUNT(*), AVG(salary) per group | Each group now has: {dept, count, avg} |
| 5 | HAVING | Keep groups where COUNT(*)≥5 AND AVG>60000 | 8 groups remain |
| 6 | SELECT | Output department, emp_count, avg_salary columns | 8 rows with 3 columns |
| 7 | ORDER BY | Sort by avg_salary DESC | 8 sorted rows |
| 8 | LIMIT | Return top 10 (only 8 exist) | 8 final rows |
At HAVING's execution stage, the 400 rows that passed WHERE have been collapsed into 15 groups. HAVING doesn't see the original 400 rows—it sees 15 groups, each with computed aggregate values. It then reduces 15 to 8.
An important distinction: the logical execution order describes what the results must behave like. The physical execution order is what the database actually does internally—which may differ significantly.
Why they can differ:
Query optimizers reorder operations for efficiency while guaranteeing logically equivalent results. Common optimizations:
Example: HAVING condition pushed down
Some optimizers can push HAVING conditions on GROUP BY columns down to WHERE:
123456789101112131415161718
-- Original querySELECT region, SUM(sales)FROM ordersGROUP BY regionHAVING region = 'East';-- Logical execution: Group ALL regions, then filter -- Optimized internal execution (equivalent results):SELECT region, SUM(sales)FROM ordersWHERE region = 'East' -- Pushed down!GROUP BY region;-- Physical execution: Filter FIRST, then group only 'East' -- The results are identical, but optimized version:-- 1. Can use index on 'region'-- 2. Groups 1 region instead of many-- 3. Uses less memory for groupingWhile optimizers may fix inefficient queries, they don't always succeed. Write correct, efficient SQL from the start. Understanding logical execution order helps you place conditions correctly without hoping the optimizer will compensate.
Why logical order still matters:
Even though physical execution may differ:
Let's explore specific consequences of HAVING's position in execution order:
Effect 1: WHERE changes what HAVING sees
Because WHERE filters rows before grouping, it changes the aggregate values that HAVING evaluates.
12345678910111213141516171819
-- Query A: No WHERE filterSELECT department, AVG(salary) AS avg_salaryFROM employeesGROUP BY departmentHAVING AVG(salary) > 70000;-- Returns: departments where overall average > $70,000 -- Query B: WITH WHERE filterSELECT department, AVG(salary) AS avg_salaryFROM employeesWHERE experience_years > 5 -- Only count senior employees!GROUP BY departmentHAVING AVG(salary) > 70000;-- Returns: departments where senior employees' average > $70,000-- HAVING sees DIFFERENT aggregate values because WHERE filtered first -- These queries answer DIFFERENT questions!-- A: "Which departments have high average salary?"-- B: "Which departments have high average salary among seniors?"Effect 2: HAVING can't undo WHERE
Once WHERE filters out rows, they're gone. HAVING cannot bring them back.
12345678
-- This pattern is WRONG if you want all employees consideredSELECT department, AVG(salary) AS avg_salaryFROM employeesWHERE department = 'Engineering' -- Only Engineering rows remainGROUP BY departmentHAVING department IN ('Engineering', 'Sales', 'Marketing');-- HAVING condition on department is pointless here!-- Only 'Engineering' exists after WHERE. Sales/Marketing already filtered out.Effect 3: Aggregates are fixed when HAVING runs
HAVING cannot change aggregate values—only decide whether to keep or discard groups based on those values.
12345678910111213
-- HAVING filters groups but doesn't recalculate aggregatesSELECT department, COUNT(*) AS emp_count, AVG(salary) AS avg_salaryFROM employeesGROUP BY departmentHAVING COUNT(*) >= 5; -- The avg_salary shown for each surviving group is the SAME as it would be-- without HAVING. HAVING just removes groups; it doesn't change aggregates.-- Groups with COUNT < 5 are removed, but the remaining groups' -- COUNT and AVG values are unchanged.Think of it this way: By the time HAVING runs, every group has its aggregates fully computed. HAVING is simply a gatekeeper that checks these pre-computed values. It's like a bouncer checking IDs at the door—they don't change anyone's age, just decide who gets in.
One of the most frequently encountered execution order issues: can HAVING reference column aliases defined in SELECT?
Standard SQL answer: No. HAVING executes at step 6; SELECT at step 7. Aliases don't exist yet.
Practical reality: It depends on your database.
| Database | Alias in HAVING? | Notes |
|---|---|---|
| PostgreSQL | ❌ No | Strictly follows standard SQL |
| MySQL | ✅ Yes | Extension: allows SELECT aliases in HAVING |
| SQL Server | ❌ No | Standard behavior; use full expression |
| Oracle | ❌ No | Standard behavior |
| SQLite | ✅ Yes | Permissive; allows aliases |
| MariaDB | ✅ Yes | MySQL-compatible extension |
123456789101112131415161718192021
-- ⚠️ NON-PORTABLE: Works in MySQL/SQLite, fails in PostgreSQL/SQL ServerSELECT department, COUNT(*) AS emp_countFROM employeesGROUP BY departmentHAVING emp_count >= 5; -- Using SELECT alias -- ✓ PORTABLE: Repeat the aggregate expressionSELECT department, COUNT(*) AS emp_countFROM employeesGROUP BY departmentHAVING COUNT(*) >= 5; -- Using the expression directly -- ✓ PORTABLE: Use a CTE to create intermediate table with aliasesWITH dept_counts AS ( SELECT department, COUNT(*) AS emp_count FROM employees GROUP BY department)SELECT department, emp_countFROM dept_countsWHERE emp_count >= 5; -- Now aliased column exists!For maximum portability, repeat the aggregate expression in HAVING rather than relying on aliases. Yes, it's a bit verbose, but it works everywhere. For complex expressions, use CTEs to pre-compute values, then filter with simple WHERE.
Unlike HAVING, ORDER BY can reference SELECT aliases in most databases. This is because ORDER BY executes after SELECT.
Why this matters:
It creates an asymmetry that confuses developers. The same alias that fails in HAVING works fine in ORDER BY.
1234567
-- HAVING fails, ORDER BY succeeds (in standard-compliant databases)SELECT department, AVG(salary) AS avg_salaryFROM employeesGROUP BY department-- HAVING avg_salary > 50000; -- ❌ Would fail in PostgreSQLHAVING AVG(salary) > 50000 -- ✓ Works: repeat expressionORDER BY avg_salary DESC; -- ✓ Works: alias available after SELECTORDER BY can also access more:
In some databases, ORDER BY can reference columns from the source tables even if they're not in SELECT (though this is implementation-dependent):
12345678910111213
-- Some databases allow ordering by non-selected columns-- when not using aggregationSELECT employee_id, nameFROM employeesORDER BY hire_date; -- hire_date not in SELECT but may work -- With GROUP BY, only grouped/aggregated columns availableSELECT department, COUNT(*)FROM employeesGROUP BY departmentORDER BY department; -- ✓ GROUP BY column-- ORDER BY employee_name; -- ❌ Not grouped, not aggregated-- ORDER BY MAX(salary); -- ✓ Aggregate (computed during grouping)HAVING runs at step 6, before SELECT (step 7), so aliases don't exist yet. ORDER BY runs at step 9, after SELECT, so aliases have been created. This is why the same alias works in ORDER BY but not HAVING.
Here's a comprehensive reference for SQL execution order, summarizing what each clause can access and what it produces:
| Order | Clause | Input | Output | Can Reference |
|---|---|---|---|---|
| 1 | FROM | Table names | All rows from table(s) | Table columns, literals |
| 2 | JOIN | Two+ tables | Combined rows | Columns from all joined tables |
| 3 | WHERE | Joined rows | Filtered rows | All table columns, subqueries, NO aggregates |
| 4 | GROUP BY | Filtered rows | Groups of rows | All columns, expressions, NOT aliases |
| 5 | Aggregates | Groups | Groups with aggregates | Columns within each group |
| 6 | HAVING | Groups with aggregates | Filtered groups | GROUP BY cols, aggregates, subqueries, NOT aliases |
| 7 | SELECT | Filtered groups | Result columns | GROUP BY cols, aggregates, expressions |
| 8 | DISTINCT | Result rows | Unique rows | Result columns |
| 9 | ORDER BY | Unique rows | Sorted rows | SELECT cols/aliases, GROUP BY cols, aggregates |
| 10 | LIMIT | Sorted rows | Limited rows | Integer count, offset |
FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY. WHERE filters rows, HAVING filters groups. Aggregates exist between GROUP BY and HAVING. ORDER BY is special—it runs after SELECT so can use aliases.
Understanding execution order leads to better queries. Here are key practical implications:
Implication 1: Filter early when possible
Since WHERE runs before GROUP BY, filtering early reduces the data that needs to be grouped and aggregated:
12345678910
-- ✓ GOOD: WHERE filters before groupingSELECT department, AVG(salary)FROM employeesWHERE hire_date > '2020-01-01' -- 60% of rows eliminated earlyGROUP BY departmentHAVING AVG(salary) > 50000; -- ✗ BAD: All rows grouped, then filter (if it were possible)-- Hypothetically, if all data were grouped first, then filtered by date-- You'd aggregate more data than necessaryImplication 2: Use HAVING only for aggregates
If a condition can go in WHERE, put it there. HAVING should be reserved for conditions that require aggregates:
1234567891011
-- ✗ WORKS but INEFFICIENTSELECT department, COUNT(*)FROM employeesGROUP BY departmentHAVING department IN ('Engineering', 'Sales'); -- Could be WHERE! -- ✓ EFFICIENT: Same result, better performanceSELECT department, COUNT(*)FROM employeesWHERE department IN ('Engineering', 'Sales') -- Filters firstGROUP BY department;Implication 3: Repeat aggregate expressions in HAVING for portability
Don't rely on SELECT aliases in HAVING. Repeat the aggregate or use CTEs:
123456789101112131415161718
-- ✓ PORTABLE: Repeat aggregate in HAVINGSELECT department, SUM(CASE WHEN status='active' THEN salary ELSE 0 END) AS active_payrollFROM employeesGROUP BY departmentHAVING SUM(CASE WHEN status='active' THEN salary ELSE 0 END) > 100000; -- ✓ CLEANER: Use CTE for complex expressionsWITH dept_payroll AS ( SELECT department, SUM(CASE WHEN status='active' THEN salary ELSE 0 END) AS active_payroll FROM employees GROUP BY department)SELECT * FROM dept_payrollWHERE active_payroll > 100000;Implication 4: Think in stages when debugging
When queries don't behave as expected, trace through the execution stages:
This systematic approach usually reveals where expectations diverge from reality.
SQL's logical execution order underpins correct query writing. Understanding HAVING's position in this order is essential for masterful SQL. Let's consolidate:
Congratulations! You've completed the HAVING Clause module. You now understand: why HAVING exists (filtering groups), how it differs from WHERE, what aggregate conditions you can express, how to build complex multi-condition filters, and where HAVING fits in SQL's execution order. These skills enable sophisticated data analysis with SQL's aggregation and grouping capabilities.
Module recap:
With these foundations, you're ready to tackle the next modules on window functions—which provide even more powerful analytical capabilities building on your aggregation knowledge.