Having Clause - Learning Module

Loading content...

0/252

Aggregate Conditions

The Language of Aggregate Constraints

The HAVING clause unlocks its true power through aggregate conditions—expressions that evaluate summarized data to determine which groups survive. These conditions can be surprisingly sophisticated, combining multiple aggregates, using comparison operators, and expressing complex business logic.

In this page, we systematically explore what's possible in HAVING conditions. You'll learn how to express conditions on counts, sums, averages, minimums, maximums, and combinations thereof. You'll see how to compare aggregate values against constants, other aggregates, subqueries, and computed expressions.

By the end, you'll be able to express virtually any aggregate constraint your business logic requires.

What You Will Learn

This page comprehensively covers aggregate conditions: the standard aggregate functions available, comparison operators, combining conditions with logical operators, comparing aggregates to each other, using subqueries in HAVING, and expressing complex business rules as aggregate constraints.

Standard Aggregate Functions in HAVING

SQL provides a core set of aggregate functions that work identically in HAVING as they do in SELECT. Let's examine each with HAVING-specific examples:

Standard Aggregate Functions
Function	Description	NULL Handling	Common HAVING Use
`COUNT(*)`	Counts all rows in each group	Counts rows with NULLs	Minimum group size thresholds
`COUNT(column)`	Counts non-NULL values in column	Ignores NULL values	Data completeness requirements
`COUNT(DISTINCT column)`	Counts unique non-NULL values	Ignores NULL values	Variety/diversity thresholds
`SUM(column)`	Totals numeric values	Ignores NULL values	Revenue/quantity thresholds
`AVG(column)`	Calculates mean of values	Ignores NULL values	Average value constraints
`MIN(column)`	Finds smallest value	Ignores NULL values	Lower bound constraints
`MAX(column)`	Finds largest value	Ignores NULL values	Upper bound constraints

Examples of each in HAVING:

aggregate_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
-- COUNT(*): Groups with at least 10 members
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) >= 10;
 
-- COUNT(column): Groups where at least 5 employees have phone numbers
SELECT department, COUNT(phone) AS with_phone
FROM employees
GROUP BY department
HAVING COUNT(phone) >= 5;
 
-- COUNT(DISTINCT): Departments with at least 3 unique job titles
SELECT department, COUNT(DISTINCT job_title) AS unique_roles
FROM employees
GROUP BY department
HAVING COUNT(DISTINCT job_title) >= 3;
 
-- SUM: Product categories with total sales over $100,000
SELECT category, SUM(sales_amount) AS total_sales
FROM products
GROUP BY category
HAVING SUM(sales_amount) > 100000;
 
-- AVG: Departments with average salary above $75,000
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 75000;
 
-- MIN: Suppliers where the cheapest product is still over $50
SELECT supplier, MIN(unit_price) AS min_price
FROM products
GROUP BY supplier
HAVING MIN(unit_price) > 50;
 
-- MAX: Categories where the most expensive item is under $1000
SELECT category, MAX(unit_price) AS max_price
FROM products
GROUP BY category
HAVING MAX(unit_price) < 1000;

COUNT(*) vs COUNT(column)

Use COUNT(*) when you want to count rows regardless of NULLs—'how many records are in this group?' Use COUNT(column) when you specifically need to count non-NULL values—'how many records have this data?' The distinction matters for data quality analysis.

Comparison Operators in HAVING

HAVING conditions use the same comparison operators as WHERE, but applied to aggregate values. Let's review the complete set:

Comparison Operators
Operator	Meaning	Example
`=`	Equal to	`HAVING COUNT(*) = 5`
`<>` or `!=`	Not equal to	`HAVING AVG(score) <> 0`
`<`	Less than	`HAVING SUM(quantity) < 100`
`<=`	Less than or equal	`HAVING MAX(price) <= 50`
`>`	Greater than	`HAVING COUNT(*) > 10`
`>=`	Greater than or equal	`HAVING MIN(rating) >= 4`
`BETWEEN`	Within inclusive range	`HAVING AVG(age) BETWEEN 25 AND 35`
`IN`	Matches any in list	`HAVING COUNT(*) IN (5, 10, 15)`
`IS NULL`	Aggregate is NULL	`HAVING SUM(value) IS NULL`
`IS NOT NULL`	Aggregate is not NULL	`HAVING AVG(score) IS NOT NULL`

Important notes on aggregate NULL handling:

•SUM/AVG of all NULLs returns NULL — If a group has only NULL values in the aggregated column, SUM() and AVG() return NULL, not 0.
•COUNT(*) is never NULL — It always returns at least 0 (in practice, at least 1 for any group that exists).
•COUNT(column) may return 0 — If all values in the column are NULL for that group.
•MIN/MAX of all NULLs returns NULL — No minimum or maximum exists for empty value sets.

null_aggregate_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Find groups where no commission data exists (all NULL)
SELECT department, SUM(commission) AS total_commission
FROM employees
GROUP BY department
HAVING SUM(commission) IS NULL;
 
-- Find groups where at least some commission data exists
SELECT department, SUM(commission) AS total_commission
FROM employees
GROUP BY department
HAVING SUM(commission) IS NOT NULL;
 
-- Using COALESCE to treat NULL aggregates as zero
SELECT department, COALESCE(SUM(commission), 0) AS total_commission
FROM employees
GROUP BY department
HAVING COALESCE(SUM(commission), 0) > 10000;

Combining Conditions with Logical Operators

Complex HAVING conditions combine multiple aggregate constraints using logical operators: AND, OR, and NOT. These work exactly as in WHERE, but each operand is typically an aggregate expression.

Operator precedence: NOT has highest precedence, then AND, then OR. Use parentheses to clarify intent.

AND requires both conditions to be true. Groups must satisfy all criteria.

having_and.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Departments with 5+ employees AND average salary > $60,000
SELECT 
    department,
    COUNT(*) AS emp_count,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING COUNT(*) >= 5 
   AND AVG(salary) > 60000;
 
-- Products with high volume AND high revenue
SELECT 
    product_id,
    SUM(quantity) AS total_units,
    SUM(quantity * unit_price) AS total_revenue
FROM order_items
GROUP BY product_id
HAVING SUM(quantity) > 1000 
   AND SUM(quantity * unit_price) > 50000;

Comparing Aggregates to Each Other

HAVING isn't limited to comparing aggregates against constants. You can compare aggregates to other aggregates, enabling sophisticated analytical conditions.

Pattern: Aggregate compared to expression of aggregates

aggregate_comparison.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- 1. Spread analysis: groups where max is at least 2x the min
SELECT 
    category,
    MIN(unit_price) AS min_price,
    MAX(unit_price) AS max_price
FROM products
GROUP BY category
HAVING MAX(unit_price) >= 2 * MIN(unit_price);
 
-- 2. Data quality: groups where count of filled values < count of all rows
--    (indicates groups with missing data)
SELECT 
    department,
    COUNT(*) AS total_employees,
    COUNT(phone) AS with_phone,
    COUNT(*) - COUNT(phone) AS missing_phone
FROM employees
GROUP BY department
HAVING COUNT(phone) < COUNT(*);
 
-- 3. Average vs median proxy: groups where max is far from average
SELECT 
    category,
    AVG(unit_price) AS avg_price,
    MAX(unit_price) AS max_price
FROM products
GROUP BY category
HAVING MAX(unit_price) > 3 * AVG(unit_price);  -- Outlier detection
 
-- 4. Revenue concentration: products where single largest order is > 20% of total
SELECT 
    product_id,
    SUM(quantity) AS total_units,
    MAX(quantity) AS max_single_order
FROM order_items
GROUP BY product_id
HAVING MAX(quantity) > 0.2 * SUM(quantity);

Powerful Pattern: Ratios and Percentages

Comparing aggregates enables ratio-based filtering: 'groups where X is Y% of Z.' This is essential for penetration analysis, completion rates, concentration metrics, and many business KPIs.

More examples of inter-aggregate comparisons:

more_aggregate_comparisons.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Customers where average order is less than their first order (declining value)
SELECT 
    customer_id,
    MIN(order_date) AS first_order_date,
    AVG(order_total) AS avg_order,
    -- Using window functions or subqueries for first order value is complex,
    -- but simpler: compare to min/max as proxy
    MAX(order_total) AS max_order
FROM orders
GROUP BY customer_id
HAVING AVG(order_total) < MAX(order_total) * 0.5;  -- Avg < half of max
 
-- Suppliers with high variety: distinct products > 10% of total products
SELECT 
    supplier_id,
    COUNT(*) AS total_products,
    COUNT(DISTINCT category) AS distinct_categories
FROM products
GROUP BY supplier_id
HAVING COUNT(DISTINCT category) >= 0.1 * COUNT(*);
 
-- Time-based: periods where max value is more than double the average
SELECT 
    DATE_TRUNC('month', sale_date) AS month,
    AVG(sale_amount) AS avg_sale,
    MAX(sale_amount) AS max_sale
FROM sales
GROUP BY DATE_TRUNC('month', sale_date)
HAVING MAX(sale_amount) > 2 * AVG(sale_amount);

Arithmetic Expressions in HAVING

HAVING conditions can include arithmetic operations on aggregates, enabling computed thresholds and derived metrics.

arithmetic_having.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Profit margin: (revenue - cost) / revenue > 30%
SELECT 
    product_id,
    SUM(quantity * unit_price) AS revenue,
    SUM(quantity * unit_cost) AS cost,
    (SUM(quantity * unit_price) - SUM(quantity * unit_cost)) / 
        NULLIF(SUM(quantity * unit_price), 0) AS margin
FROM order_items
GROUP BY product_id
HAVING (SUM(quantity * unit_price) - SUM(quantity * unit_cost)) / 
       NULLIF(SUM(quantity * unit_price), 0) > 0.3;
 
-- Average order value greater than $50 after $10 discount
SELECT 
    customer_id,
    AVG(order_total - 10) AS avg_discounted_value
FROM orders
GROUP BY customer_id
HAVING AVG(order_total - 10) > 50;
 
-- Groups where standard deviation proxy (max - min) exceeds threshold
SELECT 
    region,
    MAX(temperature) - MIN(temperature) AS temp_range
FROM weather_readings
GROUP BY region
HAVING MAX(temperature) - MIN(temperature) > 30;
 
-- Weighted average: total spent / count > $100
SELECT 
    customer_id,
    SUM(order_total) AS total_spent,
    COUNT(*) AS order_count,
    SUM(order_total) / COUNT(*) AS avg_order_value
FROM orders
GROUP BY customer_id
HAVING SUM(order_total) / COUNT(*) > 100;

Watch for Division by Zero

When dividing aggregates, use NULLIF(denominator, 0) to avoid division by zero errors. This converts zero to NULL, making the result NULL instead of an error. Consider whether NULL groups should be filtered out or handled specially.

Subqueries in HAVING Conditions

HAVING conditions can reference scalar subqueries—queries that return a single value. This enables dynamic thresholds based on data rather than hardcoded constants.

Pattern: Compare to overall aggregate

subquery_having.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
-- Departments with above-average employee count
SELECT 
    department,
    COUNT(*) AS emp_count
FROM employees
GROUP BY department
HAVING COUNT(*) > (
    SELECT AVG(dept_count)
    FROM (
        SELECT COUNT(*) AS dept_count
        FROM employees
        GROUP BY department
    ) subq
);
 
-- Categories with total sales above the median category
SELECT 
    category,
    SUM(sale_amount) AS total_sales
FROM sales
GROUP BY category
HAVING SUM(sale_amount) > (
    SELECT AVG(cat_total)
    FROM (
        SELECT SUM(sale_amount) AS cat_total
        FROM sales
        GROUP BY category
    ) subq
);
 
-- Products with order count exceeding the top 10% threshold
SELECT 
    product_id,
    COUNT(*) AS order_count
FROM order_items
GROUP BY product_id
HAVING COUNT(*) > (
    SELECT PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY cnt)
    FROM (
        SELECT COUNT(*) AS cnt
        FROM order_items
        GROUP BY product_id
    ) subq
);

Pattern: Reference another table's aggregate

cross_table_subquery.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Customers whose average order exceeds company-wide average
SELECT 
    customer_id,
    AVG(order_total) AS customer_avg
FROM orders
GROUP BY customer_id
HAVING AVG(order_total) > (
    SELECT AVG(order_total)
    FROM orders
);
 
-- Regions meeting corporate sales target (stored in settings table)
SELECT 
    region,
    SUM(sales_amount) AS regional_sales
FROM sales
GROUP BY region
HAVING SUM(sales_amount) >= (
    SELECT target_value 
    FROM corporate_targets 
    WHERE target_name = 'regional_sales_minimum'
);
 
-- Products with rating above the category average
SELECT 
    p.product_id,
    p.category,
    AVG(r.rating) AS product_rating
FROM products p
JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.product_id, p.category
HAVING AVG(r.rating) > (
    SELECT AVG(r2.rating)
    FROM products p2
    JOIN reviews r2 ON p2.product_id = r2.product_id
    WHERE p2.category = p.category
);

Correlated Subqueries in HAVING

The last example uses a correlated subquery—the subquery references the outer query's group (p.category). This evaluates the subquery for each group, comparing each product to its category average. Powerful but can be performance-intensive.

Conditional Aggregates with CASE in HAVING

The CASE expression inside aggregate functions enables conditional aggregation—counting or summing only values meeting certain criteria. These conditional aggregates can then be used in HAVING.

conditional_aggregates.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
-- Departments where over 50% of employees are senior (salary > 80000)
SELECT 
    department,
    COUNT(*) AS total,
    COUNT(CASE WHEN salary > 80000 THEN 1 END) AS senior_count
FROM employees
GROUP BY department
HAVING COUNT(CASE WHEN salary > 80000 THEN 1 END) > COUNT(*) * 0.5;
 
-- Alternative using SUM for the same logic
SELECT 
    department,
    COUNT(*) AS total,
    SUM(CASE WHEN salary > 80000 THEN 1 ELSE 0 END) AS senior_count
FROM employees
GROUP BY department
HAVING SUM(CASE WHEN salary > 80000 THEN 1 ELSE 0 END) > COUNT(*) * 0.5;
 
-- Categories where more items are on sale than at full price
SELECT 
    category,
    COUNT(CASE WHEN discount > 0 THEN 1 END) AS on_sale,
    COUNT(CASE WHEN discount = 0 THEN 1 END) AS full_price
FROM products
GROUP BY category
HAVING COUNT(CASE WHEN discount > 0 THEN 1 END) > 
       COUNT(CASE WHEN discount = 0 THEN 1 END);
 
-- Customers with at least 3 high-value orders (> $500)
SELECT 
    customer_id,
    COUNT(CASE WHEN order_total > 500 THEN 1 END) AS high_value_orders
FROM orders
GROUP BY customer_id
HAVING COUNT(CASE WHEN order_total > 500 THEN 1 END) >= 3;

Using FILTER clause (SQL standard, supported in PostgreSQL):

Some databases support the FILTER clause as a cleaner alternative to CASE in aggregates:

filter_clause.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- PostgreSQL: More readable conditional aggregates with FILTER
SELECT 
    department,
    COUNT(*) AS total,
    COUNT(*) FILTER (WHERE salary > 80000) AS senior_count
FROM employees
GROUP BY department
HAVING COUNT(*) FILTER (WHERE salary > 80000) > COUNT(*) * 0.5;
 
-- Multiple filtered aggregates
SELECT 
    year,
    SUM(revenue) FILTER (WHERE quarter = 1) AS q1_revenue,
    SUM(revenue) FILTER (WHERE quarter = 2) AS q2_revenue,
    SUM(revenue) FILTER (WHERE quarter = 3) AS q3_revenue,
    SUM(revenue) FILTER (WHERE quarter = 4) AS q4_revenue
FROM quarterly_sales
GROUP BY year
HAVING SUM(revenue) FILTER (WHERE quarter = 4) > 
       SUM(revenue) FILTER (WHERE quarter = 1);  -- Q4 > Q1 growth

FILTER vs CASE

FILTER is more readable than CASE for conditional aggregation. However, CASE works everywhere (including MySQL, SQL Server, Oracle). Use FILTER in PostgreSQL for clarity; use CASE when you need portability.

Real-World Business Scenarios

Let's see how aggregate conditions translate complex business rules into HAVING clauses:

Business Rule: Identify at-risk customers who made purchases in the past but have been inactive for 90+ days and whose recent average order is declining.

churn_risk.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
SELECT 
    customer_id,
    COUNT(*) AS total_orders,
    MAX(order_date) AS last_order,
    AVG(order_total) AS avg_order,
    AVG(CASE WHEN order_date > CURRENT_DATE - 180 THEN order_total END) AS recent_avg
FROM orders
GROUP BY customer_id
HAVING 
    COUNT(*) >= 3                                    -- Was active (3+ orders)
    AND MAX(order_date) < CURRENT_DATE - 90          -- No orders in 90 days
    AND AVG(CASE WHEN order_date > CURRENT_DATE - 180 THEN order_total END) 
        < AVG(order_total) * 0.7;                    -- Recent avg < 70% of lifetime avg

Summary: Aggregate Conditions

Aggregate conditions in HAVING enable sophisticated group filtering based on computed summaries. Let's consolidate what we've covered:

Key Takeaways

•All standard aggregates work in HAVING — COUNT, SUM, AVG, MIN, MAX, and variants like DISTINCT
•Full operator support — =, <>, <, <=, >, >=, BETWEEN, IN, IS NULL, IS NOT NULL
•Logical combinations — AND, OR, NOT with parentheses for precedence
•Aggregate-to-aggregate comparisons — Enable ratio, spread, and concentration analysis
•Arithmetic expressions — Computed thresholds and derived metrics
•Subqueries for dynamic thresholds — Compare to overall averages, corporate targets, or related data
•Conditional aggregates — CASE or FILTER for subset-based conditions
•NULL handling matters — Understand when aggregates return NULL and handle appropriately

What's next:

With individual aggregate conditions mastered, we'll explore combining multiple conditions into complex filters—handling intricate business logic with AND, OR, NOT, and nested expressions for multi-dimensional analysis.

Page Complete

You now command the full vocabulary of aggregate conditions in HAVING. From simple threshold checks to complex ratio analyses with subqueries and conditional expressions, you can translate virtually any aggregate constraint into SQL. Next, we'll tackle combining these into complex multi-condition filters.