Loading content...
NULL is SQL's representation of missing, unknown, or inapplicable data. In aggregation, NULL doesn't behave like a normal value—it's neither zero nor empty string nor any other concrete value. This creates both challenges and opportunities.
Across the aggregate functions we've studied, NULL handling follows consistent patterns, yet the implications differ based on context. Understanding these patterns—and knowing when to override them—is essential for producing accurate aggregation results.
By the end of this page, you will have a consolidated understanding of NULL handling across all standard aggregates, master techniques for controlling NULL behavior, understand the three-valued logic implications for aggregation, and know when to treat NULL as zero versus when to exclude it.
All standard SQL aggregate functions (except COUNT(*)) follow the same fundamental rule:
NULL values are excluded from aggregation.
This means:
COUNT(column))The only exception is COUNT(*), which counts rows regardless of what values they contain.
| Function | NULL Handling | Empty/All-NULL Result | Key Implication |
|---|---|---|---|
| COUNT(*) | Counts all rows including NULLs | 0 | Only aggregate that never returns NULL |
| COUNT(column) | Excludes NULL values | 0 | COUNT(*) − COUNT(col) = NULL count |
| COUNT(DISTINCT col) | Excludes NULL values | 0 | NULLs don't count as distinct |
| SUM(column) | Excludes NULL values | NULL | NULL ≠ 0 in sums |
| AVG(column) | Excludes NULLs from both sum and count | NULL | Can differ from SUM/COUNT(*) |
| MIN(column) | Excludes NULL values | NULL | NULL isn't 'smallest' |
| MAX(column) | Excludes NULL values | NULL | NULL isn't 'largest' |
1234567891011121314151617
-- Sample data for demonstrations-- values: 10, NULL, 30, NULL, 50 -- All aggregates in actionSELECT COUNT(*) AS all_rows, -- 5 (counts all rows) COUNT(value) AS non_null, -- 3 (excludes NULLs) SUM(value) AS total, -- 90 (10+30+50) AVG(value) AS average, -- 30 (90/3, not 90/5) MIN(value) AS minimum, -- 10 MAX(value) AS maximum -- 50FROM sample_values; -- Demonstrating COUNT(*) - COUNT(column) = NULL countSELECT COUNT(*) - COUNT(value) AS null_countFROM sample_values; -- Result: 2NULL means 'unknown.' Adding an unknown value to 90 gives an unknown total—not 90. The smallest of (10, unknown, 30) is still 10 because we know 10 exists; we just don't know what the unknown value is. SQL's NULL-exclusion preserves what we know rather than guessing at what we don't.
A critical distinction exists between an empty result set and a result set containing only NULL values. Understanding both behaviors is essential for robust queries.
Empty result set (no rows):
When no rows match the WHERE clause, aggregates face the mathematical question: what is the sum/average/min/max of nothing?
1234567891011121314151617181920
-- No matching rowsSELECT COUNT(*) AS cnt, -- 0 (zero rows counted) COUNT(value) AS non_null, -- 0 SUM(value) AS total, -- NULL (no values to sum) AVG(value) AS average, -- NULL (undefined) MIN(value) AS minimum, -- NULL (no values exist) MAX(value) AS maximum -- NULL (no values exist)FROM sample_valuesWHERE 1 = 0; -- False condition = empty set -- All-NULL data (rows exist, but all values are NULL)SELECT COUNT(*) AS cnt, -- 5 (if 5 NULL rows) COUNT(value) AS non_null, -- 0 (none are non-NULL) SUM(value) AS total, -- NULL AVG(value) AS average, -- NULL (0 non-NULL values) MIN(value) AS minimum, -- NULL MAX(value) AS maximum -- NULLFROM (SELECT NULL AS value FROM generate_series(1,5)) AS nulls;COUNT() returns 0 for empty sets but returns the actual row count for all-NULL data. This is the key differentiator: COUNT() counts rows, while all other aggregates process values. Empty set = 0 rows. All-NULL set = N rows, all with NULL values.
Practical handling:
Use COALESCE to provide default values when aggregates might return NULL:
12345678910111213141516171819
-- Safe defaults for empty/all-NULL scenariosSELECT COUNT(*) AS row_count, -- Already returns 0, not NULL COALESCE(SUM(amount), 0) AS total, COALESCE(AVG(amount), 0) AS average, COALESCE(MIN(amount), 0) AS minimum, COALESCE(MAX(amount), 0) AS maximumFROM ordersWHERE customer_id = 99999; -- Non-existent customer -- In reports, handle NULL explicitlySELECT department_id, CASE WHEN COUNT(salary) = 0 THEN 'No salary data' ELSE CAST(AVG(salary) AS VARCHAR) END AS avg_salary_displayFROM employeesGROUP BY department_id;While default NULL exclusion is often correct, sometimes you need different behavior. The primary tool for controlling NULL in aggregation is COALESCE—and where you place it matters greatly.
COALESCE inside the aggregate:
When COALESCE is inside the aggregate function, you're converting NULL values before aggregation. This changes which values participate in the calculation.
123456789101112131415161718
-- Data: bonus values are 100, NULL, 200, NULL, 300 -- Standard AVG (excludes NULLs)SELECT AVG(bonus) AS avg_bonus FROM employees;-- Result: (100+200+300)/3 = 200 -- COALESCE inside: Treats NULL as 0SELECT AVG(COALESCE(bonus, 0)) AS avg_bonus_with_zeros FROM employees;-- Result: (100+0+200+0+300)/5 = 120 -- For SUM, COALESCE inside has no effect (NULL + 0 = same sum)SELECT SUM(bonus) AS regular_sum FROM employees; -- 600SELECT SUM(COALESCE(bonus, 0)) AS sum_with_coalesce FROM employees; -- 600 -- For COUNT, COALESCE allows counting NULLs as valuesSELECT COUNT(bonus) AS count_regular FROM employees; -- 3SELECT COUNT(COALESCE(bonus, 0)) AS count_with_coalesce FROM employees; -- 5-- (Equivalent to COUNT(*) if all NULLs convert to non-NULL values)COALESCE outside the aggregate:
When COALESCE wraps the entire aggregate, you're providing a default for when the aggregate result itself is NULL (empty set or all-NULL data).
1234567891011121314151617
-- Empty result set scenarioSELECT SUM(bonus) AS raw_sum FROM employees WHERE 1 = 0;-- Result: NULL (no rows to sum) SELECT COALESCE(SUM(bonus), 0) AS safe_sum FROM employees WHERE 1 = 0;-- Result: 0 (default when aggregate is NULL) -- All-NULL data scenarioSELECT AVG(bonus) AS raw_avg FROM employees WHERE department = 'New Hires'; -- All have NULL bonuses-- Result: NULL SELECT COALESCE(AVG(bonus), 0) AS safe_avg FROM employees WHERE department = 'New Hires';-- Result: 0 (default when no non-NULL values)| Placement | Effect | Use When |
|---|---|---|
| Inside: AGG(COALESCE(col, default)) | Changes which values participate | NULL semantically means the default value |
| Outside: COALESCE(AGG(col), default) | Provides default for NULL result | Empty/all-NULL should return a specific value |
| Both: COALESCE(AGG(COALESCE(...)),default) | Both effects combined | Need both behaviors simultaneously |
Ask yourself: Does NULL mean 'zero' or 'unknown'? If an employee without a recorded bonus truly received no bonus (=$0), use COALESCE inside. If NULL means 'bonus not yet determined,' exclude it from averages—don't treat it as zero.
When aggregating expressions rather than single columns, NULL propagation follows standard SQL rules: any operation involving NULL yields NULL. This can unexpectedly exclude rows from aggregation.
12345678910111213141516171819202122232425
-- Sample data:-- employee | salary | bonus-- Alice | 50000 | 5000-- Bob | 60000 | NULL-- Carol | 55000 | 3000 -- Aggregating an expression with potential NULLSELECT SUM(salary + bonus) AS total_compensationFROM employees;-- Result: (50000+5000) + (55000+3000) = 113000-- Bob excluded! salary + NULL = NULL, and SUM ignores NULL -- What you probably wanted:SELECT SUM(salary + COALESCE(bonus, 0)) AS total_compensationFROM employees;-- Result: 55000 + 60000 + 58000 = 173000-- Bob now included with bonus = 0 -- Understanding the differenceSELECT SUM(salary) AS total_salary, -- 165000 (all included) SUM(bonus) AS total_bonus, -- 8000 (Bob excluded) SUM(salary) + SUM(bonus) AS sum_of_sums, -- 173000 SUM(salary + bonus) AS sum_of_expression -- 113000 (Bob excluded!)FROM employees;SUM(A) + SUM(B) is NOT always equal to SUM(A + B) when NULLs exist! The former sums A and B separately (potentially different rows), while the latter excludes any row where A + B is NULL (i.e., where either A or B is NULL).
12345678910111213141516171819202122
-- Pattern: Safely aggregate expressions with multiple columnsSELECT SUM(COALESCE(price, 0) * COALESCE(quantity, 0)) AS safe_totalFROM order_items; -- Pattern: Conditional aggregation with NULL-generating CASESELECT SUM(CASE WHEN status = 'completed' THEN amount END) AS completed_sumFROM orders;-- CASE returns NULL when condition is false, excluded from SUM-- This is the intended behavior for conditional sums! -- Pattern: Watch for NULL in WHERE vs in aggregationSELECT AVG(salary) FROM employees WHERE bonus IS NOT NULL;-- Different from:SELECT AVG(salary) FROM employees;-- First filters rows first; second filters only for AVG of bonus -- Pattern: NULLIF to force NULL (inverse of COALESCE)SELECT AVG(NULLIF(score, 0)) AS avg_nonzero_scoreFROM test_results;-- Treats 0 as 'no data' by converting it to NULL before aggregationWhen NULL appears in a GROUP BY column, it behaves differently than you might expect: all NULL values form a single group. This is an exception to the usual rule that NULL ≠ NULL.
12345678910111213141516171819
-- Sample data:-- employee | department_id-- Alice | 10-- Bob | 10-- Carol | 20-- David | NULL-- Eve | NULL SELECT department_id, COUNT(*) AS employee_countFROM employeesGROUP BY department_id; -- Results:-- department_id | employee_count-- 10 | 2-- 20 | 1-- NULL | 2 <-- David and Eve grouped together!In equality comparisons, NULL = NULL is UNKNOWN (not TRUE). But in GROUP BY, all NULLs are placed in the same group—as if they're equal. This is a practical necessity; otherwise, each NULL would form its own group.
12345678910111213141516171819202122232425262728293031323334
-- Handling NULL in grouping column for displaySELECT COALESCE(department_id::text, 'Unassigned') AS department, COUNT(*) AS employee_count, AVG(salary) AS avg_salaryFROM employeesGROUP BY department_idORDER BY department_id NULLS LAST; -- Excluding NULL group entirelySELECT department_id, COUNT(*) AS employee_countFROM employeesWHERE department_id IS NOT NULLGROUP BY department_id; -- Treating NULL as a specific value for groupingSELECT COALESCE(department_id, -1) AS dept_group, COUNT(*) AS employee_countFROM employeesGROUP BY COALESCE(department_id, -1); -- Multiple NULL columns in GROUP BYSELECT region, product_category, SUM(sales)FROM sales_dataGROUP BY region, product_category;-- If region=NULL, category='Electronics': forms one group-- If region=NULL, category=NULL: forms another group-- All NULL-NULL combinations group togetherNULL and ORDER BY NULLS FIRST/LAST:
When ordering grouped results, you can control where NULL groups appear:
12345678910111213141516
-- NULL groups at the endSELECT department_id, COUNT(*)FROM employeesGROUP BY department_idORDER BY department_id NULLS LAST; -- NULL groups at the beginningSELECT department_id, COUNT(*)FROM employeesGROUP BY department_idORDER BY department_id NULLS FIRST; -- Database-specific defaults:-- PostgreSQL: NULLS LAST for ASC, NULLS FIRST for DESC-- SQL Server: NULLs treated as smallest value (first in ASC)-- Oracle: NULLS LAST for ASC by defaultWhile standard aggregate NULL handling is consistent across databases, some systems offer additional NULL-related features.
1234567891011121314151617181920
-- FILTER clause excludes NULLs naturally via conditionSELECT COUNT(*) FILTER (WHERE bonus IS NOT NULL) AS with_bonus, AVG(salary) FILTER (WHERE department = 'Engineering') AS eng_avgFROM employees; -- array_agg with NULL handlingSELECT array_agg(name) AS all_names, -- Includes NULLs array_agg(name) FILTER (WHERE name IS NOT NULL) AS non_null_namesFROM employees; -- DISTINCT with NULL in arraysSELECT array_agg(DISTINCT department_id) FROM employees;-- NULLs may appear once in the array -- IS DISTINCT FROM for NULL-safe comparisonSELECT * FROM employees WHERE department_id IS DISTINCT FROM 10;-- Returns rows where dept ≠ 10 OR dept IS NULLLet's consolidate best practices for handling NULL in aggregation into a decision framework.
12345678910111213141516171819202122232425262728293031
-- Pattern: Count NULL valuesSELECT COUNT(*) - COUNT(column) AS null_count FROM table; -- Pattern: Percentage of non-NULLSELECT COUNT(column) * 100.0 / NULLIF(COUNT(*), 0) AS fill_rate_pctFROM table; -- Pattern: Safe aggregation for reportsSELECT COALESCE(SUM(amount), 0) AS total, COALESCE(AVG(amount), 0) AS average, COALESCE(MIN(amount), 0) AS minimum, COALESCE(MAX(amount), 0) AS maximum, COUNT(*) AS row_count, COUNT(amount) AS non_null_countFROM ordersWHERE customer_id = :customer_id; -- Pattern: Conditional aggregation handling NULL outcomesSELECT department_id, AVG(salary) AS avg_all, AVG(CASE WHEN performance = 'high' THEN salary END) AS avg_high_perf, -- avg_high_perf will be NULL for depts with no high performers COALESCE( AVG(CASE WHEN performance = 'high' THEN salary END), AVG(salary) ) AS avg_high_or_overallFROM employeesGROUP BY department_id;NULL handling in aggregation follows consistent patterns but requires careful thought. Let's consolidate the key concepts:
Module Complete:
You've now mastered all five standard aggregate functions—COUNT, SUM, AVG, MIN, and MAX—along with their comprehensive NULL handling behaviors. This foundation prepares you for the next module: GROUP BY, where we'll explore how aggregation works across segmented data.
You now have complete mastery of SQL aggregate functions: understanding how each function processes data, handles NULLs, and integrates with grouping operations. These skills form the foundation for all SQL data analysis and reporting.