Database Management SystemsSQL Joins & Subqueries

Outer Joins: Preserving Unmatched Data

LevelIntermediate

Duration60 mins

TopicSQL Joins & Subqueries

4 / 5

Preserving Unmatched Rows: The Heart of Outer Joins

Why Unmatched Rows Matter

The defining characteristic of outer joins is their ability to preserve unmatched rows. While INNER JOIN excludes rows that don't have corresponding matches in the other table, outer joins guarantee that certain rows—whether matched or not—appear in the output.

This capability isn't a minor technical feature; it fundamentally changes the questions you can answer with SQL. With row preservation, you can:

Find what's missing: Products without orders, customers without activity, employees without assignments
Include zeros in reports: Show regions with $0 sales, months with no transactions, categories with no items
Maintain complete reference data: Ensure all master data entities appear even when no transactions reference them
Detect data quality issues: Find orphaned records, broken relationships, incomplete data

This page explores the mechanics, patterns, and advanced techniques for effectively preserving unmatched rows.

Learning Objectives

By completing this page, you will:\n• Understand the philosophy behind row preservation in outer joins\n• Master the anti-join pattern for finding 'missing' data\n• Handle aggregation correctly with preserved unmatched rows\n• Apply conditional preservation with complex ON clauses\n• Recognize when row preservation is accidentally broken\n• Design queries that correctly maintain preservation chains

The Philosophy of Preservation

To truly master outer joins, it helps to understand why the SQL standard includes this capability—what class of problems it was designed to solve.

The Relational Model's Natural Fit:

Relational databases organize data into multiple normalized tables, with foreign keys connecting them. But not every relationship is mandatory:

A customer might have orders, or might not yet
A product might have reviews, or might be new
An employee might be assigned to projects, or might be between assignments

INNER JOIN assumes relationships must exist. It's perfect for navigating mandatory relationships (every order has a customer). But for optional relationships, INNER JOIN loses data that legitimately exists in isolation.

Relationship Optionality and Join Selection
Relationship Type	Example	Appropriate Join
Mandatory on both sides	Order must have Customer, Customer must exist to have Order	INNER JOIN
Mandatory on one side	Order must have Customer, Customer may have 0+ Orders	LEFT JOIN (Customer LEFT JOIN Order)
Optional on both sides	Student may have 0+ Courses, Course may have 0+ Students	FULL OUTER JOIN or separate queries
Self-referential optional	Employee may have Manager (another Employee)	LEFT JOIN (self-join)

Design Intent Drives Join Choice

The join type you choose should reflect your data model's semantics. If the business says "a customer can exist without orders," then queries about customers should probably use LEFT JOIN to Orders—preserving all customers.

The Query Intent Perspective:

Beyond data model semantics, consider what the query is trying to answer:

Question Intent	Data to Preserve	Join Strategy
"Show customers and their orders"	Customers	LEFT JOIN (Customers → Orders)
"Show orders with customer details"	Orders	LEFT JOIN (Orders → Customers) or INNER
"Find customers who never ordered"	Unmatched customers	LEFT JOIN + WHERE NULL
"Compare Q1 vs Q2 sales by employee"	Both quarters	FULL OUTER JOIN

The "preserve" side is always the one where you need all records regardless of matches.

The Anti-Join Pattern Deep Dive

One of the most powerful applications of preserved unmatched rows is the anti-join pattern—finding rows in one table that have no corresponding rows in another table.

The Pattern:

SELECT A.*
FROM A
LEFT JOIN B ON A.key = B.foreign_key
WHERE B.primary_key IS NULL;

This pattern leverages the fact that unmatched rows have NULL in all columns from the non-preserved table. By filtering to only rows where the joined table's primary key IS NULL, we get exactly the unmatched rows.

anti_join_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- Classic anti-join: Customers without orders
SELECT 
    c.customer_id,
    c.customer_name,
    c.email,
    c.registration_date
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
 
-- variation: Count how many never ordered
SELECT COUNT(*) as never_ordered_count
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
 
-- Products without reviews (to prompt for reviews)
SELECT 
    p.product_id,
    p.product_name,
    p.launch_date,
    DATEDIFF(CURRENT_DATE, p.launch_date) as days_since_launch
FROM Products p
LEFT JOIN Reviews r ON p.product_id = r.product_id
WHERE r.review_id IS NULL
ORDER BY p.launch_date ASC;  -- Oldest products first
 
-- Employees not assigned to any current project
SELECT 
    e.employee_id,
    e.name,
    e.department,
    e.hire_date
FROM Employees e
LEFT JOIN ProjectAssignments pa 
    ON e.employee_id = pa.employee_id
    AND pa.end_date IS NULL  -- Only current assignments
WHERE pa.assignment_id IS NULL
ORDER BY e.department, e.name;

Anti-Join vs Alternative Approaches:

The same result can be achieved with NOT EXISTS or NOT IN. Understanding the trade-offs helps you choose the best approach:

anti_join_alternatives.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Method 1: LEFT JOIN + IS NULL (Anti-Join)
SELECT c.customer_id, c.customer_name
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
 
-- Method 2: NOT EXISTS (Correlated Subquery)
SELECT c.customer_id, c.customer_name
FROM Customers c
WHERE NOT EXISTS (
    SELECT 1 
    FROM Orders o 
    WHERE o.customer_id = c.customer_id
);
 
-- Method 3: NOT IN (Subquery)
SELECT c.customer_id, c.customer_name
FROM Customers c
WHERE c.customer_id NOT IN (
    SELECT o.customer_id 
    FROM Orders o
    WHERE o.customer_id IS NOT NULL  -- CRITICAL: handle NULLs!
);
 
-- All three return the same result, but:
-- - LEFT JOIN + IS NULL: Often fastest with good indexes
-- - NOT EXISTS: Clear intent, handles NULLs naturally
-- - NOT IN: Simplest syntax, but dangerous with NULLs

NOT IN NULL Danger

If the NOT IN subquery can return NULL values, the entire NOT IN predicate evaluates to UNKNOWN, and no rows are returned. Always add a WHERE column IS NOT NULL filter, or prefer LEFT JOIN or NOT EXISTS which handle NULLs correctly.

Anti-Join Approach Comparison
Approach	NULL Safety	Performance	Readability
LEFT JOIN + IS NULL	Safe (NULLs in B are fine)	Often optimal	Clear once you know the pattern
NOT EXISTS	Safe (NULLs handled correctly)	Usually good	Clearest intent
NOT IN	DANGEROUS with NULLs	Can be slow	Simplest syntax
EXCEPT	Safe	Varies by database	Clear for key-only queries

Aggregation with Preserved Rows

When combining outer joins with aggregate functions, special attention is needed to handle the NULLs that appear for unmatched rows.

The Problem:

Unmatched rows contribute NULL values to aggregation
SUM(NULL) = NULL, AVG(NULL) = NULL, COUNT(column) ignores NULLs
Reports can show NULL instead of 0 for entities with no activity

The Solution: COALESCE and COUNT Awareness

aggregation_with_outer_join.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- PROBLEM: NULL appears for departments with no sales
SELECT 
    d.department_name,
    SUM(s.amount) as total_sales  -- NULL for depts with no sales
FROM Departments d
LEFT JOIN Sales s ON d.department_id = s.department_id
GROUP BY d.department_id, d.department_name;
 
-- SOLUTION: Use COALESCE to convert NULL to 0
SELECT 
    d.department_name,
    COALESCE(SUM(s.amount), 0) as total_sales  -- 0 for depts with no sales
FROM Departments d
LEFT JOIN Sales s ON d.department_id = s.department_id
GROUP BY d.department_id, d.department_name;
 
-- COUNT behavior: COUNT(column) vs COUNT(*)
SELECT 
    d.department_name,
    COUNT(*) as row_count,           -- Counts all rows (1 even for no sales)
    COUNT(s.sale_id) as sales_count  -- Counts non-NULL values (0 for no sales)
FROM Departments d
LEFT JOIN Sales s ON d.department_id = s.department_id
GROUP BY d.department_id, d.department_name;
 
-- For a department with no sales:
-- COUNT(*) = 1 (the preserved row is counted)
-- COUNT(s.sale_id) = 0 (the NULL sale_id is not counted)

COUNT(*) vs COUNT(column)

Use COUNT(joined_table.column) to count actual matches, not COUNT(*). In an outer join, unmatched rows are still rows—COUNT(*) counts them. But COUNT(column) skips NULLs, giving you the true match count.

Multi-Level Aggregation:

When joining to already-aggregated data, be careful about where the NULL handling occurs:

multi_level_aggregation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
-- Pattern: Join master data to pre-aggregated transaction data
 
-- Step 1: Pre-aggregate the transaction data
WITH SalesSummary AS (
    SELECT 
        product_id,
        SUM(quantity) as total_qty,
        SUM(amount) as total_revenue
    FROM Sales
    WHERE sale_date >= '2024-01-01'
    GROUP BY product_id
)
-- Step 2: LEFT JOIN to preserve all products
SELECT 
    p.product_id,
    p.product_name,
    p.category,
    COALESCE(ss.total_qty, 0) as total_qty,
    COALESCE(ss.total_revenue, 0) as total_revenue
FROM Products p
LEFT JOIN SalesSummary ss ON p.product_id = ss.product_id
ORDER BY total_revenue DESC;
 
-- This approach:
-- 1. Aggregates first (efficient, single pass over Sales)
-- 2. Joins to master data (preserves all products)
-- 3. Handles NULLs at the SELECT level
 
-- VERSUS joining first, then aggregating (often less efficient)
SELECT 
    p.product_id,
    p.product_name,
    p.category,
    COALESCE(SUM(s.quantity), 0) as total_qty,
    COALESCE(SUM(s.amount), 0) as total_revenue
FROM Products p
LEFT JOIN Sales s 
    ON p.product_id = s.product_id
    AND s.sale_date >= '2024-01-01'
GROUP BY p.product_id, p.product_name, p.category
ORDER BY total_revenue DESC;

Aggregation Best Practices

•Wrap sums in COALESCE: COALESCE(SUM(column), 0) ensures 0 instead of NULL
•Use COUNT(column), not COUNT(*): Get true match counts for outer-joined columns
•Consider pre-aggregating: CTE-based pre-aggregation is often cleaner and more efficient
•Test with known edge cases: Verify your query returns correct results for entities with no matches

Conditional Preservation with ON Clauses

Outer joins become even more powerful when you add conditions to the ON clause. This allows conditional matching while still preserving all rows from the primary table.

The Key Insight:

Conditions in the ON clause affect which rows count as matches. Rows that don't meet the condition become "unmatched" and get NULL values—but the left table row is still preserved.

This is fundamentally different from filtering in WHERE, which removes rows entirely.

conditional_preservation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- Scenario: Show all products, with only high-value orders (if any)
 
-- Version 1: Condition in ON (correct for preservation)
SELECT 
    p.product_id,
    p.product_name,
    o.order_id,
    o.total_amount
FROM Products p
LEFT JOIN Orders o 
    ON p.product_id = o.product_id
    AND o.total_amount > 1000;  -- Only join high-value orders
 
-- Result: ALL products appear
-- Products with high-value orders: shows order details
-- Products without high-value orders: shows NULL for order columns
-- Products with only low-value orders: shows NULL (not matched)
 
-- Version 2: Condition in WHERE (breaks preservation)
SELECT 
    p.product_id,
    p.product_name,
    o.order_id,
    o.total_amount
FROM Products p
LEFT JOIN Orders o 
    ON p.product_id = o.product_id
WHERE o.total_amount > 1000;  -- WRONG: filters out products without HV orders
 
-- Result: Only products WITH high-value orders appear
-- Products without any orders: EXCLUDED
-- Products with only low-value orders: EXCLUDED

The ON vs WHERE Trap Revisited

This is the most common bug with outer joins. If you filter on a column from the outer-joined table in WHERE, you accidentally convert your LEFT JOIN to an INNER JOIN. Always ask: "Do I want rows without matches to disappear?" If no, put the condition in ON.

Complex Conditional Scenarios:

You can use any valid expression in the ON clause to precisely control what constitutes a "match":

complex_on_conditions.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
-- Time-based conditional join
-- "Show all employees with their current project (if any)"
SELECT 
    e.employee_id,
    e.name,
    p.project_name,
    a.role
FROM Employees e
LEFT JOIN ProjectAssignments a 
    ON e.employee_id = a.employee_id
    AND a.start_date <= CURRENT_DATE
    AND (a.end_date IS NULL OR a.end_date >= CURRENT_DATE)
LEFT JOIN Projects p 
    ON a.project_id = p.project_id;
 
-- All employees appear; only current assignments are joined
 
-- Range-based conditional join
-- "Show all products with their applicable discount (if any)"
SELECT 
    p.product_id,
    p.product_name,
    p.price,
    d.discount_percentage,
    p.price * (1 - COALESCE(d.discount_percentage, 0) / 100) as final_price
FROM Products p
LEFT JOIN Discounts d
    ON p.category_id = d.category_id
    AND p.price BETWEEN d.min_price AND d.max_price
    AND CURRENT_DATE BETWEEN d.valid_from AND d.valid_to;
 
-- All products appear; discount only when all conditions match
 
-- Status-based conditional join
-- "Show all customers with their open tickets (if any)"
SELECT 
    c.customer_id,
    c.customer_name,
    c.tier,
    COUNT(t.ticket_id) as open_ticket_count
FROM Customers c
LEFT JOIN SupportTickets t
    ON c.customer_id = t.customer_id
    AND t.status IN ('Open', 'Pending', 'Escalated')
GROUP BY c.customer_id, c.customer_name, c.tier;
 
-- All customers appear with their open ticket count (0 if none)

Maintaining Preservation Through Join Chains

When you chain multiple joins, the preservation behavior cascades. Understanding how this works is essential for correct multi-table queries.

The Cascade Principle:

LEFT JOIN A to B: All A rows preserved, with B data where available
LEFT JOIN result to C: All rows from step 1 preserved, with C data where available
And so on...

Each subsequent LEFT JOIN preserves the current result set and optionally adds more data.

preservation_chain.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Full preservation chain: Customers → Orders → Items → Products
SELECT 
    c.customer_id,
    c.customer_name,
    o.order_id,
    o.order_date,
    oi.quantity,
    p.product_name,
    p.price
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
LEFT JOIN OrderItems oi ON o.order_id = oi.order_id
LEFT JOIN Products p ON oi.product_id = p.product_id
ORDER BY c.customer_id, o.order_id, oi.item_id;
 
-- Trace for different customer scenarios:
-- 
-- Customer with orders (ID 1):
-- Row 1: (1, 'Alice', 101, '2024-01-15', 2, 'Widget', 29.99)
-- Row 2: (1, 'Alice', 101, '2024-01-15', 1, 'Gadget', 49.99)
-- Row 3: (1, 'Alice', 102, '2024-02-01', 3, 'Widget', 29.99)
-- 
-- Customer with no orders (ID 2):
-- Row 4: (2, 'Bob', NULL, NULL, NULL, NULL, NULL)
-- 
-- All NULLs cascade because:
-- - Customer 2 → No matching order, so o.* = NULL
-- - NULL o.order_id → No matching order items possible, so oi.* = NULL
-- - NULL oi.product_id → No matching product possible, so p.* = NULL

Breaking the Preservation Chain:

Introducing an INNER JOIN anywhere in the chain will filter out rows that accumulated NULLs up to that point.

broken_chain.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- BROKEN: INNER JOIN in the middle breaks preservation
SELECT 
    c.customer_id,
    c.customer_name,
    o.order_id,
    oi.quantity,
    p.product_name
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
INNER JOIN OrderItems oi ON o.order_id = oi.order_id  -- BREAKS THE CHAIN
LEFT JOIN Products p ON oi.product_id = p.product_id;
 
-- Problem: The INNER JOIN requires o.order_id IS NOT NULL
-- This excludes all customers who have no orders!
 
-- Analysis:
-- - Customer with no orders has o.order_id = NULL
-- - INNER JOIN OrderItems requires o.order_id = oi.order_id
-- - NULL = anything is never true, so this row is filtered out
 
-- FIX: Use LEFT JOIN consistently
SELECT 
    c.customer_id,
    c.customer_name,
    o.order_id,
    oi.quantity,
    p.product_name
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
LEFT JOIN OrderItems oi ON o.order_id = oi.order_id
LEFT JOIN Products p ON oi.product_id = p.product_id;

The Preservation Rule

If you need to preserve all rows from table A through a chain of joins, use LEFT JOIN throughout the chain. Any INNER JOIN on a column that might be NULL (due to earlier outer joins) will break preservation.

Detecting and Debugging Broken Preservation

One of the trickiest bugs with outer joins is accidentally breaking preservation. The query runs without error, but returns fewer rows than expected. Here's how to detect and fix these issues.

Symptoms of Broken Preservation:

Query returns fewer rows than the 'left' table has
Entities with no related data don't appear
Adding more joins reduces row count unexpectedly

debugging_preservation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Debugging technique: Progressive join verification
 
-- Step 1: Baseline - count of entities to preserve
SELECT COUNT(*) as total_customers FROM Customers;
-- Result: 1000 customers
 
-- Step 2: After first LEFT JOIN
SELECT COUNT(DISTINCT c.customer_id) as preserved_customers
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id;
-- Should still be: 1000 (all customers preserved)
 
-- Step 3: After second JOIN
SELECT COUNT(DISTINCT c.customer_id) as preserved_customers
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
LEFT JOIN OrderItems oi ON o.order_id = oi.order_id;
-- Should still be: 1000
 
-- Step 4: After adding filtering condition
SELECT COUNT(DISTINCT c.customer_id) as preserved_customers
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
LEFT JOIN OrderItems oi ON o.order_id = oi.order_id
WHERE o.order_date >= '2024-01-01';  -- PROBLEM!
-- Result: 750 (only customers with orders since Jan 1)
 
-- The WHERE filter on the outer-joined table broke preservation!

Common Causes of Broken Preservation

•WHERE on outer-joined columns: Filtering on columns from the 'right' table of a LEFT JOIN excludes unmatched rows
•INNER JOIN after LEFT JOIN: An INNER JOIN requiring the previously-NULL column breaks the chain
•Non-null assumptions: Code that assumes joined columns are never NULL crashes or behaves unexpectedly
•Implicit type coercion: Some databases may coerce NULL in ways that fail the join condition
•OR conditions with NULL: col = value OR col = other_value excludes NULLs (neither condition matches NULL)

fixes_for_broken_preservation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- FIX 1: Move filter from WHERE to ON
-- BEFORE (broken):
SELECT c.*, o.*
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= '2024-01-01';
 
-- AFTER (fixed):
SELECT c.*, o.*
FROM Customers c
LEFT JOIN Orders o 
    ON c.customer_id = o.customer_id
    AND o.order_date >= '2024-01-01';
 
-- FIX 2: Handle NULL in WHERE with OR IS NULL
-- When you DO need WHERE filtering but want to preserve unmatched:
SELECT c.*, o.*
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= '2024-01-01' 
   OR o.order_id IS NULL;  -- Explicitly preserve unmatched
 
-- FIX 3: Pre-filter in a CTE before joining
-- Filter the right table first, then LEFT JOIN
WITH RecentOrders AS (
    SELECT *
    FROM Orders
    WHERE order_date >= '2024-01-01'
)
SELECT c.*, ro.*
FROM Customers c
LEFT JOIN RecentOrders ro ON c.customer_id = ro.customer_id;

Preservation Design Patterns

Over time, several design patterns have emerged for effectively using row preservation in complex queries.

Pattern 1: Hub-and-Spoke

Keep your primary 'hub' table first, and LEFT JOIN all related 'spoke' tables to it:

pattern_hub_spoke.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Hub: Customers (always preserved)
-- Spokes: Orders, Reviews, SupportTickets (optional relationships)
SELECT 
    c.customer_id,
    c.customer_name,
    c.registration_date,
    COUNT(DISTINCT o.order_id) as order_count,
    COUNT(DISTINCT r.review_id) as review_count,
    COUNT(DISTINCT t.ticket_id) as ticket_count
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
LEFT JOIN Reviews r ON c.customer_id = r.customer_id
LEFT JOIN SupportTickets t ON c.customer_id = t.customer_id
GROUP BY c.customer_id, c.customer_name, c.registration_date;
 
-- Every customer appears with counts of their related activities
-- All counts default to 0 for customers with no activity in that area

Pattern 2: Generate-and-Join

Generate a complete set of expected entities, then LEFT JOIN actual data:

pattern_generate_join.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- Generate complete date range, join actual sales data
WITH DateRange AS (
    SELECT generate_series(
        '2024-01-01'::date,
        '2024-12-31'::date,
        '1 day'::interval
    )::date as report_date
)
SELECT 
    dr.report_date,
    COALESCE(SUM(s.amount), 0) as daily_sales,
    COUNT(s.sale_id) as transaction_count
FROM DateRange dr
LEFT JOIN Sales s ON DATE(s.sale_timestamp) = dr.report_date
GROUP BY dr.report_date
ORDER BY dr.report_date;
 
-- Every day of 2024 appears, even days with $0 sales
-- No gaps in the time series
 
-- Cross-join for all product-region combinations
WITH AllProducts AS (SELECT product_id, product_name FROM Products),
     AllRegions AS (SELECT region_id, region_name FROM Regions)
SELECT 
    p.product_name,
    r.region_name,
    COALESCE(SUM(s.quantity), 0) as total_sold
FROM AllProducts p
CROSS JOIN AllRegions r
LEFT JOIN Sales s 
    ON p.product_id = s.product_id 
    AND r.region_id = s.region_id
GROUP BY p.product_id, p.product_name, r.region_id, r.region_name
ORDER BY p.product_name, r.region_name;
 
-- Shows all product-region combinations, even with 0 sales

Generate-and-Join for Complete Reports

When you need a complete dimensional analysis (all dates, all products, all regions), generate the dimension combinations first with CROSS JOIN, then LEFT JOIN your fact data. This guarantees no gaps in your report.

Summary: Preserving Unmatched Rows

The ability to preserve unmatched rows is what makes outer joins indispensable for real-world SQL. Understanding this capability deeply enables you to answer business questions that INNER JOIN simply cannot address.

Key Takeaways

•Preservation reflects data model semantics—choose LEFT JOIN when relationships are optional and you need complete data from one side.
•Anti-join pattern (LEFT JOIN + WHERE pk IS NULL) is the standard way to find 'missing' relationships.
•COALESCE is essential for aggregation—wrap SUM/AVG in COALESCE to get 0 instead of NULL for unmatched rows.
•Use COUNT(column), not COUNT(*) to count true matches in outer-joined results.
•ON clause conditions preserve rows; WHERE conditions filter them—know which you need.
•Preservation cascades through join chains—any INNER JOIN on a potentially-NULL column breaks the chain.
•Debug progressively—verify preserved row counts at each join stage to detect when preservation breaks.

What's Next: NULL in Results

The final page in this module focuses entirely on NULL values in outer join results—how to interpret them, when they appear, how to handle them in calculations, and the subtle bugs that NULL handling can introduce.

Page Complete

You now have a deep understanding of row preservation in outer joins—the concept that enables finding missing data, creating gap-free reports, and maintaining complete reference data in your queries. These patterns are fundamental to production SQL.

4 / 5

Loading learning content...

Database Management SystemsSQL Joins & Subqueries

Outer Joins: Preserving Unmatched Data

LevelIntermediate

Duration60 mins

TopicSQL Joins & Subqueries

4 / 5

Preserving Unmatched Rows: The Heart of Outer Joins

Why Unmatched Rows Matter

This capability isn't a minor technical feature; it fundamentally changes the questions you can answer with SQL. With row preservation, you can:

Find what's missing: Products without orders, customers without activity, employees without assignments
Include zeros in reports: Show regions with $0 sales, months with no transactions, categories with no items
Maintain complete reference data: Ensure all master data entities appear even when no transactions reference them
Detect data quality issues: Find orphaned records, broken relationships, incomplete data

This page explores the mechanics, patterns, and advanced techniques for effectively preserving unmatched rows.

Learning Objectives

The Philosophy of Preservation

To truly master outer joins, it helps to understand why the SQL standard includes this capability—what class of problems it was designed to solve.

The Relational Model's Natural Fit:

Relational databases organize data into multiple normalized tables, with foreign keys connecting them. But not every relationship is mandatory:

A customer might have orders, or might not yet
A product might have reviews, or might be new
An employee might be assigned to projects, or might be between assignments

Relationship Optionality and Join Selection
Relationship Type	Example	Appropriate Join
Mandatory on both sides	Order must have Customer, Customer must exist to have Order	INNER JOIN
Mandatory on one side	Order must have Customer, Customer may have 0+ Orders	LEFT JOIN (Customer LEFT JOIN Order)
Optional on both sides	Student may have 0+ Courses, Course may have 0+ Students	FULL OUTER JOIN or separate queries
Self-referential optional	Employee may have Manager (another Employee)	LEFT JOIN (self-join)

Design Intent Drives Join Choice

The Query Intent Perspective:

Beyond data model semantics, consider what the query is trying to answer:

Question Intent	Data to Preserve	Join Strategy
"Show customers and their orders"	Customers	LEFT JOIN (Customers → Orders)
"Show orders with customer details"	Orders	LEFT JOIN (Orders → Customers) or INNER
"Find customers who never ordered"	Unmatched customers	LEFT JOIN + WHERE NULL
"Compare Q1 vs Q2 sales by employee"	Both quarters	FULL OUTER JOIN

The "preserve" side is always the one where you need all records regardless of matches.

The Anti-Join Pattern Deep Dive

One of the most powerful applications of preserved unmatched rows is the anti-join pattern—finding rows in one table that have no corresponding rows in another table.

The Pattern:

SELECT A.*
FROM A
LEFT JOIN B ON A.key = B.foreign_key
WHERE B.primary_key IS NULL;

anti_join_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- Classic anti-join: Customers without orders
SELECT 
    c.customer_id,
    c.customer_name,
    c.email,
    c.registration_date
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
 
-- variation: Count how many never ordered
SELECT COUNT(*) as never_ordered_count
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
 
-- Products without reviews (to prompt for reviews)
SELECT 
    p.product_id,
    p.product_name,
    p.launch_date,
    DATEDIFF(CURRENT_DATE, p.launch_date) as days_since_launch
FROM Products p
LEFT JOIN Reviews r ON p.product_id = r.product_id
WHERE r.review_id IS NULL
ORDER BY p.launch_date ASC;  -- Oldest products first
 
-- Employees not assigned to any current project
SELECT 
    e.employee_id,
    e.name,
    e.department,
    e.hire_date
FROM Employees e
LEFT JOIN ProjectAssignments pa 
    ON e.employee_id = pa.employee_id
    AND pa.end_date IS NULL  -- Only current assignments
WHERE pa.assignment_id IS NULL
ORDER BY e.department, e.name;

Anti-Join vs Alternative Approaches:

The same result can be achieved with NOT EXISTS or NOT IN. Understanding the trade-offs helps you choose the best approach:

anti_join_alternatives.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Method 1: LEFT JOIN + IS NULL (Anti-Join)
SELECT c.customer_id, c.customer_name
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
 
-- Method 2: NOT EXISTS (Correlated Subquery)
SELECT c.customer_id, c.customer_name
FROM Customers c
WHERE NOT EXISTS (
    SELECT 1 
    FROM Orders o 
    WHERE o.customer_id = c.customer_id
);
 
-- Method 3: NOT IN (Subquery)
SELECT c.customer_id, c.customer_name
FROM Customers c
WHERE c.customer_id NOT IN (
    SELECT o.customer_id 
    FROM Orders o
    WHERE o.customer_id IS NOT NULL  -- CRITICAL: handle NULLs!
);
 
-- All three return the same result, but:
-- - LEFT JOIN + IS NULL: Often fastest with good indexes
-- - NOT EXISTS: Clear intent, handles NULLs naturally
-- - NOT IN: Simplest syntax, but dangerous with NULLs

NOT IN NULL Danger

Anti-Join Approach Comparison
Approach	NULL Safety	Performance	Readability
LEFT JOIN + IS NULL	Safe (NULLs in B are fine)	Often optimal	Clear once you know the pattern
NOT EXISTS	Safe (NULLs handled correctly)	Usually good	Clearest intent
NOT IN	DANGEROUS with NULLs	Can be slow	Simplest syntax
EXCEPT	Safe	Varies by database	Clear for key-only queries

Aggregation with Preserved Rows

When combining outer joins with aggregate functions, special attention is needed to handle the NULLs that appear for unmatched rows.

The Problem:

Unmatched rows contribute NULL values to aggregation
SUM(NULL) = NULL, AVG(NULL) = NULL, COUNT(column) ignores NULLs
Reports can show NULL instead of 0 for entities with no activity

The Solution: COALESCE and COUNT Awareness

aggregation_with_outer_join.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- PROBLEM: NULL appears for departments with no sales
SELECT 
    d.department_name,
    SUM(s.amount) as total_sales  -- NULL for depts with no sales
FROM Departments d
LEFT JOIN Sales s ON d.department_id = s.department_id
GROUP BY d.department_id, d.department_name;
 
-- SOLUTION: Use COALESCE to convert NULL to 0
SELECT 
    d.department_name,
    COALESCE(SUM(s.amount), 0) as total_sales  -- 0 for depts with no sales
FROM Departments d
LEFT JOIN Sales s ON d.department_id = s.department_id
GROUP BY d.department_id, d.department_name;
 
-- COUNT behavior: COUNT(column) vs COUNT(*)
SELECT 
    d.department_name,
    COUNT(*) as row_count,           -- Counts all rows (1 even for no sales)
    COUNT(s.sale_id) as sales_count  -- Counts non-NULL values (0 for no sales)
FROM Departments d
LEFT JOIN Sales s ON d.department_id = s.department_id
GROUP BY d.department_id, d.department_name;
 
-- For a department with no sales:
-- COUNT(*) = 1 (the preserved row is counted)
-- COUNT(s.sale_id) = 0 (the NULL sale_id is not counted)

COUNT(*) vs COUNT(column)

Multi-Level Aggregation:

When joining to already-aggregated data, be careful about where the NULL handling occurs:

multi_level_aggregation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
-- Pattern: Join master data to pre-aggregated transaction data
 
-- Step 1: Pre-aggregate the transaction data
WITH SalesSummary AS (
    SELECT 
        product_id,
        SUM(quantity) as total_qty,
        SUM(amount) as total_revenue
    FROM Sales
    WHERE sale_date >= '2024-01-01'
    GROUP BY product_id
)
-- Step 2: LEFT JOIN to preserve all products
SELECT 
    p.product_id,
    p.product_name,
    p.category,
    COALESCE(ss.total_qty, 0) as total_qty,
    COALESCE(ss.total_revenue, 0) as total_revenue
FROM Products p
LEFT JOIN SalesSummary ss ON p.product_id = ss.product_id
ORDER BY total_revenue DESC;
 
-- This approach:
-- 1. Aggregates first (efficient, single pass over Sales)
-- 2. Joins to master data (preserves all products)
-- 3. Handles NULLs at the SELECT level
 
-- VERSUS joining first, then aggregating (often less efficient)
SELECT 
    p.product_id,
    p.product_name,
    p.category,
    COALESCE(SUM(s.quantity), 0) as total_qty,
    COALESCE(SUM(s.amount), 0) as total_revenue
FROM Products p
LEFT JOIN Sales s 
    ON p.product_id = s.product_id
    AND s.sale_date >= '2024-01-01'
GROUP BY p.product_id, p.product_name, p.category
ORDER BY total_revenue DESC;

Aggregation Best Practices

•Wrap sums in COALESCE: COALESCE(SUM(column), 0) ensures 0 instead of NULL
•Use COUNT(column), not COUNT(*): Get true match counts for outer-joined columns
•Consider pre-aggregating: CTE-based pre-aggregation is often cleaner and more efficient
•Test with known edge cases: Verify your query returns correct results for entities with no matches

Conditional Preservation with ON Clauses

Outer joins become even more powerful when you add conditions to the ON clause. This allows conditional matching while still preserving all rows from the primary table.

The Key Insight:

Conditions in the ON clause affect which rows count as matches. Rows that don't meet the condition become "unmatched" and get NULL values—but the left table row is still preserved.

This is fundamentally different from filtering in WHERE, which removes rows entirely.

conditional_preservation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- Scenario: Show all products, with only high-value orders (if any)
 
-- Version 1: Condition in ON (correct for preservation)
SELECT 
    p.product_id,
    p.product_name,
    o.order_id,
    o.total_amount
FROM Products p
LEFT JOIN Orders o 
    ON p.product_id = o.product_id
    AND o.total_amount > 1000;  -- Only join high-value orders
 
-- Result: ALL products appear
-- Products with high-value orders: shows order details
-- Products without high-value orders: shows NULL for order columns
-- Products with only low-value orders: shows NULL (not matched)
 
-- Version 2: Condition in WHERE (breaks preservation)
SELECT 
    p.product_id,
    p.product_name,
    o.order_id,
    o.total_amount
FROM Products p
LEFT JOIN Orders o 
    ON p.product_id = o.product_id
WHERE o.total_amount > 1000;  -- WRONG: filters out products without HV orders
 
-- Result: Only products WITH high-value orders appear
-- Products without any orders: EXCLUDED
-- Products with only low-value orders: EXCLUDED

The ON vs WHERE Trap Revisited

Complex Conditional Scenarios:

You can use any valid expression in the ON clause to precisely control what constitutes a "match":

complex_on_conditions.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
-- Time-based conditional join
-- "Show all employees with their current project (if any)"
SELECT 
    e.employee_id,
    e.name,
    p.project_name,
    a.role
FROM Employees e
LEFT JOIN ProjectAssignments a 
    ON e.employee_id = a.employee_id
    AND a.start_date <= CURRENT_DATE
    AND (a.end_date IS NULL OR a.end_date >= CURRENT_DATE)
LEFT JOIN Projects p 
    ON a.project_id = p.project_id;
 
-- All employees appear; only current assignments are joined
 
-- Range-based conditional join
-- "Show all products with their applicable discount (if any)"
SELECT 
    p.product_id,
    p.product_name,
    p.price,
    d.discount_percentage,
    p.price * (1 - COALESCE(d.discount_percentage, 0) / 100) as final_price
FROM Products p
LEFT JOIN Discounts d
    ON p.category_id = d.category_id
    AND p.price BETWEEN d.min_price AND d.max_price
    AND CURRENT_DATE BETWEEN d.valid_from AND d.valid_to;
 
-- All products appear; discount only when all conditions match
 
-- Status-based conditional join
-- "Show all customers with their open tickets (if any)"
SELECT 
    c.customer_id,
    c.customer_name,
    c.tier,
    COUNT(t.ticket_id) as open_ticket_count
FROM Customers c
LEFT JOIN SupportTickets t
    ON c.customer_id = t.customer_id
    AND t.status IN ('Open', 'Pending', 'Escalated')
GROUP BY c.customer_id, c.customer_name, c.tier;
 
-- All customers appear with their open ticket count (0 if none)

Maintaining Preservation Through Join Chains

When you chain multiple joins, the preservation behavior cascades. Understanding how this works is essential for correct multi-table queries.

The Cascade Principle:

LEFT JOIN A to B: All A rows preserved, with B data where available
LEFT JOIN result to C: All rows from step 1 preserved, with C data where available
And so on...

Each subsequent LEFT JOIN preserves the current result set and optionally adds more data.

preservation_chain.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Full preservation chain: Customers → Orders → Items → Products
SELECT 
    c.customer_id,
    c.customer_name,
    o.order_id,
    o.order_date,
    oi.quantity,
    p.product_name,
    p.price
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
LEFT JOIN OrderItems oi ON o.order_id = oi.order_id
LEFT JOIN Products p ON oi.product_id = p.product_id
ORDER BY c.customer_id, o.order_id, oi.item_id;
 
-- Trace for different customer scenarios:
-- 
-- Customer with orders (ID 1):
-- Row 1: (1, 'Alice', 101, '2024-01-15', 2, 'Widget', 29.99)
-- Row 2: (1, 'Alice', 101, '2024-01-15', 1, 'Gadget', 49.99)
-- Row 3: (1, 'Alice', 102, '2024-02-01', 3, 'Widget', 29.99)
-- 
-- Customer with no orders (ID 2):
-- Row 4: (2, 'Bob', NULL, NULL, NULL, NULL, NULL)
-- 
-- All NULLs cascade because:
-- - Customer 2 → No matching order, so o.* = NULL
-- - NULL o.order_id → No matching order items possible, so oi.* = NULL
-- - NULL oi.product_id → No matching product possible, so p.* = NULL

Breaking the Preservation Chain:

Introducing an INNER JOIN anywhere in the chain will filter out rows that accumulated NULLs up to that point.

broken_chain.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- BROKEN: INNER JOIN in the middle breaks preservation
SELECT 
    c.customer_id,
    c.customer_name,
    o.order_id,
    oi.quantity,
    p.product_name
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
INNER JOIN OrderItems oi ON o.order_id = oi.order_id  -- BREAKS THE CHAIN
LEFT JOIN Products p ON oi.product_id = p.product_id;
 
-- Problem: The INNER JOIN requires o.order_id IS NOT NULL
-- This excludes all customers who have no orders!
 
-- Analysis:
-- - Customer with no orders has o.order_id = NULL
-- - INNER JOIN OrderItems requires o.order_id = oi.order_id
-- - NULL = anything is never true, so this row is filtered out
 
-- FIX: Use LEFT JOIN consistently
SELECT 
    c.customer_id,
    c.customer_name,
    o.order_id,
    oi.quantity,
    p.product_name
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
LEFT JOIN OrderItems oi ON o.order_id = oi.order_id
LEFT JOIN Products p ON oi.product_id = p.product_id;

The Preservation Rule

Detecting and Debugging Broken Preservation

One of the trickiest bugs with outer joins is accidentally breaking preservation. The query runs without error, but returns fewer rows than expected. Here's how to detect and fix these issues.

Symptoms of Broken Preservation:

Query returns fewer rows than the 'left' table has
Entities with no related data don't appear
Adding more joins reduces row count unexpectedly

debugging_preservation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Debugging technique: Progressive join verification
 
-- Step 1: Baseline - count of entities to preserve
SELECT COUNT(*) as total_customers FROM Customers;
-- Result: 1000 customers
 
-- Step 2: After first LEFT JOIN
SELECT COUNT(DISTINCT c.customer_id) as preserved_customers
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id;
-- Should still be: 1000 (all customers preserved)
 
-- Step 3: After second JOIN
SELECT COUNT(DISTINCT c.customer_id) as preserved_customers
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
LEFT JOIN OrderItems oi ON o.order_id = oi.order_id;
-- Should still be: 1000
 
-- Step 4: After adding filtering condition
SELECT COUNT(DISTINCT c.customer_id) as preserved_customers
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
LEFT JOIN OrderItems oi ON o.order_id = oi.order_id
WHERE o.order_date >= '2024-01-01';  -- PROBLEM!
-- Result: 750 (only customers with orders since Jan 1)
 
-- The WHERE filter on the outer-joined table broke preservation!

Common Causes of Broken Preservation

•WHERE on outer-joined columns: Filtering on columns from the 'right' table of a LEFT JOIN excludes unmatched rows
•INNER JOIN after LEFT JOIN: An INNER JOIN requiring the previously-NULL column breaks the chain
•Non-null assumptions: Code that assumes joined columns are never NULL crashes or behaves unexpectedly
•Implicit type coercion: Some databases may coerce NULL in ways that fail the join condition
•OR conditions with NULL: col = value OR col = other_value excludes NULLs (neither condition matches NULL)

fixes_for_broken_preservation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- FIX 1: Move filter from WHERE to ON
-- BEFORE (broken):
SELECT c.*, o.*
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= '2024-01-01';
 
-- AFTER (fixed):
SELECT c.*, o.*
FROM Customers c
LEFT JOIN Orders o 
    ON c.customer_id = o.customer_id
    AND o.order_date >= '2024-01-01';
 
-- FIX 2: Handle NULL in WHERE with OR IS NULL
-- When you DO need WHERE filtering but want to preserve unmatched:
SELECT c.*, o.*
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= '2024-01-01' 
   OR o.order_id IS NULL;  -- Explicitly preserve unmatched
 
-- FIX 3: Pre-filter in a CTE before joining
-- Filter the right table first, then LEFT JOIN
WITH RecentOrders AS (
    SELECT *
    FROM Orders
    WHERE order_date >= '2024-01-01'
)
SELECT c.*, ro.*
FROM Customers c
LEFT JOIN RecentOrders ro ON c.customer_id = ro.customer_id;

Preservation Design Patterns

Over time, several design patterns have emerged for effectively using row preservation in complex queries.

Pattern 1: Hub-and-Spoke

Keep your primary 'hub' table first, and LEFT JOIN all related 'spoke' tables to it:

pattern_hub_spoke.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Hub: Customers (always preserved)
-- Spokes: Orders, Reviews, SupportTickets (optional relationships)
SELECT 
    c.customer_id,
    c.customer_name,
    c.registration_date,
    COUNT(DISTINCT o.order_id) as order_count,
    COUNT(DISTINCT r.review_id) as review_count,
    COUNT(DISTINCT t.ticket_id) as ticket_count
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
LEFT JOIN Reviews r ON c.customer_id = r.customer_id
LEFT JOIN SupportTickets t ON c.customer_id = t.customer_id
GROUP BY c.customer_id, c.customer_name, c.registration_date;
 
-- Every customer appears with counts of their related activities
-- All counts default to 0 for customers with no activity in that area

Pattern 2: Generate-and-Join

Generate a complete set of expected entities, then LEFT JOIN actual data:

pattern_generate_join.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- Generate complete date range, join actual sales data
WITH DateRange AS (
    SELECT generate_series(
        '2024-01-01'::date,
        '2024-12-31'::date,
        '1 day'::interval
    )::date as report_date
)
SELECT 
    dr.report_date,
    COALESCE(SUM(s.amount), 0) as daily_sales,
    COUNT(s.sale_id) as transaction_count
FROM DateRange dr
LEFT JOIN Sales s ON DATE(s.sale_timestamp) = dr.report_date
GROUP BY dr.report_date
ORDER BY dr.report_date;
 
-- Every day of 2024 appears, even days with $0 sales
-- No gaps in the time series
 
-- Cross-join for all product-region combinations
WITH AllProducts AS (SELECT product_id, product_name FROM Products),
     AllRegions AS (SELECT region_id, region_name FROM Regions)
SELECT 
    p.product_name,
    r.region_name,
    COALESCE(SUM(s.quantity), 0) as total_sold
FROM AllProducts p
CROSS JOIN AllRegions r
LEFT JOIN Sales s 
    ON p.product_id = s.product_id 
    AND r.region_id = s.region_id
GROUP BY p.product_id, p.product_name, r.region_id, r.region_name
ORDER BY p.product_name, r.region_name;
 
-- Shows all product-region combinations, even with 0 sales

Generate-and-Join for Complete Reports

Summary: Preserving Unmatched Rows

Key Takeaways

•Preservation reflects data model semantics—choose LEFT JOIN when relationships are optional and you need complete data from one side.
•Anti-join pattern (LEFT JOIN + WHERE pk IS NULL) is the standard way to find 'missing' relationships.
•COALESCE is essential for aggregation—wrap SUM/AVG in COALESCE to get 0 instead of NULL for unmatched rows.
•Use COUNT(column), not COUNT(*) to count true matches in outer-joined results.
•ON clause conditions preserve rows; WHERE conditions filter them—know which you need.
•Preservation cascades through join chains—any INNER JOIN on a potentially-NULL column breaks the chain.
•Debug progressively—verify preserved row counts at each join stage to detect when preservation breaks.

What's Next: NULL in Results

Page Complete

4 / 5