Query Optimization Tips - Learning Module

Loading content...

0/241

Expression Optimization

The Hidden Cost of How You Write SQL

Two SQL queries can ask the exact same question, return identical results, and yet differ in execution time by orders of magnitude. The difference often lies not in what you query, but how you express the conditions.

The way you write WHERE clauses, function calls, and comparisons directly impacts whether the database can use indexes, apply query shortcuts, or must fall back to expensive operations. A single function call around the wrong column can transform a millisecond index seek into a multi-second full table scan.

This page reveals the hidden rules governing SQL expression performance—knowledge that separates queries that fly from queries that crawl.

What You Will Learn

By the end of this page, you'll understand sargability—the key property determining index usage, master patterns for function-safe expressions, recognize implicit conversion traps, and learn expression rewriting techniques that unlock optimizer power.

Understanding Sargability

Sargability (Search ARGument ABLE) refers to whether an expression allows the database to efficiently seek through an index. A sargable condition can be evaluated using an index seek; a non-sargable condition requires scanning every row.

The Core Principle:

For an expression to be sargable, the indexed column must appear alone on one side of the comparison, without any functions, operators, or transformations applied to it.

Sargable vs Non-Sargable
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- Assume index on: orders.order_date
 
-- SARGABLE: Column alone on one side
SELECT * FROM orders WHERE order_date = '2024-01-15';
SELECT * FROM orders WHERE order_date > '2024-01-01';
SELECT * FROM orders WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31';
 
-- Execution: Index seek directly to matching values
 
-- NON-SARGABLE: Function applied to indexed column
SELECT * FROM orders WHERE YEAR(order_date) = 2024;
SELECT * FROM orders WHERE DATE(order_date) = '2024-01-15';
SELECT * FROM orders WHERE UPPER(customer_name) = 'SMITH';
SELECT * FROM orders WHERE order_total + tax > 1000;
 
-- Execution: Must scan EVERY row, apply function, then compare

Why Functions Break Sargability:

An index on order_date is organized by the raw column value. When you apply YEAR(order_date), the database must:

Read each order_date value
Apply the YEAR() function
Compare the result to 2024

The index knows where '2024-01-01' is located, but it has no idea where YEAR(date) = 2024 begins—the function transforms the value into something the index wasn't built to navigate.

Sargability Quick Reference
Expression Pattern	Sargable?	Index Usage
column = value	✅ Yes	Index seek (==)
column > value	✅ Yes	Index range scan
column BETWEEN a AND b	✅ Yes	Index range scan
column LIKE 'prefix%'	✅ Yes	Index range scan
FUNCTION(column) = value	❌ No	Full scan required
column LIKE '%suffix'	❌ No	Full scan required
column + 1 = value	❌ No	Full scan required
column1 = column2	❌ Sometimes	Depends on optimizer

The Index Illusion

Having an index doesn't guarantee it will be used. A non-sargable expression on an indexed column forces a full scan despite the index existing. The expression must be written to 'unlock' index access.

Rewriting Non-Sargable Expressions

Many non-sargable expressions can be rewritten into sargable equivalents without changing the result. The goal is to move the transformation to the value side, leaving the column bare.

Pattern 1: Date/Time Functions → Range Comparisons:

Date Function Rewrites
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Non-sargable: Function on column
SELECT * FROM orders WHERE YEAR(order_date) = 2024;
 
-- Sargable: Range comparison
SELECT * FROM orders 
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01';
 
-- Non-sargable: Extract month
SELECT * FROM orders WHERE MONTH(order_date) = 3;
 
-- Sargable: Union of March ranges (or for specific year)
SELECT * FROM orders 
WHERE order_date >= '2024-03-01' AND order_date < '2024-04-01';
 
-- Non-sargable: Date part extraction
SELECT * FROM events WHERE DATE(start_timestamp) = '2024-05-15';
 
-- Sargable: Full day range
SELECT * FROM events 
WHERE start_timestamp >= '2024-05-15 00:00:00' 
  AND start_timestamp < '2024-05-16 00:00:00';

Pattern 2: String Functions → Pattern Matching or Data Cleanup:

String Function Rewrites
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Non-sargable: Case conversion on query
SELECT * FROM customers WHERE UPPER(email) = 'JOHN@EXAMPLE.COM';
 
-- Better: Store normalized data, query directly
-- (Ensure email is stored as lowercase during INSERT/UPDATE)
SELECT * FROM customers WHERE email = 'john@example.com';
 
-- Alternative: Functional index (database support varies)
CREATE INDEX idx_customers_email_upper ON customers(UPPER(email));
-- Now UPPER(email) = 'JOHN@EXAMPLE.COM' uses the index
 
-- Non-sargable: Substring extraction
SELECT * FROM products WHERE SUBSTRING(sku, 1, 3) = 'ABC';
 
-- Sargable: LIKE with prefix
SELECT * FROM products WHERE sku LIKE 'ABC%';
 
-- Non-sargable: LIKE with leading wildcard
SELECT * FROM products WHERE product_name LIKE '%phone%';
 
-- Partial solution: Full-text search index
CREATE FULLTEXT INDEX idx_product_name ON products(product_name);
SELECT * FROM products WHERE MATCH(product_name) AGAINST('phone');

Pattern 3: Arithmetic on Columns → Arithmetic on Values:

Arithmetic Rewrites
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Non-sargable: Arithmetic on column
SELECT * FROM orders WHERE total * 1.1 > 100;
 
-- Sargable: Solve for column algebraically
SELECT * FROM orders WHERE total > 100 / 1.1;  -- total > 90.91
 
-- Non-sargable: Addition on column
SELECT * FROM products WHERE price + shipping > 50;
 
-- Sargable: Rearrange
SELECT * FROM products WHERE price > 50 - shipping;
-- Note: If shipping varies per row, this doesn't help
-- In that case, consider computed column or materialized value
 
-- Non-sargable: Date arithmetic on column
SELECT * FROM subscriptions WHERE created_date + INTERVAL '30 days' < NOW();
 
-- Sargable: Arithmetic on value
SELECT * FROM subscriptions WHERE created_date < NOW() - INTERVAL '30 days';

Algebraic Transformation

Treat the comparison as an equation and 'solve for the column.' If FUNCTION(column) = value, can you express it as column = INVERSE_FUNCTION(value)? This moves the function to the value side where it's computed once, not per row.

Implicit Type Conversions

Implicit type conversion occurs when comparing different data types. The database converts one type to match the other—but which type converts to which can break sargability unexpectedly.

The Conversion Hierarchy:

Databases have type precedence rules. When types mismatch, the lower-precedence type converts to the higher-precedence type. If this converts the column, sargability is lost.

Implicit Conversion Examples
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Scenario: phone_number column is VARCHAR(20), indexed
 
-- Non-sargable: Numeric comparison causes column conversion
SELECT * FROM customers WHERE phone_number = 5551234567;
-- Database converts: CAST(phone_number AS BIGINT) = 5551234567
-- The CAST on column breaks sargability (and may error on non-numeric data)
 
-- Sargable: String comparison, no conversion needed
SELECT * FROM customers WHERE phone_number = '5551234567';
-- Direct string comparison uses index
 
-- Scenario: order_id is INT, parameter is string
 
-- Usually sargable: String converts to INT (column type preserved)
SELECT * FROM orders WHERE order_id = '12345';
-- Becomes: order_id = CAST('12345' AS INT) = 12345
-- Column not wrapped, index usable
 
-- Scenario: created_at is TIMESTAMP, parameter is DATE
 
-- Behavior varies by database:
SELECT * FROM events WHERE created_at = '2024-01-15';
-- PostgreSQL: '2024-01-15' becomes '2024-01-15 00:00:00', works as expected
-- Some databases: May apply DATE() to created_at, breaking sargability

Common Conversion Traps:

Implicit Conversion Problem Scenarios
Column Type	Value Type	What Happens	Sargable?
VARCHAR	INT	VARCHAR → INT (column converted)	❌ No
INT	VARCHAR	VARCHAR → INT (value converted)	✅ Yes
DATETIME	DATE	DATE → DATETIME (value converted)	✅ Usually
VARCHAR(10)	VARCHAR(100)	May truncate or pad	⚠️ Depends
CHAR(10)	VARCHAR	CHAR padded, comparison affected	⚠️ Tricky
DECIMAL	FLOAT	Precision differences	✅ Usually

Safe Practices:

Type Safety Practices
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Always match parameter type to column type:
 
-- Column: customer_id INT
WHERE customer_id = 12345        -- Good: INT literal
WHERE customer_id = '12345'      -- Usually OK: string to int
WHERE customer_id = :param       -- Ensure param is INT in application
 
-- Column: email VARCHAR(255) 
WHERE email = 'user@example.com' -- Good: String literal
 
-- Column: price DECIMAL(10,2)
WHERE price = 99.99              -- Good: Numeric literal
 
-- For prepared statements, use explicit bindings:
-- Java: ps.setInt(1, customerId);
-- Python: cursor.execute(sql, (customer_id,))
-- Node.js: connection.query(sql, [customerId])

The VARCHAR-to-Number Trap

Comparing a VARCHAR column to a numeric value is one of the most common sargability destroyers. The database converts the VARCHAR column to a number for each row, making index usage impossible. Always quote string values: WHERE phone = '555123' not WHERE phone = 555123.

Function Index Solutions

When you must filter on a function result, function-based indexes (also called expression indexes) provide a solution. These indexes pre-compute the function value for each row.

Creating Function-Based Indexes:

Function-Based Indexes
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- PostgreSQL: Expression indexes
CREATE INDEX idx_orders_year ON orders (EXTRACT(YEAR FROM order_date));
-- Now this is sargable:
SELECT * FROM orders WHERE EXTRACT(YEAR FROM order_date) = 2024;
 
-- Case-insensitive search
CREATE INDEX idx_customers_email_lower ON customers (LOWER(email));
SELECT * FROM customers WHERE LOWER(email) = 'john@example.com';
 
-- MySQL: Generated columns + index
ALTER TABLE customers ADD email_lower VARCHAR(255) 
    GENERATED ALWAYS AS (LOWER(email)) STORED;
CREATE INDEX idx_email_lower ON customers(email_lower);
 
-- SQL Server: Computed columns + index
ALTER TABLE customers ADD email_lower AS LOWER(email);
CREATE INDEX idx_email_lower ON customers(email_lower);
 
-- Oracle: Function-based indexes
CREATE INDEX idx_cust_email_upper ON customers(UPPER(email));

When to Use Function Indexes:

Good Candidates

•Case-insensitive text search (LOWER/UPPER)
•Date part extraction (YEAR, MONTH)
•JSON field extraction (->>, ->>)
•Computed values used in frequent filters
•Standardized format comparisons

Poor Candidates

•Low-selectivity functions (YEAR on 2 years of data)
•Rarely queried expressions
•Volatile functions (NOW(), RANDOM())
•Functions depending on other tables
•Expressions that can be rewritten sargably

Cost Considerations:

Function Index Costs
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- Function indexes have trade-offs:
 
-- Cost 1: Storage space (duplicates computed values)
-- Cost 2: INSERT/UPDATE overhead (must compute function)
-- Cost 3: Maintenance (additional index to rebuild)
 
-- Example sizing:
-- Table: 10 million rows, email VARCHAR(255)
-- Function index on LOWER(email): ~300 MB additional storage
-- INSERT overhead: ~0.1ms per insert for function computation
 
-- Benefit calculation:
-- Queries using LOWER(email): 10,000/day
-- Without index: 2 seconds each (full scan) = 5.5 hours/day
-- With index: 5 ms each = 50 seconds/day
-- Clear win for high-frequency queries

Exact Expression Match

Function indexes only work when the query expression exactly matches the indexed expression. CREATE INDEX ON (LOWER(email)) helps LOWER(email) = 'x' but NOT TRIM(LOWER(email)) = 'x' or email = UPPER('x'). Match the expression precisely.

OR Expression Optimization

OR expressions present special optimization challenges. While AND conditions narrow results (making indexes more useful), OR conditions widen results and can prevent index usage entirely.

OR on Same Column (Usually Optimizable):

OR on Same Column
SQL
1
2
3
4
5
6
7
8
9
10
-- OR on same column often converts to IN
SELECT * FROM orders 
WHERE status = 'pending' OR status = 'processing' OR status = 'shipped';
 
-- Equivalent and often faster:
SELECT * FROM orders 
WHERE status IN ('pending', 'processing', 'shipped');
 
-- Optimizer typically handles this, but explicit IN is cleaner
-- Both can use index on 'status' via multiple seeks or range scan

OR on Different Columns (Problematic):

OR on Different Columns
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
-- OR on different columns breaks simple index usage
SELECT * FROM customers 
WHERE email = 'john@example.com' OR phone = '555-1234';
 
-- With separate indexes on email and phone:
-- Option 1: Full scan (can't use either index cleanly)
-- Option 2: Index Union (scan both indexes, combine results)
 
-- Force index union (if optimizer doesn't choose it):
-- PostgreSQL
SELECT * FROM customers WHERE email = 'john@example.com'
UNION
SELECT * FROM customers WHERE phone = '555-1234';
 
-- This uses index on email for first query, index on phone for second
-- UNION automatically removes duplicates
 
-- For large result sets, union can be faster than full scan

OR with AND Mixed (Precedence Issues):

OR AND Precedence
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- AND has higher precedence than OR
-- This expression:
WHERE status = 'active' AND region = 'West' OR region = 'East'
 
-- Is parsed as:
WHERE (status = 'active' AND region = 'West') OR region = 'East'
-- Returns: Active customers in West, OR ANY customer in East
 
-- Probably intended:
WHERE status = 'active' AND (region = 'West' OR region = 'East')
-- Returns: Active customers in West or East
 
-- Clearer with explicit parentheses:
WHERE status = 'active' AND region IN ('West', 'East')
 
-- For complex OR conditions, consider UNION:
SELECT * FROM customers WHERE status = 'active' AND region = 'West'
UNION ALL
SELECT * FROM customers WHERE status = 'active' AND region = 'East';

OR Expression Optimization Strategies
Pattern	Problem	Solution
OR on same column	Verbose	Convert to IN clause
OR on different indexed columns	Can't use single index	UNION of separate queries
OR with NULL check	NULL handling complexity	COALESCE or rewrite logic
OR in JOIN condition	Complex join handling	Restructure as separate joins with UNION
Nested OR/AND	Precedence confusion	Use explicit parentheses

UNION ALL vs UNION

When converting OR to UNION, use UNION ALL if you're certain queries don't produce duplicates. UNION ALL is faster because it doesn't need to deduplicate. UNION removes duplicates (like the OR semantic), but with sort/hash overhead.

NULL Handling in Expressions

NULL handling in SQL follows three-valued logic (TRUE, FALSE, UNKNOWN) and can create both correctness and performance issues. Understanding NULL behavior prevents subtle bugs and enables better optimization.

NULL in Comparisons:

NULL Comparison Behavior
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- NULL is not equal to anything, including itself
SELECT * FROM customers WHERE discount = NULL;     -- Returns 0 rows always!
SELECT * FROM customers WHERE discount <> NULL;    -- Also returns 0 rows!
 
-- Correct NULL checks:
SELECT * FROM customers WHERE discount IS NULL;    -- Matches NULL values
SELECT * FROM customers WHERE discount IS NOT NULL; -- Excludes NULL values
 
-- NULL with equality: UNKNOWN result fails WHERE
-- WHERE NULL = NULL  →  UNKNOWN  →  row excluded
-- WHERE NULL <> 1    →  UNKNOWN  →  row excluded
 
-- Safe equality with COALESCE:
SELECT * FROM orders 
WHERE COALESCE(discount_code, '') = COALESCE(:param, '');
-- Treats NULL as empty string for comparison

NULL-Safe Equality Operators:

Some databases provide NULL-safe comparison operators:

NULL-Safe Operators
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- MySQL: NULL-safe equality (<=>)
SELECT * FROM customers WHERE discount <=> :param;
-- Returns TRUE if both NULL, or both equal non-NULL
 
-- PostgreSQL: IS NOT DISTINCT FROM
SELECT * FROM customers WHERE discount IS NOT DISTINCT FROM :param;
-- Same behavior: NULL matches NULL
 
-- SQL Standard: IS [NOT] DISTINCT FROM
SELECT * FROM orders 
WHERE status IS NOT DISTINCT FROM 'pending';
 
-- Without NULL-safe operators, use verbose form:
SELECT * FROM customers
WHERE (discount = :param) 
   OR (discount IS NULL AND :param IS NULL);

NULL and Indexes:

NULL Index Behavior
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- PostgreSQL, MySQL: NULLs included in B-tree indexes
CREATE INDEX idx_discount ON orders(discount);
SELECT * FROM orders WHERE discount IS NULL;  -- Uses index
 
-- Oracle: NULLs NOT in B-tree indexes (single-column)
-- This query cannot use single-column index:
SELECT * FROM orders WHERE discount IS NULL;  -- Full scan
 
-- Oracle workaround: Composite index includes NULLs
CREATE INDEX idx_discount ON orders(discount, 0);
-- Now NULL discount values are indexed
 
-- Partial index for IS NOT NULL (PostgreSQL):
CREATE INDEX idx_orders_discount_nonnull 
ON orders(discount) WHERE discount IS NOT NULL;
-- Smaller index, only for non-NULL queries

Performance Impact of COALESCE/NVL:

COALESCE Performance
SQL
1
2
3
4
5
6
7
8
9
10
11
12
-- COALESCE on column breaks sargability:
SELECT * FROM orders WHERE COALESCE(discount, 0) > 0;
-- Cannot use index on discount (function applied)
 
-- Sargable alternative:
SELECT * FROM orders WHERE discount > 0;
-- NULL values naturally excluded (NULL > 0 is UNKNOWN)
 
-- If you need NULLs treated as specific value, consider:
-- 1. Store the default value instead of NULL
-- 2. Use computed column: discount_effective AS COALESCE(discount, 0)
-- 3. Use partial index or expression index

COUNT(*) vs COUNT(column)

COUNT() counts all rows. COUNT(column) counts non-NULL values only. This semantic difference affects results when NULLs exist. For counting rows, use COUNT(). For counting values present, use COUNT(column).

Expression Simplification

The optimizer simplifies many expressions, but complex logic can hide opportunities. Manual simplification makes code clearer and helps the optimizer.

Constant Folding:

The optimizer pre-computes expressions involving only constants:

Constant Folding
SQL
1
2
3
4
5
6
7
8
9
10
11
-- These are computed once at parse time:
WHERE created_at > CURRENT_DATE - INTERVAL '30 days'
-- Computed once: '2024-01-15' (example)
 
WHERE price > 100 * 1.1 + 5
-- Computed once: 115
 
-- But watch out for volatile functions:
WHERE created_at > CURRENT_TIMESTAMP - INTERVAL '1 hour'
-- CURRENT_TIMESTAMP may be evaluated once or per row
-- Behavior varies by database - check documentation

Redundant Condition Elimination:

Redundant Conditions
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-- Optimizer removes redundant conditions:
WHERE age > 18 AND age >= 10
-- Simplified to: WHERE age > 18 (>18 implies >=10)
 
WHERE status = 'active' AND status = 'active'
-- Simplified to: WHERE status = 'active'
 
-- But may not simplify complex redundancy:
WHERE category_id = 5 AND category_name = 'Electronics'
-- Both kept even if category_name is functionally dependent on category_id
-- Manual simplification helps if you know the dependency
 
-- Impossible conditions can short-circuit:
WHERE status = 'active' AND status = 'inactive'
-- Optimizer may detect this returns 0 rows (contradiction)

CASE Expression Optimization:

CASE Optimization
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- CASE in WHERE can prevent index usage:
SELECT * FROM orders
WHERE CASE 
    WHEN :param_type = 'id' THEN order_id = :param_value
    WHEN :param_type = 'status' THEN status = :param_value
END;
-- This is one statement with complex predicate - hard to optimize
 
-- Better: Use application logic to build dynamic SQL
-- Or use OR with static conditions:
SELECT * FROM orders
WHERE (:param_type = 'id' AND order_id = :param_value)
   OR (:param_type = 'status' AND status = :param_value);
-- Still not great, but more explicit
 
-- Best: Separate queries per type in application
if (paramType === 'id') {
    query = 'SELECT * FROM orders WHERE order_id = ?';
} else {
    query = 'SELECT * FROM orders WHERE status = ?';
}

IN List Optimization:

IN List Optimization
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Small IN lists are fast:
WHERE status IN ('A', 'B', 'C')  -- Typically 3 index seeks or range scan
 
-- Very large IN lists can be problematic:
WHERE product_id IN (1, 2, 3, ..., 10000)
-- Parse time increases
-- May exceed parameter limits
-- Consider alternatives:
 
-- Option 1: Temporary table
CREATE TEMPORARY TABLE filter_ids (id INT PRIMARY KEY);
INSERT INTO filter_ids VALUES (1), (2), ... (10000);
SELECT * FROM products p 
JOIN filter_ids f ON f.id = p.product_id;
 
-- Option 2: VALUES clause (PostgreSQL, modern MySQL)
SELECT * FROM products p
WHERE p.product_id IN (
    SELECT * FROM (VALUES (1), (2), (3), ...) AS v(id)
);
 
-- Option 3: Array parameter (PostgreSQL)
SELECT * FROM products WHERE product_id = ANY(:ids_array);

IN List Limits

Most databases have practical limits on IN clause size (SQL Server: 2,100 parameters, Oracle: 1,000 values in IN). For large value sets, use temporary tables or array parameters. The join approach is often faster than long IN lists anyway.

Predicate Ordering and Short-Circuiting

While optimizers reorder predicates for efficiency, understanding evaluation order helps write clearer queries and avoid errors.

Optimizer Predicate Reordering:

Predicate Reordering
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- You write:
WHERE department = 'Engineering' AND salary > 100000 AND active = true
 
-- Optimizer may evaluate in different order based on:
-- 1. Selectivity: Most selective predicate first (eliminates most rows)
-- 2. Index availability: Indexed predicates preferred
-- 3. Cost: Cheaper predicates evaluated first
 
-- If 'active' is most selective and indexed, optimizer may use:
-- 1. Index scan on active = true (100 rows)
-- 2. Filter by department = 'Engineering' (50 rows)  
-- 3. Filter by salary > 100000 (10 rows)
 
-- Your written order doesn't constrain the optimizer

Short-Circuit Evaluation (Limited in SQL):

Short-Circuit Limitations
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
-- Unlike programming languages, SQL doesn't guarantee short-circuit
 
-- This might error even if amount IS NOT NULL:
WHERE amount IS NOT NULL AND amount / quantity > 10
-- SQL standard allows evaluating in any order
-- If quantity = 0 evaluated first: division error possible
 
-- Safe version using CASE ensures order:
WHERE CASE 
    WHEN amount IS NULL THEN FALSE
    WHEN quantity = 0 THEN FALSE
    ELSE amount / quantity > 10
END
 
-- Or use NULLIF to avoid division by zero:
WHERE amount IS NOT NULL AND amount / NULLIF(quantity, 0) > 10
-- NULLIF(quantity, 0) returns NULL if quantity is 0
-- NULL / anything = NULL, and NULL > 10 = UNKNOWN = excluded

Filter Pushdown in CTEs and Subqueries:

Filter Pushdown
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Optimizer may push filters into CTEs/subqueries
 
WITH expensive_calculation AS (
    SELECT customer_id, 
           SUM(total) as lifetime_value,
           COUNT(*) as order_count
    FROM orders
    GROUP BY customer_id
)
SELECT * FROM expensive_calculation
WHERE customer_id = 12345;
 
-- Smart optimizer pushes filter into CTE:
-- Only aggregates for customer 12345
 
-- But materialization prevents pushdown:
WITH expensive_calculation AS MATERIALIZED (  -- PostgreSQL
    SELECT customer_id, ...
)
SELECT * FROM expensive_calculation
WHERE customer_id = 12345;
-- CTE computed for ALL customers first, then filtered

Predicate Optimization Checklist
Practice	Why It Helps
Put indexed columns in predicates	Enables index usage
Avoid functions on columns	Preserves sargability
Use IS NOT NULL explicitly	Documents intent, helps optimizer
Separate expensive predicates	May enable parallel evaluation
Match column types in comparisons	Avoids implicit conversion
Use CASE for order-dependent logic	Guarantees evaluation order

Writing Order vs Execution Order

The order you write predicates is for human readability. Put the most important or restrictive conditions first for clarity, but know the optimizer will reorder based on its cost analysis. Write for humans; let the optimizer write for the database.

Summary: Expression Optimization Mastery

How you write SQL expressions directly impacts query performance. The patterns and principles in this page apply to every query you write. Let's consolidate the key insights:

Key Takeaways

•Sargability is fundamental — keep indexed columns bare on one side of comparisons. Functions, operators, and casts on columns destroy index usability.
•Move transformations to the value side — instead of YEAR(date) = 2024, use date >= '2024-01-01' AND date < '2025-01-01'.
•Match data types explicitly — implicit conversion on VARCHAR to INT forcomparison prevents index usage. Always match parameter types to column types.
•Function indexes save queries that can't be rewritten — when you must filter on function results, expression indexes make them sargable.
•OR on different columns is expensive — consider UNION of separate queries to use indexes on each condition.
•NULL has three-valued logic — use IS NULL/IS NOT NULL for comparisons, understand COUNT(*) vs COUNT(column), and consider NULL-safe operators.

Module Complete:

You've completed the Query Optimization Tips module. You now understand how to avoid SELECT *, limit result sets efficiently, write performant joins and subqueries, and craft expressions that unlock optimizer power. These techniques, applied consistently, transform slow queries into fast ones and prevent performance problems before they occur.

Module Complete

Congratulations! You've mastered practical query optimization techniques. From column selection to expression crafting, you now have the knowledge to write SQL that performs efficiently at any scale. Apply these patterns in your daily work, and slow queries will become rare exceptions rather than daily frustrations.