SQLWindow Functions Basics

Window Functions Basics

LevelIntermediate

Duration75 mins

TopicWindow Functions Basics

1 / 5

Window Function Concept

The Analytical Limitation of Traditional SQL

Consider a common business question: "For each sale, show the sale amount alongside the department's total and the sale's percentage of that total."

With traditional SQL, you face an uncomfortable choice. You can use GROUP BY to compute department totals, but then you lose individual sale rows. Or you can keep individual rows, but then you need subqueries or self-joins to bring in aggregate values. Either way, the query becomes convoluted, inefficient, or both.

This limitation persists across countless analytical scenarios:

Ranking items within categories
Computing running totals over time
Comparing each row to group averages
Finding the top N items per group
Calculating moving averages

Window functions solve this fundamental problem. Introduced in SQL:2003 and now supported by all major databases, window functions perform calculations across sets of table rows that are somehow related to the current row—all while preserving the individual row context that GROUP BY destroys.

What You Will Learn

By the end of this page, you will understand what window functions are, why they're necessary, how they fundamentally differ from aggregate functions, and the conceptual model of 'windows' that gives them their name. This foundation is essential before diving into specific syntax and clauses.

The Problem Window Functions Solve

To truly appreciate window functions, we must first understand the limitation they address. Let's examine a concrete scenario that illustrates the gap in traditional SQL.

Scenario: Employee Salary Analysis

Suppose we have an employees table:

emp_id	name	department	salary
1	Alice	Engineering	95000
2	Bob	Engineering	85000
3	Carol	Engineering	90000
4	David	Sales	70000
5	Eve	Sales	75000
6	Frank	HR	60000

The business asks: "For each employee, show their salary, their department's average salary, and how their salary compares to that average."

This requires two things simultaneously:

Row-level detail — Each employee's individual information
Group-level computation — The department average

Traditional SQL forces a choice between these requirements.

group_by_approach.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- GROUP BY gives us department averages...
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department;
 
-- Result:
-- department  | avg_salary
-- ------------|----------
-- Engineering | 90000
-- Sales       | 72500
-- HR          | 60000
 
-- But we've LOST individual employee rows!
-- We can no longer see Alice, Bob, Carol, etc.

The Collapse Problem

GROUP BY collapses rows into groups. We get one row per department, not one row per employee. The individual employee context is destroyed—exactly what we don't want.

What is a Window Function?

A window function performs a calculation across a set of table rows that are somehow related to the current row. Unlike aggregate functions with GROUP BY, window functions do not collapse rows—they compute a value for each row based on a "window" of rows related to it.

The formal definition:

A window function computes a value for each row in the query result based on a subset of rows (the window) defined by the OVER clause. The window may include all rows, rows in the same partition, or a frame of rows relative to the current row.

The key insight:

Think of a window function as giving each row "visibility" into other rows around it. Instead of each row being isolated, it can see and perform calculations using values from related rows—without merging with them.

Why 'Window'?

The term 'window' comes from the concept of a sliding window or frame through which each row views a subset of data. Imagine looking through a window that shows you surrounding rows—that's the conceptual model. The window can show your entire partition, or just nearby rows, or even a custom frame.

The anatomy of a window function call:

window_function_anatomy.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- General syntax:
function_name(arguments) OVER (window_specification)
 
-- The window_specification can include:
-- 1. PARTITION BY - defines how to group rows into partitions
-- 2. ORDER BY - defines row ordering within partitions  
-- 3. Frame clause - defines which rows around current row to include
 
-- Full syntax:
function_name(arguments) OVER (
    PARTITION BY column1, column2, ...
    ORDER BY column3, column4, ...
    frame_clause
)
 
-- Examples of increasing complexity:
 
-- Example 1: All rows in entire result set
AVG(salary) OVER ()
 
-- Example 2: All rows in same department
AVG(salary) OVER (PARTITION BY department)
 
-- Example 3: Running total within department
SUM(sales) OVER (PARTITION BY department ORDER BY sale_date)
 
-- Example 4: Moving average of last 3 rows
AVG(price) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)

Core Components of Window Functions

•The Function — What calculation to perform (SUM, AVG, ROW_NUMBER, RANK, LAG, etc.). Can be an aggregate function or a specialized window-only function.
•OVER Clause — The defining feature of window functions. Specifies how the 'window' of rows is defined for each calculation.
•PARTITION BY — Optional. Divides rows into groups (partitions). The function calculates independently within each partition.
•ORDER BY — Optional for some functions, required for others. Defines logical ordering within partitions. Critical for running totals and ranking.
•Frame Clause — Optional. Defines a subset (frame) of rows within the partition relative to the current row. Enables moving averages, running totals, etc.

Window Functions vs Aggregate Functions

Understanding the distinction between window functions and aggregate functions is fundamental. They share some functions (like SUM, AVG, COUNT) but operate on fundamentally different paradigms.

The key difference:

Aspect	Aggregate Function (GROUP BY)	Window Function (OVER)
Row Output	One row per group	One row per input row
Row Identity	Lost (rows collapsed)	Preserved
Calculation Scope	Entire group	Window per row
Use Case	Summary reports	Analytical queries

Aggregate Functions (GROUP BY)

•Collapse rows into summary groups
•Return one row per group
•Individual row context is destroyed
•Cannot reference both group aggregate AND individual row values in same output
•Execution: Filter → Group → Aggregate → Filter (HAVING)

Window Functions (OVER)

•Preserve all rows exactly as they are
•Return one value per row
•Individual row context is retained
•Can show individual values alongside group calculations
•Execution: After WHERE, GROUP BY, HAVING → Before ORDER BY

Visual illustration:

Consider three sales rows for the 'North' region:

aggregate_vs_window.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Sample data:
-- | sale_id | region | amount |
-- |---------|--------|--------|
-- | 1       | North  | 100    |
-- | 2       | North  | 200    |
-- | 3       | North  | 150    |
 
-- AGGREGATE with GROUP BY:
SELECT region, SUM(amount) as total
FROM sales
GROUP BY region;
 
-- Result: 
-- | region | total |
-- |--------|-------|
-- | North  | 450   |
-- 
-- 3 rows → 1 row (collapsed)
-- Individual sales are GONE
 
-- WINDOW FUNCTION:
SELECT 
    sale_id,
    region,
    amount,
    SUM(amount) OVER (PARTITION BY region) as region_total
FROM sales;
 
-- Result:
-- | sale_id | region | amount | region_total |
-- |---------|--------|--------|--------------|
-- | 1       | North  | 100    | 450          |
-- | 2       | North  | 200    | 450          |
-- | 3       | North  | 150    | 450          |
--
-- 3 rows → 3 rows (preserved)
-- Each sale visible WITH its group total

Same Function, Different Behavior

Notice that SUM() is the same function in both cases. The difference is not in the function itself—it's in how it's invoked. With GROUP BY, SUM() is an aggregate function that collapses rows. With OVER, the same SUM() becomes a window function that preserves rows while adding aggregate context to each.

Categories of Window Functions

Window functions fall into several categories based on their purpose. Understanding these categories helps you match the right function to your analytical need.

Window Function Categories
Category	Functions	Purpose	ORDER BY Required?
Aggregate	SUM, AVG, COUNT, MIN, MAX	Apply traditional aggregates over window	Optional (affects frame)
Ranking	ROW_NUMBER, RANK, DENSE_RANK, NTILE	Assign position/rank to rows	Required
Value/Offset	LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE	Access values from other rows	Required
Distribution	PERCENT_RANK, CUME_DIST, PERCENTILE_CONT, PERCENTILE_DISC	Calculate statistical distributions	Required

Category 1: Aggregate Window Functions

These are the familiar aggregate functions (SUM, AVG, COUNT, MIN, MAX) used with OVER instead of GROUP BY. They compute aggregates while preserving individual rows.

aggregate_window_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Running total
SELECT 
    order_date,
    amount,
    SUM(amount) OVER (ORDER BY order_date) as running_total
FROM orders;
 
-- Percentage of category total
SELECT 
    product_name,
    category,
    sales,
    sales * 100.0 / SUM(sales) OVER (PARTITION BY category) as pct_of_category
FROM products;
 
-- Moving average
SELECT
    date,
    price,
    AVG(price) OVER (
        ORDER BY date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) as seven_day_avg
FROM stock_prices;

Category 2: Ranking Functions

These assign a numeric rank or position to each row based on ORDER BY criteria. Essential for "top N" queries and competitive rankings.

ranking_window_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Row number (unique, sequential)
SELECT 
    name,
    score,
    ROW_NUMBER() OVER (ORDER BY score DESC) as position
FROM contestants;
 
-- Rank with gaps (ties get same rank, next rank skipped)
SELECT 
    name,
    score,
    RANK() OVER (ORDER BY score DESC) as rank
FROM contestants;
-- If two people tie for 1st, next is 3rd
 
-- Dense rank (no gaps)
SELECT 
    name,
    score,
    DENSE_RANK() OVER (ORDER BY score DESC) as dense_rank
FROM contestants;
-- If two people tie for 1st, next is 2nd
 
-- NTILE (divide into buckets)
SELECT 
    student_name,
    score,
    NTILE(4) OVER (ORDER BY score DESC) as quartile
FROM exam_results;

Category 3: Value/Offset Functions

These access values from rows other than the current row—either by relative position (LAG, LEAD) or absolute position (FIRST_VALUE, LAST_VALUE, NTH_VALUE).

value_window_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- LAG: access previous row's value
SELECT 
    date,
    revenue,
    LAG(revenue, 1, 0) OVER (ORDER BY date) as prev_revenue,
    revenue - LAG(revenue, 1, 0) OVER (ORDER BY date) as revenue_change
FROM monthly_revenue;
 
-- LEAD: access next row's value  
SELECT
    flight_id,
    departure_time,
    LEAD(departure_time) OVER (ORDER BY departure_time) as next_departure
FROM flights;
 
-- FIRST_VALUE / LAST_VALUE
SELECT
    product,
    category,
    price,
    FIRST_VALUE(product) OVER (
        PARTITION BY category 
        ORDER BY price DESC
    ) as most_expensive_in_category
FROM products;

Window-Only Functions

ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, FIRST_VALUE, LAST_VALUE, and NTH_VALUE are window-only functions. They ONLY work with OVER—they cannot be used as aggregate functions with GROUP BY. This is because their semantics depend on row-level positioning that GROUP BY destroys.

The Conceptual Model: How Window Functions Process Data

Understanding how window functions execute helps you reason about their behavior and optimize their use. Here's the conceptual processing model:

Step 1: The query runs normally up through GROUP BY and HAVING

Window functions operate on the result set after WHERE, GROUP BY, and HAVING filters have been applied. They see the post-aggregation rows (if any aggregation occurred).

Step 2: For each row in the result, the database:

Identifies the row's partition (based on PARTITION BY)
Orders rows within the partition (based on ORDER BY)
Defines the frame of rows for calculation (based on frame clause or defaults)
Computes the window function using values in that frame
Associates the computed value with the current row

Step 3: ORDER BY (if present in outer query) sorts final output

Window function calculations are complete before the final ORDER BY clause executes.

Converting Mermaid diagram...

Execution Order Implications

Because window functions execute during SELECT (step 5), you cannot use window function results in WHERE, GROUP BY, or HAVING clauses. If you need to filter on a window function result, you must wrap the query in a subquery or CTE and filter in the outer query.

filtering_window_results.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- WRONG: Cannot use window function in WHERE
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) as rank
FROM employees
WHERE RANK() OVER (ORDER BY salary DESC) <= 5; -- ERROR!
 
-- CORRECT: Use subquery or CTE
SELECT * FROM (
    SELECT 
        name, 
        salary, 
        RANK() OVER (ORDER BY salary DESC) as rank
    FROM employees
) ranked
WHERE rank <= 5;
 
-- Or with CTE (cleaner):
WITH ranked_employees AS (
    SELECT 
        name, 
        salary, 
        RANK() OVER (ORDER BY salary DESC) as rank
    FROM employees
)
SELECT * FROM ranked_employees WHERE rank <= 5;

The "Sliding Window" Mental Model

For each row being processed, imagine a window—a frame—that shows a subset of rows in the partition. The window function looks through this window and computes its value.

Empty OVER(): The window shows ALL rows in the entire result set
OVER (PARTITION BY dept): The window shows all rows in the same department
OVER (ORDER BY date): The window shows rows from partition start up to current row (default frame)
OVER (ORDER BY date ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING): The window shows exactly 5 rows centered around current

When to Use Window Functions

Window functions shine in analytical scenarios where you need both detail and context. Here are the canonical use cases:

Prime Use Cases for Window Functions

•Ranking and Top-N Queries — Finding the top 5 products per category, ranking employees by performance within departments, percentile calculations.
•Running Totals and Cumulative Calculations — Year-to-date sales, cumulative count of events, running averages over time series.
•Comparisons to Group Aggregates — Showing each sale as percentage of department total, comparing individual performance to team average.
•Row-to-Row Comparisons — Calculating change from previous period, days since last event, comparing to next value in sequence.
•Moving Averages and Sliding Windows — 7-day moving average, 3-month rolling sum, smoothing time series data.
•Deduplication with Row Selection — Selecting the most recent record per entity, keeping only first occurrence of duplicates.
•Gap and Island Analysis — Identifying consecutive sequences, finding missing values, grouping contiguous data.

Decision Framework: Aggregate vs Window Function

Choosing Between Aggregate and Window Functions
If You Need...	Use	Example
One row per group (summary only)	GROUP BY + Aggregate	Total sales per region
All rows plus group context	Window Function	Each sale with region total
To filter groups	GROUP BY + HAVING	Regions with sales > 1M
To filter on computed position	Window + Subquery	Top 3 per region
Previous/next row values	Window (LAG/LEAD)	Change from last month
Ranking with ties handled	Window (RANK/DENSE_RANK)	Competition standings

Performance Consideration

Window functions can be computationally expensive for large datasets, especially with complex frame specifications. However, they're typically far more efficient than equivalent self-join or correlated subquery approaches. Modern query optimizers handle window functions well, often computing multiple window functions in a single pass through the data.

Database Support and Compatibility

Window functions are part of the SQL:2003 standard and are now supported by all major relational databases. However, there are some differences in specific function availability and syntax:

Window Function Support by Database
Feature	PostgreSQL	MySQL 8.0+	SQL Server	Oracle	SQLite 3.25+
Core window functions	✅ Full	✅ Full	✅ Full	✅ Full	✅ Full
ROW_NUMBER, RANK, DENSE_RANK	✅	✅	✅	✅	✅
LAG, LEAD	✅	✅	✅	✅	✅
FIRST_VALUE, LAST_VALUE	✅	✅	✅	✅	✅
NTH_VALUE	✅	✅	✅	✅	✅
NTILE	✅	✅	✅	✅	✅
Frame clause (ROWS/RANGE)	✅	✅	✅	✅	✅
GROUPS frame	✅	❌	❌	✅	✅
EXCLUDE clause	✅	❌	❌	❌	✅

MySQL Version Caveat

MySQL only added window function support in version 8.0 (released 2018). If you're working with MySQL 5.7 or earlier, window functions are not available—you'll need to use the workarounds (self-joins, correlated subqueries) discussed earlier.

Syntax variations to be aware of:

While the core syntax is standardized, databases have minor variations:

Named windows: PostgreSQL, MySQL 8.0+, and SQLite support defining and reusing window definitions with WINDOW clause
Frame defaults: Behavior when ORDER BY is present but no frame clause varies slightly
NULL handling in ORDER BY: Databases differ in NULLS FIRST/LAST defaults

Summary: The Window Function Concept

We've established the foundational concepts of window functions. Let's consolidate the key takeaways:

Key Takeaways

•Window functions solve the detail-vs-aggregate dilemma — They let you compute aggregate or positional values while preserving individual row context, something impossible with GROUP BY alone.
•The OVER clause is the defining feature — It specifies the 'window' of rows for each calculation, which can include partitioning, ordering, and frame definitions.
•Window functions preserve all rows — Unlike GROUP BY which collapses rows, window functions add computed values to each row without changing row count.
•Four main categories — Aggregate (SUM, AVG over window), Ranking (ROW_NUMBER, RANK), Value (LAG, LEAD), and Distribution (PERCENT_RANK, CUME_DIST).
•Execution timing matters — Window functions run during SELECT, after WHERE/GROUP BY/HAVING but before ORDER BY. Filter on window results requires subquery.
•Universal database support — All major databases now support window functions (MySQL 8.0+, PostgreSQL, SQL Server, Oracle, SQLite 3.25+).

What's next:

Now that we understand what window functions are and why they exist, we'll dive into the specific clauses that define window behavior. The next page covers the OVER clause in depth—the core syntax element that transforms an ordinary function into a window function.

Page Complete

You now understand the fundamental concept of window functions—one of SQL's most powerful analytical features. You can articulate why they exist, how they differ from aggregate functions, and when to use them. Next, we'll master the OVER clause that brings window functions to life.

1 / 5

Loading learning content...

SQLWindow Functions Basics

Window Functions Basics

LevelIntermediate

Duration75 mins

TopicWindow Functions Basics

1 / 5

Window Function Concept

The Analytical Limitation of Traditional SQL

Consider a common business question: "For each sale, show the sale amount alongside the department's total and the sale's percentage of that total."

This limitation persists across countless analytical scenarios:

Ranking items within categories
Computing running totals over time
Comparing each row to group averages
Finding the top N items per group
Calculating moving averages

What You Will Learn

The Problem Window Functions Solve

To truly appreciate window functions, we must first understand the limitation they address. Let's examine a concrete scenario that illustrates the gap in traditional SQL.

Scenario: Employee Salary Analysis

Suppose we have an employees table:

emp_id	name	department	salary
1	Alice	Engineering	95000
2	Bob	Engineering	85000
3	Carol	Engineering	90000
4	David	Sales	70000
5	Eve	Sales	75000
6	Frank	HR	60000

The business asks: "For each employee, show their salary, their department's average salary, and how their salary compares to that average."

This requires two things simultaneously:

Row-level detail — Each employee's individual information
Group-level computation — The department average

Traditional SQL forces a choice between these requirements.

group_by_approach.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- GROUP BY gives us department averages...
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department;
 
-- Result:
-- department  | avg_salary
-- ------------|----------
-- Engineering | 90000
-- Sales       | 72500
-- HR          | 60000
 
-- But we've LOST individual employee rows!
-- We can no longer see Alice, Bob, Carol, etc.

The Collapse Problem

GROUP BY collapses rows into groups. We get one row per department, not one row per employee. The individual employee context is destroyed—exactly what we don't want.

What is a Window Function?

The formal definition:

A window function computes a value for each row in the query result based on a subset of rows (the window) defined by the OVER clause. The window may include all rows, rows in the same partition, or a frame of rows relative to the current row.

The key insight:

Why 'Window'?

The anatomy of a window function call:

window_function_anatomy.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- General syntax:
function_name(arguments) OVER (window_specification)
 
-- The window_specification can include:
-- 1. PARTITION BY - defines how to group rows into partitions
-- 2. ORDER BY - defines row ordering within partitions  
-- 3. Frame clause - defines which rows around current row to include
 
-- Full syntax:
function_name(arguments) OVER (
    PARTITION BY column1, column2, ...
    ORDER BY column3, column4, ...
    frame_clause
)
 
-- Examples of increasing complexity:
 
-- Example 1: All rows in entire result set
AVG(salary) OVER ()
 
-- Example 2: All rows in same department
AVG(salary) OVER (PARTITION BY department)
 
-- Example 3: Running total within department
SUM(sales) OVER (PARTITION BY department ORDER BY sale_date)
 
-- Example 4: Moving average of last 3 rows
AVG(price) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)

Core Components of Window Functions

•The Function — What calculation to perform (SUM, AVG, ROW_NUMBER, RANK, LAG, etc.). Can be an aggregate function or a specialized window-only function.
•OVER Clause — The defining feature of window functions. Specifies how the 'window' of rows is defined for each calculation.
•PARTITION BY — Optional. Divides rows into groups (partitions). The function calculates independently within each partition.
•ORDER BY — Optional for some functions, required for others. Defines logical ordering within partitions. Critical for running totals and ranking.
•Frame Clause — Optional. Defines a subset (frame) of rows within the partition relative to the current row. Enables moving averages, running totals, etc.

Window Functions vs Aggregate Functions

Understanding the distinction between window functions and aggregate functions is fundamental. They share some functions (like SUM, AVG, COUNT) but operate on fundamentally different paradigms.

The key difference:

Aspect	Aggregate Function (GROUP BY)	Window Function (OVER)
Row Output	One row per group	One row per input row
Row Identity	Lost (rows collapsed)	Preserved
Calculation Scope	Entire group	Window per row
Use Case	Summary reports	Analytical queries

Aggregate Functions (GROUP BY)

•Collapse rows into summary groups
•Return one row per group
•Individual row context is destroyed
•Cannot reference both group aggregate AND individual row values in same output
•Execution: Filter → Group → Aggregate → Filter (HAVING)

Window Functions (OVER)

•Preserve all rows exactly as they are
•Return one value per row
•Individual row context is retained
•Can show individual values alongside group calculations
•Execution: After WHERE, GROUP BY, HAVING → Before ORDER BY

Visual illustration:

Consider three sales rows for the 'North' region:

aggregate_vs_window.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Sample data:
-- | sale_id | region | amount |
-- |---------|--------|--------|
-- | 1       | North  | 100    |
-- | 2       | North  | 200    |
-- | 3       | North  | 150    |
 
-- AGGREGATE with GROUP BY:
SELECT region, SUM(amount) as total
FROM sales
GROUP BY region;
 
-- Result: 
-- | region | total |
-- |--------|-------|
-- | North  | 450   |
-- 
-- 3 rows → 1 row (collapsed)
-- Individual sales are GONE
 
-- WINDOW FUNCTION:
SELECT 
    sale_id,
    region,
    amount,
    SUM(amount) OVER (PARTITION BY region) as region_total
FROM sales;
 
-- Result:
-- | sale_id | region | amount | region_total |
-- |---------|--------|--------|--------------|
-- | 1       | North  | 100    | 450          |
-- | 2       | North  | 200    | 450          |
-- | 3       | North  | 150    | 450          |
--
-- 3 rows → 3 rows (preserved)
-- Each sale visible WITH its group total

Same Function, Different Behavior

Categories of Window Functions

Window functions fall into several categories based on their purpose. Understanding these categories helps you match the right function to your analytical need.

Window Function Categories
Category	Functions	Purpose	ORDER BY Required?
Aggregate	SUM, AVG, COUNT, MIN, MAX	Apply traditional aggregates over window	Optional (affects frame)
Ranking	ROW_NUMBER, RANK, DENSE_RANK, NTILE	Assign position/rank to rows	Required
Value/Offset	LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE	Access values from other rows	Required
Distribution	PERCENT_RANK, CUME_DIST, PERCENTILE_CONT, PERCENTILE_DISC	Calculate statistical distributions	Required

Category 1: Aggregate Window Functions

These are the familiar aggregate functions (SUM, AVG, COUNT, MIN, MAX) used with OVER instead of GROUP BY. They compute aggregates while preserving individual rows.

aggregate_window_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Running total
SELECT 
    order_date,
    amount,
    SUM(amount) OVER (ORDER BY order_date) as running_total
FROM orders;
 
-- Percentage of category total
SELECT 
    product_name,
    category,
    sales,
    sales * 100.0 / SUM(sales) OVER (PARTITION BY category) as pct_of_category
FROM products;
 
-- Moving average
SELECT
    date,
    price,
    AVG(price) OVER (
        ORDER BY date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) as seven_day_avg
FROM stock_prices;

Category 2: Ranking Functions

These assign a numeric rank or position to each row based on ORDER BY criteria. Essential for "top N" queries and competitive rankings.

ranking_window_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Row number (unique, sequential)
SELECT 
    name,
    score,
    ROW_NUMBER() OVER (ORDER BY score DESC) as position
FROM contestants;
 
-- Rank with gaps (ties get same rank, next rank skipped)
SELECT 
    name,
    score,
    RANK() OVER (ORDER BY score DESC) as rank
FROM contestants;
-- If two people tie for 1st, next is 3rd
 
-- Dense rank (no gaps)
SELECT 
    name,
    score,
    DENSE_RANK() OVER (ORDER BY score DESC) as dense_rank
FROM contestants;
-- If two people tie for 1st, next is 2nd
 
-- NTILE (divide into buckets)
SELECT 
    student_name,
    score,
    NTILE(4) OVER (ORDER BY score DESC) as quartile
FROM exam_results;

Category 3: Value/Offset Functions

These access values from rows other than the current row—either by relative position (LAG, LEAD) or absolute position (FIRST_VALUE, LAST_VALUE, NTH_VALUE).

value_window_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- LAG: access previous row's value
SELECT 
    date,
    revenue,
    LAG(revenue, 1, 0) OVER (ORDER BY date) as prev_revenue,
    revenue - LAG(revenue, 1, 0) OVER (ORDER BY date) as revenue_change
FROM monthly_revenue;
 
-- LEAD: access next row's value  
SELECT
    flight_id,
    departure_time,
    LEAD(departure_time) OVER (ORDER BY departure_time) as next_departure
FROM flights;
 
-- FIRST_VALUE / LAST_VALUE
SELECT
    product,
    category,
    price,
    FIRST_VALUE(product) OVER (
        PARTITION BY category 
        ORDER BY price DESC
    ) as most_expensive_in_category
FROM products;

Window-Only Functions

The Conceptual Model: How Window Functions Process Data

Understanding how window functions execute helps you reason about their behavior and optimize their use. Here's the conceptual processing model:

Step 1: The query runs normally up through GROUP BY and HAVING

Window functions operate on the result set after WHERE, GROUP BY, and HAVING filters have been applied. They see the post-aggregation rows (if any aggregation occurred).

Step 2: For each row in the result, the database:

Identifies the row's partition (based on PARTITION BY)
Orders rows within the partition (based on ORDER BY)
Defines the frame of rows for calculation (based on frame clause or defaults)
Computes the window function using values in that frame
Associates the computed value with the current row

Step 3: ORDER BY (if present in outer query) sorts final output

Window function calculations are complete before the final ORDER BY clause executes.

Converting Mermaid diagram...

Execution Order Implications

filtering_window_results.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- WRONG: Cannot use window function in WHERE
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) as rank
FROM employees
WHERE RANK() OVER (ORDER BY salary DESC) <= 5; -- ERROR!
 
-- CORRECT: Use subquery or CTE
SELECT * FROM (
    SELECT 
        name, 
        salary, 
        RANK() OVER (ORDER BY salary DESC) as rank
    FROM employees
) ranked
WHERE rank <= 5;
 
-- Or with CTE (cleaner):
WITH ranked_employees AS (
    SELECT 
        name, 
        salary, 
        RANK() OVER (ORDER BY salary DESC) as rank
    FROM employees
)
SELECT * FROM ranked_employees WHERE rank <= 5;

The "Sliding Window" Mental Model

For each row being processed, imagine a window—a frame—that shows a subset of rows in the partition. The window function looks through this window and computes its value.

Empty OVER(): The window shows ALL rows in the entire result set
OVER (PARTITION BY dept): The window shows all rows in the same department
OVER (ORDER BY date): The window shows rows from partition start up to current row (default frame)
OVER (ORDER BY date ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING): The window shows exactly 5 rows centered around current

When to Use Window Functions

Window functions shine in analytical scenarios where you need both detail and context. Here are the canonical use cases:

Prime Use Cases for Window Functions

•Ranking and Top-N Queries — Finding the top 5 products per category, ranking employees by performance within departments, percentile calculations.
•Running Totals and Cumulative Calculations — Year-to-date sales, cumulative count of events, running averages over time series.
•Comparisons to Group Aggregates — Showing each sale as percentage of department total, comparing individual performance to team average.
•Row-to-Row Comparisons — Calculating change from previous period, days since last event, comparing to next value in sequence.
•Moving Averages and Sliding Windows — 7-day moving average, 3-month rolling sum, smoothing time series data.
•Deduplication with Row Selection — Selecting the most recent record per entity, keeping only first occurrence of duplicates.
•Gap and Island Analysis — Identifying consecutive sequences, finding missing values, grouping contiguous data.

Decision Framework: Aggregate vs Window Function

Choosing Between Aggregate and Window Functions
If You Need...	Use	Example
One row per group (summary only)	GROUP BY + Aggregate	Total sales per region
All rows plus group context	Window Function	Each sale with region total
To filter groups	GROUP BY + HAVING	Regions with sales > 1M
To filter on computed position	Window + Subquery	Top 3 per region
Previous/next row values	Window (LAG/LEAD)	Change from last month
Ranking with ties handled	Window (RANK/DENSE_RANK)	Competition standings

Performance Consideration

Database Support and Compatibility

Window functions are part of the SQL:2003 standard and are now supported by all major relational databases. However, there are some differences in specific function availability and syntax:

Window Function Support by Database
Feature	PostgreSQL	MySQL 8.0+	SQL Server	Oracle	SQLite 3.25+
Core window functions	✅ Full	✅ Full	✅ Full	✅ Full	✅ Full
ROW_NUMBER, RANK, DENSE_RANK	✅	✅	✅	✅	✅
LAG, LEAD	✅	✅	✅	✅	✅
FIRST_VALUE, LAST_VALUE	✅	✅	✅	✅	✅
NTH_VALUE	✅	✅	✅	✅	✅
NTILE	✅	✅	✅	✅	✅
Frame clause (ROWS/RANGE)	✅	✅	✅	✅	✅
GROUPS frame	✅	❌	❌	✅	✅
EXCLUDE clause	✅	❌	❌	❌	✅

MySQL Version Caveat

Syntax variations to be aware of:

While the core syntax is standardized, databases have minor variations:

Named windows: PostgreSQL, MySQL 8.0+, and SQLite support defining and reusing window definitions with WINDOW clause
Frame defaults: Behavior when ORDER BY is present but no frame clause varies slightly
NULL handling in ORDER BY: Databases differ in NULLS FIRST/LAST defaults

Summary: The Window Function Concept

We've established the foundational concepts of window functions. Let's consolidate the key takeaways:

Key Takeaways

•Window functions solve the detail-vs-aggregate dilemma — They let you compute aggregate or positional values while preserving individual row context, something impossible with GROUP BY alone.
•The OVER clause is the defining feature — It specifies the 'window' of rows for each calculation, which can include partitioning, ordering, and frame definitions.
•Window functions preserve all rows — Unlike GROUP BY which collapses rows, window functions add computed values to each row without changing row count.
•Four main categories — Aggregate (SUM, AVG over window), Ranking (ROW_NUMBER, RANK), Value (LAG, LEAD), and Distribution (PERCENT_RANK, CUME_DIST).
•Execution timing matters — Window functions run during SELECT, after WHERE/GROUP BY/HAVING but before ORDER BY. Filter on window results requires subquery.
•Universal database support — All major databases now support window functions (MySQL 8.0+, PostgreSQL, SQL Server, Oracle, SQLite 3.25+).

What's next:

Page Complete

1 / 5