Database Management SystemsSQL Query Writing

SQL Query Writing for Interviews

LevelAdvanced

Duration90 mins

TopicSQL Query Writing

4 / 5

Window Functions

The Power of Analytical Queries

Window functions represent one of the most powerful features in modern SQL, enabling calculations across related rows without collapsing the result set. They solve problems that previously required complex self-joins, correlated subqueries, or procedural code—with elegant, performant, single-pass solutions.

In technical interviews, window function mastery is a strong differentiator. Candidates who fluently apply ROW_NUMBER, LAG/LEAD, running totals, and moving averages demonstrate advanced SQL thinking that impresses interviewers and solves real business problems.

What You Will Learn

By the end of this page, you will understand window function anatomy and execution model, master ranking functions (ROW_NUMBER, RANK, DENSE_RANK, NTILE), apply aggregate window functions for running totals and moving averages, use offset functions (LAG, LEAD, FIRST_VALUE, LAST_VALUE) for row comparisons, and construct sophisticated frame specifications for precise calculations.

Window Function Fundamentals

A window function performs a calculation across a set of rows that are somehow related to the current row—this set is called the window. Unlike GROUP BY which collapses rows, window functions retain all individual rows while adding computed values.

The Window Function Anatomy:

function_name(expression) OVER (
    [PARTITION BY partition_expression, ...]
    [ORDER BY sort_expression [ASC|DESC], ...]
    [frame_clause]
)

Window Function Components
Component	Purpose	Optional?	Example
OVER ()	Declares this is a window function	Required	SUM(amount) OVER ()
PARTITION BY	Divides rows into groups for calculation	Yes	PARTITION BY department
ORDER BY	Defines row order within partition	Sometimes*	ORDER BY hire_date
Frame Clause	Specifies which rows relative to current	Yes	ROWS BETWEEN ...

ORDER BY in Window Functions

ORDER BY within OVER() is different from ORDER BY at the query end. Window ORDER BY determines which rows come 'before' and 'after' the current row for calculations. Ranking functions require ORDER BY; aggregate functions behave differently with and without it.

window_basics.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
-- Basic window function: Add overall total to each row
SELECT 
    order_id,
    customer_id,
    total,
    SUM(total) OVER () AS grand_total,
    ROUND(100.0 * total / SUM(total) OVER (), 2) AS pct_of_total
FROM orders;
 
-- With PARTITION BY: Calculations within groups
SELECT 
    order_id,
    customer_id,
    total,
    SUM(total) OVER (PARTITION BY customer_id) AS customer_total,
    ROUND(
        100.0 * total / SUM(total) OVER (PARTITION BY customer_id), 
        2
    ) AS pct_of_customer_total
FROM orders;
 
-- With ORDER BY: Running calculations
SELECT 
    order_id,
    customer_id,
    order_date,
    total,
    SUM(total) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS running_total
FROM orders
ORDER BY customer_id, order_date;
 
-- Multiple window functions in one query
SELECT 
    employee_id,
    department_id,
    salary,
    AVG(salary) OVER (PARTITION BY department_id) AS dept_avg,
    salary - AVG(salary) OVER (PARTITION BY department_id) AS diff_from_avg,
    MAX(salary) OVER (PARTITION BY department_id) AS dept_max,
    salary / MAX(salary) OVER (PARTITION BY department_id) AS pct_of_max
FROM employees;

Ranking Functions

Ranking functions assign ordinal positions to rows based on ORDER BY criteria. Understanding the differences between ranking functions is crucial for selecting the right one for your use case.

Ranking Functions Comparison
Function	Ties Handling	Result for Values (100, 100, 90, 80)	Use Case
ROW_NUMBER()	Distinct ranks (arbitrary for ties)	1, 2, 3, 4	Top-N per group, pagination
RANK()	Same rank, skips after ties	1, 1, 3, 4	Competition ranking, sparse
DENSE_RANK()	Same rank, no skips	1, 1, 2, 3	Competition ranking, continuous
NTILE(n)	Distributes into n buckets	1, 1, 2, 3 (for NTILE(4))	Percentiles, quartiles

ranking_functions.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
-- All ranking functions compared
SELECT 
    employee_id,
    department_id,
    salary,
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
    RANK() OVER (ORDER BY salary DESC) AS rank,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank,
    NTILE(4) OVER (ORDER BY salary DESC) AS quartile
FROM employees;
 
-- ROW_NUMBER for Top-N per group (most common interview pattern!)
WITH ranked_employees AS (
    SELECT 
        employee_id,
        name,
        department_id,
        salary,
        ROW_NUMBER() OVER (
            PARTITION BY department_id 
            ORDER BY salary DESC
        ) AS dept_rank
    FROM employees
)
SELECT * 
FROM ranked_employees
WHERE dept_rank <= 3;  -- Top 3 per department
 
-- RANK for handling ties fairly
SELECT 
    product_id,
    product_name,
    total_sales,
    RANK() OVER (ORDER BY total_sales DESC) AS sales_rank
FROM (
    SELECT 
        p.product_id,
        p.product_name,
        SUM(oi.quantity * oi.unit_price) AS total_sales
    FROM products p
    JOIN order_items oi ON p.product_id = oi.product_id
    GROUP BY p.product_id, p.product_name
) product_sales
WHERE total_sales > 0;
 
-- DENSE_RANK for continuous ranking
SELECT 
    player_name,
    score,
    DENSE_RANK() OVER (ORDER BY score DESC) AS place
FROM tournament_results;
 
-- NTILE for percentile buckets
SELECT 
    customer_id,
    total_spending,
    CASE NTILE(5) OVER (ORDER BY total_spending DESC)
        WHEN 1 THEN 'Top 20%'
        WHEN 2 THEN '20-40%'
        WHEN 3 THEN '40-60%'
        WHEN 4 THEN '60-80%'
        WHEN 5 THEN 'Bottom 20%'
    END AS spending_tier
FROM (
    SELECT customer_id, SUM(total) AS total_spending
    FROM orders
    GROUP BY customer_id
) customer_totals;
 
-- PERCENT_RANK and CUME_DIST for precise percentiles
SELECT 
    employee_id,
    salary,
    ROUND(100 * PERCENT_RANK() OVER (ORDER BY salary), 2) AS percentile,
    ROUND(100 * CUME_DIST() OVER (ORDER BY salary), 2) AS cumulative_dist
FROM employees;

Interview Essential: Top-N Per Group

The pattern 'ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) + filter in outer query WHERE rank <= N' is one of the most frequently asked interview patterns. Master it thoroughly—it appears in many variations.

Aggregate Window Functions

Standard aggregate functions (SUM, AVG, COUNT, MIN, MAX) become dramatically more powerful when used as window functions. They can compute running totals, moving averages, cumulative counts, and group comparisons—all while preserving row-level detail.

Key Behavior Change:

Without ORDER BY in OVER(), aggregates compute over the entire partition. With ORDER BY, they compute over a cumulative frame (rows from start to current row by default).

aggregate_windows.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
-- Partition-wide aggregates (no ORDER BY)
SELECT 
    order_id,
    customer_id,
    order_date,
    total,
    COUNT(*) OVER (PARTITION BY customer_id) AS customer_order_count,
    AVG(total) OVER (PARTITION BY customer_id) AS customer_avg_order,
    SUM(total) OVER (PARTITION BY customer_id) AS customer_lifetime_value
FROM orders;
 
-- Running totals (with ORDER BY)
SELECT 
    order_date,
    total,
    SUM(total) OVER (ORDER BY order_date) AS running_total,
    AVG(total) OVER (ORDER BY order_date) AS running_avg,
    COUNT(*) OVER (ORDER BY order_date) AS cumulative_count
FROM orders
WHERE customer_id = 1001
ORDER BY order_date;
 
-- Running totals partitioned by group
SELECT 
    customer_id,
    order_date,
    total,
    SUM(total) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS customer_running_total,
    ROW_NUMBER() OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS customer_order_number
FROM orders
ORDER BY customer_id, order_date;
 
-- Comparative analysis: Each value vs group statistics
SELECT 
    department_id,
    employee_id,
    salary,
    AVG(salary) OVER (PARTITION BY department_id) AS dept_avg,
    MIN(salary) OVER (PARTITION BY department_id) AS dept_min,
    MAX(salary) OVER (PARTITION BY department_id) AS dept_max,
    ROUND(
        (salary - MIN(salary) OVER (PARTITION BY department_id)) /
        NULLIF(
            MAX(salary) OVER (PARTITION BY department_id) - 
            MIN(salary) OVER (PARTITION BY department_id), 
            0
        ) * 100, 
        2
    ) AS position_in_range_pct
FROM employees;
 
-- Year-over-year totals with partition restart
SELECT 
    EXTRACT(YEAR FROM order_date) AS year,
    EXTRACT(MONTH FROM order_date) AS month,
    SUM(total) AS monthly_total,
    SUM(SUM(total)) OVER (
        PARTITION BY EXTRACT(YEAR FROM order_date)
        ORDER BY EXTRACT(MONTH FROM order_date)
    ) AS ytd_total
FROM orders
GROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date)
ORDER BY year, month;

Double Aggregation Pattern

Notice the SUM(SUM(total)) pattern for YTD totals—the inner SUM is a regular GROUP BY aggregate, the outer SUM is a window function. This pattern is common when you need both grouped and running calculations.

Offset Functions: LAG, LEAD, and Value Access

Offset functions access values from other rows relative to the current row—enabling comparisons, change calculations, and gap filling without self-joins.

Offset Window Functions
Function	Returns	Syntax	Use Case
LAG()	Value from previous row(s)	LAG(col, offset, default)	Period-over-period change
LEAD()	Value from following row(s)	LEAD(col, offset, default)	Forecast, gap detection
FIRST_VALUE()	First value in window frame	FIRST_VALUE(col)	Frame minimum, first in group
LAST_VALUE()	Last value in window frame	LAST_VALUE(col)	Frame maximum, current state
NTH_VALUE()	Nth value in window frame	NTH_VALUE(col, n)	Specific position value

offset_functions.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
-- LAG: Compare with previous row
SELECT 
    order_date,
    daily_revenue,
    LAG(daily_revenue, 1) OVER (ORDER BY order_date) AS prev_day_revenue,
    daily_revenue - LAG(daily_revenue, 1) OVER (ORDER BY order_date) AS day_over_day_change,
    ROUND(
        100.0 * (daily_revenue - LAG(daily_revenue, 1) OVER (ORDER BY order_date)) /
        NULLIF(LAG(daily_revenue, 1) OVER (ORDER BY order_date), 0),
        2
    ) AS pct_change
FROM (
    SELECT order_date::DATE, SUM(total) AS daily_revenue
    FROM orders
    GROUP BY order_date::DATE
) daily_totals
ORDER BY order_date;
 
-- LAG with default value for first row
SELECT 
    customer_id,
    order_date,
    total,
    LAG(total, 1, 0) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS prev_order_value
FROM orders;
 
-- LEAD: Look ahead
SELECT 
    customer_id,
    order_date,
    LEAD(order_date, 1) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS next_order_date,
    LEAD(order_date, 1) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) - order_date AS days_until_next_order
FROM orders;
 
-- Gap detection with LEAD
SELECT *
FROM (
    SELECT 
        customer_id,
        order_date,
        LEAD(order_date) OVER (
            PARTITION BY customer_id 
            ORDER BY order_date
        ) AS next_order,
        LEAD(order_date) OVER (
            PARTITION BY customer_id 
            ORDER BY order_date
        ) - order_date AS gap_days
    FROM orders
) with_gaps
WHERE gap_days > 90;  -- Gaps of more than 90 days
 
-- FIRST_VALUE and LAST_VALUE
SELECT 
    employee_id,
    department_id,
    hire_date,
    salary,
    FIRST_VALUE(salary) OVER (
        PARTITION BY department_id 
        ORDER BY hire_date
    ) AS first_hire_salary,
    LAST_VALUE(salary) OVER (
        PARTITION BY department_id 
        ORDER BY hire_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS latest_hire_salary
FROM employees;
 
-- NTH_VALUE for specific positions
SELECT 
    product_id,
    product_name,
    category_id,
    unit_price,
    NTH_VALUE(unit_price, 2) OVER (
        PARTITION BY category_id 
        ORDER BY unit_price DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS second_highest_price
FROM products;
 
-- Consecutive streak detection
WITH order_gaps AS (
    SELECT 
        customer_id,
        order_date,
        order_date - LAG(order_date) OVER (
            PARTITION BY customer_id 
            ORDER BY order_date
        ) AS days_since_last
    FROM orders
)
SELECT 
    customer_id,
    order_date,
    days_since_last,
    CASE WHEN days_since_last <= 7 THEN 1 ELSE 0 END AS in_streak
FROM order_gaps;

LAST_VALUE Frame Gotcha

By default, with ORDER BY, the frame is 'UNBOUNDED PRECEDING TO CURRENT ROW'—so LAST_VALUE returns the current row's value! To get the actual last value in the partition, specify 'ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING'.

Frame Specifications: Precise Window Control

Frame specifications define exactly which rows are included in the window for each row's calculation. This enables moving averages, bounded running totals, and precise analytical computations.

Frame Clause Syntax:

{ROWS | RANGE | GROUPS} BETWEEN start_bound AND end_bound

Frame Bound Options
Bound	Meaning	Example
UNBOUNDED PRECEDING	First row of partition	All previous rows
N PRECEDING	N rows before current	3 PRECEDING = 3 rows back
CURRENT ROW	The current row	Include current row
N FOLLOWING	N rows after current	2 FOLLOWING = 2 rows ahead
UNBOUNDED FOLLOWING	Last row of partition	All remaining rows

frame_specifications.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
-- Moving average: 7-day window
SELECT 
    order_date,
    daily_revenue,
    AVG(daily_revenue) OVER (
        ORDER BY order_date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS moving_avg_7day
FROM (
    SELECT order_date::DATE, SUM(total) AS daily_revenue
    FROM orders
    GROUP BY order_date::DATE
) daily_totals
ORDER BY order_date;
 
-- Moving sum: Current row plus 2 before and 2 after (5-row window)
SELECT 
    order_date,
    daily_revenue,
    SUM(daily_revenue) OVER (
        ORDER BY order_date
        ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
    ) AS centered_5day_sum
FROM daily_totals;
 
-- Running total with bounded look-back
SELECT 
    month,
    revenue,
    SUM(revenue) OVER (
        ORDER BY month
        ROWS BETWEEN 11 PRECEDING AND CURRENT ROW
    ) AS trailing_12_month_revenue
FROM monthly_revenue;
 
-- Different frame types comparison
SELECT 
    sale_date,
    amount,
    category,
    -- ROWS: Physical row count
    SUM(amount) OVER (
        ORDER BY sale_date
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS rows_sum,
    -- RANGE: Logical value range (same date = same group)
    SUM(amount) OVER (
        ORDER BY sale_date
        RANGE BETWEEN INTERVAL '2 days' PRECEDING AND CURRENT ROW
    ) AS range_sum
FROM sales;
 
-- Excluding current row in calculation
SELECT 
    employee_id,
    department_id,
    salary,
    AVG(salary) OVER (
        PARTITION BY department_id
        ORDER BY employee_id
        ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
    ) AS avg_of_others_before_me
FROM employees;
 
-- Forward-looking calculations
SELECT 
    order_date,
    daily_revenue,
    AVG(daily_revenue) OVER (
        ORDER BY order_date
        ROWS BETWEEN CURRENT ROW AND 6 FOLLOWING
    ) AS next_7day_avg
FROM daily_totals;
 
-- Full frame for partition-wide value
SELECT 
    employee_id,
    department_id,
    salary,
    MAX(salary) OVER (
        PARTITION BY department_id
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS dept_max_salary
FROM employees;

ROWS vs RANGE vs GROUPS

ROWS counts physical rows. RANGE groups rows with equal ORDER BY values together. GROUPS (SQL:2011) counts groups of tied values. For most calculations, ROWS provides the most predictable behavior. Use RANGE when logically-equal values should be treated identically.

Named Windows and Optimization

When using multiple window functions with the same specification, named windows reduce redundancy and improve clarity. Understanding execution also helps write efficient queries.

named_windows.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
-- Without named window (repetitive)
SELECT 
    employee_id,
    department_id,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank,
    SUM(salary) OVER (PARTITION BY department_id ORDER BY salary DESC) AS running_sum,
    AVG(salary) OVER (PARTITION BY department_id ORDER BY salary DESC) AS running_avg
FROM employees;
 
-- With named window (cleaner)
SELECT 
    employee_id,
    department_id,
    salary,
    ROW_NUMBER() OVER dept_salary_window AS rank,
    SUM(salary) OVER dept_salary_window AS running_sum,
    AVG(salary) OVER dept_salary_window AS running_avg
FROM employees
WINDOW dept_salary_window AS (
    PARTITION BY department_id 
    ORDER BY salary DESC
);
 
-- Multiple named windows
SELECT 
    order_id,
    customer_id,
    order_date,
    total,
    -- Customer-level calculations
    SUM(total) OVER customer_window AS customer_total,
    ROW_NUMBER() OVER customer_window AS customer_order_num,
    -- Date-level calculations  
    SUM(total) OVER date_window AS daily_total,
    COUNT(*) OVER date_window AS daily_order_count
FROM orders
WINDOW 
    customer_window AS (PARTITION BY customer_id ORDER BY order_date),
    date_window AS (PARTITION BY order_date::DATE);
 
-- Window refinement: Extending a named window
SELECT 
    employee_id,
    department_id,
    salary,
    hire_date,
    -- Use base window
    AVG(salary) OVER base_window AS dept_avg,
    -- Extend with frame
    SUM(salary) OVER (
        base_window 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_sum
FROM employees
WINDOW base_window AS (PARTITION BY department_id ORDER BY hire_date);

Window Function Performance Tips

•Index for PARTITION BY — Indexes on partition columns improve performance significantly
•Minimize window variations — Same PARTITION BY and ORDER BY can share computation
•Use ROWS over RANGE when possible — ROWS is generally faster as it counts physical rows
•Filter before windowing — Use CTEs or subqueries to filter rows before window calculations
•Avoid DISTINCT in windows — Some databases don't support it; it's always expensive
•Consider materialized views — For frequently-used window calculations on large tables

Advanced Window Function Patterns

Beyond basic usage, window functions enable sophisticated analytical patterns that frequently appear in interviews and real-world analytics.

Islands and Gaps: Finding Consecutive Sequences

This classic pattern identifies groups of consecutive values (islands) separated by gaps:

islands_gaps.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- Find consecutive login streaks per user
WITH numbered AS (
    SELECT 
        user_id,
        login_date,
        login_date - (ROW_NUMBER() OVER (
            PARTITION BY user_id 
            ORDER BY login_date
        ) * INTERVAL '1 day') AS grp
    FROM (SELECT DISTINCT user_id, login_date::DATE FROM logins) t
),
streaks AS (
    SELECT 
        user_id,
        MIN(login_date) AS streak_start,
        MAX(login_date) AS streak_end,
        COUNT(*) AS streak_length
    FROM numbered
    GROUP BY user_id, grp
)
SELECT * FROM streaks
WHERE streak_length >= 3  -- At least 3 consecutive days
ORDER BY user_id, streak_start;
 
-- Find gaps in sequential IDs
WITH id_gaps AS (
    SELECT 
        order_id,
        LEAD(order_id) OVER (ORDER BY order_id) AS next_id,
        LEAD(order_id) OVER (ORDER BY order_id) - order_id AS gap
    FROM orders
)
SELECT 
    order_id AS gap_starts_after,
    next_id AS gap_ends_before,
    gap - 1 AS missing_count
FROM id_gaps
WHERE gap > 1
ORDER BY order_id;

Summary: Window Function Mastery

You've now mastered SQL window functions—one of the most powerful features for analytical queries and a strong differentiator in technical interviews.

Key Takeaways:

Core Concepts Mastered

•Window Function Structure — OVER() clause with optional PARTITION BY, ORDER BY, and frame specifications defines the calculation window.
•Ranking Functions — ROW_NUMBER (unique ranks), RANK (gaps for ties), DENSE_RANK (no gaps), NTILE (buckets), each serving specific use cases.
•Aggregate Windows — SUM, AVG, COUNT as window functions enable running totals, moving averages, and group comparisons without losing row detail.
•Offset Functions — LAG/LEAD for row comparisons, FIRST_VALUE/LAST_VALUE for boundary values—essential for time-series analysis.
•Frame Specifications — ROWS/RANGE/GROUPS BETWEEN defines exactly which rows contribute to each calculation.
•Advanced Patterns — Islands and gaps, sessionization, funnel analysis, and deduplication showcase real-world applications.

What's Next:

With window function mastery complete, we'll conclude with Query Optimization in the final page—techniques for writing efficient SQL, understanding query plans, and demonstrating performance awareness in interviews.

Page Complete

You now command the full power of SQL window functions. From basic rankings to complex analytical patterns like sessionization and funnel analysis, these skills enable elegant solutions to problems that would otherwise require complex procedural code or multiple queries.

4 / 5

Loading learning content...

Database Management SystemsSQL Query Writing

SQL Query Writing for Interviews

LevelAdvanced

Duration90 mins

TopicSQL Query Writing

4 / 5

Window Functions

The Power of Analytical Queries

What You Will Learn

Window Function Fundamentals

The Window Function Anatomy:

function_name(expression) OVER (
    [PARTITION BY partition_expression, ...]
    [ORDER BY sort_expression [ASC|DESC], ...]
    [frame_clause]
)

Window Function Components
Component	Purpose	Optional?	Example
OVER ()	Declares this is a window function	Required	SUM(amount) OVER ()
PARTITION BY	Divides rows into groups for calculation	Yes	PARTITION BY department
ORDER BY	Defines row order within partition	Sometimes*	ORDER BY hire_date
Frame Clause	Specifies which rows relative to current	Yes	ROWS BETWEEN ...

ORDER BY in Window Functions

window_basics.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
-- Basic window function: Add overall total to each row
SELECT 
    order_id,
    customer_id,
    total,
    SUM(total) OVER () AS grand_total,
    ROUND(100.0 * total / SUM(total) OVER (), 2) AS pct_of_total
FROM orders;
 
-- With PARTITION BY: Calculations within groups
SELECT 
    order_id,
    customer_id,
    total,
    SUM(total) OVER (PARTITION BY customer_id) AS customer_total,
    ROUND(
        100.0 * total / SUM(total) OVER (PARTITION BY customer_id), 
        2
    ) AS pct_of_customer_total
FROM orders;
 
-- With ORDER BY: Running calculations
SELECT 
    order_id,
    customer_id,
    order_date,
    total,
    SUM(total) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS running_total
FROM orders
ORDER BY customer_id, order_date;
 
-- Multiple window functions in one query
SELECT 
    employee_id,
    department_id,
    salary,
    AVG(salary) OVER (PARTITION BY department_id) AS dept_avg,
    salary - AVG(salary) OVER (PARTITION BY department_id) AS diff_from_avg,
    MAX(salary) OVER (PARTITION BY department_id) AS dept_max,
    salary / MAX(salary) OVER (PARTITION BY department_id) AS pct_of_max
FROM employees;

Ranking Functions

Ranking functions assign ordinal positions to rows based on ORDER BY criteria. Understanding the differences between ranking functions is crucial for selecting the right one for your use case.

Ranking Functions Comparison
Function	Ties Handling	Result for Values (100, 100, 90, 80)	Use Case
ROW_NUMBER()	Distinct ranks (arbitrary for ties)	1, 2, 3, 4	Top-N per group, pagination
RANK()	Same rank, skips after ties	1, 1, 3, 4	Competition ranking, sparse
DENSE_RANK()	Same rank, no skips	1, 1, 2, 3	Competition ranking, continuous
NTILE(n)	Distributes into n buckets	1, 1, 2, 3 (for NTILE(4))	Percentiles, quartiles

ranking_functions.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
-- All ranking functions compared
SELECT 
    employee_id,
    department_id,
    salary,
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
    RANK() OVER (ORDER BY salary DESC) AS rank,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank,
    NTILE(4) OVER (ORDER BY salary DESC) AS quartile
FROM employees;
 
-- ROW_NUMBER for Top-N per group (most common interview pattern!)
WITH ranked_employees AS (
    SELECT 
        employee_id,
        name,
        department_id,
        salary,
        ROW_NUMBER() OVER (
            PARTITION BY department_id 
            ORDER BY salary DESC
        ) AS dept_rank
    FROM employees
)
SELECT * 
FROM ranked_employees
WHERE dept_rank <= 3;  -- Top 3 per department
 
-- RANK for handling ties fairly
SELECT 
    product_id,
    product_name,
    total_sales,
    RANK() OVER (ORDER BY total_sales DESC) AS sales_rank
FROM (
    SELECT 
        p.product_id,
        p.product_name,
        SUM(oi.quantity * oi.unit_price) AS total_sales
    FROM products p
    JOIN order_items oi ON p.product_id = oi.product_id
    GROUP BY p.product_id, p.product_name
) product_sales
WHERE total_sales > 0;
 
-- DENSE_RANK for continuous ranking
SELECT 
    player_name,
    score,
    DENSE_RANK() OVER (ORDER BY score DESC) AS place
FROM tournament_results;
 
-- NTILE for percentile buckets
SELECT 
    customer_id,
    total_spending,
    CASE NTILE(5) OVER (ORDER BY total_spending DESC)
        WHEN 1 THEN 'Top 20%'
        WHEN 2 THEN '20-40%'
        WHEN 3 THEN '40-60%'
        WHEN 4 THEN '60-80%'
        WHEN 5 THEN 'Bottom 20%'
    END AS spending_tier
FROM (
    SELECT customer_id, SUM(total) AS total_spending
    FROM orders
    GROUP BY customer_id
) customer_totals;
 
-- PERCENT_RANK and CUME_DIST for precise percentiles
SELECT 
    employee_id,
    salary,
    ROUND(100 * PERCENT_RANK() OVER (ORDER BY salary), 2) AS percentile,
    ROUND(100 * CUME_DIST() OVER (ORDER BY salary), 2) AS cumulative_dist
FROM employees;

Interview Essential: Top-N Per Group

Aggregate Window Functions

Key Behavior Change:

Without ORDER BY in OVER(), aggregates compute over the entire partition. With ORDER BY, they compute over a cumulative frame (rows from start to current row by default).

aggregate_windows.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
-- Partition-wide aggregates (no ORDER BY)
SELECT 
    order_id,
    customer_id,
    order_date,
    total,
    COUNT(*) OVER (PARTITION BY customer_id) AS customer_order_count,
    AVG(total) OVER (PARTITION BY customer_id) AS customer_avg_order,
    SUM(total) OVER (PARTITION BY customer_id) AS customer_lifetime_value
FROM orders;
 
-- Running totals (with ORDER BY)
SELECT 
    order_date,
    total,
    SUM(total) OVER (ORDER BY order_date) AS running_total,
    AVG(total) OVER (ORDER BY order_date) AS running_avg,
    COUNT(*) OVER (ORDER BY order_date) AS cumulative_count
FROM orders
WHERE customer_id = 1001
ORDER BY order_date;
 
-- Running totals partitioned by group
SELECT 
    customer_id,
    order_date,
    total,
    SUM(total) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS customer_running_total,
    ROW_NUMBER() OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS customer_order_number
FROM orders
ORDER BY customer_id, order_date;
 
-- Comparative analysis: Each value vs group statistics
SELECT 
    department_id,
    employee_id,
    salary,
    AVG(salary) OVER (PARTITION BY department_id) AS dept_avg,
    MIN(salary) OVER (PARTITION BY department_id) AS dept_min,
    MAX(salary) OVER (PARTITION BY department_id) AS dept_max,
    ROUND(
        (salary - MIN(salary) OVER (PARTITION BY department_id)) /
        NULLIF(
            MAX(salary) OVER (PARTITION BY department_id) - 
            MIN(salary) OVER (PARTITION BY department_id), 
            0
        ) * 100, 
        2
    ) AS position_in_range_pct
FROM employees;
 
-- Year-over-year totals with partition restart
SELECT 
    EXTRACT(YEAR FROM order_date) AS year,
    EXTRACT(MONTH FROM order_date) AS month,
    SUM(total) AS monthly_total,
    SUM(SUM(total)) OVER (
        PARTITION BY EXTRACT(YEAR FROM order_date)
        ORDER BY EXTRACT(MONTH FROM order_date)
    ) AS ytd_total
FROM orders
GROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date)
ORDER BY year, month;

Double Aggregation Pattern

Offset Functions: LAG, LEAD, and Value Access

Offset functions access values from other rows relative to the current row—enabling comparisons, change calculations, and gap filling without self-joins.

Offset Window Functions
Function	Returns	Syntax	Use Case
LAG()	Value from previous row(s)	LAG(col, offset, default)	Period-over-period change
LEAD()	Value from following row(s)	LEAD(col, offset, default)	Forecast, gap detection
FIRST_VALUE()	First value in window frame	FIRST_VALUE(col)	Frame minimum, first in group
LAST_VALUE()	Last value in window frame	LAST_VALUE(col)	Frame maximum, current state
NTH_VALUE()	Nth value in window frame	NTH_VALUE(col, n)	Specific position value

offset_functions.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
-- LAG: Compare with previous row
SELECT 
    order_date,
    daily_revenue,
    LAG(daily_revenue, 1) OVER (ORDER BY order_date) AS prev_day_revenue,
    daily_revenue - LAG(daily_revenue, 1) OVER (ORDER BY order_date) AS day_over_day_change,
    ROUND(
        100.0 * (daily_revenue - LAG(daily_revenue, 1) OVER (ORDER BY order_date)) /
        NULLIF(LAG(daily_revenue, 1) OVER (ORDER BY order_date), 0),
        2
    ) AS pct_change
FROM (
    SELECT order_date::DATE, SUM(total) AS daily_revenue
    FROM orders
    GROUP BY order_date::DATE
) daily_totals
ORDER BY order_date;
 
-- LAG with default value for first row
SELECT 
    customer_id,
    order_date,
    total,
    LAG(total, 1, 0) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS prev_order_value
FROM orders;
 
-- LEAD: Look ahead
SELECT 
    customer_id,
    order_date,
    LEAD(order_date, 1) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) AS next_order_date,
    LEAD(order_date, 1) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
    ) - order_date AS days_until_next_order
FROM orders;
 
-- Gap detection with LEAD
SELECT *
FROM (
    SELECT 
        customer_id,
        order_date,
        LEAD(order_date) OVER (
            PARTITION BY customer_id 
            ORDER BY order_date
        ) AS next_order,
        LEAD(order_date) OVER (
            PARTITION BY customer_id 
            ORDER BY order_date
        ) - order_date AS gap_days
    FROM orders
) with_gaps
WHERE gap_days > 90;  -- Gaps of more than 90 days
 
-- FIRST_VALUE and LAST_VALUE
SELECT 
    employee_id,
    department_id,
    hire_date,
    salary,
    FIRST_VALUE(salary) OVER (
        PARTITION BY department_id 
        ORDER BY hire_date
    ) AS first_hire_salary,
    LAST_VALUE(salary) OVER (
        PARTITION BY department_id 
        ORDER BY hire_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS latest_hire_salary
FROM employees;
 
-- NTH_VALUE for specific positions
SELECT 
    product_id,
    product_name,
    category_id,
    unit_price,
    NTH_VALUE(unit_price, 2) OVER (
        PARTITION BY category_id 
        ORDER BY unit_price DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS second_highest_price
FROM products;
 
-- Consecutive streak detection
WITH order_gaps AS (
    SELECT 
        customer_id,
        order_date,
        order_date - LAG(order_date) OVER (
            PARTITION BY customer_id 
            ORDER BY order_date
        ) AS days_since_last
    FROM orders
)
SELECT 
    customer_id,
    order_date,
    days_since_last,
    CASE WHEN days_since_last <= 7 THEN 1 ELSE 0 END AS in_streak
FROM order_gaps;

LAST_VALUE Frame Gotcha

Frame Specifications: Precise Window Control

Frame specifications define exactly which rows are included in the window for each row's calculation. This enables moving averages, bounded running totals, and precise analytical computations.

Frame Clause Syntax:

{ROWS | RANGE | GROUPS} BETWEEN start_bound AND end_bound

Frame Bound Options
Bound	Meaning	Example
UNBOUNDED PRECEDING	First row of partition	All previous rows
N PRECEDING	N rows before current	3 PRECEDING = 3 rows back
CURRENT ROW	The current row	Include current row
N FOLLOWING	N rows after current	2 FOLLOWING = 2 rows ahead
UNBOUNDED FOLLOWING	Last row of partition	All remaining rows

frame_specifications.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
-- Moving average: 7-day window
SELECT 
    order_date,
    daily_revenue,
    AVG(daily_revenue) OVER (
        ORDER BY order_date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS moving_avg_7day
FROM (
    SELECT order_date::DATE, SUM(total) AS daily_revenue
    FROM orders
    GROUP BY order_date::DATE
) daily_totals
ORDER BY order_date;
 
-- Moving sum: Current row plus 2 before and 2 after (5-row window)
SELECT 
    order_date,
    daily_revenue,
    SUM(daily_revenue) OVER (
        ORDER BY order_date
        ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
    ) AS centered_5day_sum
FROM daily_totals;
 
-- Running total with bounded look-back
SELECT 
    month,
    revenue,
    SUM(revenue) OVER (
        ORDER BY month
        ROWS BETWEEN 11 PRECEDING AND CURRENT ROW
    ) AS trailing_12_month_revenue
FROM monthly_revenue;
 
-- Different frame types comparison
SELECT 
    sale_date,
    amount,
    category,
    -- ROWS: Physical row count
    SUM(amount) OVER (
        ORDER BY sale_date
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS rows_sum,
    -- RANGE: Logical value range (same date = same group)
    SUM(amount) OVER (
        ORDER BY sale_date
        RANGE BETWEEN INTERVAL '2 days' PRECEDING AND CURRENT ROW
    ) AS range_sum
FROM sales;
 
-- Excluding current row in calculation
SELECT 
    employee_id,
    department_id,
    salary,
    AVG(salary) OVER (
        PARTITION BY department_id
        ORDER BY employee_id
        ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
    ) AS avg_of_others_before_me
FROM employees;
 
-- Forward-looking calculations
SELECT 
    order_date,
    daily_revenue,
    AVG(daily_revenue) OVER (
        ORDER BY order_date
        ROWS BETWEEN CURRENT ROW AND 6 FOLLOWING
    ) AS next_7day_avg
FROM daily_totals;
 
-- Full frame for partition-wide value
SELECT 
    employee_id,
    department_id,
    salary,
    MAX(salary) OVER (
        PARTITION BY department_id
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS dept_max_salary
FROM employees;

ROWS vs RANGE vs GROUPS

Named Windows and Optimization

When using multiple window functions with the same specification, named windows reduce redundancy and improve clarity. Understanding execution also helps write efficient queries.

named_windows.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
-- Without named window (repetitive)
SELECT 
    employee_id,
    department_id,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank,
    SUM(salary) OVER (PARTITION BY department_id ORDER BY salary DESC) AS running_sum,
    AVG(salary) OVER (PARTITION BY department_id ORDER BY salary DESC) AS running_avg
FROM employees;
 
-- With named window (cleaner)
SELECT 
    employee_id,
    department_id,
    salary,
    ROW_NUMBER() OVER dept_salary_window AS rank,
    SUM(salary) OVER dept_salary_window AS running_sum,
    AVG(salary) OVER dept_salary_window AS running_avg
FROM employees
WINDOW dept_salary_window AS (
    PARTITION BY department_id 
    ORDER BY salary DESC
);
 
-- Multiple named windows
SELECT 
    order_id,
    customer_id,
    order_date,
    total,
    -- Customer-level calculations
    SUM(total) OVER customer_window AS customer_total,
    ROW_NUMBER() OVER customer_window AS customer_order_num,
    -- Date-level calculations  
    SUM(total) OVER date_window AS daily_total,
    COUNT(*) OVER date_window AS daily_order_count
FROM orders
WINDOW 
    customer_window AS (PARTITION BY customer_id ORDER BY order_date),
    date_window AS (PARTITION BY order_date::DATE);
 
-- Window refinement: Extending a named window
SELECT 
    employee_id,
    department_id,
    salary,
    hire_date,
    -- Use base window
    AVG(salary) OVER base_window AS dept_avg,
    -- Extend with frame
    SUM(salary) OVER (
        base_window 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_sum
FROM employees
WINDOW base_window AS (PARTITION BY department_id ORDER BY hire_date);

Window Function Performance Tips

•Index for PARTITION BY — Indexes on partition columns improve performance significantly
•Minimize window variations — Same PARTITION BY and ORDER BY can share computation
•Use ROWS over RANGE when possible — ROWS is generally faster as it counts physical rows
•Filter before windowing — Use CTEs or subqueries to filter rows before window calculations
•Avoid DISTINCT in windows — Some databases don't support it; it's always expensive
•Consider materialized views — For frequently-used window calculations on large tables

Advanced Window Function Patterns

Beyond basic usage, window functions enable sophisticated analytical patterns that frequently appear in interviews and real-world analytics.

Islands and Gaps: Finding Consecutive Sequences

This classic pattern identifies groups of consecutive values (islands) separated by gaps:

islands_gaps.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- Find consecutive login streaks per user
WITH numbered AS (
    SELECT 
        user_id,
        login_date,
        login_date - (ROW_NUMBER() OVER (
            PARTITION BY user_id 
            ORDER BY login_date
        ) * INTERVAL '1 day') AS grp
    FROM (SELECT DISTINCT user_id, login_date::DATE FROM logins) t
),
streaks AS (
    SELECT 
        user_id,
        MIN(login_date) AS streak_start,
        MAX(login_date) AS streak_end,
        COUNT(*) AS streak_length
    FROM numbered
    GROUP BY user_id, grp
)
SELECT * FROM streaks
WHERE streak_length >= 3  -- At least 3 consecutive days
ORDER BY user_id, streak_start;
 
-- Find gaps in sequential IDs
WITH id_gaps AS (
    SELECT 
        order_id,
        LEAD(order_id) OVER (ORDER BY order_id) AS next_id,
        LEAD(order_id) OVER (ORDER BY order_id) - order_id AS gap
    FROM orders
)
SELECT 
    order_id AS gap_starts_after,
    next_id AS gap_ends_before,
    gap - 1 AS missing_count
FROM id_gaps
WHERE gap > 1
ORDER BY order_id;

Summary: Window Function Mastery

You've now mastered SQL window functions—one of the most powerful features for analytical queries and a strong differentiator in technical interviews.

Key Takeaways:

Core Concepts Mastered

•Window Function Structure — OVER() clause with optional PARTITION BY, ORDER BY, and frame specifications defines the calculation window.
•Ranking Functions — ROW_NUMBER (unique ranks), RANK (gaps for ties), DENSE_RANK (no gaps), NTILE (buckets), each serving specific use cases.
•Aggregate Windows — SUM, AVG, COUNT as window functions enable running totals, moving averages, and group comparisons without losing row detail.
•Offset Functions — LAG/LEAD for row comparisons, FIRST_VALUE/LAST_VALUE for boundary values—essential for time-series analysis.
•Frame Specifications — ROWS/RANGE/GROUPS BETWEEN defines exactly which rows contribute to each calculation.
•Advanced Patterns — Islands and gaps, sessionization, funnel analysis, and deduplication showcase real-world applications.

What's Next:

Page Complete

4 / 5