Loading learning content...
Window functions represent one of the most powerful features in modern SQL, enabling calculations across related rows without collapsing the result set. They solve problems that previously required complex self-joins, correlated subqueries, or procedural code—with elegant, performant, single-pass solutions.
In technical interviews, window function mastery is a strong differentiator. Candidates who fluently apply ROW_NUMBER, LAG/LEAD, running totals, and moving averages demonstrate advanced SQL thinking that impresses interviewers and solves real business problems.
By the end of this page, you will understand window function anatomy and execution model, master ranking functions (ROW_NUMBER, RANK, DENSE_RANK, NTILE), apply aggregate window functions for running totals and moving averages, use offset functions (LAG, LEAD, FIRST_VALUE, LAST_VALUE) for row comparisons, and construct sophisticated frame specifications for precise calculations.
A window function performs a calculation across a set of rows that are somehow related to the current row—this set is called the window. Unlike GROUP BY which collapses rows, window functions retain all individual rows while adding computed values.
The Window Function Anatomy:
function_name(expression) OVER (
[PARTITION BY partition_expression, ...]
[ORDER BY sort_expression [ASC|DESC], ...]
[frame_clause]
)
| Component | Purpose | Optional? | Example |
|---|---|---|---|
| OVER () | Declares this is a window function | Required | SUM(amount) OVER () |
| PARTITION BY | Divides rows into groups for calculation | Yes | PARTITION BY department |
| ORDER BY | Defines row order within partition | Sometimes* | ORDER BY hire_date |
| Frame Clause | Specifies which rows relative to current | Yes | ROWS BETWEEN ... |
ORDER BY within OVER() is different from ORDER BY at the query end. Window ORDER BY determines which rows come 'before' and 'after' the current row for calculations. Ranking functions require ORDER BY; aggregate functions behave differently with and without it.
1234567891011121314151617181920212223242526272829303132333435363738394041424344
-- Basic window function: Add overall total to each rowSELECT order_id, customer_id, total, SUM(total) OVER () AS grand_total, ROUND(100.0 * total / SUM(total) OVER (), 2) AS pct_of_totalFROM orders; -- With PARTITION BY: Calculations within groupsSELECT order_id, customer_id, total, SUM(total) OVER (PARTITION BY customer_id) AS customer_total, ROUND( 100.0 * total / SUM(total) OVER (PARTITION BY customer_id), 2 ) AS pct_of_customer_totalFROM orders; -- With ORDER BY: Running calculationsSELECT order_id, customer_id, order_date, total, SUM(total) OVER ( PARTITION BY customer_id ORDER BY order_date ) AS running_totalFROM ordersORDER BY customer_id, order_date; -- Multiple window functions in one querySELECT employee_id, department_id, salary, AVG(salary) OVER (PARTITION BY department_id) AS dept_avg, salary - AVG(salary) OVER (PARTITION BY department_id) AS diff_from_avg, MAX(salary) OVER (PARTITION BY department_id) AS dept_max, salary / MAX(salary) OVER (PARTITION BY department_id) AS pct_of_maxFROM employees;Ranking functions assign ordinal positions to rows based on ORDER BY criteria. Understanding the differences between ranking functions is crucial for selecting the right one for your use case.
| Function | Ties Handling | Result for Values (100, 100, 90, 80) | Use Case |
|---|---|---|---|
| ROW_NUMBER() | Distinct ranks (arbitrary for ties) | 1, 2, 3, 4 | Top-N per group, pagination |
| RANK() | Same rank, skips after ties | 1, 1, 3, 4 | Competition ranking, sparse |
| DENSE_RANK() | Same rank, no skips | 1, 1, 2, 3 | Competition ranking, continuous |
| NTILE(n) | Distributes into n buckets | 1, 1, 2, 3 (for NTILE(4)) | Percentiles, quartiles |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576
-- All ranking functions comparedSELECT employee_id, department_id, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num, RANK() OVER (ORDER BY salary DESC) AS rank, DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank, NTILE(4) OVER (ORDER BY salary DESC) AS quartileFROM employees; -- ROW_NUMBER for Top-N per group (most common interview pattern!)WITH ranked_employees AS ( SELECT employee_id, name, department_id, salary, ROW_NUMBER() OVER ( PARTITION BY department_id ORDER BY salary DESC ) AS dept_rank FROM employees)SELECT * FROM ranked_employeesWHERE dept_rank <= 3; -- Top 3 per department -- RANK for handling ties fairlySELECT product_id, product_name, total_sales, RANK() OVER (ORDER BY total_sales DESC) AS sales_rankFROM ( SELECT p.product_id, p.product_name, SUM(oi.quantity * oi.unit_price) AS total_sales FROM products p JOIN order_items oi ON p.product_id = oi.product_id GROUP BY p.product_id, p.product_name) product_salesWHERE total_sales > 0; -- DENSE_RANK for continuous rankingSELECT player_name, score, DENSE_RANK() OVER (ORDER BY score DESC) AS placeFROM tournament_results; -- NTILE for percentile bucketsSELECT customer_id, total_spending, CASE NTILE(5) OVER (ORDER BY total_spending DESC) WHEN 1 THEN 'Top 20%' WHEN 2 THEN '20-40%' WHEN 3 THEN '40-60%' WHEN 4 THEN '60-80%' WHEN 5 THEN 'Bottom 20%' END AS spending_tierFROM ( SELECT customer_id, SUM(total) AS total_spending FROM orders GROUP BY customer_id) customer_totals; -- PERCENT_RANK and CUME_DIST for precise percentilesSELECT employee_id, salary, ROUND(100 * PERCENT_RANK() OVER (ORDER BY salary), 2) AS percentile, ROUND(100 * CUME_DIST() OVER (ORDER BY salary), 2) AS cumulative_distFROM employees;The pattern 'ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) + filter in outer query WHERE rank <= N' is one of the most frequently asked interview patterns. Master it thoroughly—it appears in many variations.
Standard aggregate functions (SUM, AVG, COUNT, MIN, MAX) become dramatically more powerful when used as window functions. They can compute running totals, moving averages, cumulative counts, and group comparisons—all while preserving row-level detail.
Key Behavior Change:
Without ORDER BY in OVER(), aggregates compute over the entire partition. With ORDER BY, they compute over a cumulative frame (rows from start to current row by default).
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
-- Partition-wide aggregates (no ORDER BY)SELECT order_id, customer_id, order_date, total, COUNT(*) OVER (PARTITION BY customer_id) AS customer_order_count, AVG(total) OVER (PARTITION BY customer_id) AS customer_avg_order, SUM(total) OVER (PARTITION BY customer_id) AS customer_lifetime_valueFROM orders; -- Running totals (with ORDER BY)SELECT order_date, total, SUM(total) OVER (ORDER BY order_date) AS running_total, AVG(total) OVER (ORDER BY order_date) AS running_avg, COUNT(*) OVER (ORDER BY order_date) AS cumulative_countFROM ordersWHERE customer_id = 1001ORDER BY order_date; -- Running totals partitioned by groupSELECT customer_id, order_date, total, SUM(total) OVER ( PARTITION BY customer_id ORDER BY order_date ) AS customer_running_total, ROW_NUMBER() OVER ( PARTITION BY customer_id ORDER BY order_date ) AS customer_order_numberFROM ordersORDER BY customer_id, order_date; -- Comparative analysis: Each value vs group statisticsSELECT department_id, employee_id, salary, AVG(salary) OVER (PARTITION BY department_id) AS dept_avg, MIN(salary) OVER (PARTITION BY department_id) AS dept_min, MAX(salary) OVER (PARTITION BY department_id) AS dept_max, ROUND( (salary - MIN(salary) OVER (PARTITION BY department_id)) / NULLIF( MAX(salary) OVER (PARTITION BY department_id) - MIN(salary) OVER (PARTITION BY department_id), 0 ) * 100, 2 ) AS position_in_range_pctFROM employees; -- Year-over-year totals with partition restartSELECT EXTRACT(YEAR FROM order_date) AS year, EXTRACT(MONTH FROM order_date) AS month, SUM(total) AS monthly_total, SUM(SUM(total)) OVER ( PARTITION BY EXTRACT(YEAR FROM order_date) ORDER BY EXTRACT(MONTH FROM order_date) ) AS ytd_totalFROM ordersGROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date)ORDER BY year, month;Notice the SUM(SUM(total)) pattern for YTD totals—the inner SUM is a regular GROUP BY aggregate, the outer SUM is a window function. This pattern is common when you need both grouped and running calculations.
Offset functions access values from other rows relative to the current row—enabling comparisons, change calculations, and gap filling without self-joins.
| Function | Returns | Syntax | Use Case |
|---|---|---|---|
| LAG() | Value from previous row(s) | LAG(col, offset, default) | Period-over-period change |
| LEAD() | Value from following row(s) | LEAD(col, offset, default) | Forecast, gap detection |
| FIRST_VALUE() | First value in window frame | FIRST_VALUE(col) | Frame minimum, first in group |
| LAST_VALUE() | Last value in window frame | LAST_VALUE(col) | Frame maximum, current state |
| NTH_VALUE() | Nth value in window frame | NTH_VALUE(col, n) | Specific position value |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108
-- LAG: Compare with previous rowSELECT order_date, daily_revenue, LAG(daily_revenue, 1) OVER (ORDER BY order_date) AS prev_day_revenue, daily_revenue - LAG(daily_revenue, 1) OVER (ORDER BY order_date) AS day_over_day_change, ROUND( 100.0 * (daily_revenue - LAG(daily_revenue, 1) OVER (ORDER BY order_date)) / NULLIF(LAG(daily_revenue, 1) OVER (ORDER BY order_date), 0), 2 ) AS pct_changeFROM ( SELECT order_date::DATE, SUM(total) AS daily_revenue FROM orders GROUP BY order_date::DATE) daily_totalsORDER BY order_date; -- LAG with default value for first rowSELECT customer_id, order_date, total, LAG(total, 1, 0) OVER ( PARTITION BY customer_id ORDER BY order_date ) AS prev_order_valueFROM orders; -- LEAD: Look aheadSELECT customer_id, order_date, LEAD(order_date, 1) OVER ( PARTITION BY customer_id ORDER BY order_date ) AS next_order_date, LEAD(order_date, 1) OVER ( PARTITION BY customer_id ORDER BY order_date ) - order_date AS days_until_next_orderFROM orders; -- Gap detection with LEADSELECT *FROM ( SELECT customer_id, order_date, LEAD(order_date) OVER ( PARTITION BY customer_id ORDER BY order_date ) AS next_order, LEAD(order_date) OVER ( PARTITION BY customer_id ORDER BY order_date ) - order_date AS gap_days FROM orders) with_gapsWHERE gap_days > 90; -- Gaps of more than 90 days -- FIRST_VALUE and LAST_VALUESELECT employee_id, department_id, hire_date, salary, FIRST_VALUE(salary) OVER ( PARTITION BY department_id ORDER BY hire_date ) AS first_hire_salary, LAST_VALUE(salary) OVER ( PARTITION BY department_id ORDER BY hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS latest_hire_salaryFROM employees; -- NTH_VALUE for specific positionsSELECT product_id, product_name, category_id, unit_price, NTH_VALUE(unit_price, 2) OVER ( PARTITION BY category_id ORDER BY unit_price DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS second_highest_priceFROM products; -- Consecutive streak detectionWITH order_gaps AS ( SELECT customer_id, order_date, order_date - LAG(order_date) OVER ( PARTITION BY customer_id ORDER BY order_date ) AS days_since_last FROM orders)SELECT customer_id, order_date, days_since_last, CASE WHEN days_since_last <= 7 THEN 1 ELSE 0 END AS in_streakFROM order_gaps;By default, with ORDER BY, the frame is 'UNBOUNDED PRECEDING TO CURRENT ROW'—so LAST_VALUE returns the current row's value! To get the actual last value in the partition, specify 'ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING'.
Frame specifications define exactly which rows are included in the window for each row's calculation. This enables moving averages, bounded running totals, and precise analytical computations.
Frame Clause Syntax:
{ROWS | RANGE | GROUPS} BETWEEN start_bound AND end_bound
| Bound | Meaning | Example |
|---|---|---|
| UNBOUNDED PRECEDING | First row of partition | All previous rows |
| N PRECEDING | N rows before current | 3 PRECEDING = 3 rows back |
| CURRENT ROW | The current row | Include current row |
| N FOLLOWING | N rows after current | 2 FOLLOWING = 2 rows ahead |
| UNBOUNDED FOLLOWING | Last row of partition | All remaining rows |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
-- Moving average: 7-day windowSELECT order_date, daily_revenue, AVG(daily_revenue) OVER ( ORDER BY order_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW ) AS moving_avg_7dayFROM ( SELECT order_date::DATE, SUM(total) AS daily_revenue FROM orders GROUP BY order_date::DATE) daily_totalsORDER BY order_date; -- Moving sum: Current row plus 2 before and 2 after (5-row window)SELECT order_date, daily_revenue, SUM(daily_revenue) OVER ( ORDER BY order_date ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING ) AS centered_5day_sumFROM daily_totals; -- Running total with bounded look-backSELECT month, revenue, SUM(revenue) OVER ( ORDER BY month ROWS BETWEEN 11 PRECEDING AND CURRENT ROW ) AS trailing_12_month_revenueFROM monthly_revenue; -- Different frame types comparisonSELECT sale_date, amount, category, -- ROWS: Physical row count SUM(amount) OVER ( ORDER BY sale_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ) AS rows_sum, -- RANGE: Logical value range (same date = same group) SUM(amount) OVER ( ORDER BY sale_date RANGE BETWEEN INTERVAL '2 days' PRECEDING AND CURRENT ROW ) AS range_sumFROM sales; -- Excluding current row in calculationSELECT employee_id, department_id, salary, AVG(salary) OVER ( PARTITION BY department_id ORDER BY employee_id ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING ) AS avg_of_others_before_meFROM employees; -- Forward-looking calculationsSELECT order_date, daily_revenue, AVG(daily_revenue) OVER ( ORDER BY order_date ROWS BETWEEN CURRENT ROW AND 6 FOLLOWING ) AS next_7day_avgFROM daily_totals; -- Full frame for partition-wide valueSELECT employee_id, department_id, salary, MAX(salary) OVER ( PARTITION BY department_id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS dept_max_salaryFROM employees;ROWS counts physical rows. RANGE groups rows with equal ORDER BY values together. GROUPS (SQL:2011) counts groups of tied values. For most calculations, ROWS provides the most predictable behavior. Use RANGE when logically-equal values should be treated identically.
When using multiple window functions with the same specification, named windows reduce redundancy and improve clarity. Understanding execution also helps write efficient queries.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
-- Without named window (repetitive)SELECT employee_id, department_id, salary, ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank, SUM(salary) OVER (PARTITION BY department_id ORDER BY salary DESC) AS running_sum, AVG(salary) OVER (PARTITION BY department_id ORDER BY salary DESC) AS running_avgFROM employees; -- With named window (cleaner)SELECT employee_id, department_id, salary, ROW_NUMBER() OVER dept_salary_window AS rank, SUM(salary) OVER dept_salary_window AS running_sum, AVG(salary) OVER dept_salary_window AS running_avgFROM employeesWINDOW dept_salary_window AS ( PARTITION BY department_id ORDER BY salary DESC); -- Multiple named windowsSELECT order_id, customer_id, order_date, total, -- Customer-level calculations SUM(total) OVER customer_window AS customer_total, ROW_NUMBER() OVER customer_window AS customer_order_num, -- Date-level calculations SUM(total) OVER date_window AS daily_total, COUNT(*) OVER date_window AS daily_order_countFROM ordersWINDOW customer_window AS (PARTITION BY customer_id ORDER BY order_date), date_window AS (PARTITION BY order_date::DATE); -- Window refinement: Extending a named windowSELECT employee_id, department_id, salary, hire_date, -- Use base window AVG(salary) OVER base_window AS dept_avg, -- Extend with frame SUM(salary) OVER ( base_window ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS running_sumFROM employeesWINDOW base_window AS (PARTITION BY department_id ORDER BY hire_date);Beyond basic usage, window functions enable sophisticated analytical patterns that frequently appear in interviews and real-world analytics.
Islands and Gaps: Finding Consecutive Sequences
This classic pattern identifies groups of consecutive values (islands) separated by gaps:
123456789101112131415161718192021222324252627282930313233343536373839
-- Find consecutive login streaks per userWITH numbered AS ( SELECT user_id, login_date, login_date - (ROW_NUMBER() OVER ( PARTITION BY user_id ORDER BY login_date ) * INTERVAL '1 day') AS grp FROM (SELECT DISTINCT user_id, login_date::DATE FROM logins) t),streaks AS ( SELECT user_id, MIN(login_date) AS streak_start, MAX(login_date) AS streak_end, COUNT(*) AS streak_length FROM numbered GROUP BY user_id, grp)SELECT * FROM streaksWHERE streak_length >= 3 -- At least 3 consecutive daysORDER BY user_id, streak_start; -- Find gaps in sequential IDsWITH id_gaps AS ( SELECT order_id, LEAD(order_id) OVER (ORDER BY order_id) AS next_id, LEAD(order_id) OVER (ORDER BY order_id) - order_id AS gap FROM orders)SELECT order_id AS gap_starts_after, next_id AS gap_ends_before, gap - 1 AS missing_countFROM id_gapsWHERE gap > 1ORDER BY order_id;You've now mastered SQL window functions—one of the most powerful features for analytical queries and a strong differentiator in technical interviews.
Key Takeaways:
What's Next:
With window function mastery complete, we'll conclude with Query Optimization in the final page—techniques for writing efficient SQL, understanding query plans, and demonstrating performance awareness in interviews.
You now command the full power of SQL window functions. From basic rankings to complex analytical patterns like sessionization and funnel analysis, these skills enable elegant solutions to problems that would otherwise require complex procedural code or multiple queries.