SQLWindow Functions Basics

Window Functions Basics

LevelIntermediate

Duration75 mins

TopicWindow Functions Basics

5 / 5

Frame Specification

Precision Control Over Window Boundaries

The frame specification is the most powerful and nuanced aspect of window functions. While PARTITION BY divides your data into windows and ORDER BY defines the sequence within those windows, the frame determines exactly which rows around the current row participate in the calculation.

Consider the difference between:

7-day moving average — Average of current day plus 6 preceding days
Running total — Sum from the start up to (and including) the current row
Centered average — Average of 3 rows before, current row, and 3 rows after
Excluding current — Sum of all peers except the current row

All of these require different frame specifications. Without frames, you're limited to either the entire partition or the implicit "start to current row" default. With frames, you have surgical precision.

What You Will Learn

By the end of this page, you will understand frame syntax completely, the critical difference between ROWS, RANGE, and GROUPS modes, how to specify frame boundaries, practical applications like moving averages and cumulative calculations, and the EXCLUDE clause for edge cases.

Frame Specification Syntax

The frame specification is the most complex part of the OVER clause syntax. Here's the complete structure:

OVER (
    [PARTITION BY ...]
    [ORDER BY ...]
    [frame_clause]
)

frame_clause:
    frame_mode BETWEEN frame_start AND frame_end [EXCLUDE clause]

frame_mode:
    ROWS | RANGE | GROUPS

frame_start and frame_end:
    UNBOUNDED PRECEDING
  | offset PRECEDING
  | CURRENT ROW
  | offset FOLLOWING
  | UNBOUNDED FOLLOWING

EXCLUDE clause:
    EXCLUDE CURRENT ROW
  | EXCLUDE GROUP
  | EXCLUDE TIES
  | EXCLUDE NO OTHERS

The frame specification only applies when ORDER BY is present in the window definition. Without ORDER BY, specifying a frame is usually meaningless (though some databases allow it).

frame_syntax_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Basic frame examples:
 
-- All rows from partition start to current row (default when ORDER BY present)
SUM(amount) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
 
-- All rows in partition (explicit version of default when no ORDER BY)
SUM(amount) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
 
-- Previous 2 rows plus current row (3-row window)
AVG(price) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
 
-- Previous 6 days plus current day (7-day moving average with RANGE)
AVG(value) OVER (ORDER BY date RANGE BETWEEN INTERVAL '6 days' PRECEDING AND CURRENT ROW)
 
-- Centered window: 2 before, current, 2 after
AVG(score) OVER (ORDER BY position ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
 
-- All following rows (not including current)
SUM(remaining) OVER (ORDER BY date ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)

Frame Boundary Options
Boundary	Meaning
UNBOUNDED PRECEDING	Start of the partition (first row after ORDER BY)
n PRECEDING	n rows/ranges/groups before current row
CURRENT ROW	The current row being processed
n FOLLOWING	n rows/ranges/groups after current row
UNBOUNDED FOLLOWING	End of the partition (last row after ORDER BY)

ROWS vs RANGE vs GROUPS

The frame mode (ROWS, RANGE, or GROUPS) determines how frame boundaries are interpreted:

ROWS: Physical row positions. "2 PRECEDING" means exactly 2 rows before the current row in the ordered sequence.

RANGE: Logical value ranges. "2 PRECEDING" means rows whose ORDER BY value is within 2 of the current row's value.

GROUPS: Peer group positions. "2 PRECEDING" means 2 groups of peers before the current row's peer group.

This distinction is critical for correct window function behavior, especially with duplicate ORDER BY values (peers).

rows_vs_range_vs_groups.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
-- Data with duplicates in ORDER BY column:
-- | row | date       | amount |
-- |-----|------------|--------|
-- | 1   | 2024-01-01 | 100    |
-- | 2   | 2024-01-02 | 200    |
-- | 3   | 2024-01-02 | 150    |  -- Same date as row 2 (peers)
-- | 4   | 2024-01-03 | 300    |
-- | 5   | 2024-01-04 | 250    |
 
-- ROWS: counts physical rows
SELECT 
    date,
    amount,
    SUM(amount) OVER (
        ORDER BY date 
        ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
    ) as rows_sum
FROM data;
 
-- Result:
-- | date       | amount | rows_sum |
-- |------------|--------|----------|
-- | 2024-01-01 | 100    | 100      |  -- No preceding row
-- | 2024-01-02 | 200    | 300      |  -- Row 1 + Row 2 (100+200)
-- | 2024-01-02 | 150    | 350      |  -- Row 2 + Row 3 (200+150)
-- | 2024-01-03 | 300    | 450      |  -- Row 3 + Row 4 (150+300)
-- | 2024-01-04 | 250    | 550      |  -- Row 4 + Row 5 (300+250)
 
-- RANGE: looks at value differences
SELECT 
    date,
    amount,
    SUM(amount) OVER (
        ORDER BY date 
        RANGE BETWEEN INTERVAL '1 day' PRECEDING AND CURRENT ROW
    ) as range_sum
FROM data;
 
-- Result:
-- | date       | amount | range_sum |
-- |------------|--------|-----------|
-- | 2024-01-01 | 100    | 100       |  -- Just Jan 1
-- | 2024-01-02 | 200    | 450       |  -- Jan 1-2: 100+200+150
-- | 2024-01-02 | 150    | 450       |  -- Same! All Jan 1-2 peers included
-- | 2024-01-03 | 300    | 650       |  -- Jan 2-3: 200+150+300
-- | 2024-01-04 | 250    | 550       |  -- Jan 3-4: 300+250
 
-- GROUPS: counts peer groups
SELECT 
    date,
    amount,
    SUM(amount) OVER (
        ORDER BY date 
        GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW
    ) as groups_sum
FROM data;
 
-- Result:
-- | date       | amount | groups_sum |
-- |------------|--------|------------|
-- | 2024-01-01 | 100    | 100        |  -- Group 1 only
-- | 2024-01-02 | 200    | 450        |  -- Groups 1+2: 100+200+150
-- | 2024-01-02 | 150    | 450        |  -- Same group 2, same result
-- | 2024-01-03 | 300    | 650        |  -- Groups 2+3: 200+150+300
-- | 2024-01-04 | 250    | 550        |  -- Groups 3+4: 300+250

Frame Mode Comparison
Mode	Unit of Measurement	Peers Handling	Common Use Case
ROWS	Physical row positions	Each peer row counted separately	Fixed-size sliding windows (e.g., 7 rows)
RANGE	ORDER BY value differences	All peers included together	Time-based windows (e.g., 7 days)
GROUPS	Peer group positions	Peer groups as atomic units	Group-based analysis with duplicates

RANGE Requires Specific ORDER BY Types

RANGE with numeric offsets (like '2 PRECEDING') requires the ORDER BY column to be numeric, date, or interval type. For text or other types, only UNBOUNDED and CURRENT ROW boundaries work. ROWS and GROUPS work with any data type.

Default Frame Behavior

Understanding default frame behavior is crucial because most window function calls don't explicitly specify frames. The defaults depend on whether ORDER BY is present:

Without ORDER BY

•Default frame: All rows in partition
•Equivalent to: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
•Result: Same aggregate value for every row in partition
•Effect: SUM() returns partition total, AVG() returns partition average

With ORDER BY

•Default frame: Start to current row
•Equivalent to: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
•Result: Running/cumulative values
•Effect: SUM() returns running total, AVG() returns running average

default_frames.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
-- Demonstrating default frame behavior
 
-- Without ORDER BY: entire partition
SELECT 
    name,
    department,
    salary,
    SUM(salary) OVER (PARTITION BY department) as dept_total
FROM employees;
-- dept_total is the same for all employees in each department
 
-- With ORDER BY: RANGE UNBOUNDED PRECEDING TO CURRENT ROW
SELECT 
    name,
    hire_date,
    salary,
    SUM(salary) OVER (ORDER BY hire_date) as running_total
FROM employees;
-- running_total grows as we move through rows
 
-- IMPORTANT: Default is RANGE, not ROWS!
-- This matters when there are ties (peers)
 
-- Data with same hire_date:
-- | name  | hire_date  | salary |
-- |-------|------------|--------|
-- | Alice | 2024-01-01 | 50000  |
-- | Bob   | 2024-01-02 | 60000  |
-- | Carol | 2024-01-02 | 55000  |  -- Same date as Bob
-- | David | 2024-01-03 | 70000  |
 
SELECT 
    name,
    hire_date,
    salary,
    SUM(salary) OVER (ORDER BY hire_date) as running_total
FROM employees;
 
-- Result:
-- | name  | hire_date  | salary | running_total |
-- |-------|------------|--------|---------------|
-- | Alice | 2024-01-01 | 50000  | 50000         |
-- | Bob   | 2024-01-02 | 60000  | 165000        |  -- Includes Carol!
-- | Carol | 2024-01-02 | 55000  | 165000        |  -- Same as Bob (RANGE default)
-- | David | 2024-01-03 | 70000  | 235000        |
 
-- If you want strictly cumulative (each row adds one value):
SELECT 
    name,
    hire_date,
    salary,
    SUM(salary) OVER (
        ORDER BY hire_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as strict_running_total
FROM employees;
 
-- Result:
-- | name  | hire_date  | salary | strict_running_total |
-- |-------|------------|--------|----------------------|
-- | Alice | 2024-01-01 | 50000  | 50000                |
-- | Bob   | 2024-01-02 | 60000  | 110000               |  -- 50000+60000
-- | Carol | 2024-01-02 | 55000  | 165000               |  -- 50000+60000+55000
-- | David | 2024-01-03 | 70000  | 235000               |

When in Doubt, Be Explicit

Given the subtle differences between ROWS and RANGE defaults, consider explicitly specifying your frame clause when the behavior matters. This makes your intent clear and prevents surprises from peer rows.

Practical Frame Patterns

Here are the most commonly used frame patterns in real-world analytics:

running_total.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Running total from partition start to current row
-- (This is the default, but being explicit is clearer)
 
SELECT 
    transaction_date,
    amount,
    SUM(amount) OVER (
        ORDER BY transaction_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as running_total
FROM transactions;
 
-- With partitioning (running total resets per category):
SELECT 
    category,
    transaction_date,
    amount,
    SUM(amount) OVER (
        PARTITION BY category
        ORDER BY transaction_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as category_running_total
FROM transactions;

Frame Boundaries in Detail

Let's examine each boundary type and how they interact:

boundary_details.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- Sample data (5 rows, already ordered):
-- | row | value |
-- |-----|-------|
-- | 1   | 10    |
-- | 2   | 20    |
-- | 3   | 30    |  <-- Current row for examples
-- | 4   | 40    |
-- | 5   | 50    |
 
-- Frame boundaries shown for current row = 3:
 
-- UNBOUNDED PRECEDING AND CURRENT ROW
-- Frame: rows 1, 2, 3 → SUM = 60
SUM(value) OVER (ORDER BY row ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
 
-- CURRENT ROW AND UNBOUNDED FOLLOWING  
-- Frame: rows 3, 4, 5 → SUM = 120
SUM(value) OVER (ORDER BY row ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
 
-- 1 PRECEDING AND 1 FOLLOWING
-- Frame: rows 2, 3, 4 → SUM = 90
SUM(value) OVER (ORDER BY row ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
 
-- 2 PRECEDING AND CURRENT ROW
-- Frame: rows 1, 2, 3 → SUM = 60
SUM(value) OVER (ORDER BY row ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
 
-- CURRENT ROW AND 2 FOLLOWING
-- Frame: rows 3, 4, 5 → SUM = 120
SUM(value) OVER (ORDER BY row ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
 
-- 1 PRECEDING AND 2 FOLLOWING
-- Frame: rows 2, 3, 4, 5 → SUM = 140
SUM(value) OVER (ORDER BY row ROWS BETWEEN 1 PRECEDING AND 2 FOLLOWING)
 
-- UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (entire partition)
-- Frame: rows 1, 2, 3, 4, 5 → SUM = 150
SUM(value) OVER (ORDER BY row ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

Edge cases at partition boundaries:

When the frame extends beyond partition boundaries, only existing rows are included:

edge_cases.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
-- Same data (5 rows):
-- | row | value |
-- |-----|-------|
-- | 1   | 10    |  <-- At start: "2 PRECEDING" only gets 0 rows before
-- | 2   | 20    |
-- | 3   | 30    |
-- | 4   | 40    |
-- | 5   | 50    |  <-- At end: "2 FOLLOWING" only gets 0 rows after
 
SELECT 
    row_num,
    value,
    SUM(value) OVER (
        ORDER BY row_num 
        ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
    ) as sum_5
FROM data;
 
-- Result:
-- | row_num | value | sum_5 |
-- |---------|-------|-------|
-- | 1       | 10    | 60    |  -- Only rows 1,2,3 (no preceding available)
-- | 2       | 20    | 100   |  -- Rows 1,2,3,4 (only 1 preceding available)
-- | 3       | 30    | 150   |  -- Rows 1,2,3,4,5 (full window)
-- | 4       | 40    | 140   |  -- Rows 2,3,4,5 (only 1 following available)
-- | 5       | 50    | 120   |  -- Rows 3,4,5 (no following available)
 
-- For moving average, this affects edge values:
SELECT 
    row_num,
    value,
    AVG(value) OVER (
        ORDER BY row_num 
        ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
    ) as avg_5,
    COUNT(*) OVER (
        ORDER BY row_num 
        ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
    ) as window_size  -- Shows actual rows in frame
FROM data;
 
-- Result:
-- | row_num | value | avg_5 | window_size |
-- |---------|-------|-------|-------------|
-- | 1       | 10    | 20    | 3           |  -- 60/3
-- | 2       | 20    | 25    | 4           |  -- 100/4
-- | 3       | 30    | 30    | 5           |  -- 150/5
-- | 4       | 40    | 35    | 4           |  -- 140/4
-- | 5       | 50    | 40    | 3           |  -- 120/3

Frame Boundary Rules

The frame start must come before or at the frame end in logical order. You cannot have '2 FOLLOWING AND 2 PRECEDING'. Also, if the computed frame is empty (start > end for a row), aggregate functions return NULL.

The EXCLUDE Clause

The EXCLUDE clause provides fine-grained control over which rows to exclude from the frame, beyond what boundaries specify. This is particularly useful when dealing with peers (rows with the same ORDER BY value).

EXCLUDE options:

EXCLUDE NO OTHERS — Default; include all rows in frame
EXCLUDE CURRENT ROW — Exclude just the current row
EXCLUDE GROUP — Exclude current row and all its peers
EXCLUDE TIES — Exclude peers of current row (but keep current row)

exclude_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
-- Data with peers (same ORDER BY value):
-- | id | score | team |
-- |----|-------|------|
-- | 1  | 100   | A    |
-- | 2  | 90    | A    |
-- | 3  | 90    | B    |  -- Tied with id=2
-- | 4  | 80    | A    |
 
-- EXCLUDE CURRENT ROW: compare to others only
SELECT 
    id,
    score,
    AVG(score) OVER (
        ORDER BY score DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE CURRENT ROW
    ) as avg_others
FROM players;
 
-- Result:
-- | id | score | avg_others |
-- |----|-------|------------|
-- | 1  | 100   | 86.67      |  -- Average of 90, 90, 80
-- | 2  | 90    | 90         |  -- Average of 100, 90, 80
-- | 3  | 90    | 90         |  -- Average of 100, 90, 80
-- | 4  | 80    | 93.33      |  -- Average of 100, 90, 90
 
-- EXCLUDE GROUP: exclude current row and all peers
SELECT 
    id,
    score,
    AVG(score) OVER (
        ORDER BY score DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE GROUP
    ) as avg_non_peers
FROM players;
 
-- Result:
-- | id | score | avg_non_peers |
-- |----|-------|---------------|
-- | 1  | 100   | 86.67         |  -- Average of 90, 90, 80
-- | 2  | 90    | 90            |  -- Average of 100, 80 (excludes both 90s)
-- | 3  | 90    | 90            |  -- Average of 100, 80
-- | 4  | 80    | 93.33         |  -- Average of 100, 90, 90
 
-- EXCLUDE TIES: keep current row, exclude only its peers
SELECT 
    id,
    score,
    AVG(score) OVER (
        ORDER BY score DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE TIES
    ) as avg_exclude_ties
FROM players;
 
-- Result:
-- | id | score | avg_exclude_ties |
-- |----|-------|------------------|
-- | 1  | 100   | 90               |  -- No ties, same as full average
-- | 2  | 90    | 90               |  -- 100, 90 (this row), 80 = 270/3
-- | 3  | 90    | 90               |  -- 100, 90 (this row), 80 = 270/3
-- | 4  | 80    | 90               |  -- No ties for 80

Database Support for EXCLUDE

The EXCLUDE clause is part of the SQL standard but is not universally supported. PostgreSQL, SQLite, and recent database versions support it. SQL Server and MySQL (as of 8.0) have limited or no support. Check your database documentation.

Practical use case: Leave-one-out statistics

The EXCLUDE clause is perfect for "leave-one-out" calculations, commonly used in statistical analysis:

leave_one_out.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Calculate how each employee's salary compares to average of all OTHERS
SELECT 
    name,
    department,
    salary,
    AVG(salary) OVER (
        PARTITION BY department
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE CURRENT ROW
    ) as dept_avg_without_me,
    salary - AVG(salary) OVER (
        PARTITION BY department
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE CURRENT ROW
    ) as diff_from_peers
FROM employees;
 
-- This avoids the issue where including yourself in the average
-- dilutes the comparison metric

Performance Considerations

Frame specifications can significantly impact query performance. Understanding these implications helps you write efficient analytical queries.

Performance Impact Factors

•Fixed-size ROWS frames are fastest — The database knows exactly how many rows to include. 'ROWS BETWEEN 6 PRECEDING AND CURRENT ROW' is very efficient.
•RANGE frames with value offsets are slower — The database must compare values to determine frame membership, not just count rows.
•UNBOUNDED requires full partition scan — 'UNBOUNDED PRECEDING' or 'UNBOUNDED FOLLOWING' means processing must consider all rows in the partition.
•Large frames on large partitions are expensive — A 30-day moving average on 10 years of daily data touches many rows per output row.
•EXCLUDE clause adds comparison overhead — The database must identify and exclude peers or current row from aggregation.

performance_tips.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- EFFICIENT: Fixed ROWS frame
AVG(price) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
 
-- LESS EFFICIENT: RANGE with interval (must compare dates)
AVG(price) OVER (ORDER BY date RANGE BETWEEN INTERVAL '6 days' PRECEDING AND CURRENT ROW)
 
-- EFFICIENT: Running total (cumulative, no lookback per row)
SUM(amount) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
 
-- POTENTIALLY EXPENSIVE: Full partition for every row
AVG(amount) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
 
-- TIP: Use the same frame specification for multiple functions
SELECT 
    date,
    amount,
    SUM(amount) OVER w as running_sum,
    AVG(amount) OVER w as running_avg,
    COUNT(*) OVER w as running_count
FROM transactions
WINDOW w AS (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);
 
-- The database can compute all three in a single pass through the data

Optimization Strategies

For expensive window calculations on large datasets: (1) Pre-filter data to reduce partition sizes, (2) Use ROWS instead of RANGE when possible, (3) Consider pre-aggregating to larger time buckets, (4) Reuse window definitions with WINDOW clause to help optimizer.

Summary: Frame Specification

We've completed our exploration of window function fundamentals with the frame specification. Let's consolidate the key takeaways:

Key Takeaways

•Frame specification controls exactly which rows participate — Beyond partitions and ordering, frames give surgical precision over calculation scope.
•ROWS, RANGE, and GROUPS are fundamentally different — ROWS counts physical positions, RANGE uses value differences, GROUPS counts peer groups.
•Default frame changes with ORDER BY — Without ORDER BY: entire partition. With ORDER BY: RANGE UNBOUNDED PRECEDING TO CURRENT ROW.
•Common patterns have standard frame expressions — Running totals, moving averages, centered windows, remaining totals all have well-known frame syntax.
•Frame boundaries handle edges gracefully — When frame extends beyond partition, only available rows are included without error.
•EXCLUDE clause enables leave-one-out analysis — Exclude current row or its peers for statistical comparisons.
•Performance varies by frame type — Fixed ROWS frames are fastest; RANGE with value comparisons and UNBOUNDED frames are more expensive.

Module Complete!

You've now mastered the fundamentals of SQL window functions:

Window Function Concept — What they are and why they exist
OVER Clause — The gateway syntax for window functions
PARTITION BY — Dividing data into independent windows
ORDER BY in Windows — Defining calculation order and enabling running calculations
Frame Specification — Precise control over row participation

With these foundations, you're ready to tackle ranking functions, analytic functions, and advanced window function patterns in subsequent modules.

Module Complete

Congratulations! You now have a comprehensive understanding of window function basics—the conceptual model, core syntax components, and the frame specification system. You can construct sophisticated window function queries for running totals, moving averages, and custom analytical windows. Next, explore ranking functions and analytic functions to expand your window function toolkit.

5 / 5

Loading learning content...

SQLWindow Functions Basics

Window Functions Basics

LevelIntermediate

Duration75 mins

TopicWindow Functions Basics

5 / 5

Frame Specification

Precision Control Over Window Boundaries

Consider the difference between:

7-day moving average — Average of current day plus 6 preceding days
Running total — Sum from the start up to (and including) the current row
Centered average — Average of 3 rows before, current row, and 3 rows after
Excluding current — Sum of all peers except the current row

What You Will Learn

Frame Specification Syntax

The frame specification is the most complex part of the OVER clause syntax. Here's the complete structure:

OVER (
    [PARTITION BY ...]
    [ORDER BY ...]
    [frame_clause]
)

frame_clause:
    frame_mode BETWEEN frame_start AND frame_end [EXCLUDE clause]

frame_mode:
    ROWS | RANGE | GROUPS

frame_start and frame_end:
    UNBOUNDED PRECEDING
  | offset PRECEDING
  | CURRENT ROW
  | offset FOLLOWING
  | UNBOUNDED FOLLOWING

EXCLUDE clause:
    EXCLUDE CURRENT ROW
  | EXCLUDE GROUP
  | EXCLUDE TIES
  | EXCLUDE NO OTHERS

The frame specification only applies when ORDER BY is present in the window definition. Without ORDER BY, specifying a frame is usually meaningless (though some databases allow it).

frame_syntax_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Basic frame examples:
 
-- All rows from partition start to current row (default when ORDER BY present)
SUM(amount) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
 
-- All rows in partition (explicit version of default when no ORDER BY)
SUM(amount) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
 
-- Previous 2 rows plus current row (3-row window)
AVG(price) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
 
-- Previous 6 days plus current day (7-day moving average with RANGE)
AVG(value) OVER (ORDER BY date RANGE BETWEEN INTERVAL '6 days' PRECEDING AND CURRENT ROW)
 
-- Centered window: 2 before, current, 2 after
AVG(score) OVER (ORDER BY position ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
 
-- All following rows (not including current)
SUM(remaining) OVER (ORDER BY date ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)

Frame Boundary Options
Boundary	Meaning
UNBOUNDED PRECEDING	Start of the partition (first row after ORDER BY)
n PRECEDING	n rows/ranges/groups before current row
CURRENT ROW	The current row being processed
n FOLLOWING	n rows/ranges/groups after current row
UNBOUNDED FOLLOWING	End of the partition (last row after ORDER BY)

ROWS vs RANGE vs GROUPS

The frame mode (ROWS, RANGE, or GROUPS) determines how frame boundaries are interpreted:

ROWS: Physical row positions. "2 PRECEDING" means exactly 2 rows before the current row in the ordered sequence.

RANGE: Logical value ranges. "2 PRECEDING" means rows whose ORDER BY value is within 2 of the current row's value.

GROUPS: Peer group positions. "2 PRECEDING" means 2 groups of peers before the current row's peer group.

This distinction is critical for correct window function behavior, especially with duplicate ORDER BY values (peers).

rows_vs_range_vs_groups.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
-- Data with duplicates in ORDER BY column:
-- | row | date       | amount |
-- |-----|------------|--------|
-- | 1   | 2024-01-01 | 100    |
-- | 2   | 2024-01-02 | 200    |
-- | 3   | 2024-01-02 | 150    |  -- Same date as row 2 (peers)
-- | 4   | 2024-01-03 | 300    |
-- | 5   | 2024-01-04 | 250    |
 
-- ROWS: counts physical rows
SELECT 
    date,
    amount,
    SUM(amount) OVER (
        ORDER BY date 
        ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
    ) as rows_sum
FROM data;
 
-- Result:
-- | date       | amount | rows_sum |
-- |------------|--------|----------|
-- | 2024-01-01 | 100    | 100      |  -- No preceding row
-- | 2024-01-02 | 200    | 300      |  -- Row 1 + Row 2 (100+200)
-- | 2024-01-02 | 150    | 350      |  -- Row 2 + Row 3 (200+150)
-- | 2024-01-03 | 300    | 450      |  -- Row 3 + Row 4 (150+300)
-- | 2024-01-04 | 250    | 550      |  -- Row 4 + Row 5 (300+250)
 
-- RANGE: looks at value differences
SELECT 
    date,
    amount,
    SUM(amount) OVER (
        ORDER BY date 
        RANGE BETWEEN INTERVAL '1 day' PRECEDING AND CURRENT ROW
    ) as range_sum
FROM data;
 
-- Result:
-- | date       | amount | range_sum |
-- |------------|--------|-----------|
-- | 2024-01-01 | 100    | 100       |  -- Just Jan 1
-- | 2024-01-02 | 200    | 450       |  -- Jan 1-2: 100+200+150
-- | 2024-01-02 | 150    | 450       |  -- Same! All Jan 1-2 peers included
-- | 2024-01-03 | 300    | 650       |  -- Jan 2-3: 200+150+300
-- | 2024-01-04 | 250    | 550       |  -- Jan 3-4: 300+250
 
-- GROUPS: counts peer groups
SELECT 
    date,
    amount,
    SUM(amount) OVER (
        ORDER BY date 
        GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW
    ) as groups_sum
FROM data;
 
-- Result:
-- | date       | amount | groups_sum |
-- |------------|--------|------------|
-- | 2024-01-01 | 100    | 100        |  -- Group 1 only
-- | 2024-01-02 | 200    | 450        |  -- Groups 1+2: 100+200+150
-- | 2024-01-02 | 150    | 450        |  -- Same group 2, same result
-- | 2024-01-03 | 300    | 650        |  -- Groups 2+3: 200+150+300
-- | 2024-01-04 | 250    | 550        |  -- Groups 3+4: 300+250

Frame Mode Comparison
Mode	Unit of Measurement	Peers Handling	Common Use Case
ROWS	Physical row positions	Each peer row counted separately	Fixed-size sliding windows (e.g., 7 rows)
RANGE	ORDER BY value differences	All peers included together	Time-based windows (e.g., 7 days)
GROUPS	Peer group positions	Peer groups as atomic units	Group-based analysis with duplicates

RANGE Requires Specific ORDER BY Types

Default Frame Behavior

Understanding default frame behavior is crucial because most window function calls don't explicitly specify frames. The defaults depend on whether ORDER BY is present:

Without ORDER BY

•Default frame: All rows in partition
•Equivalent to: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
•Result: Same aggregate value for every row in partition
•Effect: SUM() returns partition total, AVG() returns partition average

With ORDER BY

•Default frame: Start to current row
•Equivalent to: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
•Result: Running/cumulative values
•Effect: SUM() returns running total, AVG() returns running average

default_frames.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
-- Demonstrating default frame behavior
 
-- Without ORDER BY: entire partition
SELECT 
    name,
    department,
    salary,
    SUM(salary) OVER (PARTITION BY department) as dept_total
FROM employees;
-- dept_total is the same for all employees in each department
 
-- With ORDER BY: RANGE UNBOUNDED PRECEDING TO CURRENT ROW
SELECT 
    name,
    hire_date,
    salary,
    SUM(salary) OVER (ORDER BY hire_date) as running_total
FROM employees;
-- running_total grows as we move through rows
 
-- IMPORTANT: Default is RANGE, not ROWS!
-- This matters when there are ties (peers)
 
-- Data with same hire_date:
-- | name  | hire_date  | salary |
-- |-------|------------|--------|
-- | Alice | 2024-01-01 | 50000  |
-- | Bob   | 2024-01-02 | 60000  |
-- | Carol | 2024-01-02 | 55000  |  -- Same date as Bob
-- | David | 2024-01-03 | 70000  |
 
SELECT 
    name,
    hire_date,
    salary,
    SUM(salary) OVER (ORDER BY hire_date) as running_total
FROM employees;
 
-- Result:
-- | name  | hire_date  | salary | running_total |
-- |-------|------------|--------|---------------|
-- | Alice | 2024-01-01 | 50000  | 50000         |
-- | Bob   | 2024-01-02 | 60000  | 165000        |  -- Includes Carol!
-- | Carol | 2024-01-02 | 55000  | 165000        |  -- Same as Bob (RANGE default)
-- | David | 2024-01-03 | 70000  | 235000        |
 
-- If you want strictly cumulative (each row adds one value):
SELECT 
    name,
    hire_date,
    salary,
    SUM(salary) OVER (
        ORDER BY hire_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as strict_running_total
FROM employees;
 
-- Result:
-- | name  | hire_date  | salary | strict_running_total |
-- |-------|------------|--------|----------------------|
-- | Alice | 2024-01-01 | 50000  | 50000                |
-- | Bob   | 2024-01-02 | 60000  | 110000               |  -- 50000+60000
-- | Carol | 2024-01-02 | 55000  | 165000               |  -- 50000+60000+55000
-- | David | 2024-01-03 | 70000  | 235000               |

When in Doubt, Be Explicit

Practical Frame Patterns

Here are the most commonly used frame patterns in real-world analytics:

running_total.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Running total from partition start to current row
-- (This is the default, but being explicit is clearer)
 
SELECT 
    transaction_date,
    amount,
    SUM(amount) OVER (
        ORDER BY transaction_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as running_total
FROM transactions;
 
-- With partitioning (running total resets per category):
SELECT 
    category,
    transaction_date,
    amount,
    SUM(amount) OVER (
        PARTITION BY category
        ORDER BY transaction_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as category_running_total
FROM transactions;

Frame Boundaries in Detail

Let's examine each boundary type and how they interact:

boundary_details.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- Sample data (5 rows, already ordered):
-- | row | value |
-- |-----|-------|
-- | 1   | 10    |
-- | 2   | 20    |
-- | 3   | 30    |  <-- Current row for examples
-- | 4   | 40    |
-- | 5   | 50    |
 
-- Frame boundaries shown for current row = 3:
 
-- UNBOUNDED PRECEDING AND CURRENT ROW
-- Frame: rows 1, 2, 3 → SUM = 60
SUM(value) OVER (ORDER BY row ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
 
-- CURRENT ROW AND UNBOUNDED FOLLOWING  
-- Frame: rows 3, 4, 5 → SUM = 120
SUM(value) OVER (ORDER BY row ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
 
-- 1 PRECEDING AND 1 FOLLOWING
-- Frame: rows 2, 3, 4 → SUM = 90
SUM(value) OVER (ORDER BY row ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
 
-- 2 PRECEDING AND CURRENT ROW
-- Frame: rows 1, 2, 3 → SUM = 60
SUM(value) OVER (ORDER BY row ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
 
-- CURRENT ROW AND 2 FOLLOWING
-- Frame: rows 3, 4, 5 → SUM = 120
SUM(value) OVER (ORDER BY row ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
 
-- 1 PRECEDING AND 2 FOLLOWING
-- Frame: rows 2, 3, 4, 5 → SUM = 140
SUM(value) OVER (ORDER BY row ROWS BETWEEN 1 PRECEDING AND 2 FOLLOWING)
 
-- UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (entire partition)
-- Frame: rows 1, 2, 3, 4, 5 → SUM = 150
SUM(value) OVER (ORDER BY row ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

Edge cases at partition boundaries:

When the frame extends beyond partition boundaries, only existing rows are included:

edge_cases.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
-- Same data (5 rows):
-- | row | value |
-- |-----|-------|
-- | 1   | 10    |  <-- At start: "2 PRECEDING" only gets 0 rows before
-- | 2   | 20    |
-- | 3   | 30    |
-- | 4   | 40    |
-- | 5   | 50    |  <-- At end: "2 FOLLOWING" only gets 0 rows after
 
SELECT 
    row_num,
    value,
    SUM(value) OVER (
        ORDER BY row_num 
        ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
    ) as sum_5
FROM data;
 
-- Result:
-- | row_num | value | sum_5 |
-- |---------|-------|-------|
-- | 1       | 10    | 60    |  -- Only rows 1,2,3 (no preceding available)
-- | 2       | 20    | 100   |  -- Rows 1,2,3,4 (only 1 preceding available)
-- | 3       | 30    | 150   |  -- Rows 1,2,3,4,5 (full window)
-- | 4       | 40    | 140   |  -- Rows 2,3,4,5 (only 1 following available)
-- | 5       | 50    | 120   |  -- Rows 3,4,5 (no following available)
 
-- For moving average, this affects edge values:
SELECT 
    row_num,
    value,
    AVG(value) OVER (
        ORDER BY row_num 
        ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
    ) as avg_5,
    COUNT(*) OVER (
        ORDER BY row_num 
        ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
    ) as window_size  -- Shows actual rows in frame
FROM data;
 
-- Result:
-- | row_num | value | avg_5 | window_size |
-- |---------|-------|-------|-------------|
-- | 1       | 10    | 20    | 3           |  -- 60/3
-- | 2       | 20    | 25    | 4           |  -- 100/4
-- | 3       | 30    | 30    | 5           |  -- 150/5
-- | 4       | 40    | 35    | 4           |  -- 140/4
-- | 5       | 50    | 40    | 3           |  -- 120/3

Frame Boundary Rules

The EXCLUDE Clause

EXCLUDE options:

EXCLUDE NO OTHERS — Default; include all rows in frame
EXCLUDE CURRENT ROW — Exclude just the current row
EXCLUDE GROUP — Exclude current row and all its peers
EXCLUDE TIES — Exclude peers of current row (but keep current row)

exclude_examples.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
-- Data with peers (same ORDER BY value):
-- | id | score | team |
-- |----|-------|------|
-- | 1  | 100   | A    |
-- | 2  | 90    | A    |
-- | 3  | 90    | B    |  -- Tied with id=2
-- | 4  | 80    | A    |
 
-- EXCLUDE CURRENT ROW: compare to others only
SELECT 
    id,
    score,
    AVG(score) OVER (
        ORDER BY score DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE CURRENT ROW
    ) as avg_others
FROM players;
 
-- Result:
-- | id | score | avg_others |
-- |----|-------|------------|
-- | 1  | 100   | 86.67      |  -- Average of 90, 90, 80
-- | 2  | 90    | 90         |  -- Average of 100, 90, 80
-- | 3  | 90    | 90         |  -- Average of 100, 90, 80
-- | 4  | 80    | 93.33      |  -- Average of 100, 90, 90
 
-- EXCLUDE GROUP: exclude current row and all peers
SELECT 
    id,
    score,
    AVG(score) OVER (
        ORDER BY score DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE GROUP
    ) as avg_non_peers
FROM players;
 
-- Result:
-- | id | score | avg_non_peers |
-- |----|-------|---------------|
-- | 1  | 100   | 86.67         |  -- Average of 90, 90, 80
-- | 2  | 90    | 90            |  -- Average of 100, 80 (excludes both 90s)
-- | 3  | 90    | 90            |  -- Average of 100, 80
-- | 4  | 80    | 93.33         |  -- Average of 100, 90, 90
 
-- EXCLUDE TIES: keep current row, exclude only its peers
SELECT 
    id,
    score,
    AVG(score) OVER (
        ORDER BY score DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE TIES
    ) as avg_exclude_ties
FROM players;
 
-- Result:
-- | id | score | avg_exclude_ties |
-- |----|-------|------------------|
-- | 1  | 100   | 90               |  -- No ties, same as full average
-- | 2  | 90    | 90               |  -- 100, 90 (this row), 80 = 270/3
-- | 3  | 90    | 90               |  -- 100, 90 (this row), 80 = 270/3
-- | 4  | 80    | 90               |  -- No ties for 80

Database Support for EXCLUDE

Practical use case: Leave-one-out statistics

The EXCLUDE clause is perfect for "leave-one-out" calculations, commonly used in statistical analysis:

leave_one_out.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Calculate how each employee's salary compares to average of all OTHERS
SELECT 
    name,
    department,
    salary,
    AVG(salary) OVER (
        PARTITION BY department
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE CURRENT ROW
    ) as dept_avg_without_me,
    salary - AVG(salary) OVER (
        PARTITION BY department
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE CURRENT ROW
    ) as diff_from_peers
FROM employees;
 
-- This avoids the issue where including yourself in the average
-- dilutes the comparison metric

Performance Considerations

Frame specifications can significantly impact query performance. Understanding these implications helps you write efficient analytical queries.

Performance Impact Factors

•Fixed-size ROWS frames are fastest — The database knows exactly how many rows to include. 'ROWS BETWEEN 6 PRECEDING AND CURRENT ROW' is very efficient.
•RANGE frames with value offsets are slower — The database must compare values to determine frame membership, not just count rows.
•UNBOUNDED requires full partition scan — 'UNBOUNDED PRECEDING' or 'UNBOUNDED FOLLOWING' means processing must consider all rows in the partition.
•Large frames on large partitions are expensive — A 30-day moving average on 10 years of daily data touches many rows per output row.
•EXCLUDE clause adds comparison overhead — The database must identify and exclude peers or current row from aggregation.

performance_tips.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- EFFICIENT: Fixed ROWS frame
AVG(price) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
 
-- LESS EFFICIENT: RANGE with interval (must compare dates)
AVG(price) OVER (ORDER BY date RANGE BETWEEN INTERVAL '6 days' PRECEDING AND CURRENT ROW)
 
-- EFFICIENT: Running total (cumulative, no lookback per row)
SUM(amount) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
 
-- POTENTIALLY EXPENSIVE: Full partition for every row
AVG(amount) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
 
-- TIP: Use the same frame specification for multiple functions
SELECT 
    date,
    amount,
    SUM(amount) OVER w as running_sum,
    AVG(amount) OVER w as running_avg,
    COUNT(*) OVER w as running_count
FROM transactions
WINDOW w AS (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);
 
-- The database can compute all three in a single pass through the data

Optimization Strategies

Summary: Frame Specification

We've completed our exploration of window function fundamentals with the frame specification. Let's consolidate the key takeaways:

Key Takeaways

•Frame specification controls exactly which rows participate — Beyond partitions and ordering, frames give surgical precision over calculation scope.
•ROWS, RANGE, and GROUPS are fundamentally different — ROWS counts physical positions, RANGE uses value differences, GROUPS counts peer groups.
•Default frame changes with ORDER BY — Without ORDER BY: entire partition. With ORDER BY: RANGE UNBOUNDED PRECEDING TO CURRENT ROW.
•Common patterns have standard frame expressions — Running totals, moving averages, centered windows, remaining totals all have well-known frame syntax.
•Frame boundaries handle edges gracefully — When frame extends beyond partition, only available rows are included without error.
•EXCLUDE clause enables leave-one-out analysis — Exclude current row or its peers for statistical comparisons.
•Performance varies by frame type — Fixed ROWS frames are fastest; RANGE with value comparisons and UNBOUNDED frames are more expensive.

Module Complete!

You've now mastered the fundamentals of SQL window functions:

Window Function Concept — What they are and why they exist
OVER Clause — The gateway syntax for window functions
PARTITION BY — Dividing data into independent windows
ORDER BY in Windows — Defining calculation order and enabling running calculations
Frame Specification — Precise control over row participation

With these foundations, you're ready to tackle ranking functions, analytic functions, and advanced window function patterns in subsequent modules.

Module Complete

5 / 5