Database Management SystemsAnalytic Functions

Analytic Functions

LevelAdvanced

Duration120 mins

TopicAnalytic Functions

3 / 5

Running Totals

The Power of Cumulative Computation

Some of the most valuable business questions involve accumulation: What's our year-to-date revenue? How many users have we acquired since launch? What's the running balance of this bank account? How many units have been shipped cumulatively this quarter?

Before window functions, computing running totals required either self-joins (quadratically expensive), correlated subqueries (even worse), or procedural code that processed rows sequentially. Each approach was either slow, complex, or required moving logic outside SQL.

Running totals with window functions elegantly solve this problem. By combining aggregate functions like SUM() with the OVER clause and appropriate frame specifications, you can compute cumulative values in a single, efficient table scan. The result is clean, declarative SQL that the optimizer can execute efficiently.

This capability transforms financial reporting, inventory management, time-series analysis, and countless other domains where understanding the accumulation pattern over time or sequence is essential.

What You Will Learn

By the end of this page, you will master the SUM() OVER pattern for running totals, understand how frame specifications control accumulation boundaries, explore variations with COUNT, AVG, and other aggregates, apply running totals to financial, inventory, and time-series scenarios, and optimize cumulative computations for large datasets.

Conceptual Foundation

A running total (also called a cumulative sum or prefix sum) is the sum of all values from the beginning of a sequence up to and including the current position. For each row, it asks: What's the total so far?

Mathematical Definition:

For a sequence of values v₁, v₂, v₃, ..., vₙ, the running total at position i is:

RT(i) = v₁ + v₂ + ... + vᵢ = Σⱼ₌₁ⁱ vⱼ

Example Walkthrough:

Running Total Calculation Example
Day	Sales	Calculation	Running Total
Monday	100	100	100
Tuesday	150	100 + 150	250
Wednesday	80	100 + 150 + 80	330
Thursday	200	100 + 150 + 80 + 200	530
Friday	175	100 + 150 + 80 + 200 + 175	705

Key Insight: Incremental Computation

While the definition sums all values from the start, a running total can be computed incrementally:

RT(i) = RT(i-1) + vᵢ

This property enables efficient single-pass algorithms—the database doesn't need to re-sum all previous values for each row; it just adds the current value to the previous running total.

Converting Mermaid diagram...

Running Totals Preserve All Rows

Unlike GROUP BY aggregation which collapses rows, running totals produce a result for each input row. This is the essence of window functions: compute aggregate-like values while preserving row-level detail.

Basic Syntax and Patterns

The fundamental pattern for running totals uses SUM() with an OVER clause that specifies ordering. Let's build from the simplest form to more sophisticated variations.

running_total_basic.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Basic running total syntax
SELECT 
    column,
    value,
    SUM(value) OVER (ORDER BY column) AS running_total
FROM table_name;
 
-- Complete syntax with all options
SUM(expression) OVER (
    [PARTITION BY partition_expression, ...]
    ORDER BY sort_expression [ASC | DESC] [NULLS FIRST | NULLS LAST]
    [frame_clause]  -- Usually implicit for running totals
)
 
-- Practical example: Daily sales with running total
SELECT 
    sale_date,
    daily_revenue,
    SUM(daily_revenue) OVER (ORDER BY sale_date) AS cumulative_revenue
FROM daily_sales;
 
-- Result:
-- | sale_date  | daily_revenue | cumulative_revenue |
-- |------------|---------------|---------------------|
-- | 2024-01-01 | 1000          | 1000                |
-- | 2024-01-02 | 1500          | 2500                |
-- | 2024-01-03 | 800           | 3300                |
-- | 2024-01-04 | 2000          | 5300                |
-- | 2024-01-05 | 1750          | 7050                |

Understanding the Default Frame:

When you specify ORDER BY without an explicit frame clause, the default is:

RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

This means: sum all rows from the partition start up to and including rows with ORDER BY values equal to the current row.

RANGE vs ROWS for Running Totals:

The distinction matters when ORDER BY values have duplicates:

range_vs_rows.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Data with duplicate dates (multiple sales same day)
-- | sale_date  | amount |
-- |------------|--------|
-- | 2024-01-01 | 100    |
-- | 2024-01-02 | 150    |
-- | 2024-01-02 | 50     |  <-- Same date!
-- | 2024-01-03 | 200    |
 
-- RANGE (default): Groups duplicate ORDER BY values together
SELECT 
    sale_date,
    amount,
    SUM(amount) OVER (ORDER BY sale_date) AS running_total_range
FROM sales;
-- Result: Both Jan 2 rows show 300 (100 + 150 + 50) because they're in same RANGE
 
-- ROWS: Strict physical row ordering
SELECT 
    sale_date,
    amount,
    SUM(amount) OVER (ORDER BY sale_date ROWS UNBOUNDED PRECEDING) AS running_total_rows
FROM sales;
-- Result: First Jan 2 row shows 250 (100 + 150), second shows 300 (100 + 150 + 50)

RANGE vs ROWS for Running Totals
Aspect	RANGE (Default)	ROWS
Duplicate handling	Groups ties together	Different for each physical row
Determinism with ties	Deterministic (same result for ties)	Non-deterministic without tiebreaker
Semantic	"All values up through this value"	"All values up through this row"
Use case	When ties should be treated as one point	When exact row position matters

Best Practice: Be Explicit

For clarity and consistent behavior across databases, explicitly specify your frame: ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW for strict running totals, or RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW when grouping ties is intentional.

Partitioned Running Totals

Often, running totals should reset at logical boundaries—per customer, per product, per fiscal period. The PARTITION BY clause creates these independent running totals.

partitioned_running_totals.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
-- Running total per customer (resets for each customer)
SELECT 
    customer_id,
    order_date,
    order_total,
    SUM(order_total) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS customer_running_total
FROM orders
ORDER BY customer_id, order_date;
 
-- Result:
-- | customer_id | order_date | order_total | customer_running_total |
-- |-------------|------------|-------------|------------------------|
-- | 1001        | 2024-01-05 | 250         | 250                    |
-- | 1001        | 2024-01-12 | 175         | 425                    |
-- | 1001        | 2024-01-20 | 300         | 725                    |
-- | 1002        | 2024-01-03 | 500         | 500    <-- Resets!     |
-- | 1002        | 2024-01-18 | 225         | 725                    |
 
-- Multiple partition columns: Running total per product per month
SELECT 
    product_id,
    DATE_TRUNC('month', sale_date) AS sale_month,
    sale_date,
    quantity,
    SUM(quantity) OVER (
        PARTITION BY product_id, DATE_TRUNC('month', sale_date)
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS month_to_date_quantity
FROM product_sales
ORDER BY product_id, sale_date;

Common Partitioning Patterns:

Partitioning Strategies for Running Totals

•By entity: PARTITION BY customer_id, PARTITION BY account_number — Track individual histories
•By time period: PARTITION BY YEAR(date), PARTITION BY fiscal_quarter — Reset at period boundaries
•By category: PARTITION BY product_category, PARTITION BY region — Compare accumulation patterns across segments
•By hierarchy: PARTITION BY department, team — Nested running totals within organizational structure
•Compound: PARTITION BY customer_id, YEAR(date) — Per-customer yearly running totals

year_to_date_example.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- Year-to-Date (YTD) revenue calculation
SELECT 
    sale_date,
    EXTRACT(YEAR FROM sale_date) AS year,
    daily_revenue,
    SUM(daily_revenue) OVER (
        PARTITION BY EXTRACT(YEAR FROM sale_date)
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS ytd_revenue
FROM daily_sales
ORDER BY sale_date;
 
-- Compare current day to same YTD point last year
WITH ytd_data AS (
    SELECT 
        sale_date,
        EXTRACT(YEAR FROM sale_date) AS year,
        EXTRACT(DOY FROM sale_date) AS day_of_year,
        daily_revenue,
        SUM(daily_revenue) OVER (
            PARTITION BY EXTRACT(YEAR FROM sale_date)
            ORDER BY sale_date
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS ytd_revenue
    FROM daily_sales
)
SELECT 
    curr.sale_date,
    curr.ytd_revenue AS current_ytd,
    prev.ytd_revenue AS prior_year_ytd,
    ROUND(100.0 * (curr.ytd_revenue - prev.ytd_revenue) / NULLIF(prev.ytd_revenue, 0), 2) AS ytd_growth_pct
FROM ytd_data curr
LEFT JOIN ytd_data prev 
    ON curr.day_of_year = prev.day_of_year 
    AND curr.year = prev.year + 1
WHERE curr.year = 2024
ORDER BY curr.sale_date;

Partition Granularity Trade-offs

Finer partitions (more groups) mean shorter sequences and faster per-row computation, but more partition overhead. Coarser partitions (larger groups) can leverage the incremental nature of running totals across longer sequences. Choose based on your analytical needs and data distribution.

Variations Beyond SUM

While SUM() is the most common, any aggregate function can compute cumulative values. Let's explore other aggregate-based running computations.

running_aggregates.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- Running COUNT: How many transactions so far?
SELECT 
    transaction_date,
    amount,
    COUNT(*) OVER (ORDER BY transaction_date ROWS UNBOUNDED PRECEDING) AS transaction_count
FROM transactions;
 
-- Running AVG: Average of all values up to current point
SELECT 
    sale_date,
    daily_revenue,
    ROUND(AVG(daily_revenue) OVER (
        ORDER BY sale_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ), 2) AS running_avg_revenue
FROM daily_sales;
 
-- Running MAX: Maximum value seen so far (useful for high-water marks)
SELECT 
    trade_date,
    close_price,
    MAX(close_price) OVER (
        ORDER BY trade_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS all_time_high
FROM stock_prices
WHERE symbol = 'AAPL';
 
-- Running MIN: Minimum value seen so far (useful for low-water marks)
SELECT 
    trade_date,
    close_price,
    MIN(close_price) OVER (
        ORDER BY trade_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS all_time_low
FROM stock_prices
WHERE symbol = 'AAPL';

Common Running Aggregate Patterns
Function	Running Interpretation	Typical Use Case
SUM()	Cumulative total	Revenue YTD, account balance
COUNT()	Cumulative count	User acquisition, event sequence number
AVG()	Expanding average	Performance trending, quality metrics
MAX()	High-water mark	Peak price, record values
MIN()	Low-water mark	Trough values, minimum achieved
STDDEV()	Expanding volatility	Risk metrics, variance tracking

advanced_running_patterns.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- Combining running aggregates for analysis
SELECT 
    trade_date,
    close_price,
    SUM(volume) OVER w AS cumulative_volume,
    MAX(close_price) OVER w AS running_high,
    MIN(close_price) OVER w AS running_low,
    close_price - MIN(close_price) OVER w AS gain_from_low,
    MAX(close_price) OVER w - close_price AS drop_from_high
FROM stock_prices
WHERE symbol = 'AAPL' AND trade_date >= '2024-01-01'
WINDOW w AS (ORDER BY trade_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);
 
-- Running count with filter (conditional accumulation)
SELECT 
    event_date,
    event_type,
    COUNT(*) FILTER (WHERE event_type = 'purchase') OVER (
        ORDER BY event_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS purchases_so_far,
    COUNT(*) FILTER (WHERE event_type = 'return') OVER (
        ORDER BY event_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS returns_so_far
FROM customer_events;
 
-- Note: FILTER syntax is PostgreSQL. For other databases, use CASE:
SELECT 
    event_date,
    event_type,
    SUM(CASE WHEN event_type = 'purchase' THEN 1 ELSE 0 END) OVER (
        ORDER BY event_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS purchases_so_far
FROM customer_events;

Running DISTINCT COUNT

COUNT(DISTINCT ...) OVER is not supported in most databases because maintaining a running distinct count requires tracking all seen values. Workarounds include using DENSE_RANK() for surrogate counting or materializing intermediate results.

Practical Business Applications

Running totals are ubiquitous in business analytics. Let's explore the most important real-world applications with production-ready SQL patterns.

Financial Statements and Account Balances:

Running totals are the foundation of financial reporting—computing balances, YTD figures, and period-over-period comparisons.

financial_reporting.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
-- Bank account running balance
SELECT 
    transaction_date,
    transaction_type,
    CASE 
        WHEN transaction_type IN ('deposit', 'credit') THEN amount
        ELSE -amount 
    END AS signed_amount,
    SUM(CASE 
        WHEN transaction_type IN ('deposit', 'credit') THEN amount
        ELSE -amount 
    END) OVER (
        PARTITION BY account_id
        ORDER BY transaction_date, transaction_id
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_balance
FROM bank_transactions
WHERE account_id = 'ACC-001'
ORDER BY transaction_date, transaction_id;
 
-- Monthly revenue with YTD and full-year comparison
WITH monthly_revenue AS (
    SELECT 
        DATE_TRUNC('month', sale_date) AS month,
        EXTRACT(YEAR FROM sale_date) AS year,
        SUM(amount) AS monthly_revenue
    FROM sales
    GROUP BY 1, 2
)
SELECT 
    month,
    monthly_revenue,
    SUM(monthly_revenue) OVER (
        PARTITION BY year
        ORDER BY month
    ) AS ytd_revenue,
    SUM(monthly_revenue) OVER (
        PARTITION BY year
    ) AS full_year_revenue,
    ROUND(100.0 * SUM(monthly_revenue) OVER (
        PARTITION BY year ORDER BY month
    ) / SUM(monthly_revenue) OVER (
        PARTITION BY year
    ), 1) AS pct_of_annual_target
FROM monthly_revenue
ORDER BY month;

Resetting and Conditional Running Totals

Sometimes running totals need to reset based on conditions beyond simple partitioning—for example, resetting when a value exceeds a threshold or when a specific event occurs. These patterns require creative solutions.

resetting_running_totals.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
-- Pattern: Reset running total on specific event type
-- Use case: Cumulative order value since last return
 
WITH event_flags AS (
    SELECT 
        *,
        CASE WHEN event_type = 'return' THEN 1 ELSE 0 END AS reset_flag
    FROM customer_events
),
reset_groups AS (
    SELECT 
        *,
        SUM(reset_flag) OVER (
            PARTITION BY customer_id 
            ORDER BY event_date 
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS reset_group
    FROM event_flags
)
SELECT 
    customer_id,
    event_date,
    event_type,
    amount,
    SUM(amount) OVER (
        PARTITION BY customer_id, reset_group
        ORDER BY event_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total_since_reset
FROM reset_groups
ORDER BY customer_id, event_date;
 
-- Pattern: Running total that resets when exceeding threshold
-- Use case: Batch accumulation (fill boxes until 10kg, then start new box)
WITH ordered_items AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (ORDER BY item_id) AS rn
    FROM items
)
-- This requires recursive CTE for true threshold-based resetting
-- Simplified: using partition-based approach
SELECT 
    item_id,
    weight,
    -- This is a simplified simulation; true threshold reset needs procedural logic
    SUM(weight) OVER (ORDER BY item_id ROWS UNBOUNDED PRECEDING) AS naive_cumulative
FROM ordered_items;

Conditional Accumulation:

Use CASE expressions or FILTER clauses to accumulate only when conditions are met:

conditional_accumulation.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Running total of only positive amounts (exclude refunds/returns)
SELECT 
    transaction_date,
    transaction_type,
    amount,
    SUM(CASE WHEN amount > 0 THEN amount ELSE 0 END) OVER (
        ORDER BY transaction_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS positive_running_total,
    SUM(CASE WHEN amount < 0 THEN amount ELSE 0 END) OVER (
        ORDER BY transaction_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS negative_running_total
FROM transactions;
 
-- PostgreSQL: Using FILTER for cleaner syntax
SELECT 
    transaction_date,
    transaction_type,
    amount,
    SUM(amount) FILTER (WHERE amount > 0) OVER (
        ORDER BY transaction_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS positive_running_total
FROM transactions;
 
-- Multiple simultaneous running totals by category
SELECT 
    sale_date,
    product_category,
    sale_amount,
    SUM(sale_amount) OVER (ORDER BY sale_date ROWS UNBOUNDED PRECEDING) AS total_running,
    SUM(CASE WHEN product_category = 'Electronics' THEN sale_amount ELSE 0 END) 
        OVER (ORDER BY sale_date ROWS UNBOUNDED PRECEDING) AS electronics_running,
    SUM(CASE WHEN product_category = 'Clothing' THEN sale_amount ELSE 0 END) 
        OVER (ORDER BY sale_date ROWS UNBOUNDED PRECEDING) AS clothing_running
FROM sales;

True Threshold-Based Reset Is Hard

Resetting a running total when it reaches a threshold (e.g., batch filling) cannot be done with pure window functions—the reset decision depends on the running total itself, creating a circular dependency. These problems require recursive CTEs or procedural code.

Performance Optimization

Running totals can be computed efficiently, but understanding the performance characteristics helps you scale to large datasets.

Algorithmic Complexity:

Optimal algorithm: O(n) — single pass with accumulation Sort overhead: O(n log n) if data isn't pre-sorted Total: O(n log n) dominated by sort, or O(n) with index-provided order

Key insight: The running total computation itself is linear—the only expensive part is sorting. If you have an index that provides the ORDER BY sequence, no sort is needed.

performance_optimization.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- Index recommendation for running totals
-- For: SUM(amount) OVER (PARTITION BY customer_id ORDER BY date)
CREATE INDEX idx_customer_date ON transactions(customer_id, transaction_date);
-- This index provides both partition ordering and sort ordering
 
-- Check execution plan for sort operations
EXPLAIN ANALYZE
SELECT 
    customer_id,
    transaction_date,
    amount,
    SUM(amount) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS running_total
FROM transactions
WHERE customer_id IN (1001, 1002, 1003);
 
-- Optimization: Filter before windowing
-- GOOD: Filter first (reduces window function input)
SELECT 
    customer_id,
    transaction_date,
    amount,
    SUM(amount) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS running_total
FROM transactions
WHERE transaction_date >= '2024-01-01'  -- Filter reduces rows before windowing
ORDER BY customer_id, transaction_date;
 
-- Optimization: Consolidate window specifications
SELECT 
    date,
    amount,
    SUM(amount) OVER w AS running_sum,
    COUNT(*) OVER w AS running_count,
    AVG(amount) OVER w AS running_avg
FROM transactions
WINDOW w AS (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);
-- All three aggregates computed in single pass over same window

Performance Best Practices

•Index ORDER BY columns: Create indexes matching (partition_columns, order_column) to eliminate sort operations
•Filter early: Apply WHERE clauses before windowing to reduce input size
•Consolidate windows: Use WINDOW clause to define once and reference multiple times
•Avoid unnecessary ordering: If only care about final totals, use GROUP BY instead of window functions
•Partition appropriately: Smaller partitions process faster but create more overhead; balance based on data
•Materialize if reused: If the same running totals are queried repeatedly, consider materialized views

Performance Impact Factors
Factor	Impact	Mitigation
Missing ORDER BY index	Full table sort O(n log n)	Create covering index
Large partition sizes	More memory per partition	Filter or use finer partitions
Multiple window definitions	Potential multiple sorts	Consolidate with WINDOW clause
RANGE vs ROWS	RANGE may be slower	Use ROWS when exact ordering exists
No partition parallelism	Sequential processing	Ensure partitions enable parallel execution

Check EXPLAIN Output

Always check your query plan for 'Sort' nodes. If you see a sort and expected the index to provide order, the index may not match the window specification. Also look for 'WindowAgg' operators and ensure they're not repeated unnecessarily.

Summary: Running Totals

Running totals transform how we analyze sequential and time-based data, enabling cumulative analysis that would otherwise require complex procedural code. Let's consolidate the essential concepts.

Key Takeaways

•Running totals accumulate values: SUM(value) OVER (ORDER BY col) computes the total of all values from the start up to the current row.
•Default frame is RANGE: With ORDER BY, the default frame groups ties together. Use ROWS for strict physical row ordering.
•PARTITION BY creates independent totals: Running totals reset at partition boundaries—essential for per-entity or per-period analysis.
•Any aggregate works: COUNT, AVG, MAX, MIN, and STDDEV all support running computations with the same pattern.
•Business applications are ubiquitous: Financial balances, YTD revenue, inventory levels, user growth, and quota tracking all rely on running totals.
•Conditional accumulation uses CASE: Selectively include values with CASE expressions or FILTER clauses.
•Performance is typically O(n log n): Dominated by sorting; indexes matching ORDER BY eliminate sort overhead.
•Threshold-based reset is complex: Resetting when a running total hits a threshold requires recursive CTEs or procedural code.

What's Next:

Building on running totals, we'll explore moving averages—a specialized form of windowed aggregation that considers only a recent subset of values. Moving averages are essential for trend smoothing, anomaly detection, and time-series analysis.

Page Complete

You now have deep understanding of running totals—from basic SUM() OVER patterns through partitioning, conditional accumulation, and performance optimization. These patterns form the backbone of financial and operational analytics. Practice applying them to your own time-series and sequential data.

3 / 5

Loading learning content...

Database Management SystemsAnalytic Functions

Analytic Functions

LevelAdvanced

Duration120 mins

TopicAnalytic Functions

3 / 5

Running Totals

The Power of Cumulative Computation

What You Will Learn

Conceptual Foundation

Mathematical Definition:

For a sequence of values v₁, v₂, v₃, ..., vₙ, the running total at position i is:

RT(i) = v₁ + v₂ + ... + vᵢ = Σⱼ₌₁ⁱ vⱼ

Example Walkthrough:

Running Total Calculation Example
Day	Sales	Calculation	Running Total
Monday	100	100	100
Tuesday	150	100 + 150	250
Wednesday	80	100 + 150 + 80	330
Thursday	200	100 + 150 + 80 + 200	530
Friday	175	100 + 150 + 80 + 200 + 175	705

Key Insight: Incremental Computation

While the definition sums all values from the start, a running total can be computed incrementally:

RT(i) = RT(i-1) + vᵢ

This property enables efficient single-pass algorithms—the database doesn't need to re-sum all previous values for each row; it just adds the current value to the previous running total.

Converting Mermaid diagram...

Running Totals Preserve All Rows

Basic Syntax and Patterns

The fundamental pattern for running totals uses SUM() with an OVER clause that specifies ordering. Let's build from the simplest form to more sophisticated variations.

running_total_basic.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Basic running total syntax
SELECT 
    column,
    value,
    SUM(value) OVER (ORDER BY column) AS running_total
FROM table_name;
 
-- Complete syntax with all options
SUM(expression) OVER (
    [PARTITION BY partition_expression, ...]
    ORDER BY sort_expression [ASC | DESC] [NULLS FIRST | NULLS LAST]
    [frame_clause]  -- Usually implicit for running totals
)
 
-- Practical example: Daily sales with running total
SELECT 
    sale_date,
    daily_revenue,
    SUM(daily_revenue) OVER (ORDER BY sale_date) AS cumulative_revenue
FROM daily_sales;
 
-- Result:
-- | sale_date  | daily_revenue | cumulative_revenue |
-- |------------|---------------|---------------------|
-- | 2024-01-01 | 1000          | 1000                |
-- | 2024-01-02 | 1500          | 2500                |
-- | 2024-01-03 | 800           | 3300                |
-- | 2024-01-04 | 2000          | 5300                |
-- | 2024-01-05 | 1750          | 7050                |

Understanding the Default Frame:

When you specify ORDER BY without an explicit frame clause, the default is:

RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

This means: sum all rows from the partition start up to and including rows with ORDER BY values equal to the current row.

RANGE vs ROWS for Running Totals:

The distinction matters when ORDER BY values have duplicates:

range_vs_rows.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Data with duplicate dates (multiple sales same day)
-- | sale_date  | amount |
-- |------------|--------|
-- | 2024-01-01 | 100    |
-- | 2024-01-02 | 150    |
-- | 2024-01-02 | 50     |  <-- Same date!
-- | 2024-01-03 | 200    |
 
-- RANGE (default): Groups duplicate ORDER BY values together
SELECT 
    sale_date,
    amount,
    SUM(amount) OVER (ORDER BY sale_date) AS running_total_range
FROM sales;
-- Result: Both Jan 2 rows show 300 (100 + 150 + 50) because they're in same RANGE
 
-- ROWS: Strict physical row ordering
SELECT 
    sale_date,
    amount,
    SUM(amount) OVER (ORDER BY sale_date ROWS UNBOUNDED PRECEDING) AS running_total_rows
FROM sales;
-- Result: First Jan 2 row shows 250 (100 + 150), second shows 300 (100 + 150 + 50)

RANGE vs ROWS for Running Totals
Aspect	RANGE (Default)	ROWS
Duplicate handling	Groups ties together	Different for each physical row
Determinism with ties	Deterministic (same result for ties)	Non-deterministic without tiebreaker
Semantic	"All values up through this value"	"All values up through this row"
Use case	When ties should be treated as one point	When exact row position matters

Best Practice: Be Explicit

Partitioned Running Totals

Often, running totals should reset at logical boundaries—per customer, per product, per fiscal period. The PARTITION BY clause creates these independent running totals.

partitioned_running_totals.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
-- Running total per customer (resets for each customer)
SELECT 
    customer_id,
    order_date,
    order_total,
    SUM(order_total) OVER (
        PARTITION BY customer_id 
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS customer_running_total
FROM orders
ORDER BY customer_id, order_date;
 
-- Result:
-- | customer_id | order_date | order_total | customer_running_total |
-- |-------------|------------|-------------|------------------------|
-- | 1001        | 2024-01-05 | 250         | 250                    |
-- | 1001        | 2024-01-12 | 175         | 425                    |
-- | 1001        | 2024-01-20 | 300         | 725                    |
-- | 1002        | 2024-01-03 | 500         | 500    <-- Resets!     |
-- | 1002        | 2024-01-18 | 225         | 725                    |
 
-- Multiple partition columns: Running total per product per month
SELECT 
    product_id,
    DATE_TRUNC('month', sale_date) AS sale_month,
    sale_date,
    quantity,
    SUM(quantity) OVER (
        PARTITION BY product_id, DATE_TRUNC('month', sale_date)
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS month_to_date_quantity
FROM product_sales
ORDER BY product_id, sale_date;

Common Partitioning Patterns:

Partitioning Strategies for Running Totals

•By entity: PARTITION BY customer_id, PARTITION BY account_number — Track individual histories
•By time period: PARTITION BY YEAR(date), PARTITION BY fiscal_quarter — Reset at period boundaries
•By category: PARTITION BY product_category, PARTITION BY region — Compare accumulation patterns across segments
•By hierarchy: PARTITION BY department, team — Nested running totals within organizational structure
•Compound: PARTITION BY customer_id, YEAR(date) — Per-customer yearly running totals

year_to_date_example.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- Year-to-Date (YTD) revenue calculation
SELECT 
    sale_date,
    EXTRACT(YEAR FROM sale_date) AS year,
    daily_revenue,
    SUM(daily_revenue) OVER (
        PARTITION BY EXTRACT(YEAR FROM sale_date)
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS ytd_revenue
FROM daily_sales
ORDER BY sale_date;
 
-- Compare current day to same YTD point last year
WITH ytd_data AS (
    SELECT 
        sale_date,
        EXTRACT(YEAR FROM sale_date) AS year,
        EXTRACT(DOY FROM sale_date) AS day_of_year,
        daily_revenue,
        SUM(daily_revenue) OVER (
            PARTITION BY EXTRACT(YEAR FROM sale_date)
            ORDER BY sale_date
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS ytd_revenue
    FROM daily_sales
)
SELECT 
    curr.sale_date,
    curr.ytd_revenue AS current_ytd,
    prev.ytd_revenue AS prior_year_ytd,
    ROUND(100.0 * (curr.ytd_revenue - prev.ytd_revenue) / NULLIF(prev.ytd_revenue, 0), 2) AS ytd_growth_pct
FROM ytd_data curr
LEFT JOIN ytd_data prev 
    ON curr.day_of_year = prev.day_of_year 
    AND curr.year = prev.year + 1
WHERE curr.year = 2024
ORDER BY curr.sale_date;

Partition Granularity Trade-offs

Variations Beyond SUM

While SUM() is the most common, any aggregate function can compute cumulative values. Let's explore other aggregate-based running computations.

running_aggregates.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- Running COUNT: How many transactions so far?
SELECT 
    transaction_date,
    amount,
    COUNT(*) OVER (ORDER BY transaction_date ROWS UNBOUNDED PRECEDING) AS transaction_count
FROM transactions;
 
-- Running AVG: Average of all values up to current point
SELECT 
    sale_date,
    daily_revenue,
    ROUND(AVG(daily_revenue) OVER (
        ORDER BY sale_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ), 2) AS running_avg_revenue
FROM daily_sales;
 
-- Running MAX: Maximum value seen so far (useful for high-water marks)
SELECT 
    trade_date,
    close_price,
    MAX(close_price) OVER (
        ORDER BY trade_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS all_time_high
FROM stock_prices
WHERE symbol = 'AAPL';
 
-- Running MIN: Minimum value seen so far (useful for low-water marks)
SELECT 
    trade_date,
    close_price,
    MIN(close_price) OVER (
        ORDER BY trade_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS all_time_low
FROM stock_prices
WHERE symbol = 'AAPL';

Common Running Aggregate Patterns
Function	Running Interpretation	Typical Use Case
SUM()	Cumulative total	Revenue YTD, account balance
COUNT()	Cumulative count	User acquisition, event sequence number
AVG()	Expanding average	Performance trending, quality metrics
MAX()	High-water mark	Peak price, record values
MIN()	Low-water mark	Trough values, minimum achieved
STDDEV()	Expanding volatility	Risk metrics, variance tracking

advanced_running_patterns.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- Combining running aggregates for analysis
SELECT 
    trade_date,
    close_price,
    SUM(volume) OVER w AS cumulative_volume,
    MAX(close_price) OVER w AS running_high,
    MIN(close_price) OVER w AS running_low,
    close_price - MIN(close_price) OVER w AS gain_from_low,
    MAX(close_price) OVER w - close_price AS drop_from_high
FROM stock_prices
WHERE symbol = 'AAPL' AND trade_date >= '2024-01-01'
WINDOW w AS (ORDER BY trade_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);
 
-- Running count with filter (conditional accumulation)
SELECT 
    event_date,
    event_type,
    COUNT(*) FILTER (WHERE event_type = 'purchase') OVER (
        ORDER BY event_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS purchases_so_far,
    COUNT(*) FILTER (WHERE event_type = 'return') OVER (
        ORDER BY event_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS returns_so_far
FROM customer_events;
 
-- Note: FILTER syntax is PostgreSQL. For other databases, use CASE:
SELECT 
    event_date,
    event_type,
    SUM(CASE WHEN event_type = 'purchase' THEN 1 ELSE 0 END) OVER (
        ORDER BY event_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS purchases_so_far
FROM customer_events;

Running DISTINCT COUNT

Practical Business Applications

Running totals are ubiquitous in business analytics. Let's explore the most important real-world applications with production-ready SQL patterns.

Financial Statements and Account Balances:

Running totals are the foundation of financial reporting—computing balances, YTD figures, and period-over-period comparisons.

financial_reporting.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
-- Bank account running balance
SELECT 
    transaction_date,
    transaction_type,
    CASE 
        WHEN transaction_type IN ('deposit', 'credit') THEN amount
        ELSE -amount 
    END AS signed_amount,
    SUM(CASE 
        WHEN transaction_type IN ('deposit', 'credit') THEN amount
        ELSE -amount 
    END) OVER (
        PARTITION BY account_id
        ORDER BY transaction_date, transaction_id
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_balance
FROM bank_transactions
WHERE account_id = 'ACC-001'
ORDER BY transaction_date, transaction_id;
 
-- Monthly revenue with YTD and full-year comparison
WITH monthly_revenue AS (
    SELECT 
        DATE_TRUNC('month', sale_date) AS month,
        EXTRACT(YEAR FROM sale_date) AS year,
        SUM(amount) AS monthly_revenue
    FROM sales
    GROUP BY 1, 2
)
SELECT 
    month,
    monthly_revenue,
    SUM(monthly_revenue) OVER (
        PARTITION BY year
        ORDER BY month
    ) AS ytd_revenue,
    SUM(monthly_revenue) OVER (
        PARTITION BY year
    ) AS full_year_revenue,
    ROUND(100.0 * SUM(monthly_revenue) OVER (
        PARTITION BY year ORDER BY month
    ) / SUM(monthly_revenue) OVER (
        PARTITION BY year
    ), 1) AS pct_of_annual_target
FROM monthly_revenue
ORDER BY month;

Resetting and Conditional Running Totals

resetting_running_totals.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
-- Pattern: Reset running total on specific event type
-- Use case: Cumulative order value since last return
 
WITH event_flags AS (
    SELECT 
        *,
        CASE WHEN event_type = 'return' THEN 1 ELSE 0 END AS reset_flag
    FROM customer_events
),
reset_groups AS (
    SELECT 
        *,
        SUM(reset_flag) OVER (
            PARTITION BY customer_id 
            ORDER BY event_date 
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS reset_group
    FROM event_flags
)
SELECT 
    customer_id,
    event_date,
    event_type,
    amount,
    SUM(amount) OVER (
        PARTITION BY customer_id, reset_group
        ORDER BY event_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total_since_reset
FROM reset_groups
ORDER BY customer_id, event_date;
 
-- Pattern: Running total that resets when exceeding threshold
-- Use case: Batch accumulation (fill boxes until 10kg, then start new box)
WITH ordered_items AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (ORDER BY item_id) AS rn
    FROM items
)
-- This requires recursive CTE for true threshold-based resetting
-- Simplified: using partition-based approach
SELECT 
    item_id,
    weight,
    -- This is a simplified simulation; true threshold reset needs procedural logic
    SUM(weight) OVER (ORDER BY item_id ROWS UNBOUNDED PRECEDING) AS naive_cumulative
FROM ordered_items;

Conditional Accumulation:

Use CASE expressions or FILTER clauses to accumulate only when conditions are met:

conditional_accumulation.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Running total of only positive amounts (exclude refunds/returns)
SELECT 
    transaction_date,
    transaction_type,
    amount,
    SUM(CASE WHEN amount > 0 THEN amount ELSE 0 END) OVER (
        ORDER BY transaction_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS positive_running_total,
    SUM(CASE WHEN amount < 0 THEN amount ELSE 0 END) OVER (
        ORDER BY transaction_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS negative_running_total
FROM transactions;
 
-- PostgreSQL: Using FILTER for cleaner syntax
SELECT 
    transaction_date,
    transaction_type,
    amount,
    SUM(amount) FILTER (WHERE amount > 0) OVER (
        ORDER BY transaction_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS positive_running_total
FROM transactions;
 
-- Multiple simultaneous running totals by category
SELECT 
    sale_date,
    product_category,
    sale_amount,
    SUM(sale_amount) OVER (ORDER BY sale_date ROWS UNBOUNDED PRECEDING) AS total_running,
    SUM(CASE WHEN product_category = 'Electronics' THEN sale_amount ELSE 0 END) 
        OVER (ORDER BY sale_date ROWS UNBOUNDED PRECEDING) AS electronics_running,
    SUM(CASE WHEN product_category = 'Clothing' THEN sale_amount ELSE 0 END) 
        OVER (ORDER BY sale_date ROWS UNBOUNDED PRECEDING) AS clothing_running
FROM sales;

True Threshold-Based Reset Is Hard

Performance Optimization

Running totals can be computed efficiently, but understanding the performance characteristics helps you scale to large datasets.

Algorithmic Complexity:

Optimal algorithm: O(n) — single pass with accumulation Sort overhead: O(n log n) if data isn't pre-sorted Total: O(n log n) dominated by sort, or O(n) with index-provided order

Key insight: The running total computation itself is linear—the only expensive part is sorting. If you have an index that provides the ORDER BY sequence, no sort is needed.

performance_optimization.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- Index recommendation for running totals
-- For: SUM(amount) OVER (PARTITION BY customer_id ORDER BY date)
CREATE INDEX idx_customer_date ON transactions(customer_id, transaction_date);
-- This index provides both partition ordering and sort ordering
 
-- Check execution plan for sort operations
EXPLAIN ANALYZE
SELECT 
    customer_id,
    transaction_date,
    amount,
    SUM(amount) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS running_total
FROM transactions
WHERE customer_id IN (1001, 1002, 1003);
 
-- Optimization: Filter before windowing
-- GOOD: Filter first (reduces window function input)
SELECT 
    customer_id,
    transaction_date,
    amount,
    SUM(amount) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS running_total
FROM transactions
WHERE transaction_date >= '2024-01-01'  -- Filter reduces rows before windowing
ORDER BY customer_id, transaction_date;
 
-- Optimization: Consolidate window specifications
SELECT 
    date,
    amount,
    SUM(amount) OVER w AS running_sum,
    COUNT(*) OVER w AS running_count,
    AVG(amount) OVER w AS running_avg
FROM transactions
WINDOW w AS (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);
-- All three aggregates computed in single pass over same window

Performance Best Practices

•Index ORDER BY columns: Create indexes matching (partition_columns, order_column) to eliminate sort operations
•Filter early: Apply WHERE clauses before windowing to reduce input size
•Consolidate windows: Use WINDOW clause to define once and reference multiple times
•Avoid unnecessary ordering: If only care about final totals, use GROUP BY instead of window functions
•Partition appropriately: Smaller partitions process faster but create more overhead; balance based on data
•Materialize if reused: If the same running totals are queried repeatedly, consider materialized views

Performance Impact Factors
Factor	Impact	Mitigation
Missing ORDER BY index	Full table sort O(n log n)	Create covering index
Large partition sizes	More memory per partition	Filter or use finer partitions
Multiple window definitions	Potential multiple sorts	Consolidate with WINDOW clause
RANGE vs ROWS	RANGE may be slower	Use ROWS when exact ordering exists
No partition parallelism	Sequential processing	Ensure partitions enable parallel execution

Check EXPLAIN Output

Summary: Running Totals

Running totals transform how we analyze sequential and time-based data, enabling cumulative analysis that would otherwise require complex procedural code. Let's consolidate the essential concepts.

Key Takeaways

•Running totals accumulate values: SUM(value) OVER (ORDER BY col) computes the total of all values from the start up to the current row.
•Default frame is RANGE: With ORDER BY, the default frame groups ties together. Use ROWS for strict physical row ordering.
•PARTITION BY creates independent totals: Running totals reset at partition boundaries—essential for per-entity or per-period analysis.
•Any aggregate works: COUNT, AVG, MAX, MIN, and STDDEV all support running computations with the same pattern.
•Business applications are ubiquitous: Financial balances, YTD revenue, inventory levels, user growth, and quota tracking all rely on running totals.
•Conditional accumulation uses CASE: Selectively include values with CASE expressions or FILTER clauses.
•Performance is typically O(n log n): Dominated by sorting; indexes matching ORDER BY eliminate sort overhead.
•Threshold-based reset is complex: Resetting when a running total hits a threshold requires recursive CTEs or procedural code.

What's Next:

Page Complete

3 / 5