Loading content...
While LAG and LEAD access values at fixed offsets from the current row, many analytical questions require accessing values at the boundaries of a window frame: What was the first transaction of the day? What's the most recent salary for each employee? How does the current stock price compare to the opening price?
FIRST_VALUE and LAST_VALUE answer these questions directly. They retrieve the first or last value of an expression within the current window frame, enabling comparisons against reference points, baseline analysis, and aggregation-style lookups while preserving individual row context.
These functions are conceptually similar to aggregate functions like MIN and MAX, but with a crucial difference: they return specific row values (the first or last by some ordering) rather than mathematical extremes. The 'first' value might not be the minimum; it's simply the value that appears first in the ordered sequence.
Understanding FIRST_VALUE and LAST_VALUE requires understanding window frames—the subset of partition rows that each function considers. This interplay between function and frame is where the real power (and complexity) lies.
By the end of this page, you will fully understand FIRST_VALUE and LAST_VALUE syntax and execution semantics, master window frame specifications and their critical impact on results, handle NULL values with IGNORE NULLS and RESPECT NULLS, apply these functions to real-world scenarios from baseline comparisons to session analysis, and optimize performance for production workloads.
FIRST_VALUE and LAST_VALUE are positional window functions that extract values from specific positions within a window frame. Let's build a solid mental model before diving into syntax.
The Window Frame Model:
For each row being processed, the window frame defines which other rows are 'visible' to the window function. By default, the frame includes all rows from the partition start up to and including the current row—not the entire partition.
FIRST_VALUE: Returns the value of the expression evaluated at the first row of the current window frame.
LAST_VALUE: Returns the value of the expression evaluated at the last row of the current window frame.
Critical Insight: The window frame boundaries determine what 'first' and 'last' mean. With the default frame (RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW), LAST_VALUE always returns the current row's value—probably not what you expect!
This is the single most common mistake with LAST_VALUE. Due to default frame semantics, LAST_VALUE often returns the current row's value rather than the partition's last row. You almost always need to specify ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to get the true last value.
Comparison with Related Functions:
| Function | Access Pattern | Relationship to Current Row | Frame-Dependent? |
|---|---|---|---|
| LAG | Fixed offset backward | Relative (n rows before) | No (ignores frame) |
| LEAD | Fixed offset forward | Relative (n rows after) | No (ignores frame) |
| FIRST_VALUE | First in frame | Absolute (frame start) | Yes (frame determines 'first') |
| LAST_VALUE | Last in frame | Absolute (frame end) | Yes (frame determines 'last') |
| NTH_VALUE | Nth position in frame | Absolute (position n) | Yes (frame determines scope) |
Let's examine the complete syntax for FIRST_VALUE and LAST_VALUE, including all optional clauses.
123456789101112131415161718192021222324252627
-- Full FIRST_VALUE syntaxFIRST_VALUE(expression [IGNORE NULLS | RESPECT NULLS])OVER ( [PARTITION BY partition_expression, ...] ORDER BY sort_expression [ASC | DESC] [NULLS FIRST | NULLS LAST], ... [frame_clause]) -- Full LAST_VALUE syntaxLAST_VALUE(expression [IGNORE NULLS | RESPECT NULLS])OVER ( [PARTITION BY partition_expression, ...] ORDER BY sort_expression [ASC | DESC] [NULLS FIRST | NULLS LAST], ... [frame_clause]) -- Frame clause options:frame_clause ::= { ROWS | RANGE | GROUPS } BETWEEN frame_start AND frame_end frame_bound ::= UNBOUNDED PRECEDING | offset PRECEDING | CURRENT ROW | offset FOLLOWING | UNBOUNDED FOLLOWING| Parameter | Required | Default | Description |
|---|---|---|---|
| expression | Yes | — | Column or expression whose boundary value to retrieve |
| IGNORE/RESPECT NULLS | No | RESPECT NULLS | Whether to skip NULL values when finding first/last |
| PARTITION BY | No | Entire result set | Divides rows into independent groups |
| ORDER BY | Required* | — | Defines sequence within partition (*needed for meaningful results) |
| Frame clause | No | RANGE UNBOUNDED PRECEDING | Boundaries for 'first' and 'last' determination |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
-- Example table: daily_stock_prices-- | trade_date | symbol | open_price | close_price |-- |------------|--------|------------|-------------|-- | 2024-01-02 | AAPL | 185.00 | 186.50 |-- | 2024-01-03 | AAPL | 186.75 | 184.25 |-- | 2024-01-04 | AAPL | 184.00 | 187.00 |-- | 2024-01-05 | AAPL | 187.25 | 188.50 | -- Get first close price (start of period) for comparisonSELECT trade_date, symbol, close_price, FIRST_VALUE(close_price) OVER ( PARTITION BY symbol ORDER BY trade_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS first_close_price, close_price - FIRST_VALUE(close_price) OVER ( PARTITION BY symbol ORDER BY trade_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS change_from_startFROM daily_stock_prices; -- Result:-- | trade_date | symbol | close_price | first_close_price | change_from_start |-- |------------|--------|-------------|-------------------|-------------------|-- | 2024-01-02 | AAPL | 186.50 | 186.50 | 0.00 |-- | 2024-01-03 | AAPL | 184.25 | 186.50 | -2.25 |-- | 2024-01-04 | AAPL | 187.00 | 186.50 | 0.50 |-- | 2024-01-05 | AAPL | 188.50 | 186.50 | 2.00 | -- Get last (most recent) value - NOTE the frame specification!SELECT trade_date, symbol, close_price, LAST_VALUE(close_price) OVER ( PARTITION BY symbol ORDER BY trade_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -- Critical! ) AS latest_close_priceFROM daily_stock_prices; -- Result: latest_close_price is 188.50 for ALL rowsFor FIRST_VALUE, the default frame usually works (since 'first' from start-to-current equals first of entire partition). For LAST_VALUE, always explicitly specify ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to avoid the trap of getting the current row's value.
Window frames are central to FIRST_VALUE and LAST_VALUE behavior. Let's build deep understanding of frame types and their effects.
Frame Types:
ROWS: Physical row-based boundaries. ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING means exactly 3 rows before to 1 row after the current row.
RANGE: Logical value-based boundaries. RANGE BETWEEN 7 PRECEDING AND CURRENT ROW includes all rows within a value difference of 7 from the current row's order column.
GROUPS: Group-based boundaries (PostgreSQL 11+, others). Groups rows with equal ORDER BY values into logical groups, then counts groups for frame bounds.
12345678910111213141516171819202122232425262728293031323334353637
-- ROWS vs RANGE demonstration-- Table: daily_sales (date, revenue)-- | date | revenue |-- |------------|---------|-- | 2024-01-01 | 100 |-- | 2024-01-02 | 150 |-- | 2024-01-02 | 150 | <-- duplicate date!-- | 2024-01-03 | 200 |-- | 2024-01-04 | 175 | -- With ROWS: Physical row boundaries (ignores duplicate dates)SELECT date, revenue, FIRST_VALUE(revenue) OVER ( ORDER BY date ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING ) AS first_in_3_rows, LAST_VALUE(revenue) OVER ( ORDER BY date ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING ) AS last_in_3_rowsFROM daily_sales; -- With RANGE: Logical value boundaries (groups duplicate dates)SELECT date, revenue, FIRST_VALUE(revenue) OVER ( ORDER BY date RANGE BETWEEN INTERVAL '1 day' PRECEDING AND INTERVAL '1 day' FOLLOWING ) AS first_in_date_range, LAST_VALUE(revenue) OVER ( ORDER BY date RANGE BETWEEN INTERVAL '1 day' PRECEDING AND INTERVAL '1 day' FOLLOWING ) AS last_in_date_rangeFROM daily_sales;Frame Boundary Options:
| Boundary | Meaning | Example Context |
|---|---|---|
| UNBOUNDED PRECEDING | Start of partition | All rows from beginning |
| n PRECEDING | n rows/values before current | Last 7 days |
| CURRENT ROW | Current row (ROWS) or value group (RANGE) | Up to/from now |
| n FOLLOWING | n rows/values after current | Next 3 rows |
| UNBOUNDED FOLLOWING | End of partition | All rows to end |
12345678910111213141516171819
-- Common frame patterns for FIRST_VALUE / LAST_VALUE -- Entire partition (most common for LAST_VALUE)ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -- Rolling window: last 7 valuesROWS BETWEEN 6 PRECEDING AND CURRENT ROW -- Forward-looking: current plus next 3ROWS BETWEEN CURRENT ROW AND 3 FOLLOWING -- Symmetric window: 5 rows centered on currentROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING -- Date-based rolling (requires RANGE)RANGE BETWEEN INTERVAL '30 days' PRECEDING AND CURRENT ROW -- Full history up to but not including currentROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDINGWhen ORDER BY is present: default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. When ORDER BY is absent: default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (entire partition). This difference often surprises developers.
Real-world data often contains NULL values. FIRST_VALUE and LAST_VALUE offer explicit control over how NULLs are handled through the IGNORE NULLS and RESPECT NULLS clauses.
RESPECT NULLS (Default): If the first/last row in the frame has a NULL value for the expression, NULL is returned.
IGNORE NULLS: Skip NULL values when determining first/last, returning the first/last non-NULL value instead.
12345678910111213141516171819202122232425262728293031323334353637
-- Example: Sensor readings with gaps (NULLs)-- | reading_time | sensor | value |-- |--------------|--------|-------|-- | 10:00 | A | 100 |-- | 10:01 | A | NULL | <-- sensor offline-- | 10:02 | A | NULL | <-- still offline-- | 10:03 | A | 105 |-- | 10:04 | A | 108 | -- RESPECT NULLS (default): Returns NULL if boundary value is NULLSELECT reading_time, sensor, value, FIRST_VALUE(value) OVER ( PARTITION BY sensor ORDER BY reading_time DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS latest_value_respect_nulls -- Would be 108 (first when DESC)FROM sensor_readings; -- IGNORE NULLS: Skip NULLs to find first non-NULLSELECT reading_time, sensor, value, FIRST_VALUE(value IGNORE NULLS) OVER ( PARTITION BY sensor ORDER BY reading_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS first_non_null_value, -- 100 (skips subsequent NULLs) LAST_VALUE(value IGNORE NULLS) OVER ( PARTITION BY sensor ORDER BY reading_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS last_non_null_value -- 108 (skips intervening NULLs)FROM sensor_readings;Use Cases for IGNORE NULLS:
123456789101112131415161718192021222324252627282930313233343536373839404142434445
-- Pattern: Forward-fill NULLs with last known valueSELECT event_time, customer_id, subscription_tier, FIRST_VALUE(subscription_tier IGNORE NULLS) OVER ( PARTITION BY customer_id ORDER BY event_time ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS effective_tier -- Carries forward last known tier through NULLsFROM customer_events; -- Pattern: Find first and last actual transactions (skipping failed/null amounts)SELECT customer_id, FIRST_VALUE(transaction_date IGNORE NULLS) OVER ( PARTITION BY customer_id ORDER BY transaction_date ) AS first_successful_transaction, LAST_VALUE(amount IGNORE NULLS) OVER ( PARTITION BY customer_id ORDER BY transaction_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS most_recent_valid_amountFROM transactionsWHERE customer_id = 12345; -- Pattern: Database-agnostic fallback (if IGNORE NULLS not supported)-- Use subquery or COALESCE with LAGWITH non_null_values AS ( SELECT reading_time, sensor, value, ROW_NUMBER() OVER (PARTITION BY sensor ORDER BY reading_time) AS rn FROM sensor_readings WHERE value IS NOT NULL)SELECT s.reading_time, s.sensor, s.value, (SELECT value FROM non_null_values n WHERE n.sensor = s.sensor AND n.rn = 1) AS first_non_nullFROM sensor_readings s;IGNORE NULLS is part of the SQL standard but not universally implemented. Oracle, PostgreSQL 14+, and SQL Server 2022+ support it. For older systems, you may need workarounds using filtered subqueries or CTEs with ROW_NUMBER.
FIRST_VALUE and LAST_VALUE unlock powerful analytical patterns. Let's explore the most impactful real-world applications.
Comparing to a Fixed Reference Point:
Comparing current values to a baseline (first value of period, last known state, etc.) is foundational for financial and operational analytics.
1234567891011121314151617181920212223242526272829303132333435
-- Financial: Compare stock price to period openSELECT trade_date, symbol, close_price, FIRST_VALUE(open_price) OVER ( PARTITION BY symbol, DATE_TRUNC('month', trade_date) ORDER BY trade_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS month_open, ROUND(100.0 * (close_price - FIRST_VALUE(open_price) OVER ( PARTITION BY symbol, DATE_TRUNC('month', trade_date) ORDER BY trade_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )) / FIRST_VALUE(open_price) OVER ( PARTITION BY symbol, DATE_TRUNC('month', trade_date) ORDER BY trade_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ), 2) AS pct_change_from_month_openFROM daily_stock_prices; -- Operations: Compare current inventory to day-start levelSELECT inventory_snapshot_time, product_id, quantity, FIRST_VALUE(quantity) OVER ( PARTITION BY product_id, DATE(inventory_snapshot_time) ORDER BY inventory_snapshot_time ) AS day_start_quantity, quantity - FIRST_VALUE(quantity) OVER ( PARTITION BY product_id, DATE(inventory_snapshot_time) ORDER BY inventory_snapshot_time ) AS quantity_change_todayFROM inventory_snapshots;NTH_VALUE generalizes FIRST_VALUE and LAST_VALUE, allowing you to access any positional value within the window frame—not just the first or last.
NTH_VALUE(expression, n) returns the value of the expression evaluated at the nth row of the window frame, where n is a positive integer.
Conceptually:
FIRST_VALUE(expr) ≈ NTH_VALUE(expr, 1)123456789101112131415161718192021222324252627282930313233343536373839
-- NTH_VALUE syntaxNTH_VALUE(expression, n [FROM FIRST | FROM LAST] [IGNORE NULLS | RESPECT NULLS])OVER ( [PARTITION BY partition_expression, ...] ORDER BY sort_expression [ASC | DESC] [frame_clause]) -- Get second-place product in each categorySELECT product_name, category, sales_amount, NTH_VALUE(product_name, 2) OVER ( PARTITION BY category ORDER BY sales_amount DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS second_place_productFROM product_sales; -- Get median value (middle of 5 in rolling window)SELECT date, value, NTH_VALUE(value, 3) OVER ( ORDER BY date ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING -- 5-row window ) AS middle_valueFROM daily_values; -- FROM LAST: Count from end of frame (PostgreSQL, Oracle)SELECT date, value, NTH_VALUE(value, 2 FROM LAST) OVER ( ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS second_to_last_valueFROM daily_values;Use Cases for NTH_VALUE:
If you request NTH_VALUE(expr, 5) but the frame only has 3 rows, NULL is returned. Unlike FIRST_VALUE (which always exists if the frame is non-empty), NTH_VALUE may return NULL for sparse frames.
FIRST_VALUE and LAST_VALUE are generally efficient, but understanding their performance characteristics ensures optimal query design.
Execution Characteristics:
1. Sorting Overhead: Like all window functions with ORDER BY, these require sorted input. Without a matching index, the database performs a full sort—O(n log n).
2. Frame Processing: The frame specification affects memory and computation:
3. Multiple Window Functions: When multiple window functions share the same OVER clause, the database often computes them together in a single pass.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
-- Optimization: Consolidate window specifications-- GOOD: Define window once for multiple functionsSELECT date, value, FIRST_VALUE(value) OVER w AS first_val, LAST_VALUE(value) OVER w AS last_val, AVG(value) OVER w AS avg_valFROM daily_valuesWINDOW w AS ( ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING); -- GOOD: Indexes that match window ORDER BYCREATE INDEX idx_stock_symbol_date ON stock_prices(symbol, trade_date);-- Now PARTITION BY symbol ORDER BY trade_date uses index efficiently -- GOOD: Filter before windowingWITH recent_data AS ( SELECT * FROM stock_prices WHERE trade_date >= CURRENT_DATE - INTERVAL '30 days')SELECT trade_date, symbol, FIRST_VALUE(close_price) OVER ( PARTITION BY symbol ORDER BY trade_date ) AS month_first_closeFROM recent_data; -- BE AWARE: UNBOUNDED FOLLOWING with large partitions-- This may require buffering entire partitionSELECT transaction_id, LAST_VALUE(amount) OVER ( ORDER BY transaction_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -- Needs full partition ) AS last_amount -- Consider: do we really need the last value for each row?FROM huge_transactions; -- Alternative: Compute last value once and joinWITH summary AS ( SELECT MAX(transaction_time) AS last_time FROM huge_transactions)SELECT t.*, (SELECT amount FROM huge_transactions WHERE transaction_time = s.last_time) AS last_amountFROM huge_transactions t, summary s;| Frame Type | Memory Usage | Optimization Potential | Notes |
|---|---|---|---|
| UNBOUNDED ... UNBOUNDED | High (full partition) | Limited | May materialize entire partition |
| UNBOUNDED PRECEDING only | Low (streaming) | Good | Can process without lookahead |
| Fixed ROWS (e.g., 7 PRECEDING) | Fixed (window size) | Excellent | Sliding window optimization |
| RANGE with expressions | Variable | Database-dependent | May require complex comparison |
Before using LAST_VALUE with UNBOUNDED FOLLOWING on huge datasets, ask: 'Does every row really need the last value?' Often, you can compute boundary values once in a CTE or subquery and join, rather than computing per row.
FIRST_VALUE and LAST_VALUE provide powerful access to boundary values within window frames, enabling sophisticated baseline comparisons and state lookups. Let's consolidate the essential knowledge.
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to get the true last value.What's Next:
Having mastered FIRST_VALUE and LAST_VALUE, we'll explore running totals—cumulative aggregations that maintain a running sum, count, or other aggregate as you move through ordered data. Running totals combine the power of aggregation with the row-preservation of window functions.
You now understand FIRST_VALUE and LAST_VALUE in depth—from frame semantics to NULL handling to performance optimization. These functions enable powerful baseline comparisons and boundary lookups that would otherwise require complex self-joins or subqueries. Practice with your own data to solidify these concepts.