Database Management SystemsRanking Functions

SQL Ranking Functions

LevelIntermediate

Duration60 mins

TopicRanking Functions

5 / 5

Ranking Use Cases

Bringing Ranking Functions to Life

You've mastered the mechanics of ROW_NUMBER(), RANK(), DENSE_RANK(), and NTILE(). Now it's time to see how they work together in real production systems.

Ranking functions are workhorses of analytical SQL—appearing in everything from e-commerce product recommendations to financial reporting, from gaming leaderboards to HR performance reviews. Understanding common patterns helps you recognize opportunities to apply these tools when you encounter similar problems.

This final page synthesizes everything you've learned into practical, production-ready patterns that you'll encounter repeatedly throughout your career as a database professional.

What You Will Master

By the end of this page, you will know how to combine ranking functions for complex scenarios, handle real-world edge cases, choose the right function for each problem, and implement ranking queries that perform well at scale.

Top-N Queries: The Universal Pattern

Top-N queries—finding the best, worst, most recent, or most significant items—are the most common ranking application. Each ranking function produces subtly different results.

The Core Pattern:

Compute ranks with a window function
Filter in a subquery/CTE to the desired range
Choose the ranking function based on tie semantics

Pattern Comparison: Top 3 Products Per Category

Top-N Pattern Variants
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- Sample data: products with sales figures
CREATE TABLE products (
    product_id INT,
    product_name VARCHAR(100),
    category VARCHAR(50),
    monthly_sales DECIMAL(12,2)
);
 
-- Pattern A: Exactly N rows per group (use ROW_NUMBER)
-- "Give me exactly 3 products per category"
WITH ranked AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (
            PARTITION BY category 
            ORDER BY monthly_sales DESC
        ) AS rn
    FROM products
)
SELECT * FROM ranked WHERE rn <= 3;
-- Result: Exactly 3 rows per category (ties broken arbitrarily)
 
-- Pattern B: All tied for top N positions (use RANK)
-- "Give me everyone who placed 1st, 2nd, or 3rd"
WITH ranked AS (
    SELECT 
        *,
        RANK() OVER (
            PARTITION BY category 
            ORDER BY monthly_sales DESC
        ) AS rnk
    FROM products
)
SELECT * FROM ranked WHERE rnk <= 3;
-- Result: May be more than 3 per category if ties exist
-- Position 3 may be empty if 3+ tie for positions 1-2
 
-- Pattern C: Top N distinct values (use DENSE_RANK)
-- "Give me products with the 3 highest sales amounts"
WITH ranked AS (
    SELECT 
        *,
        DENSE_RANK() OVER (
            PARTITION BY category 
            ORDER BY monthly_sales DESC
        ) AS drnk
    FROM products
)
SELECT * FROM ranked WHERE drnk <= 3;
-- Result: All products sharing top 3 sales values
-- Guarantees ranks 1, 2, 3 all present (if enough distinct values)

Choosing the Right Top-N Function
Requirement	Use	Behavior with Ties
Exactly N rows	ROW_NUMBER()	Arbitrarily picks among ties
N-th place positions	RANK()	Includes all ties; may exceed N rows
N-th highest values	DENSE_RANK()	Includes all ties; gaps don't skip positions

Interview Question: 'Second Highest Salary'

The classic interview question 'Find the second highest salary' is ambiguous! If two people tie for first, what's 'second'? ROW_NUMBER picks one arbitrarily as 2nd. RANK says there is no 2nd (next is 3rd). DENSE_RANK says the next lower salary is 2nd. Always clarify the requirement!

Pagination and Keyset Navigation

Pagination displays results in navigable chunks—essential for web applications, reports, and any large dataset display.

Traditional OFFSET Pagination:

While LIMIT/OFFSET is simpler, it has problems:

Performance degrades for high page numbers (must scan and discard rows)
Results can shift during navigation (INSERT/DELETE between pages)

ROW_NUMBER Pagination:

ROW_NUMBER Pagination
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Page-based navigation with ROW_NUMBER
-- Parameters: @page_number (1-based), @page_size (rows per page)
DECLARE @page_number INT = 5;
DECLARE @page_size INT = 20;
 
WITH numbered AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (
            ORDER BY created_at DESC, product_id
        ) AS row_num
    FROM products
    WHERE is_active = true
)
SELECT *
FROM numbered
WHERE row_num > (@page_number - 1) * @page_size
  AND row_num <= @page_number * @page_size
ORDER BY row_num;
 
-- For page 5 with 20 per page:
-- row_num > 80 AND row_num <= 100 → rows 81-100

Keyset Pagination (Cursor-Based):

For better performance and stability, track the last-seen key:

Keyset Pagination
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- First page: no cursor
SELECT 
    product_id,
    product_name,
    created_at,
    ROW_NUMBER() OVER (ORDER BY created_at DESC, product_id)
FROM products
WHERE is_active = true
ORDER BY created_at DESC, product_id
LIMIT 20;
 
-- Subsequent pages: use cursor (last seen values)
-- After seeing created_at='2024-01-15 10:00:00', product_id=12345
SELECT 
    product_id,
    product_name,
    created_at
FROM products
WHERE is_active = true
  AND (created_at, product_id) < ('2024-01-15 10:00:00', 12345)
ORDER BY created_at DESC, product_id
LIMIT 20;
 
-- Benefits:
-- 1. Uses index efficiently (no scanning skipped rows)
-- 2. Stable results (new inserts don't shift pages)
-- 3. Constant performance regardless of page number

When to Use Each Approach

Use OFFSET for simple internal tools or small datasets. Use ROW_NUMBER for page-number-based UI (pages 1, 2, 3...). Use keyset/cursor pagination for high-performance APIs, infinite scroll, or large datasets where page numbers aren't needed.

Deduplication: Eliminating Unwanted Duplicates

Real-world data often contains duplicates that need elimination based on business rules. ROW_NUMBER() excels at selecting which row to keep.

Pattern: Keep Most Recent Record

Deduplication Patterns
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
-- Pattern 1: Keep most recent order per customer
WITH ranked AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (
            PARTITION BY customer_id 
            ORDER BY order_date DESC
        ) AS recency_rank
    FROM orders
)
SELECT * FROM ranked WHERE recency_rank = 1;
 
-- Pattern 2: Keep order with highest value per customer
WITH ranked AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (
            PARTITION BY customer_id 
            ORDER BY order_total DESC, order_id DESC
        ) AS value_rank
    FROM orders
)
SELECT * FROM ranked WHERE value_rank = 1;
 
-- Pattern 3: Delete duplicates, keeping one
-- First, identify duplicates
WITH to_delete AS (
    SELECT 
        order_id,
        ROW_NUMBER() OVER (
            PARTITION BY customer_id, product_id, order_date 
            ORDER BY order_id
        ) AS dup_num
    FROM orders
)
DELETE FROM orders
WHERE order_id IN (
    SELECT order_id FROM to_delete WHERE dup_num > 1
);
 
-- Pattern 4: Merge duplicates with priority
-- Keep the record with most complete information
WITH ranked AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (
            PARTITION BY email 
            ORDER BY 
                CASE WHEN phone IS NOT NULL THEN 0 ELSE 1 END,
                CASE WHEN address IS NOT NULL THEN 0 ELSE 1 END,
                created_at DESC
        ) AS completeness_rank
    FROM contacts
)
SELECT * FROM ranked WHERE completeness_rank = 1;

Deduplication DELETE Caution

Always test deduplication DELETE queries with SELECT first! Use transactions and verify the count of rows to be deleted. An incorrect PARTITION BY clause can delete essential records.

Gap and Island Detection

The 'islands and gaps' technique uses ROW_NUMBER() to identify consecutive sequences (islands) and breaks in sequences (gaps).

The Core Insight:

For consecutive values, subtracting ROW_NUMBER() produces a constant. When there's a gap, the constant changes, creating a natural grouping key.

Pattern: Finding Consecutive Login Streaks

Islands and Gaps
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
-- Find consecutive login day streaks for each user
WITH daily_logins AS (
    SELECT DISTINCT 
        user_id, 
        DATE(login_time) AS login_date
    FROM user_sessions
),
with_row_num AS (
    SELECT 
        user_id,
        login_date,
        login_date - INTERVAL ROW_NUMBER() OVER (
            PARTITION BY user_id 
            ORDER BY login_date
        ) DAY AS island_id
        -- For consecutive dates, (date - row_num) is constant
        -- When there's a gap, the constant changes
    FROM daily_logins
)
SELECT 
    user_id,
    MIN(login_date) AS streak_start,
    MAX(login_date) AS streak_end,
    COUNT(*) AS streak_length
FROM with_row_num
GROUP BY user_id, island_id
HAVING COUNT(*) >= 3  -- Only streaks of 3+ days
ORDER BY user_id, streak_start;
 
-- Example:
-- Dates: Jan 1, Jan 2, Jan 3, Jan 5, Jan 6
-- Row nums: 1, 2, 3, 4, 5
-- Date - row_num: Dec 31, Dec 31, Dec 31, Jan 1, Jan 1
-- Two islands: [Jan 1-3] and [Jan 5-6]

Pattern: Finding Gaps in Sequences

Gap Detection
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- Find gaps in invoice number sequence
WITH with_next AS (
    SELECT 
        invoice_number,
        LEAD(invoice_number) OVER (ORDER BY invoice_number) AS next_invoice
    FROM invoices
)
SELECT 
    invoice_number AS gap_start_after,
    next_invoice AS gap_ends_before,
    next_invoice - invoice_number - 1 AS missing_count
FROM with_next
WHERE next_invoice - invoice_number > 1
ORDER BY invoice_number;
 
-- Alternative using ROW_NUMBER for expected vs actual
WITH numbered AS (
    SELECT 
        invoice_number,
        ROW_NUMBER() OVER (ORDER BY invoice_number) AS expected_position,
        invoice_number - (SELECT MIN(invoice_number) FROM invoices) AS actual_offset
    FROM invoices
)
SELECT *
FROM numbered
WHERE actual_offset != expected_position - 1;  -- Gaps detected

The Island Trick

The 'sequence - row_number = constant for consecutive values' trick works for dates, integers, and any regularly-spaced sequence. It's one of the most elegant applications of window functions.

Comparative and Multi-Dimensional Rankings

Complex analytics often require ranking from multiple perspectives simultaneously.

Pattern: Rank Across Multiple Dimensions

Multi-Dimensional Ranking
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Rank employees by salary: globally, by department, by tenure cohort
SELECT 
    employee_name,
    department,
    hire_year,
    salary,
    -- Global salary rank
    RANK() OVER (
        ORDER BY salary DESC
    ) AS global_rank,
    -- Within department
    RANK() OVER (
        PARTITION BY department 
        ORDER BY salary DESC
    ) AS dept_rank,
    -- Within tenure cohort
    RANK() OVER (
        PARTITION BY hire_year 
        ORDER BY salary DESC
    ) AS cohort_rank,
    -- Percentile positions
    NTILE(100) OVER (ORDER BY salary) AS global_percentile,
    NTILE(100) OVER (
        PARTITION BY department 
        ORDER BY salary
    ) AS dept_percentile
FROM employees
ORDER BY global_rank;

Pattern: Change in Rank Over Time

Rank Change Tracking
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
-- Track how product rankings change month-over-month
WITH monthly_ranks AS (
    SELECT 
        product_id,
        product_name,
        sale_month,
        revenue,
        RANK() OVER (
            PARTITION BY sale_month 
            ORDER BY revenue DESC
        ) AS monthly_rank
    FROM monthly_product_sales
),
rank_changes AS (
    SELECT 
        product_id,
        product_name,
        sale_month,
        revenue,
        monthly_rank,
        LAG(monthly_rank) OVER (
            PARTITION BY product_id 
            ORDER BY sale_month
        ) AS prev_rank,
        LAG(revenue) OVER (
            PARTITION BY product_id 
            ORDER BY sale_month
        ) AS prev_revenue
    FROM monthly_ranks
)
SELECT 
    product_name,
    sale_month,
    revenue,
    monthly_rank,
    prev_rank,
    COALESCE(prev_rank - monthly_rank, 0) AS rank_change,
    CASE 
        WHEN prev_rank IS NULL THEN 'New Entry'
        WHEN monthly_rank < prev_rank THEN '↑ Up ' || (prev_rank - monthly_rank)
        WHEN monthly_rank > prev_rank THEN '↓ Down ' || (monthly_rank - prev_rank)
        ELSE '→ Unchanged'
    END AS trend,
    ROUND((revenue - prev_revenue) / NULLIF(prev_revenue, 0) * 100, 1) AS pct_change
FROM rank_changes
WHERE sale_month = '2024-01'
ORDER BY monthly_rank;

Leaderboards and Competition Systems

Gaming, sports, and competitive systems require sophisticated ranking displays. This is where understanding RANK vs DENSE_RANK matters most.

Full Leaderboard System

Competition Leaderboard
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
-- Full competition leaderboard with all ranking variants
WITH player_stats AS (
    SELECT 
        player_id,
        player_name,
        region,
        total_score,
        games_played,
        ROUND(total_score::numeric / NULLIF(games_played, 0), 2) AS avg_score
    FROM players
    WHERE season = 'current'
      AND games_played >= 10  -- Minimum games to qualify
)
SELECT 
    player_name,
    region,
    total_score,
    games_played,
    avg_score,
    -- Primary display rank (handles ties correctly)
    RANK() OVER (ORDER BY total_score DESC) AS overall_rank,
    -- Regional rankings
    RANK() OVER (
        PARTITION BY region 
        ORDER BY total_score DESC
    ) AS regional_rank,
    -- Unique position for tiebreaker display
    ROW_NUMBER() OVER (
        ORDER BY total_score DESC, avg_score DESC, player_id
    ) AS tiebreak_position,
    -- Percentile for personal achievement
    NTILE(100) OVER (ORDER BY total_score) AS percentile,
    -- Medal assignment
    CASE RANK() OVER (ORDER BY total_score DESC)
        WHEN 1 THEN '🥇 Gold'
        WHEN 2 THEN '🥈 Silver'
        WHEN 3 THEN '🥉 Bronze'
        ELSE NULL
    END AS medal
FROM player_stats
ORDER BY overall_rank, tiebreak_position
LIMIT 100;

Finding a Player's Rank and Context

Player Context Query
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Find specific player's rank and nearby competitors
WITH all_ranked AS (
    SELECT 
        player_id,
        player_name,
        total_score,
        RANK() OVER (ORDER BY total_score DESC) AS rank,
        ROW_NUMBER() OVER (ORDER BY total_score DESC, player_id) AS position,
        COUNT(*) OVER () AS total_players
    FROM players
    WHERE season = 'current'
)
SELECT 
    player_name,
    total_score,
    rank,
    position,
    total_players,
    ROUND(100.0 - (rank - 1) * 100.0 / total_players, 1) AS percentile_rank
FROM all_ranked
WHERE position BETWEEN (
    SELECT position - 2 FROM all_ranked WHERE player_id = 12345
) AND (
    SELECT position + 2 FROM all_ranked WHERE player_id = 12345
)
ORDER BY position;
-- Shows player and 2 competitors above and below

Real-Time Leaderboards

For high-traffic real-time leaderboards, ranking queries can be expensive. Consider materialized views refreshed periodically, Redis sorted sets for live updates, or approximate rankings with periodic exact calculations.

Performance Optimization Strategies

Ranking queries can be expensive. Here are key optimization strategies.

Strategy 1: Optimal Indexing

Index Optimization
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
-- For query: RANK() OVER (PARTITION BY department ORDER BY salary DESC)
-- Create compound index matching partition + order:
CREATE INDEX idx_dept_salary ON employees (department, salary DESC);
 
-- For multiple rankings in same query, consider covering indexes:
CREATE INDEX idx_rankings ON employees (
    department, 
    salary DESC
) INCLUDE (employee_name, hire_date);
 
-- Check if index is being used:
EXPLAIN ANALYZE
SELECT 
    department,
    employee_name,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC)
FROM employees;

Strategy 2: Limit Early with CTEs

Early Filtering
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- DON'T: Rank everything, then filter
SELECT * FROM (
    SELECT *, RANK() OVER (ORDER BY score DESC) AS rnk
    FROM all_players  -- 10 million rows
) ranked
WHERE rnk <= 100;
-- Must rank all 10 million rows!
 
-- DO: Pre-filter if possible
WITH top_scorers AS (
    SELECT * FROM all_players
    ORDER BY score DESC
    LIMIT 1000  -- Approximate, generous buffer
)
SELECT *, RANK() OVER (ORDER BY score DESC) AS rnk
FROM top_scorers
ORDER BY rnk
LIMIT 100;
-- Only ranks 1000 rows, much faster
 
-- For partitioned rankings, use LATERAL joins:
SELECT p.*, ranked.rnk
FROM (SELECT DISTINCT department FROM employees) d
CROSS JOIN LATERAL (
    SELECT *, RANK() OVER (ORDER BY salary DESC) AS rnk
    FROM employees e
    WHERE e.department = d.department
    ORDER BY salary DESC
    LIMIT 10
) ranked;

Strategy 3: Shared Window Definitions

Window Clause Reuse
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-- Use WINDOW clause to define once, reference multiple times
SELECT 
    employee_name,
    department,
    salary,
    ROW_NUMBER() OVER dept_salary AS row_num,
    RANK() OVER dept_salary AS rank_val,
    DENSE_RANK() OVER dept_salary AS dense_rank_val,
    NTILE(4) OVER dept_salary AS quartile
FROM employees
WINDOW dept_salary AS (
    PARTITION BY department 
    ORDER BY salary DESC
);
-- Database can potentially share sorting work across all functions

Execution Plan Analysis

Always examine execution plans for ranking queries. Look for 'Sort' operations that could be eliminated with indexes, and watch for large 'WindowAgg' operations that indicate memory-intensive processing. Large partitions without indexes are common performance killers.

Real-World Case Studies

Let's examine complete solutions to real business problems.

Case Study 1: E-Commerce Product Recommendations

Problem: Show 'customers also bought' recommendations—top 3 products frequently purchased together, excluding the current product.

Product Recommendations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Find top 3 products frequently bought with product_id = 101
WITH co_purchases AS (
    -- Find all orders containing product 101
    SELECT DISTINCT o1.order_id
    FROM order_items o1
    WHERE o1.product_id = 101
),
co_products AS (
    -- Find other products in those orders
    SELECT 
        oi.product_id,
        p.product_name,
        COUNT(*) AS co_purchase_count
    FROM order_items oi
    JOIN co_purchases cp ON oi.order_id = cp.order_id
    JOIN products p ON oi.product_id = p.product_id
    WHERE oi.product_id != 101  -- Exclude the source product
    GROUP BY oi.product_id, p.product_name
),
ranked AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (ORDER BY co_purchase_count DESC) AS rank
    FROM co_products
)
SELECT product_id, product_name, co_purchase_count
FROM ranked
WHERE rank <= 3;

Case Study 2: Sales Performance Tiering

Problem: Assign salespeople to performance tiers (Gold/Silver/Bronze) based on quarterly results, with top 20% as Gold, middle 50% as Silver, bottom 30% as Bronze.

Performance Tiering
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Assign performance tiers using NTILE
WITH ranked AS (
    SELECT 
        salesperson_id,
        salesperson_name,
        region,
        quarterly_sales,
        NTILE(10) OVER (ORDER BY quarterly_sales DESC) AS decile
    FROM sales_performance
    WHERE quarter = '2024-Q1'
)
SELECT 
    salesperson_name,
    region,
    quarterly_sales,
    decile,
    CASE 
        WHEN decile <= 2 THEN 'Gold (Top 20%)'
        WHEN decile <= 7 THEN 'Silver (Middle 50%)'
        ELSE 'Bronze (Bottom 30%)'
    END AS performance_tier,
    CASE 
        WHEN decile <= 2 THEN quarterly_sales * 0.10  -- 10% bonus
        WHEN decile <= 7 THEN quarterly_sales * 0.05  -- 5% bonus
        ELSE 0
    END AS bonus_amount
FROM ranked
ORDER BY decile, quarterly_sales DESC;

Case Study 3: Session Time Analysis

Problem: For each user session, calculate the time spent and identify the longest sessions per user.

Session Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
-- Identify user sessions and find longest per user
WITH session_bounds AS (
    SELECT 
        user_id,
        event_time,
        -- Gap > 30 min starts new session
        CASE 
            WHEN event_time - LAG(event_time) OVER (
                PARTITION BY user_id ORDER BY event_time
            ) > INTERVAL '30 minutes'
            THEN 1 
            ELSE 0 
        END AS is_new_session
    FROM user_events
),
session_numbered AS (
    SELECT 
        user_id,
        event_time,
        SUM(is_new_session) OVER (
            PARTITION BY user_id 
            ORDER BY event_time
        ) + 1 AS session_number
    FROM session_bounds
),
session_stats AS (
    SELECT 
        user_id,
        session_number,
        MIN(event_time) AS session_start,
        MAX(event_time) AS session_end,
        EXTRACT(EPOCH FROM MAX(event_time) - MIN(event_time)) / 60 AS duration_minutes,
        COUNT(*) AS event_count
    FROM session_numbered
    GROUP BY user_id, session_number
),
ranked_sessions AS (
    SELECT 
        *,
        RANK() OVER (
            PARTITION BY user_id 
            ORDER BY duration_minutes DESC
        ) AS duration_rank
    FROM session_stats
)
SELECT 
    user_id,
    session_number,
    session_start,
    session_end,
    ROUND(duration_minutes, 1) AS duration_minutes,
    event_count,
    duration_rank
FROM ranked_sessions
WHERE duration_rank <= 3  -- Top 3 longest sessions per user
ORDER BY user_id, duration_rank;

Module Summary: The Complete Ranking Toolkit

You've now mastered SQL ranking functions—from individual mechanics to complex real-world applications. Let's consolidate everything:

Complete Ranking Function Reference
Function	Ties	Gaps	Use Case
ROW_NUMBER()	Unique arbitrary	Never	Pagination, deduplication, exact counts
RANK()	Same rank	After ties	Competition positions, ordinal placement
DENSE_RANK()	Same rank	Never	Nth value queries, tier assignment
NTILE(n)	Arbitrary within tile	N/A	Percentiles, load balancing, bucketing

Module Key Takeaways

•ROW_NUMBER assigns unique sequential integers; use for pagination, deduplication, and when you need exactly N rows
•RANK assigns same rank to ties with gaps afterward; use for competition-style rankings where position matters
•DENSE_RANK assigns same rank to ties without gaps; use when you need the Nth highest value or consecutive tier numbers
•NTILE divides data into equal-count buckets; use for percentiles, quartiles, and balanced distribution
•Always specify ORDER BY for deterministic results; include unique tiebreakers for reproducibility
•Optimize with indexes matching PARTITION BY + ORDER BY columns to avoid expensive sorts
•Use CTEs to filter before ranking when possible; ranking all rows then filtering is wasteful

Decision Quick-Reference:

"Give me exactly 3" → ROW_NUMBER
"Who placed 1st, 2nd, 3rd?" → RANK
"What's the 3rd highest value?" → DENSE_RANK
"Divide into 4 equal groups" → NTILE(4)

With these four functions mastered, you can solve virtually any positional, comparative, or distribution problem in SQL. They form the foundation for advanced analytics, business intelligence, and data-driven decision making.

Module Complete: Ranking Functions

Congratulations! You have achieved comprehensive mastery of SQL ranking functions. You understand not just the mechanics of each function, but when to apply each one, how to combine them for complex scenarios, and how to optimize their performance. This knowledge will serve you throughout your career in database development and analytics.

5 / 5

Loading learning content...

Database Management SystemsRanking Functions

SQL Ranking Functions

LevelIntermediate

Duration60 mins

TopicRanking Functions

5 / 5

Ranking Use Cases

Bringing Ranking Functions to Life

You've mastered the mechanics of ROW_NUMBER(), RANK(), DENSE_RANK(), and NTILE(). Now it's time to see how they work together in real production systems.

This final page synthesizes everything you've learned into practical, production-ready patterns that you'll encounter repeatedly throughout your career as a database professional.

What You Will Master

Top-N Queries: The Universal Pattern

Top-N queries—finding the best, worst, most recent, or most significant items—are the most common ranking application. Each ranking function produces subtly different results.

The Core Pattern:

Compute ranks with a window function
Filter in a subquery/CTE to the desired range
Choose the ranking function based on tie semantics

Pattern Comparison: Top 3 Products Per Category

Top-N Pattern Variants
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- Sample data: products with sales figures
CREATE TABLE products (
    product_id INT,
    product_name VARCHAR(100),
    category VARCHAR(50),
    monthly_sales DECIMAL(12,2)
);
 
-- Pattern A: Exactly N rows per group (use ROW_NUMBER)
-- "Give me exactly 3 products per category"
WITH ranked AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (
            PARTITION BY category 
            ORDER BY monthly_sales DESC
        ) AS rn
    FROM products
)
SELECT * FROM ranked WHERE rn <= 3;
-- Result: Exactly 3 rows per category (ties broken arbitrarily)
 
-- Pattern B: All tied for top N positions (use RANK)
-- "Give me everyone who placed 1st, 2nd, or 3rd"
WITH ranked AS (
    SELECT 
        *,
        RANK() OVER (
            PARTITION BY category 
            ORDER BY monthly_sales DESC
        ) AS rnk
    FROM products
)
SELECT * FROM ranked WHERE rnk <= 3;
-- Result: May be more than 3 per category if ties exist
-- Position 3 may be empty if 3+ tie for positions 1-2
 
-- Pattern C: Top N distinct values (use DENSE_RANK)
-- "Give me products with the 3 highest sales amounts"
WITH ranked AS (
    SELECT 
        *,
        DENSE_RANK() OVER (
            PARTITION BY category 
            ORDER BY monthly_sales DESC
        ) AS drnk
    FROM products
)
SELECT * FROM ranked WHERE drnk <= 3;
-- Result: All products sharing top 3 sales values
-- Guarantees ranks 1, 2, 3 all present (if enough distinct values)

Choosing the Right Top-N Function
Requirement	Use	Behavior with Ties
Exactly N rows	ROW_NUMBER()	Arbitrarily picks among ties
N-th place positions	RANK()	Includes all ties; may exceed N rows
N-th highest values	DENSE_RANK()	Includes all ties; gaps don't skip positions

Interview Question: 'Second Highest Salary'

Pagination and Keyset Navigation

Pagination displays results in navigable chunks—essential for web applications, reports, and any large dataset display.

Traditional OFFSET Pagination:

While LIMIT/OFFSET is simpler, it has problems:

Performance degrades for high page numbers (must scan and discard rows)
Results can shift during navigation (INSERT/DELETE between pages)

ROW_NUMBER Pagination:

ROW_NUMBER Pagination
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Page-based navigation with ROW_NUMBER
-- Parameters: @page_number (1-based), @page_size (rows per page)
DECLARE @page_number INT = 5;
DECLARE @page_size INT = 20;
 
WITH numbered AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (
            ORDER BY created_at DESC, product_id
        ) AS row_num
    FROM products
    WHERE is_active = true
)
SELECT *
FROM numbered
WHERE row_num > (@page_number - 1) * @page_size
  AND row_num <= @page_number * @page_size
ORDER BY row_num;
 
-- For page 5 with 20 per page:
-- row_num > 80 AND row_num <= 100 → rows 81-100

Keyset Pagination (Cursor-Based):

For better performance and stability, track the last-seen key:

Keyset Pagination
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- First page: no cursor
SELECT 
    product_id,
    product_name,
    created_at,
    ROW_NUMBER() OVER (ORDER BY created_at DESC, product_id)
FROM products
WHERE is_active = true
ORDER BY created_at DESC, product_id
LIMIT 20;
 
-- Subsequent pages: use cursor (last seen values)
-- After seeing created_at='2024-01-15 10:00:00', product_id=12345
SELECT 
    product_id,
    product_name,
    created_at
FROM products
WHERE is_active = true
  AND (created_at, product_id) < ('2024-01-15 10:00:00', 12345)
ORDER BY created_at DESC, product_id
LIMIT 20;
 
-- Benefits:
-- 1. Uses index efficiently (no scanning skipped rows)
-- 2. Stable results (new inserts don't shift pages)
-- 3. Constant performance regardless of page number

When to Use Each Approach

Deduplication: Eliminating Unwanted Duplicates

Real-world data often contains duplicates that need elimination based on business rules. ROW_NUMBER() excels at selecting which row to keep.

Pattern: Keep Most Recent Record

Deduplication Patterns
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
-- Pattern 1: Keep most recent order per customer
WITH ranked AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (
            PARTITION BY customer_id 
            ORDER BY order_date DESC
        ) AS recency_rank
    FROM orders
)
SELECT * FROM ranked WHERE recency_rank = 1;
 
-- Pattern 2: Keep order with highest value per customer
WITH ranked AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (
            PARTITION BY customer_id 
            ORDER BY order_total DESC, order_id DESC
        ) AS value_rank
    FROM orders
)
SELECT * FROM ranked WHERE value_rank = 1;
 
-- Pattern 3: Delete duplicates, keeping one
-- First, identify duplicates
WITH to_delete AS (
    SELECT 
        order_id,
        ROW_NUMBER() OVER (
            PARTITION BY customer_id, product_id, order_date 
            ORDER BY order_id
        ) AS dup_num
    FROM orders
)
DELETE FROM orders
WHERE order_id IN (
    SELECT order_id FROM to_delete WHERE dup_num > 1
);
 
-- Pattern 4: Merge duplicates with priority
-- Keep the record with most complete information
WITH ranked AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (
            PARTITION BY email 
            ORDER BY 
                CASE WHEN phone IS NOT NULL THEN 0 ELSE 1 END,
                CASE WHEN address IS NOT NULL THEN 0 ELSE 1 END,
                created_at DESC
        ) AS completeness_rank
    FROM contacts
)
SELECT * FROM ranked WHERE completeness_rank = 1;

Deduplication DELETE Caution

Always test deduplication DELETE queries with SELECT first! Use transactions and verify the count of rows to be deleted. An incorrect PARTITION BY clause can delete essential records.

Gap and Island Detection

The 'islands and gaps' technique uses ROW_NUMBER() to identify consecutive sequences (islands) and breaks in sequences (gaps).

The Core Insight:

For consecutive values, subtracting ROW_NUMBER() produces a constant. When there's a gap, the constant changes, creating a natural grouping key.

Pattern: Finding Consecutive Login Streaks

Islands and Gaps
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
-- Find consecutive login day streaks for each user
WITH daily_logins AS (
    SELECT DISTINCT 
        user_id, 
        DATE(login_time) AS login_date
    FROM user_sessions
),
with_row_num AS (
    SELECT 
        user_id,
        login_date,
        login_date - INTERVAL ROW_NUMBER() OVER (
            PARTITION BY user_id 
            ORDER BY login_date
        ) DAY AS island_id
        -- For consecutive dates, (date - row_num) is constant
        -- When there's a gap, the constant changes
    FROM daily_logins
)
SELECT 
    user_id,
    MIN(login_date) AS streak_start,
    MAX(login_date) AS streak_end,
    COUNT(*) AS streak_length
FROM with_row_num
GROUP BY user_id, island_id
HAVING COUNT(*) >= 3  -- Only streaks of 3+ days
ORDER BY user_id, streak_start;
 
-- Example:
-- Dates: Jan 1, Jan 2, Jan 3, Jan 5, Jan 6
-- Row nums: 1, 2, 3, 4, 5
-- Date - row_num: Dec 31, Dec 31, Dec 31, Jan 1, Jan 1
-- Two islands: [Jan 1-3] and [Jan 5-6]

Pattern: Finding Gaps in Sequences

Gap Detection
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- Find gaps in invoice number sequence
WITH with_next AS (
    SELECT 
        invoice_number,
        LEAD(invoice_number) OVER (ORDER BY invoice_number) AS next_invoice
    FROM invoices
)
SELECT 
    invoice_number AS gap_start_after,
    next_invoice AS gap_ends_before,
    next_invoice - invoice_number - 1 AS missing_count
FROM with_next
WHERE next_invoice - invoice_number > 1
ORDER BY invoice_number;
 
-- Alternative using ROW_NUMBER for expected vs actual
WITH numbered AS (
    SELECT 
        invoice_number,
        ROW_NUMBER() OVER (ORDER BY invoice_number) AS expected_position,
        invoice_number - (SELECT MIN(invoice_number) FROM invoices) AS actual_offset
    FROM invoices
)
SELECT *
FROM numbered
WHERE actual_offset != expected_position - 1;  -- Gaps detected

The Island Trick

The 'sequence - row_number = constant for consecutive values' trick works for dates, integers, and any regularly-spaced sequence. It's one of the most elegant applications of window functions.

Comparative and Multi-Dimensional Rankings

Complex analytics often require ranking from multiple perspectives simultaneously.

Pattern: Rank Across Multiple Dimensions

Multi-Dimensional Ranking
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Rank employees by salary: globally, by department, by tenure cohort
SELECT 
    employee_name,
    department,
    hire_year,
    salary,
    -- Global salary rank
    RANK() OVER (
        ORDER BY salary DESC
    ) AS global_rank,
    -- Within department
    RANK() OVER (
        PARTITION BY department 
        ORDER BY salary DESC
    ) AS dept_rank,
    -- Within tenure cohort
    RANK() OVER (
        PARTITION BY hire_year 
        ORDER BY salary DESC
    ) AS cohort_rank,
    -- Percentile positions
    NTILE(100) OVER (ORDER BY salary) AS global_percentile,
    NTILE(100) OVER (
        PARTITION BY department 
        ORDER BY salary
    ) AS dept_percentile
FROM employees
ORDER BY global_rank;

Pattern: Change in Rank Over Time

Rank Change Tracking
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
-- Track how product rankings change month-over-month
WITH monthly_ranks AS (
    SELECT 
        product_id,
        product_name,
        sale_month,
        revenue,
        RANK() OVER (
            PARTITION BY sale_month 
            ORDER BY revenue DESC
        ) AS monthly_rank
    FROM monthly_product_sales
),
rank_changes AS (
    SELECT 
        product_id,
        product_name,
        sale_month,
        revenue,
        monthly_rank,
        LAG(monthly_rank) OVER (
            PARTITION BY product_id 
            ORDER BY sale_month
        ) AS prev_rank,
        LAG(revenue) OVER (
            PARTITION BY product_id 
            ORDER BY sale_month
        ) AS prev_revenue
    FROM monthly_ranks
)
SELECT 
    product_name,
    sale_month,
    revenue,
    monthly_rank,
    prev_rank,
    COALESCE(prev_rank - monthly_rank, 0) AS rank_change,
    CASE 
        WHEN prev_rank IS NULL THEN 'New Entry'
        WHEN monthly_rank < prev_rank THEN '↑ Up ' || (prev_rank - monthly_rank)
        WHEN monthly_rank > prev_rank THEN '↓ Down ' || (monthly_rank - prev_rank)
        ELSE '→ Unchanged'
    END AS trend,
    ROUND((revenue - prev_revenue) / NULLIF(prev_revenue, 0) * 100, 1) AS pct_change
FROM rank_changes
WHERE sale_month = '2024-01'
ORDER BY monthly_rank;

Leaderboards and Competition Systems

Gaming, sports, and competitive systems require sophisticated ranking displays. This is where understanding RANK vs DENSE_RANK matters most.

Full Leaderboard System

Competition Leaderboard
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
-- Full competition leaderboard with all ranking variants
WITH player_stats AS (
    SELECT 
        player_id,
        player_name,
        region,
        total_score,
        games_played,
        ROUND(total_score::numeric / NULLIF(games_played, 0), 2) AS avg_score
    FROM players
    WHERE season = 'current'
      AND games_played >= 10  -- Minimum games to qualify
)
SELECT 
    player_name,
    region,
    total_score,
    games_played,
    avg_score,
    -- Primary display rank (handles ties correctly)
    RANK() OVER (ORDER BY total_score DESC) AS overall_rank,
    -- Regional rankings
    RANK() OVER (
        PARTITION BY region 
        ORDER BY total_score DESC
    ) AS regional_rank,
    -- Unique position for tiebreaker display
    ROW_NUMBER() OVER (
        ORDER BY total_score DESC, avg_score DESC, player_id
    ) AS tiebreak_position,
    -- Percentile for personal achievement
    NTILE(100) OVER (ORDER BY total_score) AS percentile,
    -- Medal assignment
    CASE RANK() OVER (ORDER BY total_score DESC)
        WHEN 1 THEN '🥇 Gold'
        WHEN 2 THEN '🥈 Silver'
        WHEN 3 THEN '🥉 Bronze'
        ELSE NULL
    END AS medal
FROM player_stats
ORDER BY overall_rank, tiebreak_position
LIMIT 100;

Finding a Player's Rank and Context

Player Context Query
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Find specific player's rank and nearby competitors
WITH all_ranked AS (
    SELECT 
        player_id,
        player_name,
        total_score,
        RANK() OVER (ORDER BY total_score DESC) AS rank,
        ROW_NUMBER() OVER (ORDER BY total_score DESC, player_id) AS position,
        COUNT(*) OVER () AS total_players
    FROM players
    WHERE season = 'current'
)
SELECT 
    player_name,
    total_score,
    rank,
    position,
    total_players,
    ROUND(100.0 - (rank - 1) * 100.0 / total_players, 1) AS percentile_rank
FROM all_ranked
WHERE position BETWEEN (
    SELECT position - 2 FROM all_ranked WHERE player_id = 12345
) AND (
    SELECT position + 2 FROM all_ranked WHERE player_id = 12345
)
ORDER BY position;
-- Shows player and 2 competitors above and below

Real-Time Leaderboards

Performance Optimization Strategies

Ranking queries can be expensive. Here are key optimization strategies.

Strategy 1: Optimal Indexing

Index Optimization
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
-- For query: RANK() OVER (PARTITION BY department ORDER BY salary DESC)
-- Create compound index matching partition + order:
CREATE INDEX idx_dept_salary ON employees (department, salary DESC);
 
-- For multiple rankings in same query, consider covering indexes:
CREATE INDEX idx_rankings ON employees (
    department, 
    salary DESC
) INCLUDE (employee_name, hire_date);
 
-- Check if index is being used:
EXPLAIN ANALYZE
SELECT 
    department,
    employee_name,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC)
FROM employees;

Strategy 2: Limit Early with CTEs

Early Filtering
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- DON'T: Rank everything, then filter
SELECT * FROM (
    SELECT *, RANK() OVER (ORDER BY score DESC) AS rnk
    FROM all_players  -- 10 million rows
) ranked
WHERE rnk <= 100;
-- Must rank all 10 million rows!
 
-- DO: Pre-filter if possible
WITH top_scorers AS (
    SELECT * FROM all_players
    ORDER BY score DESC
    LIMIT 1000  -- Approximate, generous buffer
)
SELECT *, RANK() OVER (ORDER BY score DESC) AS rnk
FROM top_scorers
ORDER BY rnk
LIMIT 100;
-- Only ranks 1000 rows, much faster
 
-- For partitioned rankings, use LATERAL joins:
SELECT p.*, ranked.rnk
FROM (SELECT DISTINCT department FROM employees) d
CROSS JOIN LATERAL (
    SELECT *, RANK() OVER (ORDER BY salary DESC) AS rnk
    FROM employees e
    WHERE e.department = d.department
    ORDER BY salary DESC
    LIMIT 10
) ranked;

Strategy 3: Shared Window Definitions

Window Clause Reuse
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-- Use WINDOW clause to define once, reference multiple times
SELECT 
    employee_name,
    department,
    salary,
    ROW_NUMBER() OVER dept_salary AS row_num,
    RANK() OVER dept_salary AS rank_val,
    DENSE_RANK() OVER dept_salary AS dense_rank_val,
    NTILE(4) OVER dept_salary AS quartile
FROM employees
WINDOW dept_salary AS (
    PARTITION BY department 
    ORDER BY salary DESC
);
-- Database can potentially share sorting work across all functions

Execution Plan Analysis

Real-World Case Studies

Let's examine complete solutions to real business problems.

Case Study 1: E-Commerce Product Recommendations

Problem: Show 'customers also bought' recommendations—top 3 products frequently purchased together, excluding the current product.

Product Recommendations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Find top 3 products frequently bought with product_id = 101
WITH co_purchases AS (
    -- Find all orders containing product 101
    SELECT DISTINCT o1.order_id
    FROM order_items o1
    WHERE o1.product_id = 101
),
co_products AS (
    -- Find other products in those orders
    SELECT 
        oi.product_id,
        p.product_name,
        COUNT(*) AS co_purchase_count
    FROM order_items oi
    JOIN co_purchases cp ON oi.order_id = cp.order_id
    JOIN products p ON oi.product_id = p.product_id
    WHERE oi.product_id != 101  -- Exclude the source product
    GROUP BY oi.product_id, p.product_name
),
ranked AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (ORDER BY co_purchase_count DESC) AS rank
    FROM co_products
)
SELECT product_id, product_name, co_purchase_count
FROM ranked
WHERE rank <= 3;

Case Study 2: Sales Performance Tiering

Problem: Assign salespeople to performance tiers (Gold/Silver/Bronze) based on quarterly results, with top 20% as Gold, middle 50% as Silver, bottom 30% as Bronze.

Performance Tiering
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Assign performance tiers using NTILE
WITH ranked AS (
    SELECT 
        salesperson_id,
        salesperson_name,
        region,
        quarterly_sales,
        NTILE(10) OVER (ORDER BY quarterly_sales DESC) AS decile
    FROM sales_performance
    WHERE quarter = '2024-Q1'
)
SELECT 
    salesperson_name,
    region,
    quarterly_sales,
    decile,
    CASE 
        WHEN decile <= 2 THEN 'Gold (Top 20%)'
        WHEN decile <= 7 THEN 'Silver (Middle 50%)'
        ELSE 'Bronze (Bottom 30%)'
    END AS performance_tier,
    CASE 
        WHEN decile <= 2 THEN quarterly_sales * 0.10  -- 10% bonus
        WHEN decile <= 7 THEN quarterly_sales * 0.05  -- 5% bonus
        ELSE 0
    END AS bonus_amount
FROM ranked
ORDER BY decile, quarterly_sales DESC;

Case Study 3: Session Time Analysis

Problem: For each user session, calculate the time spent and identify the longest sessions per user.

Session Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
-- Identify user sessions and find longest per user
WITH session_bounds AS (
    SELECT 
        user_id,
        event_time,
        -- Gap > 30 min starts new session
        CASE 
            WHEN event_time - LAG(event_time) OVER (
                PARTITION BY user_id ORDER BY event_time
            ) > INTERVAL '30 minutes'
            THEN 1 
            ELSE 0 
        END AS is_new_session
    FROM user_events
),
session_numbered AS (
    SELECT 
        user_id,
        event_time,
        SUM(is_new_session) OVER (
            PARTITION BY user_id 
            ORDER BY event_time
        ) + 1 AS session_number
    FROM session_bounds
),
session_stats AS (
    SELECT 
        user_id,
        session_number,
        MIN(event_time) AS session_start,
        MAX(event_time) AS session_end,
        EXTRACT(EPOCH FROM MAX(event_time) - MIN(event_time)) / 60 AS duration_minutes,
        COUNT(*) AS event_count
    FROM session_numbered
    GROUP BY user_id, session_number
),
ranked_sessions AS (
    SELECT 
        *,
        RANK() OVER (
            PARTITION BY user_id 
            ORDER BY duration_minutes DESC
        ) AS duration_rank
    FROM session_stats
)
SELECT 
    user_id,
    session_number,
    session_start,
    session_end,
    ROUND(duration_minutes, 1) AS duration_minutes,
    event_count,
    duration_rank
FROM ranked_sessions
WHERE duration_rank <= 3  -- Top 3 longest sessions per user
ORDER BY user_id, duration_rank;

Module Summary: The Complete Ranking Toolkit

You've now mastered SQL ranking functions—from individual mechanics to complex real-world applications. Let's consolidate everything:

Complete Ranking Function Reference
Function	Ties	Gaps	Use Case
ROW_NUMBER()	Unique arbitrary	Never	Pagination, deduplication, exact counts
RANK()	Same rank	After ties	Competition positions, ordinal placement
DENSE_RANK()	Same rank	Never	Nth value queries, tier assignment
NTILE(n)	Arbitrary within tile	N/A	Percentiles, load balancing, bucketing

Module Key Takeaways

•ROW_NUMBER assigns unique sequential integers; use for pagination, deduplication, and when you need exactly N rows
•RANK assigns same rank to ties with gaps afterward; use for competition-style rankings where position matters
•DENSE_RANK assigns same rank to ties without gaps; use when you need the Nth highest value or consecutive tier numbers
•NTILE divides data into equal-count buckets; use for percentiles, quartiles, and balanced distribution
•Always specify ORDER BY for deterministic results; include unique tiebreakers for reproducibility
•Optimize with indexes matching PARTITION BY + ORDER BY columns to avoid expensive sorts
•Use CTEs to filter before ranking when possible; ranking all rows then filtering is wasteful

Decision Quick-Reference:

"Give me exactly 3" → ROW_NUMBER
"Who placed 1st, 2nd, 3rd?" → RANK
"What's the 3rd highest value?" → DENSE_RANK
"Divide into 4 equal groups" → NTILE(4)

Module Complete: Ranking Functions

5 / 5