Loading learning content...
You've mastered the mechanics of ROW_NUMBER(), RANK(), DENSE_RANK(), and NTILE(). Now it's time to see how they work together in real production systems.
Ranking functions are workhorses of analytical SQL—appearing in everything from e-commerce product recommendations to financial reporting, from gaming leaderboards to HR performance reviews. Understanding common patterns helps you recognize opportunities to apply these tools when you encounter similar problems.
This final page synthesizes everything you've learned into practical, production-ready patterns that you'll encounter repeatedly throughout your career as a database professional.
By the end of this page, you will know how to combine ranking functions for complex scenarios, handle real-world edge cases, choose the right function for each problem, and implement ranking queries that perform well at scale.
Top-N queries—finding the best, worst, most recent, or most significant items—are the most common ranking application. Each ranking function produces subtly different results.
The Core Pattern:
Pattern Comparison: Top 3 Products Per Category
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
-- Sample data: products with sales figuresCREATE TABLE products ( product_id INT, product_name VARCHAR(100), category VARCHAR(50), monthly_sales DECIMAL(12,2)); -- Pattern A: Exactly N rows per group (use ROW_NUMBER)-- "Give me exactly 3 products per category"WITH ranked AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY category ORDER BY monthly_sales DESC ) AS rn FROM products)SELECT * FROM ranked WHERE rn <= 3;-- Result: Exactly 3 rows per category (ties broken arbitrarily) -- Pattern B: All tied for top N positions (use RANK)-- "Give me everyone who placed 1st, 2nd, or 3rd"WITH ranked AS ( SELECT *, RANK() OVER ( PARTITION BY category ORDER BY monthly_sales DESC ) AS rnk FROM products)SELECT * FROM ranked WHERE rnk <= 3;-- Result: May be more than 3 per category if ties exist-- Position 3 may be empty if 3+ tie for positions 1-2 -- Pattern C: Top N distinct values (use DENSE_RANK)-- "Give me products with the 3 highest sales amounts"WITH ranked AS ( SELECT *, DENSE_RANK() OVER ( PARTITION BY category ORDER BY monthly_sales DESC ) AS drnk FROM products)SELECT * FROM ranked WHERE drnk <= 3;-- Result: All products sharing top 3 sales values-- Guarantees ranks 1, 2, 3 all present (if enough distinct values)| Requirement | Use | Behavior with Ties |
|---|---|---|
| Exactly N rows | ROW_NUMBER() | Arbitrarily picks among ties |
| N-th place positions | RANK() | Includes all ties; may exceed N rows |
| N-th highest values | DENSE_RANK() | Includes all ties; gaps don't skip positions |
The classic interview question 'Find the second highest salary' is ambiguous! If two people tie for first, what's 'second'? ROW_NUMBER picks one arbitrarily as 2nd. RANK says there is no 2nd (next is 3rd). DENSE_RANK says the next lower salary is 2nd. Always clarify the requirement!
Pagination displays results in navigable chunks—essential for web applications, reports, and any large dataset display.
Traditional OFFSET Pagination:
While LIMIT/OFFSET is simpler, it has problems:
ROW_NUMBER Pagination:
12345678910111213141516171819202122
-- Page-based navigation with ROW_NUMBER-- Parameters: @page_number (1-based), @page_size (rows per page)DECLARE @page_number INT = 5;DECLARE @page_size INT = 20; WITH numbered AS ( SELECT *, ROW_NUMBER() OVER ( ORDER BY created_at DESC, product_id ) AS row_num FROM products WHERE is_active = true)SELECT *FROM numberedWHERE row_num > (@page_number - 1) * @page_size AND row_num <= @page_number * @page_sizeORDER BY row_num; -- For page 5 with 20 per page:-- row_num > 80 AND row_num <= 100 → rows 81-100Keyset Pagination (Cursor-Based):
For better performance and stability, track the last-seen key:
123456789101112131415161718192021222324252627
-- First page: no cursorSELECT product_id, product_name, created_at, ROW_NUMBER() OVER (ORDER BY created_at DESC, product_id)FROM productsWHERE is_active = trueORDER BY created_at DESC, product_idLIMIT 20; -- Subsequent pages: use cursor (last seen values)-- After seeing created_at='2024-01-15 10:00:00', product_id=12345SELECT product_id, product_name, created_atFROM productsWHERE is_active = true AND (created_at, product_id) < ('2024-01-15 10:00:00', 12345)ORDER BY created_at DESC, product_idLIMIT 20; -- Benefits:-- 1. Uses index efficiently (no scanning skipped rows)-- 2. Stable results (new inserts don't shift pages)-- 3. Constant performance regardless of page numberUse OFFSET for simple internal tools or small datasets. Use ROW_NUMBER for page-number-based UI (pages 1, 2, 3...). Use keyset/cursor pagination for high-performance APIs, infinite scroll, or large datasets where page numbers aren't needed.
Real-world data often contains duplicates that need elimination based on business rules. ROW_NUMBER() excels at selecting which row to keep.
Pattern: Keep Most Recent Record
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
-- Pattern 1: Keep most recent order per customerWITH ranked AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY customer_id ORDER BY order_date DESC ) AS recency_rank FROM orders)SELECT * FROM ranked WHERE recency_rank = 1; -- Pattern 2: Keep order with highest value per customerWITH ranked AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY customer_id ORDER BY order_total DESC, order_id DESC ) AS value_rank FROM orders)SELECT * FROM ranked WHERE value_rank = 1; -- Pattern 3: Delete duplicates, keeping one-- First, identify duplicatesWITH to_delete AS ( SELECT order_id, ROW_NUMBER() OVER ( PARTITION BY customer_id, product_id, order_date ORDER BY order_id ) AS dup_num FROM orders)DELETE FROM ordersWHERE order_id IN ( SELECT order_id FROM to_delete WHERE dup_num > 1); -- Pattern 4: Merge duplicates with priority-- Keep the record with most complete informationWITH ranked AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY email ORDER BY CASE WHEN phone IS NOT NULL THEN 0 ELSE 1 END, CASE WHEN address IS NOT NULL THEN 0 ELSE 1 END, created_at DESC ) AS completeness_rank FROM contacts)SELECT * FROM ranked WHERE completeness_rank = 1;Always test deduplication DELETE queries with SELECT first! Use transactions and verify the count of rows to be deleted. An incorrect PARTITION BY clause can delete essential records.
The 'islands and gaps' technique uses ROW_NUMBER() to identify consecutive sequences (islands) and breaks in sequences (gaps).
The Core Insight:
For consecutive values, subtracting ROW_NUMBER() produces a constant. When there's a gap, the constant changes, creating a natural grouping key.
Pattern: Finding Consecutive Login Streaks
12345678910111213141516171819202122232425262728293031323334
-- Find consecutive login day streaks for each userWITH daily_logins AS ( SELECT DISTINCT user_id, DATE(login_time) AS login_date FROM user_sessions),with_row_num AS ( SELECT user_id, login_date, login_date - INTERVAL ROW_NUMBER() OVER ( PARTITION BY user_id ORDER BY login_date ) DAY AS island_id -- For consecutive dates, (date - row_num) is constant -- When there's a gap, the constant changes FROM daily_logins)SELECT user_id, MIN(login_date) AS streak_start, MAX(login_date) AS streak_end, COUNT(*) AS streak_lengthFROM with_row_numGROUP BY user_id, island_idHAVING COUNT(*) >= 3 -- Only streaks of 3+ daysORDER BY user_id, streak_start; -- Example:-- Dates: Jan 1, Jan 2, Jan 3, Jan 5, Jan 6-- Row nums: 1, 2, 3, 4, 5-- Date - row_num: Dec 31, Dec 31, Dec 31, Jan 1, Jan 1-- Two islands: [Jan 1-3] and [Jan 5-6]Pattern: Finding Gaps in Sequences
1234567891011121314151617181920212223242526
-- Find gaps in invoice number sequenceWITH with_next AS ( SELECT invoice_number, LEAD(invoice_number) OVER (ORDER BY invoice_number) AS next_invoice FROM invoices)SELECT invoice_number AS gap_start_after, next_invoice AS gap_ends_before, next_invoice - invoice_number - 1 AS missing_countFROM with_nextWHERE next_invoice - invoice_number > 1ORDER BY invoice_number; -- Alternative using ROW_NUMBER for expected vs actualWITH numbered AS ( SELECT invoice_number, ROW_NUMBER() OVER (ORDER BY invoice_number) AS expected_position, invoice_number - (SELECT MIN(invoice_number) FROM invoices) AS actual_offset FROM invoices)SELECT *FROM numberedWHERE actual_offset != expected_position - 1; -- Gaps detectedThe 'sequence - row_number = constant for consecutive values' trick works for dates, integers, and any regularly-spaced sequence. It's one of the most elegant applications of window functions.
Complex analytics often require ranking from multiple perspectives simultaneously.
Pattern: Rank Across Multiple Dimensions
12345678910111213141516171819202122232425262728
-- Rank employees by salary: globally, by department, by tenure cohortSELECT employee_name, department, hire_year, salary, -- Global salary rank RANK() OVER ( ORDER BY salary DESC ) AS global_rank, -- Within department RANK() OVER ( PARTITION BY department ORDER BY salary DESC ) AS dept_rank, -- Within tenure cohort RANK() OVER ( PARTITION BY hire_year ORDER BY salary DESC ) AS cohort_rank, -- Percentile positions NTILE(100) OVER (ORDER BY salary) AS global_percentile, NTILE(100) OVER ( PARTITION BY department ORDER BY salary ) AS dept_percentileFROM employeesORDER BY global_rank;Pattern: Change in Rank Over Time
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
-- Track how product rankings change month-over-monthWITH monthly_ranks AS ( SELECT product_id, product_name, sale_month, revenue, RANK() OVER ( PARTITION BY sale_month ORDER BY revenue DESC ) AS monthly_rank FROM monthly_product_sales),rank_changes AS ( SELECT product_id, product_name, sale_month, revenue, monthly_rank, LAG(monthly_rank) OVER ( PARTITION BY product_id ORDER BY sale_month ) AS prev_rank, LAG(revenue) OVER ( PARTITION BY product_id ORDER BY sale_month ) AS prev_revenue FROM monthly_ranks)SELECT product_name, sale_month, revenue, monthly_rank, prev_rank, COALESCE(prev_rank - monthly_rank, 0) AS rank_change, CASE WHEN prev_rank IS NULL THEN 'New Entry' WHEN monthly_rank < prev_rank THEN '↑ Up ' || (prev_rank - monthly_rank) WHEN monthly_rank > prev_rank THEN '↓ Down ' || (monthly_rank - prev_rank) ELSE '→ Unchanged' END AS trend, ROUND((revenue - prev_revenue) / NULLIF(prev_revenue, 0) * 100, 1) AS pct_changeFROM rank_changesWHERE sale_month = '2024-01'ORDER BY monthly_rank;Gaming, sports, and competitive systems require sophisticated ranking displays. This is where understanding RANK vs DENSE_RANK matters most.
Full Leaderboard System
123456789101112131415161718192021222324252627282930313233343536373839404142
-- Full competition leaderboard with all ranking variantsWITH player_stats AS ( SELECT player_id, player_name, region, total_score, games_played, ROUND(total_score::numeric / NULLIF(games_played, 0), 2) AS avg_score FROM players WHERE season = 'current' AND games_played >= 10 -- Minimum games to qualify)SELECT player_name, region, total_score, games_played, avg_score, -- Primary display rank (handles ties correctly) RANK() OVER (ORDER BY total_score DESC) AS overall_rank, -- Regional rankings RANK() OVER ( PARTITION BY region ORDER BY total_score DESC ) AS regional_rank, -- Unique position for tiebreaker display ROW_NUMBER() OVER ( ORDER BY total_score DESC, avg_score DESC, player_id ) AS tiebreak_position, -- Percentile for personal achievement NTILE(100) OVER (ORDER BY total_score) AS percentile, -- Medal assignment CASE RANK() OVER (ORDER BY total_score DESC) WHEN 1 THEN '🥇 Gold' WHEN 2 THEN '🥈 Silver' WHEN 3 THEN '🥉 Bronze' ELSE NULL END AS medalFROM player_statsORDER BY overall_rank, tiebreak_positionLIMIT 100;Finding a Player's Rank and Context
123456789101112131415161718192021222324252627
-- Find specific player's rank and nearby competitorsWITH all_ranked AS ( SELECT player_id, player_name, total_score, RANK() OVER (ORDER BY total_score DESC) AS rank, ROW_NUMBER() OVER (ORDER BY total_score DESC, player_id) AS position, COUNT(*) OVER () AS total_players FROM players WHERE season = 'current')SELECT player_name, total_score, rank, position, total_players, ROUND(100.0 - (rank - 1) * 100.0 / total_players, 1) AS percentile_rankFROM all_rankedWHERE position BETWEEN ( SELECT position - 2 FROM all_ranked WHERE player_id = 12345) AND ( SELECT position + 2 FROM all_ranked WHERE player_id = 12345)ORDER BY position;-- Shows player and 2 competitors above and belowFor high-traffic real-time leaderboards, ranking queries can be expensive. Consider materialized views refreshed periodically, Redis sorted sets for live updates, or approximate rankings with periodic exact calculations.
Ranking queries can be expensive. Here are key optimization strategies.
Strategy 1: Optimal Indexing
123456789101112131415161718
-- For query: RANK() OVER (PARTITION BY department ORDER BY salary DESC)-- Create compound index matching partition + order:CREATE INDEX idx_dept_salary ON employees (department, salary DESC); -- For multiple rankings in same query, consider covering indexes:CREATE INDEX idx_rankings ON employees ( department, salary DESC) INCLUDE (employee_name, hire_date); -- Check if index is being used:EXPLAIN ANALYZESELECT department, employee_name, salary, RANK() OVER (PARTITION BY department ORDER BY salary DESC)FROM employees;Strategy 2: Limit Early with CTEs
123456789101112131415161718192021222324252627282930
-- DON'T: Rank everything, then filterSELECT * FROM ( SELECT *, RANK() OVER (ORDER BY score DESC) AS rnk FROM all_players -- 10 million rows) rankedWHERE rnk <= 100;-- Must rank all 10 million rows! -- DO: Pre-filter if possibleWITH top_scorers AS ( SELECT * FROM all_players ORDER BY score DESC LIMIT 1000 -- Approximate, generous buffer)SELECT *, RANK() OVER (ORDER BY score DESC) AS rnkFROM top_scorersORDER BY rnkLIMIT 100;-- Only ranks 1000 rows, much faster -- For partitioned rankings, use LATERAL joins:SELECT p.*, ranked.rnkFROM (SELECT DISTINCT department FROM employees) dCROSS JOIN LATERAL ( SELECT *, RANK() OVER (ORDER BY salary DESC) AS rnk FROM employees e WHERE e.department = d.department ORDER BY salary DESC LIMIT 10) ranked;Strategy 3: Shared Window Definitions
123456789101112131415
-- Use WINDOW clause to define once, reference multiple timesSELECT employee_name, department, salary, ROW_NUMBER() OVER dept_salary AS row_num, RANK() OVER dept_salary AS rank_val, DENSE_RANK() OVER dept_salary AS dense_rank_val, NTILE(4) OVER dept_salary AS quartileFROM employeesWINDOW dept_salary AS ( PARTITION BY department ORDER BY salary DESC);-- Database can potentially share sorting work across all functionsAlways examine execution plans for ranking queries. Look for 'Sort' operations that could be eliminated with indexes, and watch for large 'WindowAgg' operations that indicate memory-intensive processing. Large partitions without indexes are common performance killers.
Let's examine complete solutions to real business problems.
Case Study 1: E-Commerce Product Recommendations
Problem: Show 'customers also bought' recommendations—top 3 products frequently purchased together, excluding the current product.
12345678910111213141516171819202122232425262728
-- Find top 3 products frequently bought with product_id = 101WITH co_purchases AS ( -- Find all orders containing product 101 SELECT DISTINCT o1.order_id FROM order_items o1 WHERE o1.product_id = 101),co_products AS ( -- Find other products in those orders SELECT oi.product_id, p.product_name, COUNT(*) AS co_purchase_count FROM order_items oi JOIN co_purchases cp ON oi.order_id = cp.order_id JOIN products p ON oi.product_id = p.product_id WHERE oi.product_id != 101 -- Exclude the source product GROUP BY oi.product_id, p.product_name),ranked AS ( SELECT *, ROW_NUMBER() OVER (ORDER BY co_purchase_count DESC) AS rank FROM co_products)SELECT product_id, product_name, co_purchase_countFROM rankedWHERE rank <= 3;Case Study 2: Sales Performance Tiering
Problem: Assign salespeople to performance tiers (Gold/Silver/Bronze) based on quarterly results, with top 20% as Gold, middle 50% as Silver, bottom 30% as Bronze.
12345678910111213141516171819202122232425262728
-- Assign performance tiers using NTILEWITH ranked AS ( SELECT salesperson_id, salesperson_name, region, quarterly_sales, NTILE(10) OVER (ORDER BY quarterly_sales DESC) AS decile FROM sales_performance WHERE quarter = '2024-Q1')SELECT salesperson_name, region, quarterly_sales, decile, CASE WHEN decile <= 2 THEN 'Gold (Top 20%)' WHEN decile <= 7 THEN 'Silver (Middle 50%)' ELSE 'Bronze (Bottom 30%)' END AS performance_tier, CASE WHEN decile <= 2 THEN quarterly_sales * 0.10 -- 10% bonus WHEN decile <= 7 THEN quarterly_sales * 0.05 -- 5% bonus ELSE 0 END AS bonus_amountFROM rankedORDER BY decile, quarterly_sales DESC;Case Study 3: Session Time Analysis
Problem: For each user session, calculate the time spent and identify the longest sessions per user.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
-- Identify user sessions and find longest per userWITH session_bounds AS ( SELECT user_id, event_time, -- Gap > 30 min starts new session CASE WHEN event_time - LAG(event_time) OVER ( PARTITION BY user_id ORDER BY event_time ) > INTERVAL '30 minutes' THEN 1 ELSE 0 END AS is_new_session FROM user_events),session_numbered AS ( SELECT user_id, event_time, SUM(is_new_session) OVER ( PARTITION BY user_id ORDER BY event_time ) + 1 AS session_number FROM session_bounds),session_stats AS ( SELECT user_id, session_number, MIN(event_time) AS session_start, MAX(event_time) AS session_end, EXTRACT(EPOCH FROM MAX(event_time) - MIN(event_time)) / 60 AS duration_minutes, COUNT(*) AS event_count FROM session_numbered GROUP BY user_id, session_number),ranked_sessions AS ( SELECT *, RANK() OVER ( PARTITION BY user_id ORDER BY duration_minutes DESC ) AS duration_rank FROM session_stats)SELECT user_id, session_number, session_start, session_end, ROUND(duration_minutes, 1) AS duration_minutes, event_count, duration_rankFROM ranked_sessionsWHERE duration_rank <= 3 -- Top 3 longest sessions per userORDER BY user_id, duration_rank;You've now mastered SQL ranking functions—from individual mechanics to complex real-world applications. Let's consolidate everything:
| Function | Ties | Gaps | Use Case |
|---|---|---|---|
| ROW_NUMBER() | Unique arbitrary | Never | Pagination, deduplication, exact counts |
| RANK() | Same rank | After ties | Competition positions, ordinal placement |
| DENSE_RANK() | Same rank | Never | Nth value queries, tier assignment |
| NTILE(n) | Arbitrary within tile | N/A | Percentiles, load balancing, bucketing |
Decision Quick-Reference:
With these four functions mastered, you can solve virtually any positional, comparative, or distribution problem in SQL. They form the foundation for advanced analytics, business intelligence, and data-driven decision making.
Congratulations! You have achieved comprehensive mastery of SQL ranking functions. You understand not just the mechanics of each function, but when to apply each one, how to combine them for complex scenarios, and how to optimize their performance. This knowledge will serve you throughout your career in database development and analytics.