Loading learning content...
The previous ranking functions—ROW_NUMBER, RANK, and DENSE_RANK—assign positions based on ordering. But sometimes you don't need positions; you need categories. You want to divide your data into groups of roughly equal size: quartiles, deciles, percentiles, or any custom bucket count.
NTILE() serves this purpose. Given N buckets, it distributes rows as evenly as possible across those buckets, assigning each row a bucket number from 1 to N.
This capability is fundamental for:
By the end of this page, you will deeply understand NTILE()'s bucket distribution mechanics, how it handles rows that don't divide evenly, and how to apply it for percentile calculations, load balancing, stratified sampling, and other equal-division scenarios.
NTILE(n) is a window function that divides an ordered partition into n approximately equal groups (tiles), assigning each row an integer from 1 to n indicating its group membership.
Formal Definition:
NTILE(n) partitions the rows in the ordered partition into n groups of approximately equal size, assigning tile numbers 1 through n. If the partition has more rows than tiles, each tile receives at least one row. If the partition has fewer rows than tiles, some tiles remain empty (the maximum tile number equals the row count).
The Distribution Rule:
For N rows distributed into T tiles:
- Each tile receives at least ⌊N/T⌋ rows
- The first (N mod T) tiles receive one extra row
1234567891011121314151617181920
-- Basic SyntaxNTILE(number_of_tiles) OVER ( [PARTITION BY partition_expression, ...] ORDER BY sort_expression [ASC | DESC], ...) -- Split into 4 groups (quartiles)NTILE(4) OVER (ORDER BY score DESC) -- Split each department into 3 tiersNTILE(3) OVER ( PARTITION BY department ORDER BY performance_score DESC) -- Split into 100 groups (percentiles)NTILE(100) OVER (ORDER BY metric_value) -- Key difference from other ranking functions:-- NTILE takes an argument: the number of tiles (buckets)Understanding NTILE's row assignment algorithm is essential for predictable results.
The Algorithm:
Example: 10 Rows into 4 Tiles
12345678910111213141516171819202122232425
-- 10 salespeople divided into 4 performance quartilesCREATE TABLE salespeople ( id INT PRIMARY KEY, name VARCHAR(100), total_sales DECIMAL(12,2)); INSERT INTO salespeople VALUES(1, 'Alice', 100000),(2, 'Bob', 90000),(3, 'Carol', 85000),(4, 'David', 80000),(5, 'Emma', 75000),(6, 'Frank', 70000),(7, 'Grace', 65000),(8, 'Henry', 60000),(9, 'Ivy', 55000),(10, 'Jack', 50000); -- Assign to quartiles (4 groups)SELECT name, total_sales, NTILE(4) OVER (ORDER BY total_sales DESC) AS quartileFROM salespeople;| name | total_sales | quartile |
|---|---|---|
| Alice | 100000 | 1 |
| Bob | 90000 | 1 |
| Carol | 85000 | 1 |
| David | 80000 | 2 |
| Emma | 75000 | 2 |
| Frank | 70000 | 2 |
| Grace | 65000 | 3 |
| Henry | 60000 | 3 |
| Ivy | 55000 | 4 |
| Jack | 50000 | 4 |
Analysis:
Why ORDER BY Matters:
The ORDER BY determines which rows go into which tiles:
Unlike other ranking functions where ORDER BY affects the rank value, with NTILE it also determines group membership. Without ORDER BY, rows are assigned to tiles arbitrarily—usually useless. Always include ORDER BY for meaningful results.
Real-world data rarely divides evenly. Understanding NTILE's boundary conditions prevents surprises.
Edge Case 1: Fewer Rows Than Tiles
If you request 10 tiles but have only 7 rows:
12345678
-- 7 rows into 10 tilesWITH sample AS ( SELECT n FROM generate_series(1, 7) AS n)SELECT n AS value, NTILE(10) OVER (ORDER BY n) AS tileFROM sample;| value | tile |
|---|---|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 6 | 6 |
| 7 | 7 |
Result: Each row gets its own tile (1-7). Tiles 8, 9, 10 have zero rows—they simply don't appear in the output.
Edge Case 2: One Row Per Tile
When rows exactly equal tiles:
12345678910
-- 5 rows into 5 tiles: perfect divisionWITH sample AS ( SELECT n FROM generate_series(1, 5) AS n)SELECT n AS value, NTILE(5) OVER (ORDER BY n) AS tileFROM sample; -- Result: Each row gets exactly one unique tile (1, 2, 3, 4, 5)Edge Case 3: All Rows in One Tile
NTILE(1) is valid—all rows go to tile 1:
123456
-- All rows in tile 1SELECT name, NTILE(1) OVER (ORDER BY total_sales) AS tileFROM salespeople;-- Every row returns tile = 1Edge Case 4: Ties in ORDER BY
NTILE must assign each row to exactly one tile, even if ORDER BY values are tied. The assignment among tied rows is arbitrary:
12345678910111213141516171819202122
-- Ties in ORDER BY: arbitrary assignment within tiesCREATE TABLE scores (id INT, score INT);INSERT INTO scores VALUES(1, 100), (2, 100), (3, 100), (4, 90), (5, 90), (6, 80); SELECT id, score, NTILE(2) OVER (ORDER BY score DESC) AS tileFROM scores; -- Possible result (tile assignment for tied scores is arbitrary):-- 100: tiles 1, 1, 1 (or 1, 1, 2 - depends on engine)-- 90: tiles 1, 2-- 80: tile 2 -- Add secondary order for determinism:SELECT id, score, NTILE(2) OVER (ORDER BY score DESC, id) AS tileFROM scores;For reproducible results, include enough ORDER BY columns to break all ties. Adding a unique key (like ID) as the final sort column ensures deterministic tile assignment.
NTILE's primary use case is computing statistical divisions: quartiles, quintiles, deciles, and percentiles.
Common Statistical Divisions:
| Division | NTILE Arg | Description |
|---|---|---|
| Halves | 2 | Median split |
| Tertiles | 3 | Three equal groups |
| Quartiles | 4 | 25% each (Q1, Q2, Q3, Q4) |
| Quintiles | 5 | 20% each |
| Deciles | 10 | 10% each |
| Percentiles | 100 | 1% each |
Application 1: Quartile Analysis
123456789101112131415161718192021222324
-- Analyze customer spending by quartileWITH customer_quartiles AS ( SELECT customer_id, customer_name, total_lifetime_value, NTILE(4) OVER (ORDER BY total_lifetime_value DESC) AS value_quartile FROM customers)SELECT value_quartile, CASE value_quartile WHEN 1 THEN 'Top 25% (VIP)' WHEN 2 THEN 'Upper Mid (Growth)' WHEN 3 THEN 'Lower Mid (Develop)' WHEN 4 THEN 'Bottom 25% (Nurture)' END AS segment_name, COUNT(*) AS customer_count, AVG(total_lifetime_value) AS avg_ltv, MIN(total_lifetime_value) AS min_ltv, MAX(total_lifetime_value) AS max_ltvFROM customer_quartilesGROUP BY value_quartileORDER BY value_quartile;Application 2: Percentile Rank
Computing each row's percentile position:
12345678910111213141516171819202122232425262728293031
-- Calculate percentile rank for each employee's salaryWITH percentiles AS ( SELECT employee_id, employee_name, department, salary, NTILE(100) OVER (ORDER BY salary) AS percentile, NTILE(100) OVER ( PARTITION BY department ORDER BY salary ) AS dept_percentile FROM employees)SELECT employee_name, department, salary, percentile AS global_percentile, dept_percentile AS department_percentile, CASE WHEN percentile >= 90 THEN 'Top 10%' WHEN percentile >= 75 THEN 'Top 25%' WHEN percentile >= 50 THEN 'Above Median' ELSE 'Below Median' END AS performance_tierFROM percentilesORDER BY global_percentile DESC; -- Note: NTILE(100) assigns 1 to lowest 1%, 100 to highest 1%-- For conventional percentile (100 = top), ORDER BY ascendingNTILE(100) gives integer percentiles 1-100 with equal row counts per tile. PERCENT_RANK() gives continuous percentile values 0-1 based on relative position. Use NTILE when you need discrete buckets; use PERCENT_RANK for precise percentile values.
Application 3: Decile Analysis for Performance Evaluation
Dividing employees into 10 performance buckets:
12345678910111213141516171819202122232425
-- Decile-based performance evaluationWITH performance_deciles AS ( SELECT employee_id, employee_name, performance_score, NTILE(10) OVER (ORDER BY performance_score DESC) AS decile FROM employee_reviews WHERE review_year = 2024)SELECT decile, CONCAT('Decile ', decile, ' (', (decile - 1) * 10, '-', decile * 10, ' percentile)') AS description, COUNT(*) AS employee_count, ROUND(AVG(performance_score), 2) AS avg_score, CASE WHEN decile <= 2 THEN 'Exceptional - Promote' WHEN decile <= 4 THEN 'Strong - Bonus' WHEN decile <= 7 THEN 'Meeting Expectations' WHEN decile <= 9 THEN 'Needs Improvement' ELSE 'Performance Plan' END AS action_itemFROM performance_decilesGROUP BY decileORDER BY decile;NTILE's equal distribution capability extends beyond statistical analysis to operational use cases.
Application 1: Load Balancing Across Workers
Distributing tasks evenly across N processing nodes:
12345678910111213141516171819202122232425262728
-- Distribute orders to 5 fulfillment warehouses-- Each warehouse gets roughly equal order countWITH order_assignment AS ( SELECT order_id, order_date, total_amount, NTILE(5) OVER (ORDER BY order_id) AS warehouse_id FROM pending_orders)SELECT warehouse_id, COUNT(*) AS order_count, SUM(total_amount) AS total_valueFROM order_assignmentGROUP BY warehouse_idORDER BY warehouse_id; -- Alternative: Balance by value, not countWITH value_balanced AS ( SELECT order_id, total_amount, SUM(total_amount) OVER (ORDER BY total_amount DESC) AS running_total, NTILE(5) OVER (ORDER BY total_amount DESC) AS warehouse_id FROM pending_orders)SELECT * FROM value_balanced;Application 2: Stratified Sampling
Selecting representative samples from each tier:
12345678910111213141516171819202122232425262728293031323334353637
-- Select 2 samples from each quartile of customer spendingWITH quartiled AS ( SELECT customer_id, total_spending, NTILE(4) OVER (ORDER BY total_spending) AS spending_quartile, ROW_NUMBER() OVER ( PARTITION BY NTILE(4) OVER (ORDER BY total_spending) ORDER BY RANDOM() ) AS random_rank FROM customers)-- Workaround: can't nest window functions directly, rerank AS ( SELECT customer_id, total_spending, spending_quartile, ROW_NUMBER() OVER ( PARTITION BY spending_quartile ORDER BY RANDOM() ) AS rand_within_quartile FROM ( SELECT customer_id, total_spending, NTILE(4) OVER (ORDER BY total_spending) AS spending_quartile FROM customers ) t)SELECT customer_id, total_spending, spending_quartileFROM rerankWHERE rand_within_quartile <= 2ORDER BY spending_quartile, total_spending;Application 3: Batch Processing
Dividing large datasets into processable chunks:
1234567891011121314151617181920212223
-- Create 10 batches for nightly processing-- Process one batch per hour overnightCREATE VIEW processing_batches ASSELECT record_id, data_payload, NTILE(10) OVER (ORDER BY record_id) AS batch_number, CURRENT_DATE AS scheduled_dateFROM large_processing_queueWHERE status = 'pending'; -- Query for specific batchSELECT * FROM processing_batches WHERE batch_number = 3; -- Verify batch sizes are balancedSELECT batch_number, COUNT(*) AS record_count, MIN(record_id) AS first_id, MAX(record_id) AS last_idFROM processing_batchesGROUP BY batch_numberORDER BY batch_number;To create batches of specific size rather than specific count, divide total rows by desired batch size: NTILE(CEIL(COUNT(*) / 1000.0)) for ~1000-row batches. Compute the count separately and use it as the NTILE argument in a CTE.
NTILE operates independently within each partition, creating separate equal distributions per group.
Example: Department-Level Quintiles
1234567891011121314
-- Assign performance quintiles within each departmentSELECT department, employee_name, performance_score, NTILE(5) OVER ( PARTITION BY department ORDER BY performance_score DESC ) AS dept_quintile, NTILE(5) OVER ( ORDER BY performance_score DESC ) AS global_quintileFROM employeesORDER BY department, dept_quintile;| department | employee_name | performance_score | dept_quintile | global_quintile |
|---|---|---|---|---|
| Engineering | Alice | 95 | 1 | 1 |
| Engineering | Bob | 92 | 1 | 1 |
| Engineering | Carol | 88 | 2 | 2 |
| Engineering | David | 85 | 2 | 2 |
| Engineering | Emma | 80 | 3 | 3 |
| Sales | Frank | 78 | 1 | 3 |
| Sales | Grace | 75 | 1 | 4 |
| Sales | Henry | 72 | 2 | 4 |
| Sales | Ivy | 70 | 2 | 5 |
| Sales | Jack | 65 | 3 | 5 |
Key Observation:
This captures the nuance: being top performer in a lower-performing department vs. middle performer in a high-performing department. Both quintile perspectives are valuable:
When partitions have different row counts, NTILE independently distributes within each. A 5-row partition with NTILE(4) gives tiles of size 2, 1, 1, 1. A 100-row partition gives 25-row tiles. The tile sizes aren't comparable across partitions.
NTILE enables sophisticated analytical patterns for segmentation and distribution analysis.
Pattern 1: Dynamic Tile Count
Using a variable or calculation for the tile count:
123456789101112131415161718192021222324
-- Create tiles based on data characteristics-- Example: One tile per 100 rows (approximately)WITH metadata AS ( SELECT COUNT(*) AS total_rows, GREATEST(1, CEIL(COUNT(*) / 100.0)) AS tile_count FROM large_table)SELECT t.*, NTILE(m.tile_count::int) OVER (ORDER BY t.sort_key) AS tile_numFROM large_table tCROSS JOIN metadata m; -- Note: Some databases don't allow NTILE with a non-literal argument-- Alternative using ROW_NUMBER:WITH numbered AS ( SELECT *, ROW_NUMBER() OVER (ORDER BY sort_key) AS rn, COUNT(*) OVER () AS total FROM large_table)SELECT *, CEIL(rn * 10.0 / total) AS manual_decileFROM numbered;Pattern 2: Cross-Tile Analysis
Comparing metrics across tile boundaries:
123456789101112131415161718192021222324
-- Analyze behavior differences between customer quartilesWITH customer_tiles AS ( SELECT customer_id, total_spending, order_count, avg_order_value, NTILE(4) OVER (ORDER BY total_spending DESC) AS spending_quartile FROM customer_metrics)SELECT spending_quartile, COUNT(*) AS customer_count, AVG(total_spending) AS avg_spending, AVG(order_count) AS avg_order_count, AVG(avg_order_value) AS avg_order_value, -- Compare to previous quartile LAG(AVG(total_spending)) OVER (ORDER BY spending_quartile) AS prev_quartile_spending, AVG(total_spending) / NULLIF(LAG(AVG(total_spending)) OVER (ORDER BY spending_quartile), 0) AS spending_ratioFROM customer_tilesGROUP BY spending_quartileORDER BY spending_quartile; -- This shows: How much more do Q1 customers spend vs Q2? Q2 vs Q3?Pattern 3: Tile-Based Cohort Creation
Creating balanced cohorts for A/B testing or experimentation:
1234567891011121314151617181920212223242526272829303132333435
-- Create balanced test/control cohorts-- Stratified by spending level to ensure fair comparisonWITH stratified AS ( SELECT customer_id, spending_level, -- First, stratify by spending NTILE(10) OVER (ORDER BY total_spending) AS spending_stratum FROM customers),cohort_assigned AS ( SELECT customer_id, spending_stratum, -- Within each stratum, randomly assign to test/control NTILE(2) OVER ( PARTITION BY spending_stratum ORDER BY RANDOM() ) AS cohort FROM stratified)SELECT customer_id, CASE cohort WHEN 1 THEN 'test' ELSE 'control' END AS cohort_name, spending_stratumFROM cohort_assigned; -- Verify balance: each stratum should have ~50% test, ~50% controlSELECT spending_stratum, SUM(CASE cohort WHEN 1 THEN 1 ELSE 0 END) AS test_count, SUM(CASE cohort WHEN 2 THEN 1 ELSE 0 END) AS control_countFROM cohort_assignedGROUP BY spending_stratumORDER BY spending_stratum;Using NTILE with PARTITION BY enables stratified randomization—ensuring test/control groups have similar distributions across important dimensions. This reduces confounding and improves experiment validity.
Understanding when NTILE is the right tool—and when alternatives work better—improves solution quality.
| Approach | Best For | Limitation |
|---|---|---|
| NTILE(n) | Equal-count buckets | Doesn't consider value distributions |
| Width-based bucketing | Equal-width value ranges | May produce very unequal counts |
| PERCENT_RANK() | Continuous percentile values | No discrete bucket assignment |
| CUME_DIST() | Cumulative distribution | No direct bucket assignment |
| CASE with thresholds | Custom bucket boundaries | Requires knowing thresholds in advance |
1234567891011121314151617181920212223242526
-- Compare NTILE (equal count) vs width-based (equal range) -- NTILE: Equal number of rows per bucketSELECT value, NTILE(4) OVER (ORDER BY value) AS ntile_bucketFROM (VALUES (10), (20), (30), (35), (37), (38), (39), (100)) AS t(value);-- Each bucket gets 2 rows regardless of value distribution -- Width-based: Equal value range per bucketWITH bounds AS ( SELECT MIN(value) AS min_val, MAX(value) AS max_val FROM sample_data),width_bucketed AS ( SELECT value, FLOOR((value - min_val) / ((max_val - min_val) / 4.0)) + 1 AS width_bucket FROM sample_data, bounds)SELECT * FROM width_bucketed;-- Buckets may have very different row counts -- NTILE with skewed data:-- Values: 1, 2, 3, 4, 5, 100, 200, 300-- NTILE(4): (1,2), (3,4), (5,100), (200,300) - 2 rows each-- Width(4): (1-75), (75-150), (150-225), (225-300) - very unequal countsWhen to Use NTILE:
When to Use Alternatives:
NTILE divides rows equally, not values. A dataset with values 1-10 and 1000 gets even row distribution even though the value 1000 is an extreme outlier. For outlier-sensitive bucketing, consider percentile-based approaches or preprocessing.
NTILE() provides equal-count distribution into N buckets—essential for percentile analysis, load balancing, and stratified operations. Let's consolidate the key knowledge:
| Function | Assigns | Ties | Range | Primary Use |
|---|---|---|---|---|
| ROW_NUMBER() | Unique sequence | Arbitrary | 1 to N | Pagination, deduplication |
| RANK() | Position with gaps | Same rank | 1 to N | Competition rankings |
| DENSE_RANK() | Position no gaps | Same rank | 1 to D | Nth value queries |
| NTILE(n) | Bucket assignment | Arbitrary within | 1 to n | Percentiles, load balancing |
What's Next:
With all four ranking functions mastered, we'll explore practical ranking use cases in the final page of this module. We'll see how to combine these functions for complex analytical scenarios, handle real-world edge cases, and apply ranking patterns to solve business problems.
You now possess comprehensive knowledge of NTILE()—from distribution mechanics through advanced patterns. With all four ranking functions understood, you're equipped to handle percentile analysis, load balancing, stratified sampling, and any equal-distribution scenario.