Loading learning content...
In the previous page, we established that immutable data grants us unlimited preprocessing freedom. But freedom without strategy leads to waste—we could precompute everything and drown in O(n²) space, or precompute nothing and suffer O(n) queries. The genius of Sparse Tables lies in finding the perfect balance: precompute just enough to answer any query in O(1), while keeping space and preprocessing time at O(n log n).
This page unveils the preprocessing strategy that makes this possible. By the end, you'll understand exactly which ranges to precompute, how to organize them efficiently, and why this specific choice leads to constant-time queries. The mathematics is elegant, the implementation is straightforward, and the performance gains are substantial.
You will master the Sparse Table construction algorithm, understand why we precompute ranges of power-of-two lengths, build the complete table with O(n log n) time and space, implement the floor-log lookup table for efficient queries, and trace through detailed examples that cement your understanding.
The central insight behind Sparse Tables is deceptively simple: any integer can be represented as a sum of distinct powers of two. This is just the binary representation we use for numbers. But the implication for range queries is profound:
Any range of length L can be covered by at most ⌈log₂(L)⌉ non-overlapping ranges of power-of-two lengths.
However, for operations that handle overlap gracefully (minimum, maximum, GCD), we can do even better:
Any range can be covered by just two overlapping ranges of power-of-two length.
This means if we precompute answers for all ranges whose length is a power of two, we can answer any query by combining at most two precomputed results. For idempotent operations, this gives us O(1) queries.
The mathematical foundation:
Consider a query range [L, R] with length len = R - L + 1. Let k = ⌊log₂(len)⌋ be the largest power-of-two exponent that fits within this length. Then:
These two ranges overlap in the middle but together cover exactly [L, R]. For minimum/maximum, overlap doesn't matter—the minimum of overlapping regions is still correct.
12345678910111213141516171819202122232425262728
# Example: Query range [2, 9] with length 8# Array indices: 0 1 2 3 4 5 6 7 8 9 10# Query range: [---------------] # Length = 9 - 2 + 1 = 8# k = floor(log2(8)) = 3# 2^k = 8 # Range 1: [2, 2 + 8 - 1] = [2, 9] ← starts at L# Range 2: [9 - 8 + 1, 9] = [2, 9] ← ends at R # In this case, the ranges are identical! (length equals power of 2) # Example: Query range [2, 10] with length 9# Length = 10 - 2 + 1 = 9# k = floor(log2(9)) = 3# 2^k = 8 # Range 1: [2, 2 + 8 - 1] = [2, 9] ← covers left 8 elements# Range 2: [10 - 8 + 1, 10] = [3, 10] ← covers right 8 elements## [2, 9]: [-------]# [3, 10]: [-------]# Overlap: [-----]# Union: [---------] ← exactly [2, 10] ✓ # For min/max, we just take: min(min[2,9], min[3,10])# The overlap is harmless for idempotent operations!For any length L, 2^⌊log₂(L)⌋ ≥ L/2. This means a power-of-two range covers at least half the query range. Two such ranges (one from each end) must therefore overlap in the middle, ensuring complete coverage. This geometric argument guarantees O(1) queries for any range.
A Sparse Table is a 2D array table[k][i] where:
k represents the exponent: ranges of length 2^ki represents the starting indextable[k][i] stores the answer (minimum, maximum, GCD, etc.) for the range starting at index i with length 2^kIn formal notation:
table[k][i] = operation(arr[i], arr[i+1], ..., arr[i + 2^k - 1])
Dimensions of the table:
K = ⌊log₂(n)⌋ + 1 (we need exponents from 0 to ⌊log₂(n)⌋)n (one for each starting position)n × log₂(n)Example visualization:
For an array arr = [1, 3, 2, 7, 9, 11, 3, 5] with n = 8:
| k (power) | Length 2^k | table[k][0] | table[k][1] | table[k][2] | table[k][3] | table[k][4] | table[k][5] | table[k][6] | table[k][7] |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 1 | 3 | 2 | 7 | 9 | 11 | 3 | 5 |
| 1 | 2 | 1 | 2 | 2 | 7 | 9 | 3 | 3 | |
| 2 | 4 | 1 | 2 | 2 | 3 | 3 | |||
| 3 | 8 | 1 |
Reading the table:
table[0][i] = arr[i] (ranges of length 1 are just individual elements)table[1][0] = min(arr[0], arr[1]) = min(1, 3) = 1 (range [0, 1])table[1][2] = min(arr[2], arr[3]) = min(2, 7) = 2 (range [2, 3])table[2][0] = min(arr[0..3]) = min(1, 3, 2, 7) = 1 (range [0, 3])table[3][0] = min(arr[0..7]) = 1 (entire array)Why some cells are '-' (undefined):
For table[k][i], the range [i, i + 2^k - 1] must fit within the array. If i + 2^k > n, the cell is invalid and we don't compute it. This is why higher rows have fewer valid entries.
The table has K ≈ log₂(n) rows, but not all columns are valid in each row. Row k has approximately n - 2^k + 1 valid entries. Summing across all rows gives O(n log n) total cells. For n = 100,000, this is about 1.7 million entries—very manageable.
Building the Sparse Table is a beautiful application of dynamic programming. The key recurrence relation exploits the fact that a range of length 2^k can be split into two consecutive ranges of length 2^(k-1):
table[k][i] = combine(table[k-1][i], table[k-1][i + 2^(k-1)])
For minimum:
table[k][i] = min(table[k-1][i], table[k-1][i + 2^(k-1)])
Visual intuition:
The range [i, i + 2^k - 1] of length 2^k can be split into:
Since we compute rows in order of increasing k, when we compute row k, row k-1 is already complete. This is classic dynamic programming—larger subproblems depend on smaller subproblems.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
from typing import Listimport math def build_sparse_table_rmq(arr: List[int]) -> List[List[int]]: """ Build a Sparse Table for Range Minimum Query. Time Complexity: O(n log n) Space Complexity: O(n log n) Returns a 2D table where table[k][i] = min(arr[i:i+2^k]) """ n = len(arr) if n == 0: return [] # Calculate the number of rows needed # We need k values from 0 to floor(log2(n)) K = int(math.log2(n)) + 1 # Initialize the table # table[k][i] = minimum of range starting at i, length 2^k table = [[0] * n for _ in range(K)] # Base case: ranges of length 1 (k = 0) # Each element is the minimum of its single-element range for i in range(n): table[0][i] = arr[i] # Fill the table using dynamic programming # For each power k from 1 to K-1 for k in range(1, K): # Length of ranges at this level length = 1 << k # 2^k # For each valid starting position # Range [i, i + 2^k - 1] must fit in array for i in range(n - length + 1): # Combine two ranges of length 2^(k-1) # Left half: table[k-1][i] # Right half: table[k-1][i + 2^(k-1)] left_half = table[k - 1][i] right_half = table[k - 1][i + (1 << (k - 1))] table[k][i] = min(left_half, right_half) return table # Example walkthrougharr = [1, 3, 2, 7, 9, 11, 3, 5]table = build_sparse_table_rmq(arr) # Verify table[2][1] = min of arr[1:5] = min(3, 2, 7, 9) = 2# table[2][1] = min(table[1][1], table[1][3])# = min(min(3,2), min(7,9))# = min(2, 7) = 2 ✓Throughout Sparse Table implementations, you'll see 1 << k instead of 2**k or Math.pow(2, k). Bit shifting is faster (a single CPU instruction) and makes the code's intent clear: we're working with powers of two. This idiom is universal in competitive programming.
For O(1) queries, we need to compute ⌊log₂(len)⌋ for the query range length. While programming languages provide log functions, these involve floating-point operations and can be slow. More importantly, floating-point precision issues can cause subtle bugs.
The solution: precompute a lookup table of floor(log₂(i)) for all i from 1 to n. This adds O(n) to preprocessing but makes each query's log computation a single array lookup.
Building the log₂ table:
The key insight is that ⌊log₂(i)⌋ = ⌊log₂(i/2)⌋ + 1 for i > 1. This gives us a simple linear-time algorithm:
1234567891011121314151617181920212223242526272829303132
def build_log2_table(max_n: int) -> List[int]: """ Build a lookup table for floor(log2(i)) for all i from 1 to max_n. Time Complexity: O(n) Space Complexity: O(n) log2[i] = floor(log2(i)) """ log2 = [0] * (max_n + 1) # log2[1] = 0 (2^0 = 1) # For i > 1, use the recurrence: log2[i] = log2[i/2] + 1 for i in range(2, max_n + 1): log2[i] = log2[i // 2] + 1 return log2 # Example: build log2 table for n = 20log2 = build_log2_table(20) # Verify:# log2[1] = 0 (2^0 = 1)# log2[2] = 1 (2^1 = 2)# log2[3] = 1 (2^1 = 2 <= 3 < 4 = 2^2)# log2[4] = 2 (2^2 = 4)# log2[5] = 2 (2^2 = 4 <= 5 < 8 = 2^3)# log2[8] = 3 (2^3 = 8)# log2[16] = 4 (2^4 = 16) print(log2)# [0, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4]Why is this recurrence correct?
For any positive integer i:
Integer division handles both cases correctly, giving us a simple, elegant algorithm with no floating-point concerns.
Using math.log2() or Math.log2() for floor(log2()) can cause bugs due to floating-point precision. For example, log2(8) might return 2.9999999999 due to floating-point representation, and floor() would give 2 instead of 3. The integer-based lookup table avoids all such issues.
Now let's combine everything into a complete, production-ready Sparse Table implementation. This includes both the log₂ lookup table and the sparse table itself, encapsulated in a clean class structure.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788
from typing import List, Tuple, Callableimport math class SparseTableRMQ: """ Sparse Table for Range Minimum Query (RMQ). Preprocessing: O(n log n) time and space Query: O(1) time """ def __init__(self, arr: List[int]): """ Build the sparse table from the input array. Args: arr: The input array (will not be modified) """ self.n = len(arr) if self.n == 0: self.table = [] self.log2 = [] return # Build log2 lookup table self.log2 = self._build_log2_table() # Build the sparse table self.table = self._build_sparse_table(arr) def _build_log2_table(self) -> List[int]: """Build floor(log2) lookup table.""" log2 = [0] * (self.n + 1) for i in range(2, self.n + 1): log2[i] = log2[i // 2] + 1 return log2 def _build_sparse_table(self, arr: List[int]) -> List[List[int]]: """Build the sparse table using dynamic programming.""" K = self.log2[self.n] + 1 table = [[0] * self.n for _ in range(K)] # Base case: k = 0 for i in range(self.n): table[0][i] = arr[i] # DP: build each level from the previous for k in range(1, K): for i in range(self.n - (1 << k) + 1): table[k][i] = min( table[k - 1][i], table[k - 1][i + (1 << (k - 1))] ) return table def query(self, left: int, right: int) -> int: """ Query the minimum value in range [left, right] (inclusive). Args: left: Left index (0-based) right: Right index (0-based, inclusive) Returns: Minimum value in arr[left:right+1] Time Complexity: O(1) """ length = right - left + 1 k = self.log2[length] # Two overlapping ranges of length 2^k return min( self.table[k][left], # [left, left + 2^k - 1] self.table[k][right - (1 << k) + 1] # [right - 2^k + 1, right] ) # Demonstrationif __name__ == "__main__": arr = [1, 3, 2, 7, 9, 11, 3, 5] st = SparseTableRMQ(arr) # Test queries print(f"min([0, 7]) = {st.query(0, 7)}") # Expected: 1 print(f"min([2, 4]) = {st.query(2, 4)}") # Expected: 2 print(f"min([5, 7]) = {st.query(5, 7)}") # Expected: 3 print(f"min([3, 3]) = {st.query(3, 3)}") # Expected: 7 (single element)Let's trace through the entire process with a concrete example to solidify understanding. We'll use the array [5, 2, 8, 1, 6, 3, 9, 4] and build a Range Minimum Query Sparse Table.
Step 1: Initialize log₂ table
123456789101112131415
n = 8 Building log2 table:i = 1: log2[1] = 0 (base case)i = 2: log2[2] = log2[1] + 1 = 0 + 1 = 1i = 3: log2[3] = log2[1] + 1 = 0 + 1 = 1i = 4: log2[4] = log2[2] + 1 = 1 + 1 = 2i = 5: log2[5] = log2[2] + 1 = 1 + 1 = 2i = 6: log2[6] = log2[3] + 1 = 1 + 1 = 2i = 7: log2[7] = log2[3] + 1 = 1 + 1 = 2i = 8: log2[8] = log2[4] + 1 = 2 + 1 = 3 log2 = [0, 0, 1, 1, 2, 2, 2, 2, 3] ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ 0 1 2 3 4 5 6 7 8Step 2: Build Sparse Table
Number of rows K = log2[8] + 1 = 3 + 1 = 4
123456789101112131415161718192021222324252627
Array: [5, 2, 8, 1, 6, 3, 9, 4] Index: 0 1 2 3 4 5 6 7 Row k=0 (length 1): Just copy the arraytable[0] = [5, 2, 8, 1, 6, 3, 9, 4] Row k=1 (length 2): Combine pairs from k=0table[1][0] = min(table[0][0], table[0][1]) = min(5, 2) = 2table[1][1] = min(table[0][1], table[0][2]) = min(2, 8) = 2table[1][2] = min(table[0][2], table[0][3]) = min(8, 1) = 1table[1][3] = min(table[0][3], table[0][4]) = min(1, 6) = 1table[1][4] = min(table[0][4], table[0][5]) = min(6, 3) = 3table[1][5] = min(table[0][5], table[0][6]) = min(3, 9) = 3table[1][6] = min(table[0][6], table[0][7]) = min(9, 4) = 4table[1] = [2, 2, 1, 1, 3, 3, 4, -] Row k=2 (length 4): Combine pairs from k=1table[2][0] = min(table[1][0], table[1][2]) = min(2, 1) = 1table[2][1] = min(table[1][1], table[1][3]) = min(2, 1) = 1table[2][2] = min(table[1][2], table[1][4]) = min(1, 3) = 1table[2][3] = min(table[1][3], table[1][5]) = min(1, 3) = 1table[2][4] = min(table[1][4], table[1][6]) = min(3, 4) = 3table[2] = [1, 1, 1, 1, 3, -, -, -] Row k=3 (length 8): Combine pairs from k=2table[3][0] = min(table[2][0], table[2][4]) = min(1, 3) = 1table[3] = [1, -, -, -, -, -, -, -]Step 3: Answer Queries
Now let's trace through several queries:
12345678910111213141516171819202122232425262728293031323334353637383940414243
Query 1: min([1, 5]) - indices 1 through 5 inclusive length = 5 - 1 + 1 = 5 k = log2[5] = 2 2^k = 4 Range 1: [1, 1+4-1] = [1, 4] → table[2][1] = 1 Range 2: [5-4+1, 5] = [2, 5] → table[2][2] = 1 Answer: min(1, 1) = 1 ✓ Verification: arr[1:6] = [2, 8, 1, 6, 3] → min = 1 ✓ Query 2: min([4, 7]) - indices 4 through 7 inclusive length = 7 - 4 + 1 = 4 k = log2[4] = 2 2^k = 4 Range 1: [4, 4+4-1] = [4, 7] → table[2][4] = 3 Range 2: [7-4+1, 7] = [4, 7] → table[2][4] = 3 Answer: min(3, 3) = 3 Verification: arr[4:8] = [6, 3, 9, 4] → min = 3 ✓ Query 3: min([2, 3]) - indices 2 through 3 inclusive length = 3 - 2 + 1 = 2 k = log2[2] = 1 2^k = 2 Range 1: [2, 2+2-1] = [2, 3] → table[1][2] = 1 Range 2: [3-2+1, 3] = [2, 3] → table[1][2] = 1 Answer: min(1, 1) = 1 Verification: arr[2:4] = [8, 1] → min = 1 ✓ Query 4: min([0, 0]) - single element length = 0 - 0 + 1 = 1 k = log2[1] = 0 2^k = 1 Range 1: [0, 0+1-1] = [0, 0] → table[0][0] = 5 Range 2: [0-1+1, 0] = [0, 0] → table[0][0] = 5 Answer: min(5, 5) = 5 Verification: arr[0] = 5 ✓Notice how in Query 1, the two ranges [1,4] and [2,5] overlap in [2,4]. For minimum, this overlap is harmless—the minimum of the overlapping portion is counted twice, but min(a, a) = a. This is the essence of idempotency, which we'll explore deeply in the next page.
The query formula deserves careful attention because it's the heart of the O(1) claim:
k = log2[right - left + 1]
result = min(table[k][left], table[k][right - (1 << k) + 1])
Let's break down each component:
Computing k:
right - left + 1 is the length of the query rangek = log2[length] gives the largest power k such that 2^k ≤ lengthFirst lookup: table[k][left]
Second lookup: table[k][right - (1 << k) + 1]
j = right - (1 << k) + 1 = right - 2^k + 1Why these ranges cover [left, right]:
1234567891011121314151617181920212223242526272829
Claim: The union of [left, left + 2^k - 1] and [right - 2^k + 1, right] equals [left, right]. Proof: Let length = right - left + 1 We know: 2^k ≤ length < 2^(k+1) Therefore: 2^k ≤ length, which means 2^k ≤ right - left + 1 Rearranging: left + 2^k - 1 ≤ right Similarly: 2^k ≤ right - left + 1 means right - 2^k + 1 ≥ left So: - Range 1: [left, left + 2^k - 1] where left + 2^k - 1 ≤ right - Range 2: [right - 2^k + 1, right] where right - 2^k + 1 ≥ left The ranges overlap because: - Range 1 ends at: left + 2^k - 1 - Range 2 starts at: right - 2^k + 1 For overlap, we need: right - 2^k + 1 ≤ left + 2^k - 1 Simplifying: right - left + 2 ≤ 2 × 2^k = 2^(k+1) Since length < 2^(k+1), we have right - left + 1 < 2^(k+1) So right - left < 2^(k+1) - 1, meaning right - left + 2 ≤ 2^(k+1) ✓ Union covers [left, right] because: - Range 1 covers left end - Range 2 covers right end - They overlap in between (guaranteed by math above)The inequality 2^k ≤ length < 2^(k+1) is crucial. It ensures that each 2^k-length range covers more than half the query range, guaranteeing overlap. If we used a smaller power, we'd need more than two ranges. If we used a larger power, the ranges would extend outside [left, right].
We've now fully understood the preprocessing mechanism that powers Sparse Tables. Let's consolidate the key insights:
| Operation | Time Complexity | Space Complexity |
|---|---|---|
| Build log₂ table | O(n) | O(n) |
| Build sparse table | O(n log n) | O(n log n) |
| Total preprocessing | O(n log n) | O(n log n) |
| Single query | O(1) | |
| Q queries (after build) | O(Q) |
What's next:
The query formula relies on overlapping ranges—but this only works because minimum is an idempotent operation. The next page explores idempotent operations in depth: what they are, why they enable overlap-based queries, and which common operations (minimum, maximum, GCD, bitwise AND/OR) work with Sparse Tables versus which don't (sum, product).
You now understand the complete preprocessing mechanism behind Sparse Tables. You can build the table, understand why it works, and trace through queries step by step. Next, we'll discover which operations are compatible with this overlapping-range approach through the lens of idempotency.