Data Structures & AlgorithmsSparse Tables

Sparse Tables — Immutable Range Queries

LevelAdvanced

Duration60 mins

TopicSparse Tables

2 / 4

Preprocessing for O(1) Queries

The Art of Strategic Preprocessing

In the previous page, we established that immutable data grants us unlimited preprocessing freedom. But freedom without strategy leads to waste—we could precompute everything and drown in O(n²) space, or precompute nothing and suffer O(n) queries. The genius of Sparse Tables lies in finding the perfect balance: precompute just enough to answer any query in O(1), while keeping space and preprocessing time at O(n log n).

This page unveils the preprocessing strategy that makes this possible. By the end, you'll understand exactly which ranges to precompute, how to organize them efficiently, and why this specific choice leads to constant-time queries. The mathematics is elegant, the implementation is straightforward, and the performance gains are substantial.

What You Will Learn

You will master the Sparse Table construction algorithm, understand why we precompute ranges of power-of-two lengths, build the complete table with O(n log n) time and space, implement the floor-log lookup table for efficient queries, and trace through detailed examples that cement your understanding.

The Power-of-Two Insight

The central insight behind Sparse Tables is deceptively simple: any integer can be represented as a sum of distinct powers of two. This is just the binary representation we use for numbers. But the implication for range queries is profound:

Any range of length L can be covered by at most ⌈log₂(L)⌉ non-overlapping ranges of power-of-two lengths.

However, for operations that handle overlap gracefully (minimum, maximum, GCD), we can do even better:

Any range can be covered by just two overlapping ranges of power-of-two length.

This means if we precompute answers for all ranges whose length is a power of two, we can answer any query by combining at most two precomputed results. For idempotent operations, this gives us O(1) queries.

The mathematical foundation:

Consider a query range [L, R] with length len = R - L + 1. Let k = ⌊log₂(len)⌋ be the largest power-of-two exponent that fits within this length. Then:

Range 1: [L, L + 2^k - 1] — a range of length 2^k starting at L
Range 2: [R - 2^k + 1, R] — a range of length 2^k ending at R

These two ranges overlap in the middle but together cover exactly [L, R]. For minimum/maximum, overlap doesn't matter—the minimum of overlapping regions is still correct.

power_of_two_covering.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Example: Query range [2, 9] with length 8
# Array indices: 0  1  2  3  4  5  6  7  8  9  10
# Query range:         [---------------]
 
# Length = 9 - 2 + 1 = 8
# k = floor(log2(8)) = 3
# 2^k = 8
 
# Range 1: [2, 2 + 8 - 1] = [2, 9]  ← starts at L
# Range 2: [9 - 8 + 1, 9] = [2, 9]  ← ends at R
 
# In this case, the ranges are identical! (length equals power of 2)
 
# Example: Query range [2, 10] with length 9
# Length = 10 - 2 + 1 = 9
# k = floor(log2(9)) = 3
# 2^k = 8
 
# Range 1: [2, 2 + 8 - 1] = [2, 9]   ← covers left 8 elements
# Range 2: [10 - 8 + 1, 10] = [3, 10] ← covers right 8 elements
#
#         [2, 9]:     [-------]
#         [3, 10]:      [-------]
#         Overlap:      [-----]
#         Union:      [---------]  ← exactly [2, 10] ✓
 
# For min/max, we just take: min(min[2,9], min[3,10])
# The overlap is harmless for idempotent operations!

Why Two Ranges Always Suffice

For any length L, 2^⌊log₂(L)⌋ ≥ L/2. This means a power-of-two range covers at least half the query range. Two such ranges (one from each end) must therefore overlap in the middle, ensuring complete coverage. This geometric argument guarantees O(1) queries for any range.

The Sparse Table Structure

A Sparse Table is a 2D array table[k][i] where:

k represents the exponent: ranges of length 2^k
i represents the starting index
table[k][i] stores the answer (minimum, maximum, GCD, etc.) for the range starting at index i with length 2^k

In formal notation:

table[k][i] = operation(arr[i], arr[i+1], ..., arr[i + 2^k - 1])

Dimensions of the table:

Number of rows: K = ⌊log₂(n)⌋ + 1 (we need exponents from 0 to ⌊log₂(n)⌋)
Number of columns: n (one for each starting position)
Total cells: approximately n × log₂(n)

Example visualization:

For an array arr = [1, 3, 2, 7, 9, 11, 3, 5] with n = 8:

Sparse Table for Range Minimum Query (n=8)
k (power)	Length 2^k	table[k][0]	table[k][1]	table[k][2]	table[k][3]	table[k][4]	table[k][5]	table[k][6]	table[k][7]
0	1	1	3	2	7	9	11	3	5
1	2	1	2	2	7	9	3	3
2	4	1	2	2	3	3
3	8	1

Reading the table:

table[0][i] = arr[i] (ranges of length 1 are just individual elements)
table[1][0] = min(arr[0], arr[1]) = min(1, 3) = 1 (range [0, 1])
table[1][2] = min(arr[2], arr[3]) = min(2, 7) = 2 (range [2, 3])
table[2][0] = min(arr[0..3]) = min(1, 3, 2, 7) = 1 (range [0, 3])
table[3][0] = min(arr[0..7]) = 1 (entire array)

Why some cells are '-' (undefined):

For table[k][i], the range [i, i + 2^k - 1] must fit within the array. If i + 2^k > n, the cell is invalid and we don't compute it. This is why higher rows have fewer valid entries.

Space Analysis

The table has K ≈ log₂(n) rows, but not all columns are valid in each row. Row k has approximately n - 2^k + 1 valid entries. Summing across all rows gives O(n log n) total cells. For n = 100,000, this is about 1.7 million entries—very manageable.

The Build Algorithm: Dynamic Programming Elegance

Building the Sparse Table is a beautiful application of dynamic programming. The key recurrence relation exploits the fact that a range of length 2^k can be split into two consecutive ranges of length 2^(k-1):

table[k][i] = combine(table[k-1][i], table[k-1][i + 2^(k-1)])

For minimum:

table[k][i] = min(table[k-1][i], table[k-1][i + 2^(k-1)])

Visual intuition:

The range [i, i + 2^k - 1] of length 2^k can be split into:

Left half: [i, i + 2^(k-1) - 1] of length 2^(k-1)
Right half: [i + 2^(k-1), i + 2^k - 1] of length 2^(k-1)

Since we compute rows in order of increasing k, when we compute row k, row k-1 is already complete. This is classic dynamic programming—larger subproblems depend on smaller subproblems.

sparse_table_build.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
from typing import List
import math
 
def build_sparse_table_rmq(arr: List[int]) -> List[List[int]]:
    """
    Build a Sparse Table for Range Minimum Query.
    
    Time Complexity: O(n log n)
    Space Complexity: O(n log n)
    
    Returns a 2D table where table[k][i] = min(arr[i:i+2^k])
    """
    n = len(arr)
    if n == 0:
        return []
    
    # Calculate the number of rows needed
    # We need k values from 0 to floor(log2(n))
    K = int(math.log2(n)) + 1
    
    # Initialize the table
    # table[k][i] = minimum of range starting at i, length 2^k
    table = [[0] * n for _ in range(K)]
    
    # Base case: ranges of length 1 (k = 0)
    # Each element is the minimum of its single-element range
    for i in range(n):
        table[0][i] = arr[i]
    
    # Fill the table using dynamic programming
    # For each power k from 1 to K-1
    for k in range(1, K):
        # Length of ranges at this level
        length = 1 << k  # 2^k
        
        # For each valid starting position
        # Range [i, i + 2^k - 1] must fit in array
        for i in range(n - length + 1):
            # Combine two ranges of length 2^(k-1)
            # Left half: table[k-1][i]
            # Right half: table[k-1][i + 2^(k-1)]
            left_half = table[k - 1][i]
            right_half = table[k - 1][i + (1 << (k - 1))]
            table[k][i] = min(left_half, right_half)
    
    return table
 
# Example walkthrough
arr = [1, 3, 2, 7, 9, 11, 3, 5]
table = build_sparse_table_rmq(arr)
 
# Verify table[2][1] = min of arr[1:5] = min(3, 2, 7, 9) = 2
# table[2][1] = min(table[1][1], table[1][3])
#             = min(min(3,2), min(7,9))
#             = min(2, 7) = 2 ✓

Bit Shifting for Powers of Two

Throughout Sparse Table implementations, you'll see 1 << k instead of 2**k or Math.pow(2, k). Bit shifting is faster (a single CPU instruction) and makes the code's intent clear: we're working with powers of two. This idiom is universal in competitive programming.

The Log₂ Lookup Table: Essential for O(1) Queries

For O(1) queries, we need to compute ⌊log₂(len)⌋ for the query range length. While programming languages provide log functions, these involve floating-point operations and can be slow. More importantly, floating-point precision issues can cause subtle bugs.

The solution: precompute a lookup table of floor(log₂(i)) for all i from 1 to n. This adds O(n) to preprocessing but makes each query's log computation a single array lookup.

Building the log₂ table:

The key insight is that ⌊log₂(i)⌋ = ⌊log₂(i/2)⌋ + 1 for i > 1. This gives us a simple linear-time algorithm:

log2_lookup.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def build_log2_table(max_n: int) -> List[int]:
    """
    Build a lookup table for floor(log2(i)) for all i from 1 to max_n.
    
    Time Complexity: O(n)
    Space Complexity: O(n)
    
    log2[i] = floor(log2(i))
    """
    log2 = [0] * (max_n + 1)
    
    # log2[1] = 0 (2^0 = 1)
    # For i > 1, use the recurrence: log2[i] = log2[i/2] + 1
    for i in range(2, max_n + 1):
        log2[i] = log2[i // 2] + 1
    
    return log2
 
# Example: build log2 table for n = 20
log2 = build_log2_table(20)
 
# Verify:
# log2[1] = 0  (2^0 = 1)
# log2[2] = 1  (2^1 = 2)
# log2[3] = 1  (2^1 = 2 <= 3 < 4 = 2^2)
# log2[4] = 2  (2^2 = 4)
# log2[5] = 2  (2^2 = 4 <= 5 < 8 = 2^3)
# log2[8] = 3  (2^3 = 8)
# log2[16] = 4 (2^4 = 16)
 
print(log2)
# [0, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4]

Why is this recurrence correct?

For any positive integer i:

If i is a power of 2: log₂(i) = log₂(i/2) + 1 (exactly doubling doubles the exponent)
If i is not a power of 2: ⌊log₂(i)⌋ = ⌊log₂(⌊i/2⌋)⌋ + 1 (halving reduces the exponent by 1 until we hit the nearest power of 2)

Integer division handles both cases correctly, giving us a simple, elegant algorithm with no floating-point concerns.

Avoid Floating-Point log2

Using math.log2() or Math.log2() for floor(log2()) can cause bugs due to floating-point precision. For example, log2(8) might return 2.9999999999 due to floating-point representation, and floor() would give 2 instead of 3. The integer-based lookup table avoids all such issues.

Complete Sparse Table Build Implementation

Now let's combine everything into a complete, production-ready Sparse Table implementation. This includes both the log₂ lookup table and the sparse table itself, encapsulated in a clean class structure.

sparse_table_complete.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
from typing import List, Tuple, Callable
import math
 
class SparseTableRMQ:
    """
    Sparse Table for Range Minimum Query (RMQ).
    
    Preprocessing: O(n log n) time and space
    Query: O(1) time
    """
    
    def __init__(self, arr: List[int]):
        """
        Build the sparse table from the input array.
        
        Args:
            arr: The input array (will not be modified)
        """
        self.n = len(arr)
        if self.n == 0:
            self.table = []
            self.log2 = []
            return
        
        # Build log2 lookup table
        self.log2 = self._build_log2_table()
        
        # Build the sparse table
        self.table = self._build_sparse_table(arr)
    
    def _build_log2_table(self) -> List[int]:
        """Build floor(log2) lookup table."""
        log2 = [0] * (self.n + 1)
        for i in range(2, self.n + 1):
            log2[i] = log2[i // 2] + 1
        return log2
    
    def _build_sparse_table(self, arr: List[int]) -> List[List[int]]:
        """Build the sparse table using dynamic programming."""
        K = self.log2[self.n] + 1
        table = [[0] * self.n for _ in range(K)]
        
        # Base case: k = 0
        for i in range(self.n):
            table[0][i] = arr[i]
        
        # DP: build each level from the previous
        for k in range(1, K):
            for i in range(self.n - (1 << k) + 1):
                table[k][i] = min(
                    table[k - 1][i],
                    table[k - 1][i + (1 << (k - 1))]
                )
        
        return table
    
    def query(self, left: int, right: int) -> int:
        """
        Query the minimum value in range [left, right] (inclusive).
        
        Args:
            left: Left index (0-based)
            right: Right index (0-based, inclusive)
        
        Returns:
            Minimum value in arr[left:right+1]
        
        Time Complexity: O(1)
        """
        length = right - left + 1
        k = self.log2[length]
        
        # Two overlapping ranges of length 2^k
        return min(
            self.table[k][left],          # [left, left + 2^k - 1]
            self.table[k][right - (1 << k) + 1]  # [right - 2^k + 1, right]
        )
 
# Demonstration
if __name__ == "__main__":
    arr = [1, 3, 2, 7, 9, 11, 3, 5]
    st = SparseTableRMQ(arr)
    
    # Test queries
    print(f"min([0, 7]) = {st.query(0, 7)}")  # Expected: 1
    print(f"min([2, 4]) = {st.query(2, 4)}")  # Expected: 2
    print(f"min([5, 7]) = {st.query(5, 7)}")  # Expected: 3
    print(f"min([3, 3]) = {st.query(3, 3)}")  # Expected: 7 (single element)

Detailed Trace-Through: Building and Querying

Let's trace through the entire process with a concrete example to solidify understanding. We'll use the array [5, 2, 8, 1, 6, 3, 9, 4] and build a Range Minimum Query Sparse Table.

Step 1: Initialize log₂ table

trace_log2.txt

Text

n = 8
 
Building log2 table:
i = 1: log2[1] = 0 (base case)
i = 2: log2[2] = log2[1] + 1 = 0 + 1 = 1
i = 3: log2[3] = log2[1] + 1 = 0 + 1 = 1
i = 4: log2[4] = log2[2] + 1 = 1 + 1 = 2
i = 5: log2[5] = log2[2] + 1 = 1 + 1 = 2
i = 6: log2[6] = log2[3] + 1 = 1 + 1 = 2
i = 7: log2[7] = log2[3] + 1 = 1 + 1 = 2
i = 8: log2[8] = log2[4] + 1 = 2 + 1 = 3
 
log2 = [0, 0, 1, 1, 2, 2, 2, 2, 3]
        ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑
        0  1  2  3  4  5  6  7  8

Step 2: Build Sparse Table

Number of rows K = log2[8] + 1 = 3 + 1 = 4

trace_build.txt

Text

Array: [5, 2, 8, 1, 6, 3, 9, 4]
        Index: 0  1  2  3  4  5  6  7
 
Row k=0 (length 1): Just copy the array
table[0] = [5, 2, 8, 1, 6, 3, 9, 4]
 
Row k=1 (length 2): Combine pairs from k=0
table[1][0] = min(table[0][0], table[0][1]) = min(5, 2) = 2
table[1][1] = min(table[0][1], table[0][2]) = min(2, 8) = 2
table[1][2] = min(table[0][2], table[0][3]) = min(8, 1) = 1
table[1][3] = min(table[0][3], table[0][4]) = min(1, 6) = 1
table[1][4] = min(table[0][4], table[0][5]) = min(6, 3) = 3
table[1][5] = min(table[0][5], table[0][6]) = min(3, 9) = 3
table[1][6] = min(table[0][6], table[0][7]) = min(9, 4) = 4
table[1] = [2, 2, 1, 1, 3, 3, 4, -]
 
Row k=2 (length 4): Combine pairs from k=1
table[2][0] = min(table[1][0], table[1][2]) = min(2, 1) = 1
table[2][1] = min(table[1][1], table[1][3]) = min(2, 1) = 1
table[2][2] = min(table[1][2], table[1][4]) = min(1, 3) = 1
table[2][3] = min(table[1][3], table[1][5]) = min(1, 3) = 1
table[2][4] = min(table[1][4], table[1][6]) = min(3, 4) = 3
table[2] = [1, 1, 1, 1, 3, -, -, -]
 
Row k=3 (length 8): Combine pairs from k=2
table[3][0] = min(table[2][0], table[2][4]) = min(1, 3) = 1
table[3] = [1, -, -, -, -, -, -, -]

Step 3: Answer Queries

Now let's trace through several queries:

trace_queries.txt

Text

Query 1: min([1, 5]) - indices 1 through 5 inclusive
  length = 5 - 1 + 1 = 5
  k = log2[5] = 2
  2^k = 4
  
  Range 1: [1, 1+4-1] = [1, 4] → table[2][1] = 1
  Range 2: [5-4+1, 5] = [2, 5] → table[2][2] = 1
  
  Answer: min(1, 1) = 1 ✓
  Verification: arr[1:6] = [2, 8, 1, 6, 3] → min = 1 ✓
 
Query 2: min([4, 7]) - indices 4 through 7 inclusive
  length = 7 - 4 + 1 = 4
  k = log2[4] = 2
  2^k = 4
  
  Range 1: [4, 4+4-1] = [4, 7] → table[2][4] = 3
  Range 2: [7-4+1, 7] = [4, 7] → table[2][4] = 3
  
  Answer: min(3, 3) = 3
  Verification: arr[4:8] = [6, 3, 9, 4] → min = 3 ✓
 
Query 3: min([2, 3]) - indices 2 through 3 inclusive
  length = 3 - 2 + 1 = 2
  k = log2[2] = 1
  2^k = 2
  
  Range 1: [2, 2+2-1] = [2, 3] → table[1][2] = 1
  Range 2: [3-2+1, 3] = [2, 3] → table[1][2] = 1
  
  Answer: min(1, 1) = 1
  Verification: arr[2:4] = [8, 1] → min = 1 ✓
 
Query 4: min([0, 0]) - single element
  length = 0 - 0 + 1 = 1
  k = log2[1] = 0
  2^k = 1
  
  Range 1: [0, 0+1-1] = [0, 0] → table[0][0] = 5
  Range 2: [0-1+1, 0] = [0, 0] → table[0][0] = 5
  
  Answer: min(5, 5) = 5
  Verification: arr[0] = 5 ✓

The Overlap Magic

Notice how in Query 1, the two ranges [1,4] and [2,5] overlap in [2,4]. For minimum, this overlap is harmless—the minimum of the overlapping portion is counted twice, but min(a, a) = a. This is the essence of idempotency, which we'll explore deeply in the next page.

Deep Dive: The Query Formula

The query formula deserves careful attention because it's the heart of the O(1) claim:

k = log2[right - left + 1]
result = min(table[k][left], table[k][right - (1 << k) + 1])

Let's break down each component:

Computing k:

right - left + 1 is the length of the query range
k = log2[length] gives the largest power k such that 2^k ≤ length
This means 2^k ≤ length < 2^(k+1)

First lookup: table[k][left]

This retrieves the minimum of [left, left + 2^k - 1]
This range has length exactly 2^k
It starts at the left edge of the query range

Second lookup: table[k][right - (1 << k) + 1]

Let's call the starting index j = right - (1 << k) + 1 = right - 2^k + 1
This retrieves the minimum of [j, j + 2^k - 1]
But j + 2^k - 1 = (right - 2^k + 1) + 2^k - 1 = right
So this range is [right - 2^k + 1, right]
It has length exactly 2^k and ends at the right edge of the query range

Why these ranges cover [left, right]:

coverage_proof.txt

Text

Claim: The union of [left, left + 2^k - 1] and [right - 2^k + 1, right] 
       equals [left, right].
 
Proof:
  Let length = right - left + 1
  We know: 2^k ≤ length < 2^(k+1)
  
  Therefore: 2^k ≤ length, which means 2^k ≤ right - left + 1
  Rearranging: left + 2^k - 1 ≤ right
  
  Similarly: 2^k ≤ right - left + 1 means right - 2^k + 1 ≥ left
  
  So:
  - Range 1: [left, left + 2^k - 1] where left + 2^k - 1 ≤ right
  - Range 2: [right - 2^k + 1, right] where right - 2^k + 1 ≥ left
  
  The ranges overlap because:
  - Range 1 ends at: left + 2^k - 1
  - Range 2 starts at: right - 2^k + 1
  
  For overlap, we need: right - 2^k + 1 ≤ left + 2^k - 1
  Simplifying: right - left + 2 ≤ 2 × 2^k = 2^(k+1)
  Since length < 2^(k+1), we have right - left + 1 < 2^(k+1)
  So right - left < 2^(k+1) - 1, meaning right - left + 2 ≤ 2^(k+1) ✓
  
  Union covers [left, right] because:
  - Range 1 covers left end
  - Range 2 covers right end
  - They overlap in between (guaranteed by math above)

The Tight Bound

The inequality 2^k ≤ length < 2^(k+1) is crucial. It ensures that each 2^k-length range covers more than half the query range, guaranteeing overlap. If we used a smaller power, we'd need more than two ranges. If we used a larger power, the ranges would extend outside [left, right].

Summary: The Preprocessing Power

We've now fully understood the preprocessing mechanism that powers Sparse Tables. Let's consolidate the key insights:

Key Takeaways

•Power-of-two ranges are key — By precomputing answers only for ranges of length 1, 2, 4, 8, ..., we limit preprocessing to O(n log n) while enabling O(1) queries.
•Two overlapping ranges cover any query — Any range can be covered by two power-of-two ranges from each end that meet (and overlap) in the middle.
•Dynamic programming builds the table — Each row depends only on the previous row: table[k][i] = min(table[k-1][i], table[k-1][i + 2^(k-1)]).
•Log₂ lookup table avoids floating-point — Precomputing floor(log₂(i)) ensures fast, correct exponent lookup without precision issues.
•The query formula is simple and constant-time — Just two table lookups and a min operation, regardless of range size.
•Trade-off is clear — O(n log n) preprocessing and space for O(1) queries, optimal when queries dominate.

Sparse Table Complexity Summary
Operation	Time Complexity	Space Complexity
Build log₂ table	O(n)	O(n)
Build sparse table	O(n log n)	O(n log n)
Total preprocessing	O(n log n)	O(n log n)
Single query	O(1)
Q queries (after build)	O(Q)

What's next:

The query formula relies on overlapping ranges—but this only works because minimum is an idempotent operation. The next page explores idempotent operations in depth: what they are, why they enable overlap-based queries, and which common operations (minimum, maximum, GCD, bitwise AND/OR) work with Sparse Tables versus which don't (sum, product).

Page Complete

You now understand the complete preprocessing mechanism behind Sparse Tables. You can build the table, understand why it works, and trace through queries step by step. Next, we'll discover which operations are compatible with this overlapping-range approach through the lens of idempotency.

2 / 4

Loading learning content...

Data Structures & AlgorithmsSparse Tables

Sparse Tables — Immutable Range Queries

LevelAdvanced

Duration60 mins

TopicSparse Tables

2 / 4

Preprocessing for O(1) Queries

The Art of Strategic Preprocessing

What You Will Learn

The Power-of-Two Insight

Any range of length L can be covered by at most ⌈log₂(L)⌉ non-overlapping ranges of power-of-two lengths.

However, for operations that handle overlap gracefully (minimum, maximum, GCD), we can do even better:

Any range can be covered by just two overlapping ranges of power-of-two length.

The mathematical foundation:

Consider a query range [L, R] with length len = R - L + 1. Let k = ⌊log₂(len)⌋ be the largest power-of-two exponent that fits within this length. Then:

Range 1: [L, L + 2^k - 1] — a range of length 2^k starting at L
Range 2: [R - 2^k + 1, R] — a range of length 2^k ending at R

These two ranges overlap in the middle but together cover exactly [L, R]. For minimum/maximum, overlap doesn't matter—the minimum of overlapping regions is still correct.

power_of_two_covering.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Example: Query range [2, 9] with length 8
# Array indices: 0  1  2  3  4  5  6  7  8  9  10
# Query range:         [---------------]
 
# Length = 9 - 2 + 1 = 8
# k = floor(log2(8)) = 3
# 2^k = 8
 
# Range 1: [2, 2 + 8 - 1] = [2, 9]  ← starts at L
# Range 2: [9 - 8 + 1, 9] = [2, 9]  ← ends at R
 
# In this case, the ranges are identical! (length equals power of 2)
 
# Example: Query range [2, 10] with length 9
# Length = 10 - 2 + 1 = 9
# k = floor(log2(9)) = 3
# 2^k = 8
 
# Range 1: [2, 2 + 8 - 1] = [2, 9]   ← covers left 8 elements
# Range 2: [10 - 8 + 1, 10] = [3, 10] ← covers right 8 elements
#
#         [2, 9]:     [-------]
#         [3, 10]:      [-------]
#         Overlap:      [-----]
#         Union:      [---------]  ← exactly [2, 10] ✓
 
# For min/max, we just take: min(min[2,9], min[3,10])
# The overlap is harmless for idempotent operations!

Why Two Ranges Always Suffice

The Sparse Table Structure

A Sparse Table is a 2D array table[k][i] where:

k represents the exponent: ranges of length 2^k
i represents the starting index
table[k][i] stores the answer (minimum, maximum, GCD, etc.) for the range starting at index i with length 2^k

In formal notation:

table[k][i] = operation(arr[i], arr[i+1], ..., arr[i + 2^k - 1])

Dimensions of the table:

Number of rows: K = ⌊log₂(n)⌋ + 1 (we need exponents from 0 to ⌊log₂(n)⌋)
Number of columns: n (one for each starting position)
Total cells: approximately n × log₂(n)

Example visualization:

For an array arr = [1, 3, 2, 7, 9, 11, 3, 5] with n = 8:

Sparse Table for Range Minimum Query (n=8)
k (power)	Length 2^k	table[k][0]	table[k][1]	table[k][2]	table[k][3]	table[k][4]	table[k][5]	table[k][6]	table[k][7]
0	1	1	3	2	7	9	11	3	5
1	2	1	2	2	7	9	3	3
2	4	1	2	2	3	3
3	8	1

Reading the table:

table[0][i] = arr[i] (ranges of length 1 are just individual elements)
table[1][0] = min(arr[0], arr[1]) = min(1, 3) = 1 (range [0, 1])
table[1][2] = min(arr[2], arr[3]) = min(2, 7) = 2 (range [2, 3])
table[2][0] = min(arr[0..3]) = min(1, 3, 2, 7) = 1 (range [0, 3])
table[3][0] = min(arr[0..7]) = 1 (entire array)

Why some cells are '-' (undefined):

For table[k][i], the range [i, i + 2^k - 1] must fit within the array. If i + 2^k > n, the cell is invalid and we don't compute it. This is why higher rows have fewer valid entries.

Space Analysis

The Build Algorithm: Dynamic Programming Elegance

table[k][i] = combine(table[k-1][i], table[k-1][i + 2^(k-1)])

For minimum:

table[k][i] = min(table[k-1][i], table[k-1][i + 2^(k-1)])

Visual intuition:

The range [i, i + 2^k - 1] of length 2^k can be split into:

Left half: [i, i + 2^(k-1) - 1] of length 2^(k-1)
Right half: [i + 2^(k-1), i + 2^k - 1] of length 2^(k-1)

Since we compute rows in order of increasing k, when we compute row k, row k-1 is already complete. This is classic dynamic programming—larger subproblems depend on smaller subproblems.

sparse_table_build.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
from typing import List
import math
 
def build_sparse_table_rmq(arr: List[int]) -> List[List[int]]:
    """
    Build a Sparse Table for Range Minimum Query.
    
    Time Complexity: O(n log n)
    Space Complexity: O(n log n)
    
    Returns a 2D table where table[k][i] = min(arr[i:i+2^k])
    """
    n = len(arr)
    if n == 0:
        return []
    
    # Calculate the number of rows needed
    # We need k values from 0 to floor(log2(n))
    K = int(math.log2(n)) + 1
    
    # Initialize the table
    # table[k][i] = minimum of range starting at i, length 2^k
    table = [[0] * n for _ in range(K)]
    
    # Base case: ranges of length 1 (k = 0)
    # Each element is the minimum of its single-element range
    for i in range(n):
        table[0][i] = arr[i]
    
    # Fill the table using dynamic programming
    # For each power k from 1 to K-1
    for k in range(1, K):
        # Length of ranges at this level
        length = 1 << k  # 2^k
        
        # For each valid starting position
        # Range [i, i + 2^k - 1] must fit in array
        for i in range(n - length + 1):
            # Combine two ranges of length 2^(k-1)
            # Left half: table[k-1][i]
            # Right half: table[k-1][i + 2^(k-1)]
            left_half = table[k - 1][i]
            right_half = table[k - 1][i + (1 << (k - 1))]
            table[k][i] = min(left_half, right_half)
    
    return table
 
# Example walkthrough
arr = [1, 3, 2, 7, 9, 11, 3, 5]
table = build_sparse_table_rmq(arr)
 
# Verify table[2][1] = min of arr[1:5] = min(3, 2, 7, 9) = 2
# table[2][1] = min(table[1][1], table[1][3])
#             = min(min(3,2), min(7,9))
#             = min(2, 7) = 2 ✓

Bit Shifting for Powers of Two

The Log₂ Lookup Table: Essential for O(1) Queries

The solution: precompute a lookup table of floor(log₂(i)) for all i from 1 to n. This adds O(n) to preprocessing but makes each query's log computation a single array lookup.

Building the log₂ table:

The key insight is that ⌊log₂(i)⌋ = ⌊log₂(i/2)⌋ + 1 for i > 1. This gives us a simple linear-time algorithm:

log2_lookup.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def build_log2_table(max_n: int) -> List[int]:
    """
    Build a lookup table for floor(log2(i)) for all i from 1 to max_n.
    
    Time Complexity: O(n)
    Space Complexity: O(n)
    
    log2[i] = floor(log2(i))
    """
    log2 = [0] * (max_n + 1)
    
    # log2[1] = 0 (2^0 = 1)
    # For i > 1, use the recurrence: log2[i] = log2[i/2] + 1
    for i in range(2, max_n + 1):
        log2[i] = log2[i // 2] + 1
    
    return log2
 
# Example: build log2 table for n = 20
log2 = build_log2_table(20)
 
# Verify:
# log2[1] = 0  (2^0 = 1)
# log2[2] = 1  (2^1 = 2)
# log2[3] = 1  (2^1 = 2 <= 3 < 4 = 2^2)
# log2[4] = 2  (2^2 = 4)
# log2[5] = 2  (2^2 = 4 <= 5 < 8 = 2^3)
# log2[8] = 3  (2^3 = 8)
# log2[16] = 4 (2^4 = 16)
 
print(log2)
# [0, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4]

Why is this recurrence correct?

For any positive integer i:

If i is a power of 2: log₂(i) = log₂(i/2) + 1 (exactly doubling doubles the exponent)
If i is not a power of 2: ⌊log₂(i)⌋ = ⌊log₂(⌊i/2⌋)⌋ + 1 (halving reduces the exponent by 1 until we hit the nearest power of 2)

Integer division handles both cases correctly, giving us a simple, elegant algorithm with no floating-point concerns.

Avoid Floating-Point log2

Complete Sparse Table Build Implementation

sparse_table_complete.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
from typing import List, Tuple, Callable
import math
 
class SparseTableRMQ:
    """
    Sparse Table for Range Minimum Query (RMQ).
    
    Preprocessing: O(n log n) time and space
    Query: O(1) time
    """
    
    def __init__(self, arr: List[int]):
        """
        Build the sparse table from the input array.
        
        Args:
            arr: The input array (will not be modified)
        """
        self.n = len(arr)
        if self.n == 0:
            self.table = []
            self.log2 = []
            return
        
        # Build log2 lookup table
        self.log2 = self._build_log2_table()
        
        # Build the sparse table
        self.table = self._build_sparse_table(arr)
    
    def _build_log2_table(self) -> List[int]:
        """Build floor(log2) lookup table."""
        log2 = [0] * (self.n + 1)
        for i in range(2, self.n + 1):
            log2[i] = log2[i // 2] + 1
        return log2
    
    def _build_sparse_table(self, arr: List[int]) -> List[List[int]]:
        """Build the sparse table using dynamic programming."""
        K = self.log2[self.n] + 1
        table = [[0] * self.n for _ in range(K)]
        
        # Base case: k = 0
        for i in range(self.n):
            table[0][i] = arr[i]
        
        # DP: build each level from the previous
        for k in range(1, K):
            for i in range(self.n - (1 << k) + 1):
                table[k][i] = min(
                    table[k - 1][i],
                    table[k - 1][i + (1 << (k - 1))]
                )
        
        return table
    
    def query(self, left: int, right: int) -> int:
        """
        Query the minimum value in range [left, right] (inclusive).
        
        Args:
            left: Left index (0-based)
            right: Right index (0-based, inclusive)
        
        Returns:
            Minimum value in arr[left:right+1]
        
        Time Complexity: O(1)
        """
        length = right - left + 1
        k = self.log2[length]
        
        # Two overlapping ranges of length 2^k
        return min(
            self.table[k][left],          # [left, left + 2^k - 1]
            self.table[k][right - (1 << k) + 1]  # [right - 2^k + 1, right]
        )
 
# Demonstration
if __name__ == "__main__":
    arr = [1, 3, 2, 7, 9, 11, 3, 5]
    st = SparseTableRMQ(arr)
    
    # Test queries
    print(f"min([0, 7]) = {st.query(0, 7)}")  # Expected: 1
    print(f"min([2, 4]) = {st.query(2, 4)}")  # Expected: 2
    print(f"min([5, 7]) = {st.query(5, 7)}")  # Expected: 3
    print(f"min([3, 3]) = {st.query(3, 3)}")  # Expected: 7 (single element)

Detailed Trace-Through: Building and Querying

Let's trace through the entire process with a concrete example to solidify understanding. We'll use the array [5, 2, 8, 1, 6, 3, 9, 4] and build a Range Minimum Query Sparse Table.

Step 1: Initialize log₂ table

trace_log2.txt

Text

n = 8
 
Building log2 table:
i = 1: log2[1] = 0 (base case)
i = 2: log2[2] = log2[1] + 1 = 0 + 1 = 1
i = 3: log2[3] = log2[1] + 1 = 0 + 1 = 1
i = 4: log2[4] = log2[2] + 1 = 1 + 1 = 2
i = 5: log2[5] = log2[2] + 1 = 1 + 1 = 2
i = 6: log2[6] = log2[3] + 1 = 1 + 1 = 2
i = 7: log2[7] = log2[3] + 1 = 1 + 1 = 2
i = 8: log2[8] = log2[4] + 1 = 2 + 1 = 3
 
log2 = [0, 0, 1, 1, 2, 2, 2, 2, 3]
        ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑
        0  1  2  3  4  5  6  7  8

Step 2: Build Sparse Table

Number of rows K = log2[8] + 1 = 3 + 1 = 4

trace_build.txt

Text

Array: [5, 2, 8, 1, 6, 3, 9, 4]
        Index: 0  1  2  3  4  5  6  7
 
Row k=0 (length 1): Just copy the array
table[0] = [5, 2, 8, 1, 6, 3, 9, 4]
 
Row k=1 (length 2): Combine pairs from k=0
table[1][0] = min(table[0][0], table[0][1]) = min(5, 2) = 2
table[1][1] = min(table[0][1], table[0][2]) = min(2, 8) = 2
table[1][2] = min(table[0][2], table[0][3]) = min(8, 1) = 1
table[1][3] = min(table[0][3], table[0][4]) = min(1, 6) = 1
table[1][4] = min(table[0][4], table[0][5]) = min(6, 3) = 3
table[1][5] = min(table[0][5], table[0][6]) = min(3, 9) = 3
table[1][6] = min(table[0][6], table[0][7]) = min(9, 4) = 4
table[1] = [2, 2, 1, 1, 3, 3, 4, -]
 
Row k=2 (length 4): Combine pairs from k=1
table[2][0] = min(table[1][0], table[1][2]) = min(2, 1) = 1
table[2][1] = min(table[1][1], table[1][3]) = min(2, 1) = 1
table[2][2] = min(table[1][2], table[1][4]) = min(1, 3) = 1
table[2][3] = min(table[1][3], table[1][5]) = min(1, 3) = 1
table[2][4] = min(table[1][4], table[1][6]) = min(3, 4) = 3
table[2] = [1, 1, 1, 1, 3, -, -, -]
 
Row k=3 (length 8): Combine pairs from k=2
table[3][0] = min(table[2][0], table[2][4]) = min(1, 3) = 1
table[3] = [1, -, -, -, -, -, -, -]

Step 3: Answer Queries

Now let's trace through several queries:

trace_queries.txt

Text

Query 1: min([1, 5]) - indices 1 through 5 inclusive
  length = 5 - 1 + 1 = 5
  k = log2[5] = 2
  2^k = 4
  
  Range 1: [1, 1+4-1] = [1, 4] → table[2][1] = 1
  Range 2: [5-4+1, 5] = [2, 5] → table[2][2] = 1
  
  Answer: min(1, 1) = 1 ✓
  Verification: arr[1:6] = [2, 8, 1, 6, 3] → min = 1 ✓
 
Query 2: min([4, 7]) - indices 4 through 7 inclusive
  length = 7 - 4 + 1 = 4
  k = log2[4] = 2
  2^k = 4
  
  Range 1: [4, 4+4-1] = [4, 7] → table[2][4] = 3
  Range 2: [7-4+1, 7] = [4, 7] → table[2][4] = 3
  
  Answer: min(3, 3) = 3
  Verification: arr[4:8] = [6, 3, 9, 4] → min = 3 ✓
 
Query 3: min([2, 3]) - indices 2 through 3 inclusive
  length = 3 - 2 + 1 = 2
  k = log2[2] = 1
  2^k = 2
  
  Range 1: [2, 2+2-1] = [2, 3] → table[1][2] = 1
  Range 2: [3-2+1, 3] = [2, 3] → table[1][2] = 1
  
  Answer: min(1, 1) = 1
  Verification: arr[2:4] = [8, 1] → min = 1 ✓
 
Query 4: min([0, 0]) - single element
  length = 0 - 0 + 1 = 1
  k = log2[1] = 0
  2^k = 1
  
  Range 1: [0, 0+1-1] = [0, 0] → table[0][0] = 5
  Range 2: [0-1+1, 0] = [0, 0] → table[0][0] = 5
  
  Answer: min(5, 5) = 5
  Verification: arr[0] = 5 ✓

The Overlap Magic

Deep Dive: The Query Formula

The query formula deserves careful attention because it's the heart of the O(1) claim:

k = log2[right - left + 1]
result = min(table[k][left], table[k][right - (1 << k) + 1])

Let's break down each component:

Computing k:

right - left + 1 is the length of the query range
k = log2[length] gives the largest power k such that 2^k ≤ length
This means 2^k ≤ length < 2^(k+1)

First lookup: table[k][left]

This retrieves the minimum of [left, left + 2^k - 1]
This range has length exactly 2^k
It starts at the left edge of the query range

Second lookup: table[k][right - (1 << k) + 1]

Let's call the starting index j = right - (1 << k) + 1 = right - 2^k + 1
This retrieves the minimum of [j, j + 2^k - 1]
But j + 2^k - 1 = (right - 2^k + 1) + 2^k - 1 = right
So this range is [right - 2^k + 1, right]
It has length exactly 2^k and ends at the right edge of the query range

Why these ranges cover [left, right]:

coverage_proof.txt

Text

Claim: The union of [left, left + 2^k - 1] and [right - 2^k + 1, right] 
       equals [left, right].
 
Proof:
  Let length = right - left + 1
  We know: 2^k ≤ length < 2^(k+1)
  
  Therefore: 2^k ≤ length, which means 2^k ≤ right - left + 1
  Rearranging: left + 2^k - 1 ≤ right
  
  Similarly: 2^k ≤ right - left + 1 means right - 2^k + 1 ≥ left
  
  So:
  - Range 1: [left, left + 2^k - 1] where left + 2^k - 1 ≤ right
  - Range 2: [right - 2^k + 1, right] where right - 2^k + 1 ≥ left
  
  The ranges overlap because:
  - Range 1 ends at: left + 2^k - 1
  - Range 2 starts at: right - 2^k + 1
  
  For overlap, we need: right - 2^k + 1 ≤ left + 2^k - 1
  Simplifying: right - left + 2 ≤ 2 × 2^k = 2^(k+1)
  Since length < 2^(k+1), we have right - left + 1 < 2^(k+1)
  So right - left < 2^(k+1) - 1, meaning right - left + 2 ≤ 2^(k+1) ✓
  
  Union covers [left, right] because:
  - Range 1 covers left end
  - Range 2 covers right end
  - They overlap in between (guaranteed by math above)

The Tight Bound

Summary: The Preprocessing Power

We've now fully understood the preprocessing mechanism that powers Sparse Tables. Let's consolidate the key insights:

Key Takeaways

•Power-of-two ranges are key — By precomputing answers only for ranges of length 1, 2, 4, 8, ..., we limit preprocessing to O(n log n) while enabling O(1) queries.
•Two overlapping ranges cover any query — Any range can be covered by two power-of-two ranges from each end that meet (and overlap) in the middle.
•Dynamic programming builds the table — Each row depends only on the previous row: table[k][i] = min(table[k-1][i], table[k-1][i + 2^(k-1)]).
•Log₂ lookup table avoids floating-point — Precomputing floor(log₂(i)) ensures fast, correct exponent lookup without precision issues.
•The query formula is simple and constant-time — Just two table lookups and a min operation, regardless of range size.
•Trade-off is clear — O(n log n) preprocessing and space for O(1) queries, optimal when queries dominate.

Sparse Table Complexity Summary
Operation	Time Complexity	Space Complexity
Build log₂ table	O(n)	O(n)
Build sparse table	O(n log n)	O(n log n)
Total preprocessing	O(n log n)	O(n log n)
Single query	O(1)
Q queries (after build)	O(Q)

What's next:

Page Complete

2 / 4