Search Space Reduction Patterns - Learning Module

Loading content...

0/279

Half-Interval Elimination

The Logarithmic Leap

We've established that eliminating impossible regions speeds up search. We've learned that invariants guarantee correctness. Now we arrive at the crucial question: How much should we eliminate per step?

The answer—eliminating half the remaining space—is not arbitrary. It represents the optimal strategy for comparison-based search, yielding the celebrated O(log n) time complexity that makes binary search so powerful.

This page explores why 'half' is magical: the mathematics behind it, the strategies to achieve it, and what happens when perfect halving isn't possible.

What You Will Learn

By the end of this page, you will understand: (1) why eliminating half the space gives O(log n) complexity, (2) the information-theoretic lower bound on comparison-based search, (3) practical techniques for achieving consistent half-elimination, and (4) how to analyze variants that don't eliminate exactly half.

The Mathematics of Halving

Let's rigorously understand why halving the search space yields logarithmic time complexity.

The Halving Sequence:

Start with n elements. After each comparison that eliminates half:

After 0 comparisons: n elements remain
After 1 comparison: n/2 elements remain
After 2 comparisons: n/4 elements remain
After k comparisons: n/2^k elements remain

Finding the Number of Steps:

We continue until 1 element remains (the answer, or confirmation of absence):

n/2^k = 1

Solving for k: 2^k = n k = log₂(n)

Thus, halving the search space per step guarantees at most ⌈log₂(n)⌉ comparisons.

Comparisons Required for Various Input Sizes
Array Size (n)	log₂(n)	Max Comparisons	For Perspective
10	3.32	4	A few items
100	6.64	7	A short list
1,000	9.97	10	About 10 comparisons
1,000,000	19.93	20	A million items, 20 comparisons
1,000,000,000	29.90	30	A billion items, 30 comparisons
10^12	39.86	40	A trillion items, 40 comparisons

The Power of Exponential Reduction:

These numbers are remarkable. Searching a billion elements requires only 30 comparisons—fewer than counting to 30! This is because each comparison divides the problem size by 2, and 2^30 ≈ 1 billion.

Contrast this with linear search:

Linear search on 1 billion elements: up to 1,000,000,000 comparisons
Binary search on 1 billion elements: at most 30 comparisons

The ratio is over 33 million times faster.

Logarithms Compress Scale

The logarithm function grows incredibly slowly. Doubling your data only adds ONE more comparison. This is why binary search scales to any practical dataset size—the difference between searching a list of 1,000 items and 1,000,000,000 items is only about 20 extra comparisons.

The Information-Theoretic Lower Bound

Is O(log n) the best possible? Can we do better than binary search for comparison-based search? The answer is no—and information theory tells us why.

The Decision Tree Argument:

Any comparison-based search algorithm can be modeled as a decision tree:

Each internal node represents a comparison
Each leaf node represents an outcome (found at index i, or not found)
The height of the tree equals the maximum number of comparisons

For n elements, we need at least n+1 distinct outcomes:

n outcomes: 'found at index 0, 1, 2, ..., n-1'
1 outcome: 'not found'

A binary tree of height h has at most 2^h leaves. To accommodate n+1 outcomes:

2^h ≥ n + 1 h ≥ log₂(n + 1) h ≥ ⌈log₂(n + 1)⌉

Information-Theoretic Interpretation:

Each comparison provides at most 1 bit of information (binary outcome: less than, or greater-than-or-equal). To distinguish among n+1 possibilities, we need log₂(n+1) bits of information.

Since each comparison provides at most 1 bit, we need at least log₂(n+1) comparisons.

Binary Search Achieves This Bound:

Binary search makes exactly ⌈log₂(n+1)⌉ comparisons in the worst case, meaning it is optimal among comparison-based search algorithms.

No amount of cleverness can beat this bound using comparisons alone. The only way to search faster is to use additional structure (hash tables, specialized data structures) or to accept probabilistic answers.

Hash Tables: Different Paradigm

Hash tables achieve O(1) average-case search by using a different approach entirely—they don't compare elements to each other. Instead, they compute a hash function and jump directly to the location. This bypasses the comparison-based lower bound because it's not comparison-based.

Achieving Consistent Halving

In practice, achieving exactly half elimination requires careful midpoint calculation. Several subtle issues can arise:

Midpoint Calculation Pitfalls

•Integer Overflow — The naive formula (left + right) / 2 can overflow when left + right exceeds the maximum integer value. In a 32-bit signed integer system, this happens when left + right > 2^31 - 1.
•Integer Division Rounding — (left + right) / 2 rounds toward zero, which is usually down for positive numbers. This affects which element becomes mid when the range has even size.
•Odd vs Even Ranges — A range of odd size splits into two parts of unequal size. A range of 5 elements [0,4] with mid=2 splits into [0,1] (2 elements) and [3,4] (2 elements). The median is handled separately.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// ❌ DANGEROUS: Can overflow when left and right are large
function naiveMidpoint(left: number, right: number): number {
    return Math.floor((left + right) / 2);  // left + right may overflow!
}
 
// ✅ SAFE: Avoids overflow by computing difference first
function safeMidpoint(left: number, right: number): number {
    // (right - left) is always ≤ right, so no overflow
    // Then we add left, which also won't overflow if right was valid
    return left + Math.floor((right - left) / 2);
}
 
// ✅ ALTERNATIVE: Using unsigned right shift (in languages that support it)
function safeMidpointBitwise(left: number, right: number): number {
    // Unsigned right shift by 1 is equivalent to division by 2
    // Works correctly even when (left + right) overflows to negative
    return (left + right) >>> 1;  // >>> is unsigned right shift in JS
}
 
// Example of the overflow problem:
// In 32-bit signed integers:
// left = 2,000,000,000
// right = 2,000,000,000
// left + right = 4,000,000,000 → overflows to negative!
// 
// Using safe method:
// right - left = 0
// left + 0/2 = 2,000,000,000 ✓

Does Perfect Halving Matter?

What if mid isn't exactly in the center? Consider a range of 10 elements where we pick mid = 3 instead of mid = 4:

Perfect halving: left part has 4 elements, right part has 5 elements
Our choice: left part has 3 elements, right part has 6 elements

The larger part is 6 vs 5—a 20% difference. Does this affect complexity?

Analysis: The key insight is that as long as we eliminate a constant fraction of the space, the complexity remains O(log n). Even eliminating just 1/10 of the space per step gives:

n → 0.9n → 0.81n → ... → 1 Steps = log₀.₉(n) = log(n) / log(0.9) = O(log n)

The constant factor changes, but the logarithmic nature is preserved.

Half Is Optimal, But Not Required

Eliminating half is optimal because it minimizes the worst-case number of comparisons. But algorithms that eliminate any constant fraction (1/3, 2/3, even 1/10) still achieve O(log n). The constant multiplier differs, but the fundamental efficiency remains.

Variations in Elimination Ratio

Not all search algorithms eliminate exactly half. Let's examine algorithms with different elimination ratios and understand their complexity:

Algorithms with Different Elimination Ratios
Algorithm	Elimination Ratio	Time Complexity	Notes
Binary Search	1/2	O(log₂ n)	Optimal for comparison-based search
Ternary Search	1/3	O(log₁.₅ n) ≈ O(1.7 × log₂ n)	Two comparisons per step; not faster than binary
Interpolation Search (uniform)	~1 - 1/√n	O(log log n) average	Exploits value distribution; degrades to O(n) worst case
Exponential Search	doubles range	O(log n)	First finds range, then binary search
Linear Search	1 element	O(n)	No structure to exploit

Why Not Ternary Search?

A common question: if halving is good, isn't thirding better? Split into three parts and potentially eliminate 2/3?

Let's count comparisons:

Binary search: 1 comparison eliminates 1/2 → ~log₂(n) comparisons total
Ternary search: 2 comparisons eliminate 2/3 → but that's 2 comparisons!

For n elements:

Binary: log₂(n) comparisons
Ternary: 2 × log₃(n) = 2 × log₂(n)/log₂(3) ≈ 1.26 × log₂(n) comparisons

Ternary search uses more comparisons! The improvement in elimination ratio doesn't compensate for the extra comparison per step.

The Lesson: When comparing algorithms, count the actual operations, not just the elimination ratio.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
/**
 * Ternary Search: Split into 3 parts, eliminate 2/3 per iteration
 * 
 * Despite eliminating 2/3 per iteration, this requires 2 comparisons
 * per iteration, making it SLOWER than binary search.
 */
function ternarySearch(arr: number[], target: number): number {
    let left = 0;
    let right = arr.length - 1;
    
    while (left <= right) {
        // Divide into three parts
        const oneThird = left + Math.floor((right - left) / 3);
        const twoThird = right - Math.floor((right - left) / 3);
        
        // COMPARISON 1
        if (arr[oneThird] === target) return oneThird;
        
        // COMPARISON 2
        if (arr[twoThird] === target) return twoThird;
        
        if (arr[oneThird] > target) {
            // Target in left third
            right = oneThird - 1;
        } else if (arr[twoThird] < target) {
            // Target in right third
            left = twoThird + 1;
        } else {
            // Target in middle third
            left = oneThird + 1;
            right = twoThird - 1;
        }
    }
    
    return -1;
}
 
// Comparison count analysis for n = 1000:
// Binary search: log₂(1000) ≈ 10 comparisons
// Ternary search: 2 × log₃(1000) ≈ 2 × 6.3 ≈ 12.6 comparisons
// Binary search wins!

When Ternary Search Shines

Ternary search isn't useless—it excels at finding extrema of unimodal functions (functions with a single peak or valley). In that context, the comparison semantics differ, and ternary search is the natural approach.

Guaranteeing Progress: Avoiding Infinite Loops

Half-interval elimination only works if the search space actually shrinks each iteration. A subtle bug can create an infinite loop where the space never decreases. Understanding this is crucial for writing correct binary search.

The Danger Zone: Two-Element Searches

Most infinite loop bugs manifest when the search space contains exactly 2 elements. Consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// BUGGY CODE: Infinite loop on certain inputs
function binarySearchBuggy(arr: number[], target: number): number {
    let left = 0;
    let right = arr.length - 1;
    
    while (left < right) {  // Note: < not <=
        const mid = left + Math.floor((right - left) / 2);
        
        if (arr[mid] < target) {
            left = mid;  // BUG: Should be mid + 1
        } else {
            right = mid;
        }
    }
    
    return arr[left] === target ? left : -1;
}
 
// Trace with arr = [1, 3], target = 3:
// 
// Iteration 1:
//   left = 0, right = 1
//   mid = 0 + floor((1-0)/2) = 0
//   arr[0] = 1 < 3, so left = mid = 0
//   After: left = 0, right = 1 (UNCHANGED!)
//
// Iteration 2:
//   left = 0, right = 1 (same as before)
//   mid = 0
//   left = 0
//   After: left = 0, right = 1 (STILL UNCHANGED!)
//
// → INFINITE LOOP: We never make progress

The Fix: Guaranteed Shrinkage

The solution is to ensure the search space shrinks every iteration. After checking mid:

If we eliminate the left portion: set left = mid + 1 (exclude mid, which was checked)
If we eliminate the right portion: set right = mid - 1 (exclude mid, which was checked)

Progress Guarantee Rules

•Rule 1: After every iteration, right - left must be strictly smaller than before
•Rule 2: If mid is checked and doesn't match, exclude it from the new range
•Rule 3: Use <= in loop condition for standard binary search to handle single-element case
•Rule 4: Test your implementation with 0, 1, 2, and 3 element arrays—most bugs appear there

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
function binarySearchCorrect(arr: number[], target: number): number {
    let left = 0;
    let right = arr.length - 1;
    
    // Loop invariant: If target exists, it's in [left, right]
    // Progress invariant: right - left decreases by at least 1 each iteration
    
    while (left <= right) {
        const mid = left + Math.floor((right - left) / 2);
        const sizeBefore = right - left;
        
        if (arr[mid] === target) {
            return mid;  // Found!
        }
        
        if (arr[mid] < target) {
            left = mid + 1;  // ✓ Exclude mid, which we checked
        } else {
            right = mid - 1;  // ✓ Exclude mid, which we checked
        }
        
        const sizeAfter = right - left;
        
        // Progress check (for illustration):
        // If we reach here, either:
        //   - left increased by at least 1, OR
        //   - right decreased by at least 1
        // So sizeBefore > sizeAfter (search space shrunk)
    }
    
    return -1;  // Exhausted search space
}
 
// Trace with arr = [1, 3], target = 3:
// 
// Iteration 1:
//   left = 0, right = 1
//   mid = 0
//   arr[0] = 1 < 3, so left = mid + 1 = 1
//   After: left = 1, right = 1 (size decreased from 1 to 0)
//
// Iteration 2:
//   left = 1, right = 1
//   mid = 1
//   arr[1] = 3 === 3 → return 1 ✓

Half-Open Interval Has Different Rules

In half-open interval style [left, right), the update for the right side is right = mid (not mid - 1) because right is already exclusive. Using mid - 1 in this style would incorrectly exclude an element. Know your interval convention!

Practical Half-Interval Patterns

Let's codify the most common half-interval elimination patterns you'll use repeatedly:

Pattern 1: Standard Binary Search (Find Exact Value)

1
2
3
4
5
6
7
8
9
10
11
12
13
function exactMatch(arr: number[], target: number): number {
    let left = 0, right = arr.length - 1;
    
    while (left <= right) {
        const mid = left + Math.floor((right - left) / 2);
        
        if (arr[mid] === target) return mid;
        if (arr[mid] < target) left = mid + 1;
        else right = mid - 1;
    }
    
    return -1;  // Not found
}

Pattern 2: Lower Bound (First Element ≥ Target)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
function lowerBound(arr: number[], target: number): number {
    let left = 0, right = arr.length;  // Note: right = length (half-open)
    
    // Invariant: All elements arr[0..left-1] are < target
    //            All elements arr[right..n-1] are >= target (or unknown)
    // We seek the boundary between these regions
    
    while (left < right) {
        const mid = left + Math.floor((right - left) / 2);
        
        if (arr[mid] < target) {
            left = mid + 1;  // arr[mid] is too small, first valid is after mid
        } else {
            right = mid;     // arr[mid] could be the answer, or answer is before
        }
    }
    
    // left == right, pointing to first element >= target (or arr.length if none)
    return left;
}

Pattern 3: Upper Bound (First Element > Target)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
function upperBound(arr: number[], target: number): number {
    let left = 0, right = arr.length;
    
    // Invariant: All elements arr[0..left-1] are <= target
    //            All elements arr[right..n-1] are > target (or unknown)
    
    while (left < right) {
        const mid = left + Math.floor((right - left) / 2);
        
        if (arr[mid] <= target) {
            left = mid + 1;  // arr[mid] is <= target, first > is after mid
        } else {
            right = mid;     // arr[mid] > target, could be the answer
        }
    }
    
    return left;  // First element > target (or arr.length if none)
}
 
// USEFUL DERIVATIONS:
// 
// Count of elements < target:  lowerBound(arr, target)
// Count of elements <= target: upperBound(arr, target)
// Count of elements == target: upperBound(arr, target) - lowerBound(arr, target)
// 
// First occurrence of target: 
//   const lb = lowerBound(arr, target);
//   return (lb < arr.length && arr[lb] === target) ? lb : -1;
// 
// Last occurrence of target:
//   const ub = upperBound(arr, target);
//   return (ub > 0 && arr[ub - 1] === target) ? ub - 1 : -1;

Master Lower/Upper Bound

These two patterns (lower_bound and upper_bound) can solve nearly every binary search variant. Finding first occurrence, last occurrence, insertion position, count in range—all reduce to these primitives. Master them thoroughly.

Visualizing Half-Interval Reduction

Let's trace a complete binary search to see half-interval elimination in action. We'll search for target = 23 in the array:

[2, 5, 8, 12, 16, 23, 38, 56, 72, 91]

Binary Search Trace: Finding 23
Step	left	right	Size	mid	arr[mid]	Action
1	0	9	10	4	16	16 < 23 → eliminate left, left = 5
2	5	9	5	7	56	56 > 23 → eliminate right, right = 6
3	5	6	2	5	23	23 = 23 → FOUND!

Size Reduction Visualization:

Step 0: [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]  ← 10 elements
         ↑                               ↑
       left=0                         right=9

Step 1: [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
         ╳  ╳  ╳  ╳   ╳   ↑               ↑
      (eliminated)     left=5          right=9
                       5 elements remain

Step 2: [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
         ╳  ╳  ╳  ╳   ╳   ↑  ↑   ╳   ╳   ╳
                       left=5 right=6
                       2 elements remain

Step 3: [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
                          ↑
                        FOUND at index 5

Size progression: 10 → 5 → 2 → 1 (found)

Each step roughly halves the remaining space. The logarithmic relationship is visible: log₂(10) ≈ 3.3, and we found the answer in 3 steps.

Comparison with Linear Search:

Linear search would examine elements in order: 2, 5, 8, 12, 16, 23—requiring 6 comparisons to find 23.

Binary search found it in 3 comparisons by strategically eliminating half the candidates at each step.

For a target at the end (like 91), linear search needs 10 comparisons. Binary search still needs only ~4. The larger the array, the more dramatic the difference.

Best Case vs Worst Case

Best case for binary search: target is exactly at mid on the first check (1 comparison). Worst case: target is in the last remaining position or doesn't exist (⌈log₂(n)⌉ comparisons). Average case: also ~log₂(n) comparisons, since finding target early is rare.

Summary: The Magic of Halving

Half-interval elimination is the engine that powers binary search's remarkable efficiency. Let's consolidate the key insights:

Key Takeaways

•Halving gives O(log n) — Eliminating half the search space per step means n elements require only log₂(n) comparisons.
•This is optimal — Information theory proves no comparison-based algorithm can do better. Binary search achieves the theoretical minimum.
•Correct midpoint calculation — Use left + (right - left) / 2 to avoid integer overflow.
•Constant fraction suffices — Eliminating any constant fraction (not just half) gives O(log n), just with a different constant factor.
•Guaranteed progress — After each iteration, the search space must shrink. Use mid ± 1 to exclude already-checked elements.
•Master lower/upper bound — These two patterns solve virtually all binary search variants. Learn them deeply.

Looking Ahead:

We've now mastered the three pillars of search space reduction: eliminating impossible regions, maintaining invariants, and half-interval elimination. In the final page, we'll synthesize these concepts into a general problem-solving framework for search problems. You'll learn to identify search problems in disguise and apply these principles to novel situations.

Page Complete

You now understand why halving is the magic behind binary search's efficiency—and why it can't be improved upon for comparison-based search. The logarithmic complexity means searching a billion elements requires only 30 comparisons. Next, we'll apply these principles to general problem-solving.