Loading content...
We've established that eliminating impossible regions speeds up search. We've learned that invariants guarantee correctness. Now we arrive at the crucial question: How much should we eliminate per step?
The answer—eliminating half the remaining space—is not arbitrary. It represents the optimal strategy for comparison-based search, yielding the celebrated O(log n) time complexity that makes binary search so powerful.
This page explores why 'half' is magical: the mathematics behind it, the strategies to achieve it, and what happens when perfect halving isn't possible.
By the end of this page, you will understand: (1) why eliminating half the space gives O(log n) complexity, (2) the information-theoretic lower bound on comparison-based search, (3) practical techniques for achieving consistent half-elimination, and (4) how to analyze variants that don't eliminate exactly half.
Let's rigorously understand why halving the search space yields logarithmic time complexity.
The Halving Sequence:
Start with n elements. After each comparison that eliminates half:
Finding the Number of Steps:
We continue until 1 element remains (the answer, or confirmation of absence):
n/2^k = 1
Solving for k: 2^k = n k = log₂(n)
Thus, halving the search space per step guarantees at most ⌈log₂(n)⌉ comparisons.
| Array Size (n) | log₂(n) | Max Comparisons | For Perspective |
|---|---|---|---|
| 10 | 3.32 | 4 | A few items |
| 100 | 6.64 | 7 | A short list |
| 1,000 | 9.97 | 10 | About 10 comparisons |
| 1,000,000 | 19.93 | 20 | A million items, 20 comparisons |
| 1,000,000,000 | 29.90 | 30 | A billion items, 30 comparisons |
| 10^12 | 39.86 | 40 | A trillion items, 40 comparisons |
The Power of Exponential Reduction:
These numbers are remarkable. Searching a billion elements requires only 30 comparisons—fewer than counting to 30! This is because each comparison divides the problem size by 2, and 2^30 ≈ 1 billion.
Contrast this with linear search:
The ratio is over 33 million times faster.
The logarithm function grows incredibly slowly. Doubling your data only adds ONE more comparison. This is why binary search scales to any practical dataset size—the difference between searching a list of 1,000 items and 1,000,000,000 items is only about 20 extra comparisons.
Is O(log n) the best possible? Can we do better than binary search for comparison-based search? The answer is no—and information theory tells us why.
The Decision Tree Argument:
Any comparison-based search algorithm can be modeled as a decision tree:
For n elements, we need at least n+1 distinct outcomes:
A binary tree of height h has at most 2^h leaves. To accommodate n+1 outcomes:
2^h ≥ n + 1 h ≥ log₂(n + 1) h ≥ ⌈log₂(n + 1)⌉
Information-Theoretic Interpretation:
Each comparison provides at most 1 bit of information (binary outcome: less than, or greater-than-or-equal). To distinguish among n+1 possibilities, we need log₂(n+1) bits of information.
Since each comparison provides at most 1 bit, we need at least log₂(n+1) comparisons.
Binary Search Achieves This Bound:
Binary search makes exactly ⌈log₂(n+1)⌉ comparisons in the worst case, meaning it is optimal among comparison-based search algorithms.
No amount of cleverness can beat this bound using comparisons alone. The only way to search faster is to use additional structure (hash tables, specialized data structures) or to accept probabilistic answers.
Hash tables achieve O(1) average-case search by using a different approach entirely—they don't compare elements to each other. Instead, they compute a hash function and jump directly to the location. This bypasses the comparison-based lower bound because it's not comparison-based.
In practice, achieving exactly half elimination requires careful midpoint calculation. Several subtle issues can arise:
(left + right) / 2 can overflow when left + right exceeds the maximum integer value. In a 32-bit signed integer system, this happens when left + right > 2^31 - 1.(left + right) / 2 rounds toward zero, which is usually down for positive numbers. This affects which element becomes mid when the range has even size.12345678910111213141516171819202122232425262728
// ❌ DANGEROUS: Can overflow when left and right are largefunction naiveMidpoint(left: number, right: number): number { return Math.floor((left + right) / 2); // left + right may overflow!} // ✅ SAFE: Avoids overflow by computing difference firstfunction safeMidpoint(left: number, right: number): number { // (right - left) is always ≤ right, so no overflow // Then we add left, which also won't overflow if right was valid return left + Math.floor((right - left) / 2);} // ✅ ALTERNATIVE: Using unsigned right shift (in languages that support it)function safeMidpointBitwise(left: number, right: number): number { // Unsigned right shift by 1 is equivalent to division by 2 // Works correctly even when (left + right) overflows to negative return (left + right) >>> 1; // >>> is unsigned right shift in JS} // Example of the overflow problem:// In 32-bit signed integers:// left = 2,000,000,000// right = 2,000,000,000// left + right = 4,000,000,000 → overflows to negative!// // Using safe method:// right - left = 0// left + 0/2 = 2,000,000,000 ✓Does Perfect Halving Matter?
What if mid isn't exactly in the center? Consider a range of 10 elements where we pick mid = 3 instead of mid = 4:
The larger part is 6 vs 5—a 20% difference. Does this affect complexity?
Analysis: The key insight is that as long as we eliminate a constant fraction of the space, the complexity remains O(log n). Even eliminating just 1/10 of the space per step gives:
n → 0.9n → 0.81n → ... → 1 Steps = log₀.₉(n) = log(n) / log(0.9) = O(log n)
The constant factor changes, but the logarithmic nature is preserved.
Eliminating half is optimal because it minimizes the worst-case number of comparisons. But algorithms that eliminate any constant fraction (1/3, 2/3, even 1/10) still achieve O(log n). The constant multiplier differs, but the fundamental efficiency remains.
Not all search algorithms eliminate exactly half. Let's examine algorithms with different elimination ratios and understand their complexity:
| Algorithm | Elimination Ratio | Time Complexity | Notes |
|---|---|---|---|
| Binary Search | 1/2 | O(log₂ n) | Optimal for comparison-based search |
| Ternary Search | 1/3 | O(log₁.₅ n) ≈ O(1.7 × log₂ n) | Two comparisons per step; not faster than binary |
| Interpolation Search (uniform) | ~1 - 1/√n | O(log log n) average | Exploits value distribution; degrades to O(n) worst case |
| Exponential Search | doubles range | O(log n) | First finds range, then binary search |
| Linear Search | 1 element | O(n) | No structure to exploit |
Why Not Ternary Search?
A common question: if halving is good, isn't thirding better? Split into three parts and potentially eliminate 2/3?
Let's count comparisons:
For n elements:
Ternary search uses more comparisons! The improvement in elimination ratio doesn't compensate for the extra comparison per step.
The Lesson: When comparing algorithms, count the actual operations, not just the elimination ratio.
1234567891011121314151617181920212223242526272829303132333435363738394041
/** * Ternary Search: Split into 3 parts, eliminate 2/3 per iteration * * Despite eliminating 2/3 per iteration, this requires 2 comparisons * per iteration, making it SLOWER than binary search. */function ternarySearch(arr: number[], target: number): number { let left = 0; let right = arr.length - 1; while (left <= right) { // Divide into three parts const oneThird = left + Math.floor((right - left) / 3); const twoThird = right - Math.floor((right - left) / 3); // COMPARISON 1 if (arr[oneThird] === target) return oneThird; // COMPARISON 2 if (arr[twoThird] === target) return twoThird; if (arr[oneThird] > target) { // Target in left third right = oneThird - 1; } else if (arr[twoThird] < target) { // Target in right third left = twoThird + 1; } else { // Target in middle third left = oneThird + 1; right = twoThird - 1; } } return -1;} // Comparison count analysis for n = 1000:// Binary search: log₂(1000) ≈ 10 comparisons// Ternary search: 2 × log₃(1000) ≈ 2 × 6.3 ≈ 12.6 comparisons// Binary search wins!Ternary search isn't useless—it excels at finding extrema of unimodal functions (functions with a single peak or valley). In that context, the comparison semantics differ, and ternary search is the natural approach.
Half-interval elimination only works if the search space actually shrinks each iteration. A subtle bug can create an infinite loop where the space never decreases. Understanding this is crucial for writing correct binary search.
The Danger Zone: Two-Element Searches
Most infinite loop bugs manifest when the search space contains exactly 2 elements. Consider:
123456789101112131415161718192021222324252627282930313233
// BUGGY CODE: Infinite loop on certain inputsfunction binarySearchBuggy(arr: number[], target: number): number { let left = 0; let right = arr.length - 1; while (left < right) { // Note: < not <= const mid = left + Math.floor((right - left) / 2); if (arr[mid] < target) { left = mid; // BUG: Should be mid + 1 } else { right = mid; } } return arr[left] === target ? left : -1;} // Trace with arr = [1, 3], target = 3:// // Iteration 1:// left = 0, right = 1// mid = 0 + floor((1-0)/2) = 0// arr[0] = 1 < 3, so left = mid = 0// After: left = 0, right = 1 (UNCHANGED!)//// Iteration 2:// left = 0, right = 1 (same as before)// mid = 0// left = 0// After: left = 0, right = 1 (STILL UNCHANGED!)//// → INFINITE LOOP: We never make progressThe Fix: Guaranteed Shrinkage
The solution is to ensure the search space shrinks every iteration. After checking mid:
left = mid + 1 (exclude mid, which was checked)right = mid - 1 (exclude mid, which was checked)right - left must be strictly smaller than before<= in loop condition for standard binary search to handle single-element case123456789101112131415161718192021222324252627282930313233343536373839404142434445
function binarySearchCorrect(arr: number[], target: number): number { let left = 0; let right = arr.length - 1; // Loop invariant: If target exists, it's in [left, right] // Progress invariant: right - left decreases by at least 1 each iteration while (left <= right) { const mid = left + Math.floor((right - left) / 2); const sizeBefore = right - left; if (arr[mid] === target) { return mid; // Found! } if (arr[mid] < target) { left = mid + 1; // ✓ Exclude mid, which we checked } else { right = mid - 1; // ✓ Exclude mid, which we checked } const sizeAfter = right - left; // Progress check (for illustration): // If we reach here, either: // - left increased by at least 1, OR // - right decreased by at least 1 // So sizeBefore > sizeAfter (search space shrunk) } return -1; // Exhausted search space} // Trace with arr = [1, 3], target = 3:// // Iteration 1:// left = 0, right = 1// mid = 0// arr[0] = 1 < 3, so left = mid + 1 = 1// After: left = 1, right = 1 (size decreased from 1 to 0)//// Iteration 2:// left = 1, right = 1// mid = 1// arr[1] = 3 === 3 → return 1 ✓In half-open interval style [left, right), the update for the right side is right = mid (not mid - 1) because right is already exclusive. Using mid - 1 in this style would incorrectly exclude an element. Know your interval convention!
Let's codify the most common half-interval elimination patterns you'll use repeatedly:
Pattern 1: Standard Binary Search (Find Exact Value)
12345678910111213
function exactMatch(arr: number[], target: number): number { let left = 0, right = arr.length - 1; while (left <= right) { const mid = left + Math.floor((right - left) / 2); if (arr[mid] === target) return mid; if (arr[mid] < target) left = mid + 1; else right = mid - 1; } return -1; // Not found}Pattern 2: Lower Bound (First Element ≥ Target)
1234567891011121314151617181920
function lowerBound(arr: number[], target: number): number { let left = 0, right = arr.length; // Note: right = length (half-open) // Invariant: All elements arr[0..left-1] are < target // All elements arr[right..n-1] are >= target (or unknown) // We seek the boundary between these regions while (left < right) { const mid = left + Math.floor((right - left) / 2); if (arr[mid] < target) { left = mid + 1; // arr[mid] is too small, first valid is after mid } else { right = mid; // arr[mid] could be the answer, or answer is before } } // left == right, pointing to first element >= target (or arr.length if none) return left;}Pattern 3: Upper Bound (First Element > Target)
1234567891011121314151617181920212223242526272829303132
function upperBound(arr: number[], target: number): number { let left = 0, right = arr.length; // Invariant: All elements arr[0..left-1] are <= target // All elements arr[right..n-1] are > target (or unknown) while (left < right) { const mid = left + Math.floor((right - left) / 2); if (arr[mid] <= target) { left = mid + 1; // arr[mid] is <= target, first > is after mid } else { right = mid; // arr[mid] > target, could be the answer } } return left; // First element > target (or arr.length if none)} // USEFUL DERIVATIONS:// // Count of elements < target: lowerBound(arr, target)// Count of elements <= target: upperBound(arr, target)// Count of elements == target: upperBound(arr, target) - lowerBound(arr, target)// // First occurrence of target: // const lb = lowerBound(arr, target);// return (lb < arr.length && arr[lb] === target) ? lb : -1;// // Last occurrence of target:// const ub = upperBound(arr, target);// return (ub > 0 && arr[ub - 1] === target) ? ub - 1 : -1;These two patterns (lower_bound and upper_bound) can solve nearly every binary search variant. Finding first occurrence, last occurrence, insertion position, count in range—all reduce to these primitives. Master them thoroughly.
Let's trace a complete binary search to see half-interval elimination in action. We'll search for target = 23 in the array:
[2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
| Step | left | right | Size | mid | arr[mid] | Action |
|---|---|---|---|---|---|---|
| 1 | 0 | 9 | 10 | 4 | 16 | 16 < 23 → eliminate left, left = 5 |
| 2 | 5 | 9 | 5 | 7 | 56 | 56 > 23 → eliminate right, right = 6 |
| 3 | 5 | 6 | 2 | 5 | 23 | 23 = 23 → FOUND! |
Size Reduction Visualization:
Step 0: [2, 5, 8, 12, 16, 23, 38, 56, 72, 91] ← 10 elements
↑ ↑
left=0 right=9
Step 1: [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
╳ ╳ ╳ ╳ ╳ ↑ ↑
(eliminated) left=5 right=9
5 elements remain
Step 2: [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
╳ ╳ ╳ ╳ ╳ ↑ ↑ ╳ ╳ ╳
left=5 right=6
2 elements remain
Step 3: [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
↑
FOUND at index 5
Size progression: 10 → 5 → 2 → 1 (found)
Each step roughly halves the remaining space. The logarithmic relationship is visible: log₂(10) ≈ 3.3, and we found the answer in 3 steps.
Comparison with Linear Search:
Linear search would examine elements in order: 2, 5, 8, 12, 16, 23—requiring 6 comparisons to find 23.
Binary search found it in 3 comparisons by strategically eliminating half the candidates at each step.
For a target at the end (like 91), linear search needs 10 comparisons. Binary search still needs only ~4. The larger the array, the more dramatic the difference.
Best case for binary search: target is exactly at mid on the first check (1 comparison). Worst case: target is in the last remaining position or doesn't exist (⌈log₂(n)⌉ comparisons). Average case: also ~log₂(n) comparisons, since finding target early is rare.
Half-interval elimination is the engine that powers binary search's remarkable efficiency. Let's consolidate the key insights:
left + (right - left) / 2 to avoid integer overflow.Looking Ahead:
We've now mastered the three pillars of search space reduction: eliminating impossible regions, maintaining invariants, and half-interval elimination. In the final page, we'll synthesize these concepts into a general problem-solving framework for search problems. You'll learn to identify search problems in disguise and apply these principles to novel situations.
You now understand why halving is the magic behind binary search's efficiency—and why it can't be improved upon for comparison-based search. The logarithmic complexity means searching a billion elements requires only 30 comparisons. Next, we'll apply these principles to general problem-solving.