Loading learning content...
Binary search is often called a "divide and conquer" algorithm, but this label obscures its true nature. Unlike merge sort or quick sort, which genuinely divide problems into independent subproblems, binary search does something more elegant: it eliminates.
Each step of binary search doesn't create new subproblems to solve—it discards half of all remaining possibilities. It's not "divide and conquer"; it's divide and eliminate. This distinction matters because it reveals the algorithm's fundamental insight: we don't need to examine most of the data to find what we're looking for.
Consider searching through a telephone book with 1,000,000 names. Linear search might require examining all million entries. Binary search guarantees you'll find any name in at most 20 comparisons. This isn't magic—it's the power of strategic elimination.
By the end of this page, you will understand:
• The divide-and-eliminate strategy as the conceptual heart of binary search • How selecting a midpoint enables maximum information gain per comparison • The geometric intuition behind logarithmic efficiency • Why this strategy is optimal for sorted data search • The connection to information theory and decision trees
Let's trace the reasoning that leads to binary search.
The Naive Approach: Linear Search
Given a sorted array and a target value, linear search examines elements one by one:
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19] Target: 13
↑
Check 1: not 13
↑
Check 2: not 13
↑
Check 3: not 13
... (continue)
Each comparison eliminates exactly one element. For n elements, we might need n comparisons. But we're ignoring crucial information: the data is sorted.
The Key Question:
If we can only make one comparison before deciding which direction to search, which element should we examine to eliminate the most possibilities?
The Answer: The Middle Element
In sorted data, examining the middle element is optimal because:
Crucially, in cases 2 and 3, we eliminate half of all remaining elements with a single comparison. No other element choice guarantees as much elimination.
If we pick an element near the beginning, we might eliminate very few elements (if the target is larger). If we pick near the end, we might eliminate very few (if the target is smaller). The middle guarantees we eliminate at least half, regardless of which half contains the target.
Let's trace binary search step by step to see elimination in action.
Example: Searching for 14 in a sorted array
Initial array (16 elements):
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31]
↑
mid=15
Step 1: Compare 15 with 14
15 > 14, so target is in LEFT half
Eliminate 8 elements (indices 8-15)
[1, 3, 5, 7, 9, 11, 13, 15] | eliminated: [17, 19, 21, 23, 25, 27, 29, 31]
↑
mid=7
Step 2: Compare 7 with 14
7 < 14, so target is in RIGHT half
Eliminate 4 elements (indices 0-3)
eliminated: [1, 3, 5, 7] | [9, 11, 13, 15]
↑
mid=11
Step 3: Compare 11 with 14
11 < 14, so target is in RIGHT half
Eliminate 2 elements
[13, 15]
↑
mid=13
Step 4: Compare 13 with 14
13 < 14, so target is in RIGHT half
[15]
↑
mid=15
Step 5: Compare 15 with 14
15 > 14, left half is empty
Target NOT FOUND!
Observations:
The beauty is in the efficiency: we examined only 5 of 16 elements. For 1 million elements, we'd examine only ~20.
| Step | Elements Remaining | Eliminated This Step | Cumulative Comparisons |
|---|---|---|---|
| Start | n | 0 | 0 |
| 1 | n/2 | n/2 | 1 |
| 2 | n/4 | n/4 | 2 |
| 3 | n/8 | n/8 | 3 |
| k | n/2^k | n/2^k | k |
| log₂(n) | 1 | 1 | log₂(n) |
Each step halves the search space. Starting with n elements, after k steps we have n/2^k elements. We're done when n/2^k ≤ 1, which happens when k ≥ log₂(n). This is why binary search takes O(log n) time.
Let's formalize the divide-and-eliminate strategy mathematically.
The Recurrence Relation:
Let T(n) represent the maximum number of comparisons needed for binary search on n elements. Each step:
This gives us:
T(n) = T(n/2) + 1 T(1) = 1 (base case: one element requires one comparison)
Solving the Recurrence:
Unrolling the recurrence:
We stop when n/2^k = 1, i.e., k = log₂(n)
T(n) = T(1) + log₂(n) = 1 + log₂(n) = O(log n)
Logarithmic Growth:
Logarithmic functions grow incredibly slowly. To appreciate this:
| n (elements) | log₂(n) (comparisons) | Ratio improvement over linear |
|---|---|---|
| 10 | ~3.3 | 3x faster |
| 100 | ~6.6 | 15x faster |
| 1,000 | ~10 | 100x faster |
| 1,000,000 | ~20 | 50,000x faster |
| 1,000,000,000 | ~30 | 33,000,000x faster |
For a billion elements, binary search needs about 30 comparisons. Linear search might need a billion. The gap is astronomical.
The number of binary search steps equals the number of times you can halve n before reaching 1. This is precisely what logarithms measure: log₂(n) answers 'how many times must I multiply 2 by itself to get n?' or equivalently 'how many times can I divide n by 2?'
There's a beautiful connection between binary search and information theory that reveals why log n is optimal.
The Search as Information Game:
Before searching, the target could be at any of n positions—we have log₂(n) bits of uncertainty. Each yes/no comparison provides at most 1 bit of information. Therefore, we need at least log₂(n) comparisons to uniquely identify the target's position.
Binary Search is Optimal:
Binary search achieves this theoretical minimum. Each comparison ("Is target ≤ mid?") perfectly splits our uncertainty in half—exactly 1 bit of information per comparison. No comparison-based algorithm can do better on worst-case inputs.
1234567891011121314151617181920212223242526272829303132333435363738394041
import math def uncertainty_bits(n: int) -> float: """ Calculate bits of uncertainty for n equally likely positions. """ return math.log2(n) def remaining_uncertainty(n: int, comparisons: int) -> float: """ After k comparisons, remaining uncertainty bits. """ remaining_elements = n / (2 ** comparisons) if remaining_elements <= 1: return 0 return math.log2(remaining_elements) # Trace information gain through binary searchn = 1024 # 2^10 = 1024 elements print(f"Searching in {n} elements")print(f"Initial uncertainty: {uncertainty_bits(n):.1f} bits")print() for k in range(11): remaining = remaining_uncertainty(n, k) elements_left = max(1, n // (2 ** k)) info_gained = uncertainty_bits(n) - remaining print(f"After {k:2d} comparisons: " f"{elements_left:4d} candidates, " f"{remaining:.1f} bits uncertainty, " f"{info_gained:.1f} bits gained") if remaining == 0: print(f"✓ Position uniquely identified in {k} comparisons") print(f" (log₂({n}) = {math.log2(n):.1f})") breakDecision Tree Interpretation:
Binary search can be visualized as traversing a binary decision tree:
The depth of this tree is log₂(n), meaning any search follows a path of at most log₂(n) decisions. This is optimal: a binary tree of depth d has at most 2^d leaves, so to distinguish n positions, we need depth at least log₂(n).
Any comparison-based search algorithm on n elements requires Ω(log n) comparisons in the worst case. Binary search achieves O(log n). Therefore, binary search is asymptotically optimal for searching sorted arrays using comparisons.
Let's crystallize the divide-and-eliminate strategy into precise steps.
Core Strategy:
The Loop Invariant:
At every iteration, we maintain the invariant:
If the target exists in the array, it must be within the current search bounds [left, right]
This invariant is:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
def binary_search_explained(arr: list[int], target: int) -> int: """ Binary search with detailed tracing of the divide-and-eliminate strategy. """ left = 0 right = len(arr) - 1 step = 0 print(f"Searching for {target} in {arr}") print(f"{'='*60}") while left <= right: step += 1 mid = (left + right) // 2 current_space = arr[left:right+1] print(f"\nStep {step}:") print(f" Search space: indices [{left}, {right}] → {current_space}") print(f" Midpoint: index {mid}, value {arr[mid]}") if arr[mid] == target: print(f" Comparison: {arr[mid]} == {target} ✓ FOUND!") print(f"\nResult: Found at index {mid} in {step} comparisons") return mid elif arr[mid] < target: print(f" Comparison: {arr[mid]} < {target}") print(f" Decision: Target must be in RIGHT half") eliminated = arr[left:mid+1] print(f" Eliminated: indices [{left}, {mid}] → {eliminated}") left = mid + 1 else: print(f" Comparison: {arr[mid]} > {target}") print(f" Decision: Target must be in LEFT half") eliminated = arr[mid:right+1] print(f" Eliminated: indices [{mid}, {right}] → {eliminated}") right = mid - 1 print(f"\nStep {step + 1}:") print(f" Search space: empty (left={left} > right={right})") print(f"\nResult: Not found after {step} comparisons") return -1 # Demoarr = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91] print("\n" + "="*60)print("EXAMPLE 1: Searching for an existing element")print("="*60)binary_search_explained(arr, 23) print("\n" + "="*60)print("EXAMPLE 2: Searching for a non-existing element")print("="*60)binary_search_explained(arr, 25)Binary search always terminates because the search space shrinks by at least 1 element each iteration. When left > right, the space is empty and we return 'not found'. This guarantee holds even for empty arrays (the loop never executes, returning -1 immediately).
The midpoint calculation seems trivial: mid = (left + right) / 2. But this simple formula hides a critical bug that has plagued binary search implementations for decades.
The Overflow Problem:
Consider searching in an array with 2 billion elements (within 32-bit integer range). If left = 1,500,000,000 and right = 2,000,000,000:
left + right = 3,500,000,000
This exceeds the maximum 32-bit signed integer (~2.1 billion), causing integer overflow. The result wraps around to a negative number, producing a negative index—a catastrophic bug.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
# Python handles arbitrary-precision integers, so overflow isn't# a problem. However, understanding the issue is crucial for other languages. import sys def demonstrate_overflow_concept(): """ Demonstrates the midpoint overflow problem conceptually. In languages with fixed-width integers, this would overflow. """ # Simulating 32-bit signed integer limits INT32_MAX = 2**31 - 1 # 2,147,483,647 left = 1_500_000_000 right = 2_000_000_000 print(f"left = {left:,}") print(f"right = {right:,}") print(f"INT32_MAX = {INT32_MAX:,}") print() # Problematic approach naive_sum = left + right print(f"left + right = {naive_sum:,}") print(f"Exceeds INT32_MAX? {naive_sum > INT32_MAX}") if naive_sum > INT32_MAX: # Simulate 32-bit overflow (wrap-around) overflowed = naive_sum - 2**32 print(f"After 32-bit overflow: {overflowed:,}") print(f"Divide by 2: {overflowed // 2:,} (NEGATIVE INDEX!)") print() # Safe approach: avoid overflow safe_mid = left + (right - left) // 2 print(f"Safe calculation: left + (right - left) // 2") print(f"= {left} + ({right} - {left}) // 2") print(f"= {left} + {right - left} // 2") print(f"= {left} + {(right - left) // 2}") print(f"= {safe_mid:,}") demonstrate_overflow_concept() # THE CORRECT FORMULAS: def midpoint_safe(left: int, right: int) -> int: """ Overflow-safe midpoint calculation. Works for any language with fixed-width integers. """ return left + (right - left) // 2 def midpoint_bitwise(left: int, right: int) -> int: """ Alternative using bitwise operations. Equivalent to (left + right) // 2 but avoids overflow. Uses unsigned right shift in languages that support it. """ return (left + right) >> 1 # Safe in Python, but >>> in Java/C# # Both methods work correctlyleft, right = 1_500_000_000, 2_000_000_000print(f"\nSafe method: {midpoint_safe(left, right)}")print(f"Bitwise method: {midpoint_bitwise(left, right)}")This overflow bug existed in Java's Arrays.binarySearch() for nearly a decade before being discovered in 2006. The lesson: even 'trivial' code can hide subtle bugs. Always use the safe formula: mid = left + (right - left) / 2
Why the Safe Formula Works:
mid = left + (right - left) / 2
right - left is always ≤ right (since left ≥ 0)(right - left) / 2 is at most half of rightleft + (right - left) / 2 is at most left + right/2, which won't overflow if right didn'tThe safe formula rearranges arithmetic to keep intermediate values small.
Binary search divides by 2, but what about other division factors?
Ternary Search (Divide by 3):
Instead of one midpoint, use two: at 1/3 and 2/3 positions. Each iteration:
After k iterations: n/3^k elements remain Steps to reach 1 element: log₃(n)
Comparisons: 2 × log₃(n) ≈ 2 × (log₂(n) / log₂(3)) ≈ 1.26 × log₂(n)
That's 26% more comparisons than binary search! Ternary search is worse, not better.
| Strategy | Comparisons per Step | Reduction per Step | Total Comparisons (n=1M) |
|---|---|---|---|
| Binary (k=2) | 1 | 50% | ~20 |
| Ternary (k=3) | 2 | 67% | ~25 |
| Quaternary (k=4) | 3 | 75% | ~30 |
| Linear (k=n) | 1 | 0.0001% | ~1,000,000 |
Why Binary is Optimal:
The key insight is information per comparison:
Binary search perfectly matches the information content of yes/no comparisons.
Ternary search is useful for finding extrema of unimodal functions (one peak or valley), where comparisons aren't simple yes/no but rather 'increasing' vs 'decreasing'. In this context, its characteristics differ from searching sorted arrays.
The divide-and-eliminate strategy transforms searching from a linear scan into a logarithmic descent. By choosing the midpoint, we maximize information gain and guarantee that half the remaining possibilities are eliminated with each comparison.
Let's consolidate the key insights:
left + (right - left) / 2.What's Next:
Now that we understand the strategic foundation, we're ready to translate this insight into actual code. The next page presents both iterative and recursive implementations of binary search, analyzing their trade-offs and helping you write bug-free implementations on your first attempt.
You now understand the divide-and-eliminate strategy—the conceptual heart of binary search. This isn't just an algorithm; it's a way of thinking: every decision should maximally reduce uncertainty. With this mental model, you're ready to implement binary search with confidence.