Loading content...
In the previous page, we discovered a frustrating truth: searching unsorted data requires O(n) time. Check every element. No shortcuts.
But what if the data is sorted?
Here's a remarkable fact: searching 1 billion sorted elements takes at most 30 comparisons. Not 30 million. Not 30 thousand. Thirty.
This is the power of binary search—one of the most important algorithms in computer science. It transforms the impossible into the trivial.
By the end of this page, you will understand the intuition behind binary search, why sorted order enables exponential speedup, and the mathematical foundation of O(log n) complexity. This is a conceptual preview—full implementation details come in dedicated modules.
Before formal algorithms, let's think about how humans naturally search sorted data.
The phone book problem:
Imagine you have a physical phone book with millions of names, sorted alphabetically. You want to find 'Nakamura.'
Do you start at page 1 and flip through every page? Of course not. You might:
Each step eliminates half of the remaining possibilities.
When data is sorted, a single comparison tells you not just one fact ('this element isn't the target') but DIRECTION ('the target is in the left/right half'). Each comparison eliminates 50% of the remaining search space. This is exponentially more powerful than linear search.
Contrast with unsorted data:
If the phone book were shuffled randomly, checking 'Johnson' in the middle tells you nothing about where 'Nakamura' might be. It could be anywhere in the remaining pages. You'd have no choice but to check pages one by one.
Sortedness creates relationships between elements that unsorted data lacks. Each comparison leverages these relationships to eliminate massive portions of the search space.
Binary search formalizes the phone book intuition into a precise algorithm.
The algorithm:
Each iteration halves the search space. After k iterations, only n/2^k elements remain.
12345678910111213141516171819202122232425262728293031
def binary_search(arr, target): """ Search for target in sorted array. Returns index if found, -1 otherwise. Time: O(log n), Space: O(1) """ low = 0 high = len(arr) - 1 while low <= high: mid = (low + high) // 2 # Find middle index if arr[mid] == target: return mid # Found! elif target < arr[mid]: high = mid - 1 # Search left half else: low = mid + 1 # Search right half return -1 # Not found # Example: Find 23 in sorted arraysorted_arr = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]result = binary_search(sorted_arr, 23)print(f"Found at index: {result}") # Output: 5 # Trace for finding 23:# Step 1: low=0, high=9, mid=4 → arr[4]=16 < 23 → search right# Step 2: low=5, high=9, mid=7 → arr[7]=56 > 23 → search left# Step 3: low=5, high=6, mid=5 → arr[5]=23 = 23 → Found!# Total: 3 comparisons for 10 elementsBinary search ONLY works on sorted arrays. If you apply it to unsorted data, it will return incorrect results (false negatives or wrong indices). Always verify sortedness before using binary search, or ensure your data pipeline maintains sorted order.
What does O(log n) actually mean, and why is it so powerful?
The logarithm counts halvings:
log₂(n) answers the question: "How many times must I divide n by 2 to get to 1?"
Binary search halves the search space each iteration. So searching n elements takes at most log₂(n) iterations.
| Array Size (n) | Linear Search (O(n)) | Binary Search (O(log n)) | Speedup Factor |
|---|---|---|---|
| 10 | 10 | 4 | 2.5× |
| 100 | 100 | 7 | 14× |
| 1,000 | 1,000 | 10 | 100× |
| 1,000,000 | 1,000,000 | 20 | 50,000× |
| 1,000,000,000 | 1,000,000,000 | 30 | 33,000,000× |
Notice how the speedup factor grows with n. For a billion elements, binary search is 33 MILLION times faster than linear search. This isn't incremental improvement—it's a completely different scale of performance. O(log n) is why sorted data is so valuable.
Why logarithms grow so slowly:
Logarithms are the inverse of exponentials. Just as 2^30 = 1 billion (exponentially fast growth), log₂(1 billion) = 30 (logarithmically slow growth).
This means:
O(log n) scales incredibly well. You could search the entire internet's URLs (hundreds of billions) in about 38 comparisons.
Let's trace binary search visually to see the halving in action.
Example: Find 23 in [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
Initial: [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
^ ^
low=0 high=9
Step 1: mid = (0+9)/2 = 4 → arr[4] = 16
16 < 23 → target is RIGHT of mid
[2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
^ ^
low=5 high=9
(Left half eliminated: 5 elements gone)
Step 2: mid = (5+9)/2 = 7 → arr[7] = 56
56 > 23 → target is LEFT of mid
[2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
^ ^
low=5 high=6
(Right quarter eliminated: 2 more elements gone)
Step 3: mid = (5+6)/2 = 5 → arr[5] = 23
23 == 23 → FOUND at index 5!
Total comparisons: 3 (not 10 for linear search)
What about searching for 50 (not in array)?
Step 1: mid=4, arr[4]=16 < 50 → search right (low=5)
Step 2: mid=7, arr[7]=56 > 50 → search left (high=6)
Step 3: mid=5, arr[5]=23 < 50 → search right (low=6)
Step 4: mid=6, arr[6]=38 < 50 → search right (low=7)
Now low (7) > high (6) → NOT FOUND
Total comparisons: 4
Even unsuccessful searches take at most log₂(n) comparisons.
The magic is that each comparison removes ~50% of remaining elements. After 10 comparisons, you've reduced 1,024 elements to 1. After 20 comparisons, you've handled 1 million. After 30, a billion. The reduction is exponential.
Binary search isn't a free lunch. It has strict requirements that must be satisfied.
The trade-off equation:
Total cost = Sorting cost + (Number of searches × Search cost)
Linear search: 0 + (k × n) = k×n
Binary search: n log n + (k × log n) = n log n + k log n
If k (number of searches) is large, binary search wins. If k is small, the sorting overhead may not be worth it.
Binary search pays off when you search the same data multiple times. If you search once and discard the data, linear search on unsorted data may be faster (no sorting overhead). If you search thousands of times, sort once and enjoy O(log n) per search.
Binary search is famously tricky to implement correctly. Even experienced programmers make subtle errors. Here are the most common mistakes:
"Although the basic idea of binary search is comparatively straightforward, the details can be surprisingly tricky." — Donald Knuth. Even the legendary Knuth acknowledged this. The first correct binary search was published in 1946, but the first bug-free implementation in a major programming language took until 1962. Test your implementations carefully!
Binary search isn't just for finding exact values. The core idea—eliminating half the search space—applies to many problems.
Finding boundaries:
Searching on answer:
Example: First and Last Position
In [1, 2, 2, 2, 2, 3, 4], standard binary search might find any of the 2s. But what if you need:
These require modified binary search algorithms that continue searching even after finding a match, to find the boundary.
Example: Square Root
Find √x without a sqrt function. Binary search between 0 and x for the largest number whose square is ≤ x. The 'sorted array' here is the implicit sequence 0, 1, 2, 3, ..., x and the 'search' is finding the boundary where n² > x.
Binary search is more than an algorithm—it's a paradigm. Whenever you have a monotonic condition (something that changes from false to true or vice versa once), you can binary search to find the transition point. This insight unlocks dozens of advanced problems.
Here's a decision framework for choosing between linear and binary search:
| Criterion | Choose Linear Search | Choose Binary Search |
|---|---|---|
| Data is sorted? | No / Not feasible to sort | Yes / Can be sorted |
| Search frequency | One-time or rare | Frequent / Repeated |
| Data size | Small (< 100) | Large (> 100) |
| Data structure | Any (linked list OK) | Array (O(1) access) |
| Insertion frequency | High (order hard to maintain) | Low / Static data |
| Find exact match only? | Either | Yes (or variations) |
Common scenarios for binary search:
Common scenarios for linear search:
We've explored one of the most powerful ideas in algorithm design. Let's consolidate:
What's next:
We've covered access (O(1)) and search (O(n) and O(log n)). But arrays have another critical aspect: modification. What happens when you need to insert or delete elements?
Next, we explore insertion and deletion in arrays, and discover why these operations cost O(n) time—even though access is instant.
You now understand the power of binary search and O(log n) complexity. This conceptual foundation prepares you for deep dives into binary search variations, which you'll encounter in dedicated modules. For now, you know: sorted arrays + binary search = incredibly fast lookup.