Binary Search - Learning Module

Loading content...

0/276

Algorithmic Steps — Iterative and Recursive

From Strategy to Implementation

Understanding the divide-and-eliminate strategy is one thing; implementing it correctly is another. Binary search is notoriously tricky to code—surveys have shown that even experienced programmers often write incorrect implementations with subtle off-by-one errors.

This page presents two complete approaches: iterative and recursive. Both achieve the same logarithmic efficiency, but they differ in structure, resource usage, and when each is preferable. We'll develop each implementation step by step, highlighting the critical decisions that separate correct implementations from buggy ones.

What You Will Learn

By the end of this page, you will:

• Implement iterative binary search with confidence, avoiding common pitfalls • Implement recursive binary search and understand its call stack behavior • Recognize the trade-offs between iterative and recursive approaches • Write binary search variants that handle edge cases correctly • Debug binary search implementations systematically

The Iterative Approach

The iterative approach uses a loop to repeatedly narrow the search space. It's the most common implementation because it's efficient, easy to reason about, and uses constant extra space.

Core Components:

Pointers: left and right define the current search boundaries
Loop condition: Continue while the search space is non-empty (left <= right)
Midpoint calculation: Find the middle of current bounds
Comparison and update: Narrow bounds based on comparison result
Return: Index if found, -1 (or appropriate sentinel) if not found

binary_search_iterative.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
from typing import List, Optional
 
def binary_search_iterative(arr: List[int], target: int) -> int:
    """
    Standard iterative binary search.
    
    Args:
        arr: Sorted array in ascending order
        target: Value to search for
        
    Returns:
        Index of target if found, -1 otherwise
        
    Time Complexity: O(log n)
    Space Complexity: O(1)
    """
    # Edge case: empty array
    if not arr:
        return -1
    
    # Initialize search bounds
    left = 0
    right = len(arr) - 1
    
    # Continue while search space is non-empty
    while left <= right:
        # Calculate midpoint (overflow-safe)
        mid = left + (right - left) // 2
        
        # Compare and narrow bounds
        if arr[mid] == target:
            return mid  # Found!
        elif arr[mid] < target:
            # Target is in right half
            left = mid + 1
        else:
            # Target is in left half
            right = mid - 1
    
    # Target not found
    return -1
 
 
# Comprehensive testing
def test_binary_search():
    # Test 1: Normal cases
    arr = [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
    
    assert binary_search_iterative(arr, 1) == 0, "First element"
    assert binary_search_iterative(arr, 19) == 9, "Last element"
    assert binary_search_iterative(arr, 9) == 4, "Middle-ish element"
    assert binary_search_iterative(arr, 8) == -1, "Non-existent (between)"
    assert binary_search_iterative(arr, 0) == -1, "Non-existent (before all)"
    assert binary_search_iterative(arr, 20) == -1, "Non-existent (after all)"
    
    # Test 2: Edge cases
    assert binary_search_iterative([], 5) == -1, "Empty array"
    assert binary_search_iterative([5], 5) == 0, "Single element (found)"
    assert binary_search_iterative([5], 3) == -1, "Single element (not found)"
    
    # Test 3: Duplicates (returns one valid index)
    arr_dup = [1, 2, 2, 2, 3, 4, 5]
    result = binary_search_iterative(arr_dup, 2)
    assert result in [1, 2, 3], "Duplicates: should find one of them"
    
    print("All tests passed! ✓")
 
test_binary_search()

Critical Implementation Details

Three decisions that must be correct:

Loop condition: left <= right (not left < right!) ensures single-element ranges are checked
Pointer updates: left = mid + 1 and right = mid - 1 (not mid!) prevent infinite loops
Midpoint calculation: left + (right - left) // 2 prevents integer overflow

Understanding the Loop Invariant

A loop invariant is a property that remains true before and after each iteration. For binary search, the invariant provides correctness:

Invariant: If the target exists in the original array, it lies within indices [left, right] (inclusive).

Proof of correctness using the invariant:

Loop Invariant Analysis

•Initialization: Before the loop, left=0 and right=n-1. The entire array is within bounds, so if the target exists, it's in [left, right]. ✓
•Maintenance: Each iteration narrows bounds correctly. If arr[mid] < target, we know target (if it exists) must be at indices > mid, so setting left = mid + 1 maintains the invariant. Similarly for the other case. ✓
•Termination: The loop ends in two ways: (a) We find the target and return, or (b) left > right, meaning the search space is empty. If the target existed, it would be in [left, right], but that set is empty, so the target doesn't exist. ✓

Why left <= right and not left < right?

Consider a single-element search space where left = right = 5:

With left < right: Loop doesn't execute, element is never checked!
With left <= right: Loop executes, element is properly examined.

The inclusive inequality ensures we check the last remaining candidate.

Why mid + 1 and mid - 1 instead of mid?

After comparing arr[mid] with target:

If arr[mid] ≠ target, we know mid is NOT the answer
Including mid in the new range would re-examine it forever
Excluding mid (mid + 1 or mid - 1) ensures progress

The Infinite Loop Bug:

# BUGGY CODE - DON'T USE
while left <= right:
    mid = (left + right) // 2
    if arr[mid] < target:
        left = mid  # BUG: If left=mid and right=mid+1, mid never changes!
    else:
        right = mid

With left=5, right=6: mid = 5. If arr[5] < target, left stays 5, right stays 6, mid stays 5... forever.

Debugging Strategy

If your binary search loops forever, check two things:

Are you updating pointers with mid + 1 / mid - 1?
Is your loop condition correct for your pointer updates?

The combination must ensure the search space shrinks every iteration.

The Recursive Approach

The recursive approach expresses binary search as a self-referential definition: to search a range, check the middle and recursively search the appropriate half.

Recursive Structure:

Base case: Empty range (left > right) → return -1
Recursive case: Calculate mid, compare, and recurse on the appropriate half

This approach mirrors the mathematical definition of binary search and can be more intuitive for those comfortable with recursion.

binary_search_recursive.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
from typing import List
 
def binary_search_recursive(arr: List[int], target: int) -> int:
    """
    Recursive binary search (public interface).
    
    Args:
        arr: Sorted array in ascending order
        target: Value to search for
        
    Returns:
        Index of target if found, -1 otherwise
        
    Time Complexity: O(log n)
    Space Complexity: O(log n) - recursive call stack
    """
    if not arr:
        return -1
    return _binary_search_helper(arr, target, 0, len(arr) - 1)
 
 
def _binary_search_helper(
    arr: List[int], 
    target: int, 
    left: int, 
    right: int
) -> int:
    """
    Recursive helper with explicit bounds.
    
    The invariant: if target exists, it's in arr[left..right] inclusive.
    """
    # Base case: empty search space
    if left > right:
        return -1
    
    # Calculate midpoint
    mid = left + (right - left) // 2
    
    # Compare and recurse
    if arr[mid] == target:
        return mid
    elif arr[mid] < target:
        # Recurse on right half
        return _binary_search_helper(arr, target, mid + 1, right)
    else:
        # Recurse on left half
        return _binary_search_helper(arr, target, left, mid - 1)
 
 
# Tracing execution
def binary_search_recursive_traced(
    arr: List[int], 
    target: int,
    left: int = None,
    right: int = None,
    depth: int = 0
) -> int:
    """Version with execution tracing for understanding."""
    if left is None:
        left = 0
        right = len(arr) - 1
        print(f"Searching for {target} in {arr}")
        print("=" * 50)
    
    indent = "  " * depth
    print(f"{indent}Call: search([{left}..{right}], target={target})")
    
    if left > right:
        print(f"{indent}  → Base case: empty range, return -1")
        return -1
    
    mid = left + (right - left) // 2
    print(f"{indent}  mid={mid}, arr[mid]={arr[mid]}")
    
    if arr[mid] == target:
        print(f"{indent}  → Found! Return {mid}")
        return mid
    elif arr[mid] < target:
        print(f"{indent}  → {arr[mid]} < {target}, search right half")
        return binary_search_recursive_traced(arr, target, mid + 1, right, depth + 1)
    else:
        print(f"{indent}  → {arr[mid]} > {target}, search left half")
        return binary_search_recursive_traced(arr, target, left, mid - 1, depth + 1)
 
 
# Demo tracing
arr = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
binary_search_recursive_traced(arr, 23)

Call Stack Visualization:

Searching for 23 in [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]:

_helper(arr, 23, 0, 9)        # mid=4, arr[4]=16 < 23, go right
  └─ _helper(arr, 23, 5, 9)   # mid=7, arr[7]=56 > 23, go left
       └─ _helper(arr, 23, 5, 6)  # mid=5, arr[5]=23 == 23, FOUND!
            └─ returns 5
         returns 5
      returns 5

The call stack depth is O(log n), which represents additional memory usage compared to the iterative version.

Tail Recursion and Optimization

The recursive binary search is tail-recursive: the recursive call is the last operation in the function. Some languages/compilers can optimize tail recursion into iteration, eliminating stack overhead.

What is Tail Recursion?

A function is tail-recursive if the recursive call is the final action before returning. Compare:

# Tail recursive (good)
def factorial_tail(n, accumulator=1):
    if n <= 1:
        return accumulator
    return factorial_tail(n - 1, n * accumulator)  # Nothing after the call

# Not tail recursive
def factorial(n):
    if n <= 1:
        return 1
    return n * factorial(n - 1)  # Multiplication happens AFTER the call returns

Binary Search is Tail-Recursive:

Our recursive implementation directly returns the recursive call:

return _binary_search_helper(arr, target, mid + 1, right)

No computation happens after the recursive call returns—we simply pass the result up the stack.

Language Support:

Language	Tail Call Optimization
Python	No (by design)
JavaScript	Yes (ES6+, limited)
Java	No
Scala	Yes (with @tailrec)
Rust	Planned
C/C++	Often (compiler-dependent)

Python's Design Choice

Python intentionally doesn't optimize tail calls because it would obscure stack traces, making debugging harder. Guido van Rossum argued that explicit loops are clearer than relying on tail call optimization. For Python, always prefer the iterative version for performance-critical code.

tail_recursion_comparison.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
import sys
import time
 
# Increase recursion limit for testing (careful in production!)
sys.setrecursionlimit(20000)
 
def binary_search_iterative(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = left + (right - left) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1
 
def binary_search_recursive(arr, target, left=None, right=None):
    if left is None:
        left, right = 0, len(arr) - 1
    if left > right:
        return -1
    mid = left + (right - left) // 2
    if arr[mid] == target:
        return mid
    elif arr[mid] < target:
        return binary_search_recursive(arr, target, mid + 1, right)
    else:
        return binary_search_recursive(arr, target, left, mid - 1)
 
 
# Performance comparison
def benchmark():
    arr = list(range(10000000))  # 10 million elements
    target = 7777777
    
    # Iterative
    start = time.perf_counter()
    for _ in range(1000):
        binary_search_iterative(arr, target)
    iter_time = time.perf_counter() - start
    
    # Recursive (fewer iterations due to stack limit concerns)
    start = time.perf_counter()
    for _ in range(1000):
        binary_search_recursive(arr, target)
    rec_time = time.perf_counter() - start
    
    print(f"Array size: {len(arr):,}")
    print(f"Iterative: {iter_time:.4f}s for 1000 searches")
    print(f"Recursive: {rec_time:.4f}s for 1000 searches")
    print(f"Overhead: {((rec_time - iter_time) / iter_time * 100):.1f}% slower")
 
benchmark()

Iterative vs. Recursive: Trade-offs

Both implementations achieve O(log n) time complexity, but they differ in several practical aspects:

Comparison of Iterative and Recursive Binary Search
Aspect	Iterative	Recursive
Time Complexity	O(log n)	O(log n)
Space Complexity	O(1)	O(log n) stack frames
Stack Overflow Risk	None	Possible for huge arrays
Code Length	Slightly longer	Often more concise
Readability	Clear loop structure	Mirrors mathematical definition
Debugging	Easy to trace with prints	Need to track call stack
Optimization	Already optimal	Needs tail-call optimization
Functional Style	Imperative	Functional/declarative

When to Use Iterative

•Production code in most languages
•Performance-critical applications
•Very large datasets (avoiding stack overflow)
•When memory footprint matters
•Technical interviews (shows awareness of trade-offs)

When to Use Recursive

•Functional programming contexts
•Languages with tail-call optimization
•Educational purposes (clearer logic)
•When array size is bounded and small
•Consistency with other recursive algorithms

Interview Recommendation

In technical interviews, implement the iterative version first—it demonstrates awareness of space efficiency. If asked about alternatives, explain the recursive version and discuss trade-offs. This shows depth of understanding.

Common Implementation Bugs and How to Avoid Them

Binary search is infamous for subtle bugs. Research by Jon Bentley found that 90% of professional programmers couldn't write a correct binary search when given 2 hours. Let's examine the most common mistakes:

common_bugs.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# BUG 1: Wrong loop condition
def binary_search_bug1(arr, target):
    left, right = 0, len(arr) - 1
    while left < right:  # BUG: should be <=
        mid = left + (right - left) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1
 
# This fails on single-element arrays and when target is at bounds:
print(binary_search_bug1([5], 5))  # Returns -1, should return 0!
 
 
# BUG 2: Wrong pointer update (infinite loop)
def binary_search_bug2(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = left + (right - left) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid  # BUG: should be mid + 1
        else:
            right = mid  # BUG: should be mid - 1
    return -1
 
# This causes infinite loops:
# binary_search_bug2([1, 2, 3], 3)  # Hangs forever!
 
 
# BUG 3: Integer overflow (less relevant in Python, critical in Java/C++)
def binary_search_bug3(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = (left + right) // 2  # BUG in other languages!
        # For large arrays, left + right can overflow
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1
 
 
# BUG 4: Off-by-one in initial bounds
def binary_search_bug4(arr, target):
    left, right = 0, len(arr)  # BUG: right should be len(arr) - 1
    while left <= right:
        mid = left + (right - left) // 2
        if mid >= len(arr):  # Need this guard now
            return -1
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1
 
 
# BUG 5: Wrong comparison direction
def binary_search_bug5(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = left + (right - left) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] > target:  # BUG: comparison reversed!
            left = mid + 1  # Going the wrong direction
        else:
            right = mid - 1
    return -1
 
# This fails for most searches:
print(binary_search_bug5([1, 2, 3, 4, 5], 2))  # Returns -1, should return 1!
 
 
# THE CORRECT IMPLEMENTATION
def binary_search_correct(arr, target):
    left, right = 0, len(arr) - 1  # Correct bounds
    while left <= right:  # Correct condition
        mid = left + (right - left) // 2  # Overflow-safe
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:  # Correct comparison
            left = mid + 1  # Correct update
        else:
            right = mid - 1  # Correct update
    return -1

Bug Prevention Checklist

•Check loop condition: Use left <= right for inclusive bounds
•Verify pointer updates: Ensure mid + 1 and mid - 1 to prevent infinite loops
•Use safe midpoint: left + (right - left) // 2 avoids overflow
•Test edge cases: Empty array, single element, target at ends, target not present
•Match comparison with update: arr[mid] < target → left = mid + 1 (search right)

Generic Binary Search with Custom Comparators

Real-world applications often search arrays of complex objects, not just integers. A generic binary search accepts a comparator function to handle any sortable data type.

generic_binary_search.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
from typing import TypeVar, Sequence, Callable, Optional
from dataclasses import dataclass
 
T = TypeVar('T')
 
def binary_search_generic(
    arr: Sequence[T],
    target: T,
    key: Callable[[T], any] = lambda x: x,
    compare: Callable[[any, any], int] = lambda a, b: (a > b) - (a < b)
) -> int:
    """
    Generic binary search with custom key extraction and comparison.
    
    Args:
        arr: Sorted sequence
        target: Value to search for
        key: Function to extract comparison key from elements
        compare: Comparison function returning negative/zero/positive
        
    Returns:
        Index of target if found, -1 otherwise
    """
    left, right = 0, len(arr) - 1
    target_key = key(target) if not callable(target) or hasattr(target, '__class__') else target
    
    while left <= right:
        mid = left + (right - left) // 2
        mid_key = key(arr[mid])
        
        cmp = compare(mid_key, target_key)
        
        if cmp == 0:
            return mid
        elif cmp < 0:
            left = mid + 1
        else:
            right = mid - 1
    
    return -1
 
 
# Example: Searching employees by ID
@dataclass
class Employee:
    id: int
    name: str
    salary: float
 
employees = [
    Employee(101, "Alice", 75000),
    Employee(205, "Bob", 82000),
    Employee(308, "Charlie", 67000),
    Employee(412, "Diana", 91000),
    Employee(515, "Eve", 78000),
]  # Sorted by id
 
# Search by employee ID
target_id = 308
result = binary_search_generic(
    employees, 
    target_id,
    key=lambda e: e.id if isinstance(e, Employee) else e
)
print(f"Employee with ID {target_id}: {employees[result] if result != -1 else 'Not found'}")
 
 
# Example: Searching strings case-insensitively
words = ["Apple", "banana", "Cherry", "Date", "elderberry"]
# Note: This list is NOT sorted case-insensitively!
words_sorted = sorted(words, key=str.lower)  # Proper sorted order
 
result = binary_search_generic(
    words_sorted,
    "CHERRY",
    key=lambda s: s.lower()
)
print(f"Found 'CHERRY' at index: {result}")  # Works because of case-insensitive key
 
 
# Example: Searching with custom comparison (descending order)
descending_arr = [100, 90, 80, 70, 60, 50, 40, 30, 20, 10]
 
result = binary_search_generic(
    descending_arr,
    70,
    compare=lambda a, b: (b > a) - (b < a)  # Reversed comparison
)
print(f"Found 70 at index: {result}")  # 3

Key Consistency Warning

The array must be sorted using the same key and comparison as the search. Sorting by name then searching by ID will produce incorrect results. Always ensure your sort and search criteria match.

Summary: Mastering Binary Search Implementation

You now have complete implementations of binary search in both iterative and recursive forms. The key to bug-free implementations is understanding the invariants and being precise with boundary conditions.

Let's consolidate the essential points:

Key Takeaways

•The Iterative Version — Uses O(1) space, avoids stack overflow, and is the preferred implementation for most production use cases.
•The Recursive Version — Mirrors the mathematical definition, useful in functional contexts, but uses O(log n) stack space.
•Loop Invariant — 'If the target exists, it's in [left, right]'. Understanding this invariant is key to correct implementations.
•Critical Details — Use left <= right, update with mid ± 1, and calculate midpoint safely to avoid infinite loops and overflow.
•Common Bugs — Most bugs come from wrong loop conditions, wrong pointer updates, or mismatched comparisons. Test edge cases rigorously.
•Generic Implementation — Custom key and comparator functions allow searching any sorted sequence of complex objects.

What's Next:

With core implementations mastered, the final page explores binary search's time complexity in depth. We'll prove the O(log n) bound, compare it with linear search across different scales, and understand why this efficiency makes binary search indispensable.

Page Complete

You can now implement binary search correctly in both iterative and recursive forms. You understand the critical implementation details that separate working code from subtle bugs. With these skills, you're ready to explore the algorithm's complexity analysis and truly appreciate why binary search is one of the most important algorithms in computer science.