Binary Search Variations - Learning Module

Loading content...

0/276

Finding Insertion Position

When the Target Doesn't Exist

In the previous pages, we focused on finding the first or last occurrence of a target that exists in the array. But what happens when the target is not present?

This isn't an error case—it's often the primary use case. Consider these scenarios:

Inserting a new value into a sorted array while maintaining sorted order
Finding how many elements are less than (or greater than) a threshold
Implementing ceiling/floor operations (smallest element ≥ target, largest ≤ target)
Range queries on data that may not contain exact boundary values

In all these cases, we need to find where a value would go if inserted—the insertion position. This page shows that you already know this algorithm: it's simply leftmost binary search, but interpreted differently when the target is missing.

What You Will Learn

By the end of this page, you will understand: (1) The insertion position is the same as the leftmost boundary, (2) How to correctly interpret the result when the target is missing, (3) The difference between insertion-stable and insertion-unstable positions, (4) Practical applications like maintaining sorted order and implementing ceiling/floor operations.

The Insertion Position Concept

The insertion position for a value target in a sorted array is the index i such that:

All elements at indices < i are < target
All elements at indices ≥ i are ≥ target
If we insert target at index i, the array remains sorted

This is exactly what leftmost binary search returns! The boundary position is the insertion position.

Visualization:

Array: [1, 3, 5, 7, 9]

Insertion position for 6:
[1, 3, 5, _, 7, 9]
         ↑
         Position 3 (between 5 and 7)

After insertion:
[1, 3, 5, 6, 7, 9]

Insertion Positions for Various Targets in [1, 3, 5, 7, 9]
Target	Insertion Position	Explanation
0	0	Smaller than all elements → insert at beginning
1	0	Equals first element → insert before or at position 0
2	1	Between 1 and 3 → insert at position 1
3	1	Equals second element → insert at position 1
4	2	Between 3 and 5 → insert at position 2
6	3	Between 5 and 7 → insert at position 3
9	4	Equals last element → insert at position 4
10	5	Greater than all → insert at end (len=5)

The Key Insight

Leftmost binary search always returns a valid insertion position, regardless of whether the target exists. When the target exists, the insertion position happens to be the index of its first occurrence. When it doesn't exist, it's where the target would go to maintain sorted order. The algorithm doesn't need to change—only the interpretation of the result.

Left vs Right Insertion Position

When the target already exists in the array, there are actually two valid insertion positions:

Left insertion position (bisect_left): Insert before all existing copies of target
Right insertion position (bisect_right): Insert after all existing copies of target

Both maintain sorted order, but they have different semantics and use cases.

left_vs_right_insertion.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
import bisect
 
arr = [1, 3, 3, 3, 5, 7]
 
# Target 3 exists multiple times
# Where should we insert a new 3?
 
# LEFT insertion: insert BEFORE existing 3s
left_pos = bisect.bisect_left(arr, 3)
print(f"Left insertion position for 3: {left_pos}")   # 1
 
# RIGHT insertion: insert AFTER existing 3s  
right_pos = bisect.bisect_right(arr, 3)
print(f"Right insertion position for 3: {right_pos}") # 4
 
# Visualizing:
# Original: [1, 3, 3, 3, 5, 7]
#              ↑     ↑
#              1     4
#           left    right
 
# After left insertion of 3:  [1, 3*, 3, 3, 3, 5, 7]  (new 3 at index 1)
# After right insertion of 3: [1, 3, 3, 3, 3*, 5, 7]  (new 3 at index 4)
 
# Both are valid sorted arrays!
# The choice depends on what behavior you want.
 
 
# When target DOESN'T exist, left and right return the SAME position
arr2 = [1, 2, 4, 5]  # No 3
print(f"bisect_left(arr2, 3) = {bisect.bisect_left(arr2, 3)}")   # 2
print(f"bisect_right(arr2, 3) = {bisect.bisect_right(arr2, 3)}") # 2
# Both agree: position 2 is where 3 would go
 
 
# Practical example: maintaining a sorted list with insort
scores = [72, 85, 90]
 
# insort_left: insert new element before equal elements (stable at front)
bisect.insort_left(scores, 85)
print(f"After insort_left(85): {scores}")  # [72, 85, 85, 90]
 
# insort_right (or just insort): insert after equal elements (stable at back)
scores2 = [72, 85, 90]
bisect.insort_right(scores2, 85)
print(f"After insort_right(85): {scores2}")  # [72, 85, 85, 90]
 
# Note: bisect.insort() is an alias for insort_right()

Left vs Right Insertion: When to Use Which
Scenario	Use Left (bisect_left)	Use Right (bisect_right)
Default insertion		✓ (preserves insertion order among equals)
Find first occurrence	✓
Find last occurrence		✓ (then subtract 1)
Count elements < target	✓
Count elements ≤ target		✓
Stable sort (preserve order of equals)		✓
Priority queue (LIFO for equals)	✓

Stability and Insertion Order

Using bisect_right (right insertion) preserves the order in which equal elements were inserted. If you insert A, B, C (all equal), they appear in that order. This is 'stable' behavior. Using bisect_left would reverse this order, with C, B, A appearing in the array. Choose based on your application's needs.

The Insertion Position Algorithm

The insertion position algorithm is identical to leftmost binary search—we're just interpreting the result as "where to insert" rather than "where is the first match."

The Crucial Difference:

When finding the first occurrence, we return -1 if the target doesn't exist. For insertion position, we always return a valid position (0 to len(arr)), because the question "where would this go?" always has an answer.

insertion_position.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
from typing import List
 
def find_insertion_position(arr: List[int], target: int) -> int:
    """
    Find the leftmost position where target could be inserted
    while maintaining sorted order.
    
    This is identical to bisect.bisect_left().
    
    Returns:
        An index i where 0 <= i <= len(arr) such that:
        - All elements at indices < i are strictly < target
        - All elements at indices >= i are >= target
        - Inserting at position i maintains sorted order
    
    Time Complexity: O(log n)
    Space Complexity: O(1)
    """
    left = 0
    right = len(arr)  # Exclusive right boundary; allows returning len(arr)
    
    while left < right:
        mid = left + (right - left) // 2
        
        if arr[mid] < target:
            left = mid + 1
        else:  # arr[mid] >= target
            right = mid
    
    # Always return left - it's always a valid insertion position
    # No need to check if target exists!
    return left
 
 
def find_right_insertion_position(arr: List[int], target: int) -> int:
    """
    Find the rightmost position where target could be inserted
    while maintaining sorted order.
    
    This is identical to bisect.bisect_right().
    """
    left = 0
    right = len(arr)
    
    while left < right:
        mid = left + (right - left) // 2
        
        if arr[mid] <= target:  # Note: <= instead of <
            left = mid + 1
        else:
            right = mid
    
    return left
 
 
# Demonstration of different interpretations
def demo_interpretations():
    arr = [1, 3, 5, 7, 9]
    
    # Case 1: Target exists
    target = 5
    pos = find_insertion_position(arr, target)
    print(f"Target {target} in {arr}")
    print(f"  Insertion position: {pos}")
    print(f"  Element at that position: {arr[pos]}")
    print(f"  Interpretation as 'first occurrence': index {pos}")
    print()
    
    # Case 2: Target doesn't exist
    target = 6
    pos = find_insertion_position(arr, target)
    print(f"Target {target} in {arr}")
    print(f"  Insertion position: {pos}")
    print(f"  Would insert between {arr[pos-1]} and {arr[pos]}")
    print(f"  After insertion: {arr[:pos] + [target] + arr[pos:]}")
    print()
    
    # Case 3: Target smaller than all
    target = 0
    pos = find_insertion_position(arr, target)
    print(f"Target {target} in {arr}")
    print(f"  Insertion position: {pos}")
    print(f"  After insertion: {[target] + arr}")
    print()
    
    # Case 4: Target larger than all
    target = 100
    pos = find_insertion_position(arr, target)
    print(f"Target {target} in {arr}")
    print(f"  Insertion position: {pos}")
    print(f"  After insertion: {arr + [target]}")
 
 
demo_interpretations()

Ceiling and Floor Operations

Two common operations built on insertion position are ceiling and floor:

Ceiling(x): The smallest element in the array that is ≥ x
Floor(x): The largest element in the array that is ≤ x

These naturally emerge from the insertion position:

Operation	Definition	How to Find
Ceiling	Smallest element ≥ x	Element at `bisect_left` position (if valid)
Floor	Largest element ≤ x	Element at `bisect_right - 1` position (if valid)

ceiling_floor.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
import bisect
from typing import List, Optional
 
def ceiling(arr: List[int], x: int) -> Optional[int]:
    """
    Find the smallest element in arr that is >= x.
    Returns None if all elements are < x.
    
    The ceiling is at the insertion position (bisect_left).
    
    Time: O(log n)
    """
    pos = bisect.bisect_left(arr, x)
    
    if pos < len(arr):
        return arr[pos]  # This element is >= x
    return None  # All elements are < x
 
 
def floor(arr: List[int], x: int) -> Optional[int]:
    """
    Find the largest element in arr that is <= x.
    Returns None if all elements are > x.
    
    The floor is just before the right insertion position.
    bisect_right gives first element > x, so floor is at pos - 1.
    
    Time: O(log n)
    """
    pos = bisect.bisect_right(arr, x)
    
    if pos > 0:
        return arr[pos - 1]  # This element is <= x
    return None  # All elements are > x
 
 
# Demonstration
arr = [10, 20, 30, 40, 50]
 
# Ceiling examples
print("Ceiling examples:")
print(f"  ceiling({arr}, 25) = {ceiling(arr, 25)}")  # 30 (smallest >= 25)
print(f"  ceiling({arr}, 30) = {ceiling(arr, 30)}")  # 30 (exact match)
print(f"  ceiling({arr}, 5) = {ceiling(arr, 5)}")    # 10 (all >= 5)
print(f"  ceiling({arr}, 55) = {ceiling(arr, 55)}")  # None (55 > all)
 
# Floor examples
print("\nFloor examples:")
print(f"  floor({arr}, 25) = {floor(arr, 25)}")  # 20 (largest <= 25)
print(f"  floor({arr}, 30) = {floor(arr, 30)}")  # 30 (exact match)
print(f"  floor({arr}, 55) = {floor(arr, 55)}")  # 50 (all <= 55)
print(f"  floor({arr}, 5) = {floor(arr, 5)}")    # None (5 < all)
 
 
# Alternative implementations without bisect
def ceiling_from_scratch(arr: List[int], x: int) -> Optional[int]:
    """Ceiling using manual binary search."""
    left = 0
    right = len(arr)
    
    while left < right:
        mid = left + (right - left) // 2
        if arr[mid] < x:
            left = mid + 1
        else:
            right = mid
    
    if left < len(arr):
        return arr[left]
    return None
 
 
def floor_from_scratch(arr: List[int], x: int) -> Optional[int]:
    """Floor using manual binary search."""
    left = 0
    right = len(arr)
    
    while left < right:
        mid = left + (right - left) // 2
        if arr[mid] <= x:
            left = mid + 1
        else:
            right = mid
    
    if left > 0:
        return arr[left - 1]
    return None
 
 
# Verify implementations match
print("\nVerification against bisect-based implementation:")
test_values = [5, 10, 15, 25, 30, 45, 50, 55]
for val in test_values:
    assert ceiling(arr, val) == ceiling_from_scratch(arr, val), f"Ceiling mismatch for {val}"
    assert floor(arr, val) == floor_from_scratch(arr, val), f"Floor mismatch for {val}"
print("All tests passed!")

Practical Uses of Ceiling and Floor

Ceiling is useful for: finding the next available time slot ≥ requested time, the smallest price ≥ budget, or the first version ≥ target.

Floor is useful for: finding the most recent event before a timestamp, the largest denomination ≤ remaining amount, or the latest version ≤ target.

Counting Applications

Insertion positions directly translate to counting queries. This is because the insertion position tells us how many elements are "before" our target in the sorted order.

Counting Queries Using Insertion Position
Query	Formula	Explanation
Count < x	bisect_left(arr, x)	Elements before insertion point
Count ≤ x	bisect_right(arr, x)	Elements before right insertion point
Count > x	len(arr) - bisect_right(arr, x)	Elements after right insertion point
Count ≥ x	len(arr) - bisect_left(arr, x)	Elements at or after insertion point
Count == x	bisect_right(arr, x) - bisect_left(arr, x)	Between left and right insertion
Count in [a, b]	bisect_right(arr, b) - bisect_left(arr, a)	Range query

counting_queries.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
import bisect
from typing import List
 
class SortedCounter:
    """
    A class that maintains a sorted list and answers counting queries in O(log n).
    """
    def __init__(self, arr: List[int]):
        self.arr = sorted(arr)
    
    def count_less_than(self, x: int) -> int:
        """Count elements strictly less than x."""
        return bisect.bisect_left(self.arr, x)
    
    def count_less_or_equal(self, x: int) -> int:
        """Count elements less than or equal to x."""
        return bisect.bisect_right(self.arr, x)
    
    def count_greater_than(self, x: int) -> int:
        """Count elements strictly greater than x."""
        return len(self.arr) - bisect.bisect_right(self.arr, x)
    
    def count_greater_or_equal(self, x: int) -> int:
        """Count elements greater than or equal to x."""
        return len(self.arr) - bisect.bisect_left(self.arr, x)
    
    def count_equal(self, x: int) -> int:
        """Count elements equal to x."""
        return bisect.bisect_right(self.arr, x) - bisect.bisect_left(self.arr, x)
    
    def count_in_range(self, low: int, high: int) -> int:
        """Count elements in [low, high] (inclusive)."""
        return bisect.bisect_right(self.arr, high) - bisect.bisect_left(self.arr, low)
 
 
# Example usage
scores = SortedCounter([55, 60, 65, 70, 70, 75, 80, 85, 90, 95, 100])
 
print("Score distribution analysis:")
print(f"  Scores: {scores.arr}")
print(f"  Failed (< 60): {scores.count_less_than(60)}")              # 1
print(f"  Passed (>= 60): {scores.count_greater_or_equal(60)}")      # 10
print(f"  Grade A (>= 90): {scores.count_greater_or_equal(90)}")     # 3
print(f"  Grade B [80, 90): {scores.count_in_range(80, 89)}")        # 2
print(f"  Exact 70: {scores.count_equal(70)}")                        # 2
 
 
# Practical example: Rank in a sorted list
def find_rank(arr: List[int], x: int) -> int:
    """
    Find the rank of x in a sorted array.
    Rank is 1-indexed (1st, 2nd, 3rd...).
    
    Rank = count of elements less than x + 1
    """
    return bisect.bisect_left(arr, x) + 1
 
 
# Example: Leaderboard ranking
leaderboard = [100, 200, 300, 400, 500, 600, 700, 800, 900, 1000]
player_score = 550
 
rank = find_rank(leaderboard, player_score)
print(f"\nPlayer with score {player_score} would rank #{rank} out of {len(leaderboard)+1}")
# Rank = 6 (would be 6th if inserted, with 5 scores below)
 
 
# Percentile calculation
def percentile(arr: List[int], x: int) -> float:
    """
    Calculate the percentile rank of x in a sorted array.
    Returns what percentage of values are <= x.
    """
    if not arr:
        return 0.0
    count_leq = bisect.bisect_right(arr, x)
    return (count_leq / len(arr)) * 100
 
 
print(f"Score {player_score} is in the {percentile(leaderboard, player_score):.1f}th percentile")

Maintaining Sorted Order in Dynamic Data

One of the most practical applications of insertion position is maintaining a sorted collection as new elements arrive. This pattern appears in:

Real-time leaderboards
Event logs with sorted timestamps
Order books in trading systems
Sliding window problems

sorted_container.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
import bisect
from typing import List, Optional
 
class SortedList:
    """
    A list that maintains sorted order.
    Insertions: O(log n) to find position + O(n) to shift (total O(n))
    Lookups: O(log n)
    
    Note: For truly efficient O(log n) insertions, use a balanced BST
    or Python's sortedcontainers.SortedList. This implementation
    demonstrates the binary search aspect.
    """
    def __init__(self):
        self._data: List[int] = []
    
    def insert(self, value: int) -> int:
        """
        Insert value into sorted position.
        Returns the insertion index.
        """
        pos = bisect.bisect_left(self._data, value)
        self._data.insert(pos, value)  # O(n) shift
        return pos
    
    def remove(self, value: int) -> bool:
        """
        Remove first occurrence of value.
        Returns True if found and removed.
        """
        pos = bisect.bisect_left(self._data, value)
        if pos < len(self._data) and self._data[pos] == value:
            self._data.pop(pos)  # O(n) shift
            return True
        return False
    
    def __contains__(self, value: int) -> bool:
        """Check if value exists. O(log n)"""
        pos = bisect.bisect_left(self._data, value)
        return pos < len(self._data) and self._data[pos] == value
    
    def __len__(self) -> int:
        return len(self._data)
    
    def __repr__(self) -> str:
        return f"SortedList({self._data})"
    
    def get_all(self) -> List[int]:
        return self._data.copy()
 
 
# Demo: Real-time leaderboard
class Leaderboard:
    """
    A leaderboard that maintains high scores in descending order.
    """
    def __init__(self, max_entries: int = 10):
        self._scores: List[int] = []  # Stored in ascending order
        self._max = max_entries
    
    def add_score(self, score: int) -> Optional[int]:
        """
        Add a score to the leaderboard.
        Returns the rank (1-indexed) if score makes the leaderboard, else None.
        """
        # Find position (ascending order, so we look from the other end)
        pos = bisect.bisect_left(self._scores, score)
        
        # For descending leaderboard, rank = len - pos
        if pos < self._max or len(self._scores) < self._max:
            bisect.insort_left(self._scores, score)
            
            # Trim to max entries (remove lowest if over limit)
            if len(self._scores) > self._max:
                self._scores.pop(0)  # Remove smallest (lowest rank)
            
            # Return rank (1-indexed, descending)
            return len(self._scores) - bisect.bisect_left(self._scores, score)
        
        return None  # Didn't make the leaderboard
    
    def get_leaderboard(self) -> List[int]:
        """Return scores in descending order."""
        return self._scores[::-1]
    
    def get_rank(self, score: int) -> int:
        """Get rank for a given score (hypothetical if not on board)."""
        pos = bisect.bisect_right(self._scores, score)
        return len(self._scores) - pos + 1
 
 
# Demo
lb = Leaderboard(max_entries=5)
incoming_scores = [100, 250, 180, 300, 150, 400, 50, 350, 200, 500]
 
print("Building leaderboard:")
for score in incoming_scores:
    rank = lb.add_score(score)
    if rank:
        print(f"  Score {score} placed at rank #{rank}")
    else:
        print(f"  Score {score} didn't make the board")
 
print(f"\nFinal leaderboard: {lb.get_leaderboard()}")  # [500, 400, 350, 300, 250]

Performance Consideration

While finding the insertion position is O(log n), actually inserting into an array/list is O(n) because elements must be shifted. For truly O(log n) insertions, use a balanced binary search tree (like Python's sortedcontainers.SortedList, Java's TreeSet, or C++'s std::set). The pattern we've learned (find position via binary search) is the same—the underlying data structure changes.

Edge Cases for Insertion Position

Insertion position has fewer failure modes than first/last occurrence searches because it always returns a valid result. However, edge cases still matter for correct interpretation.

Edge Cases for Insertion Position
Edge Case	Array	Target	Left Position	Right Position	Notes
Empty array	[]	5	0	0	Only valid position is 0
Target < all	[3, 5, 7]	1	0	0	Insert at beginning
Target > all	[3, 5, 7]	9	3	3	Insert at end (len)
Target equals first	[3, 5, 7]	3	0	1	Left=before, Right=after
Target equals last	[3, 5, 7]	7	2	3	Left=before last, Right=at len
All equal	[5, 5, 5]	5	0	3	Full range difference
Single element	[5]	5	0	1	Left=0, Right=1
Between elements	[1, 5, 9]	3	1	1	Same for non-existent

Key Points:

Insertion position is always valid: It's always in range [0, len(arr)], never -1.
Position == len(arr) is valid: This means "insert at the end" because the target is greater than all elements.
Left == Right means target doesn't exist: If bisect_left(arr, x) == bisect_right(arr, x), then x is not in the array. This is a reliable existence check.
Empty array returns 0: The only valid position in an empty array is index 0.

Checking Existence via Insertion Position

A clean way to check if a target exists:

pos = bisect.bisect_left(arr, target)
exists = pos < len(arr) and arr[pos] == target

Or equivalently:

exists = bisect.bisect_left(arr, target) != bisect.bisect_right(arr, target)

The second form requires two searches but more clearly expresses "the range of equal elements is non-empty."

Summary: Finding Insertion Position

Finding the insertion position unifies several important concepts: it's the same algorithm as leftmost/rightmost boundary search, just interpreted differently for cases where the target may not exist.

Key Takeaways

•Same algorithm, different interpretation — Leftmost binary search returns the insertion position. When target exists, it's the first occurrence. When not, it's where target would go.
•Left vs Right insertion — bisect_left inserts before equal elements; bisect_right inserts after. Choose based on whether you need stable (ordered) duplicates.
•Always returns valid position — Unlike finding first occurrence, insertion position never returns -1. Range is [0, len(arr)].
•Ceiling and Floor — Ceiling(x) = element at bisect_left position (if valid). Floor(x) = element at bisect_right position - 1 (if valid).
•Counting queries are free — bisect_left gives count < x. bisect_right gives count ≤ x. Combine for any counting query.
•Foundation for dynamic sorted data — Insertion position is how you maintain sorted order in real-time systems.

Insertion Position Quick Reference
Need	Use	Result
Where to insert (before equals)	bisect_left	First position ≥ target
Where to insert (after equals)	bisect_right	First position > target
Smallest element ≥ x	arr[bisect_left(x)]	Ceiling (check bounds)
Largest element ≤ x	arr[bisect_right(x)-1]	Floor (check bounds)
Count elements < x	bisect_left(x)	Direct result
Count elements ≤ x	bisect_right(x)	Direct result
Does x exist?	bisect_left(x) != bisect_right(x)	True if exists

What's Next:

In the next page, we'll tackle Handling Duplicates Correctly—a comprehensive look at how duplicates complicate binary search and the systematic approaches to handle them. We'll see how leftmost and rightmost search are the building blocks for all duplicate-handling strategies.

Page Complete

You've mastered insertion position finding and its many applications. You understand how it relates to first/last occurrence searching, how to implement ceiling/floor operations, and how to answer counting queries in O(log n). This knowledge forms the foundation for maintaining sorted dynamic data structures.