Loading content...
The merge operation is the crown jewel of merge sort—the moment when two sorted halves fuse into a single, perfectly sorted whole. While the divide phase is trivial (O(1)) and the conquer phase is recursive delegation, the combine phase is where the real work happens.
Merging two sorted arrays might seem straightforward: "just combine them in order." But executing this efficiently—achieving O(n) time while maintaining stability—requires careful algorithmic design. In this page, we'll dissect the merge operation completely, understanding every comparison, every copy, and every invariant that makes it correct.
By the end of this page, you will understand the merge operation at the deepest level: the two-pointer technique, the loop invariants that guarantee correctness, handling remaining elements, the O(n) auxiliary space requirement, and how stability is preserved. You'll be able to implement merge correctly and debug any merge-related issues.
Before diving into implementation, let's precisely define the merge problem:
Given:
A with a subarray A[left..right]mid such that:
A[left..mid] is sorted (call this the "left half")A[mid+1..right] is sorted (call this the "right half")Goal:
A[left..right] so that it contains the same elements, but in sorted orderConstraints:
The merge operation is efficient precisely because its inputs are sorted. If we had two unsorted arrays, combining them would require O(n log n) sorting. But with sorted inputs, we can exploit a powerful property: the smallest unprocessed element is always at the front of one of the two halves.
Why Merge Is O(n)
The efficiency of merge comes from a simple observation:
Each comparison places one element in its final position.
With n elements total and each comparison resolving one element's position, we need at most n-1 comparisons. Adding the copies to and from the auxiliary array (both O(n)), total time is O(n).
This linear-time merge is what makes merge sort's O(n log n) total complexity possible. If merge were O(n²), the overall algorithm would be O(n² log n)—no better than simple quadratic sorts!
The merge operation uses the two-pointer technique—a fundamental algorithmic pattern that appears throughout computer science. Here's the core idea:
Visualization:
Left half: [27, 38, 43] Right half: [3, 9, 82]
↑ i ↑ j
Output: []
Compare 27 vs 3: 3 is smaller
Left half: [27, 38, 43] Right half: [3, 9, 82]
↑ i ↑ j (advanced)
Output: [3]
Compare 27 vs 9: 9 is smaller
Left half: [27, 38, 43] Right half: [3, 9, 82]
↑ i ↑ j (advanced)
Output: [3, 9]
Compare 27 vs 82: 27 is smaller
Left half: [27, 38, 43] Right half: [3, 9, 82]
↑ i (advanced) ↑ j
Output: [3, 9, 27]
... and so on until both halves are exhausted
12345678910111213141516171819202122232425262728293031323334353637
def merge(arr, left, mid, right): """ Merge two sorted subarrays arr[left..mid] and arr[mid+1..right]. Uses O(n) auxiliary space. """ # Step 1: Create copies of the two halves left_half = arr[left:mid + 1] # Copy left half right_half = arr[mid + 1:right + 1] # Copy right half # Step 2: Initialize pointers i = 0 # Pointer for left_half j = 0 # Pointer for right_half k = left # Pointer for the main array (output position) # Step 3: Main merge loop - compare and place while i < len(left_half) and j < len(right_half): if left_half[i] <= right_half[j]: # Left element is smaller (or equal - stability!) arr[k] = left_half[i] i += 1 else: # Right element is smaller arr[k] = right_half[j] j += 1 k += 1 # Step 4: Copy remaining elements from left half (if any) while i < len(left_half): arr[k] = left_half[i] i += 1 k += 1 # Step 5: Copy remaining elements from right half (if any) while j < len(right_half): arr[k] = right_half[j] j += 1 k += 1Using <= (less than or equal) instead of < is essential for stability. When elements are equal, we take from the LEFT half first. Since the left half's elements came before the right half's in the original array, this preserves their relative order.
To prove the merge operation correct, we use loop invariants—conditions that remain true before and after each loop iteration. A correct invariant, combined with proper initialization and termination, proves algorithm correctness.
Main Loop Invariant:
At the start of each iteration of the main while loop:
arr[left..k-1]contains the k-left smallest elements from the original two halves, in sorted orderleft_half[i]andright_half[j]are the smallest unprocessed elements from their respective halves- Every element in
arr[left..k-1]is ≤left_half[i]andright_half[j]
Initialization (before the loop):
Loop invariants are powerful because they reduce a complex algorithm to a simple claim that we verify three times: at initialization, through one iteration (maintenance), and at termination. If all three hold, correctness follows by induction over iterations.
Maintenance (during the loop):
Assume the invariant holds at the start of an iteration where both i < len(left_half) and j < len(right_half).
Case 1: left_half[i] <= right_half[j]
Case 2: left_half[i] > right_half[j]
Termination (after the loop):
The loop terminates when i = len(left_half) OR j = len(right_half).
| Step | i | j | k | arr state | Next action |
|---|---|---|---|---|---|
| Init | 0 | 0 | 0 | [_, _, _, _, _, _] | Compare L[0]=27 vs R[0]=3 |
| 1 | 0 | 1 | 1 | [3, _, _, _, _, _] | Compare L[0]=27 vs R[1]=9 |
| 2 | 0 | 2 | 2 | [3, 9, _, _, _, _] | Compare L[0]=27 vs R[2]=82 |
| 3 | 1 | 2 | 3 | [3, 9, 27, _, _, _] | Compare L[1]=38 vs R[2]=82 |
| 4 | 2 | 2 | 4 | [3, 9, 27, 38, _, _] | Compare L[2]=43 vs R[2]=82 |
| 5 | 3 | 2 | 5 | [3, 9, 27, 38, 43, _] | L exhausted, copy R remaining |
| 6 | 3 | 3 | 6 | [3, 9, 27, 38, 43, 82] | Done! |
After the main merge loop, one of the two halves will be exhausted while the other may still have elements. We must copy these remaining elements to complete the merge.
Key Insight: At Most One Cleanup Loop Runs
The main loop terminates when i >= len(left_half) OR j >= len(right_half). So:
Why the Remaining Elements Are Already Sorted
The remaining elements from one half are:
123456789101112131415161718192021222324252627282930
# After the main merge loop, exactly one of these runs (or neither if both exhausted) # Cleanup 1: Copy remaining elements from left halfwhile i < len(left_half): arr[k] = left_half[i] i += 1 k += 1 # Cleanup 2: Copy remaining elements from right halfwhile j < len(right_half): arr[k] = right_half[j] j += 1 k += 1 # Example scenarios:# # Scenario A: Left exhausted first# left_half = [1, 2] right_half = [3, 4, 5, 6]# After main loop: i=2, j=0, arr = [1, 2, _, _, _, _]# Right cleanup: copies 3, 4, 5, 6 → arr = [1, 2, 3, 4, 5, 6]## Scenario B: Right exhausted first # left_half = [5, 6, 7, 8] right_half = [1, 2]# After main loop: i=0, j=2, arr = [1, 2, _, _, _, _]# Left cleanup: copies 5, 6, 7, 8 → arr = [1, 2, 5, 6, 7, 8]## Scenario C: Both exhaust together (interleaved input)# left_half = [1, 3] right_half = [2, 4]# After main loop: i=2, j=2, arr = [1, 2, 3, 4]# Neither cleanup runsSome implementations use a single unified cleanup loop or a memcpy-style operation. In languages with efficient array slicing (like Python), you can use arr[k:right+1] = left_half[i:] + right_half[j:]. However, the explicit loops are clearer for understanding.
The merge operation is the source of merge sort's O(n) space complexity. Let's analyze why auxiliary space is needed and how much.
Why We Need Auxiliary Space
Consider merging in-place without extra space:
Original: [4, 5, | 1, 2]
left right
Step 1: 1 is smallest, should go to position 0
But [4, 5] are in the way!
To move 1 to position 0, we need to shift [4, 5] right
That's O(n) work just for one element!
In-place merging is possible but requires O(n²) time in the worst case, destroying the O(n log n) benefit of divide-and-conquer. The O(n) auxiliary space is a trade-off for O(n) merge time.
Using O(n) extra space to achieve O(n) merge time is a fundamental trade-off. You can have O(1) space with O(n²) time (in-place merge), or O(n) space with O(n) time. Most practical implementations choose the latter because time is typically more precious than space.
Detailed Space Analysis
Naive Implementation (new arrays at each level):
Total space = O(n) at level 0 + O(n/2 + n/2) at level 1 + ...
= O(n log n)
This is wasteful—we're using space that could be reused.
Optimized Implementation (single auxiliary array):
Total space = O(n) for one auxiliary array, reused at all levels
The key optimization: allocate one auxiliary array of size n at the start, and reuse it for all merge operations. Since merge operations at the same level don't overlap, the same auxiliary space can serve different merges.
1234567891011121314151617181920212223242526272829303132333435363738394041424344
def merge_sort_optimized(arr): """ Space-optimized merge sort using a single auxiliary array. """ n = len(arr) aux = [0] * n # Allocate once, reuse for all merges _merge_sort_helper(arr, aux, 0, n - 1) def _merge_sort_helper(arr, aux, left, right): if left >= right: return mid = left + (right - left) // 2 _merge_sort_helper(arr, aux, left, mid) _merge_sort_helper(arr, aux, mid + 1, right) _merge_with_aux(arr, aux, left, mid, right) def _merge_with_aux(arr, aux, left, mid, right): """ Merge using pre-allocated auxiliary array. """ # Copy both halves to auxiliary array for i in range(left, right + 1): aux[i] = arr[i] # Merge from aux back to arr i = left # Pointer for left half in aux j = mid + 1 # Pointer for right half in aux k = left # Pointer for output in arr while i <= mid and j <= right: if aux[i] <= aux[j]: arr[k] = aux[i] i += 1 else: arr[k] = aux[j] j += 1 k += 1 # Copy remaining (only left needs copying; right is already in place) while i <= mid: arr[k] = aux[i] i += 1 k += 1| Implementation | Auxiliary Space | Total Space | Notes |
|---|---|---|---|
| Naive (new arrays each merge) | O(n log n) | O(n log n) | Simple but wasteful |
| Single auxiliary array | O(n) | O(n) | Standard optimized approach |
| In-place merge (complex) | O(1) | O(1) | Exists but O(n²) time per merge |
| Linked list merge sort | O(log n) stack | O(log n) | No array copying needed |
Stability means that equal elements maintain their original relative order after sorting. Merge sort is stable, but only if the merge operation is implemented correctly.
Why Stability Matters
Suppose you're sorting employee records:
Original: [{name: "Bob", dept: "Sales"},
{name: "Alice", dept: "Engineering"},
{name: "Carol", dept: "Sales"}]
Sort by department:
Stable: [{name: "Bob", dept: "Sales"}, ← Bob before Carol (original order)
{name: "Carol", dept: "Sales"},
{name: "Alice", dept: "Engineering"}]
Unstable: [{name: "Carol", dept: "Sales"}, ← Carol before Bob (shuffled)
{name: "Bob", dept: "Sales"},
{name: "Alice", dept: "Engineering"}]
Stability enables multi-key sorting: sort by secondary key first, then by primary key. The stable nature preserves the secondary-key order within primary-key groups.
When comparing elements, use <= (less than or equal), not < (less than). When elements are equal, prefer the left half's element. Since elements in the left half originally came before elements in the right half, this preserves relative order.
12345678910111213141516171819202122
# STABLE merge (correct)if left_half[i] <= right_half[j]: # Use <= arr[k] = left_half[i] # Prefer left when equal i += 1else: arr[k] = right_half[j] j += 1 # UNSTABLE merge (wrong for stability)if left_half[i] < right_half[j]: # Using < instead of <= arr[k] = left_half[i] i += 1 else: arr[k] = right_half[j] # Takes right when equal! j += 1 # Example with equal elements:# left_half = [3a, 5] (3a means "3" that was originally at position a)# right_half = [3b, 7] (3b means "3" that was originally at position b)## With <=: When comparing 3a and 3b, we take 3a first → stable# With <: When comparing 3a and 3b, we take 3b first → unstableProving Stability
Let's prove merge sort is stable:
Base case: Single elements can't have unstable comparisons (nothing to compare).
Recursive case: Assume recursive calls are stable. Consider equal elements x and y where x originally appeared before y.
<=).Conclusion: In all cases, x ends up before y in the output. ✓
Production merge implementations include several optimizations beyond the basic two-pointer approach:
left_half[-1] <= right_half[0], the array is already sorted—no merge needed. This optimization makes merge sort O(n) for already-sorted input.1234567891011121314151617181920212223242526272829303132
def merge_optimized(arr, aux, left, mid, right): """ Optimized merge with early-exit and reduced copying. """ # Optimization 1: Skip merge if already sorted if arr[mid] <= arr[mid + 1]: return # Already in order, nothing to do! # Optimization 2: Only copy left half to auxiliary for i in range(left, mid + 1): aux[i] = arr[i] # Merge: compare aux[i] (left half) with arr[j] (right half in-place) i = left # Pointer in aux (left half) j = mid + 1 # Pointer in arr (right half) k = left # Output pointer in arr while i <= mid and j <= right: if aux[i] <= arr[j]: arr[k] = aux[i] i += 1 else: arr[k] = arr[j] j += 1 k += 1 # Optimization 3: Only need to copy remaining from aux # If right half has remaining, they're already in place! while i <= mid: arr[k] = aux[i] i += 1 k += 1Python's built-in sort and Java's Arrays.sort for objects use Timsort, a sophisticated merge sort variant. Timsort includes all the above optimizations plus "run" detection (finding pre-sorted subsequences) and careful tuning of thresholds. It achieves O(n) for many real-world patterns while maintaining O(n log n) worst case.
The merge operation is where most merge sort bugs occur. Here are the most common pitfalls:
mid vs mid+1 incorrectly when copying halves.123456789101112131415161718192021222324
# BUG 1: Off-by-one in half boundariesleft_half = arr[left:mid] # WRONG: misses arr[mid]left_half = arr[left:mid + 1] # CORRECT: includes arr[mid] # BUG 2: Starting output at wrong positionk = 0 # WRONG: should start at 'left', not 0k = left # CORRECT: output starts at 'left' # BUG 3: Forgetting cleanup loopswhile i < len(left_half) and j < len(right_half): # ... merge logic ... pass# Missing: copy remaining from left_half# Missing: copy remaining from right_half# Result: last elements are garbage! # BUG 4: Wrong condition in cleanupwhile i <= len(left_half): # WRONG: <= causes index out of boundswhile i < len(left_half): # CORRECT: strict < # BUG 5: Overwriting data still being read# If using in-place without auxiliary array:arr[k] = arr[i] # May overwrite arr[j] if k == j!# Solution: always use auxiliary arrayTest merge and merge sort with: empty arrays, single elements, two elements, already sorted input, reverse-sorted input, all equal elements, and alternating patterns. These cases catch most boundary bugs.
Let's consolidate the complexity analysis for the merge operation and the overall merge sort algorithm:
| Metric | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each element is copied and compared at most once |
| Comparisons | ≤ n-1 | One comparison resolves one element; n elements, n-1 max comparisons |
| Array copies | ~2n | n copies to auxiliary, n copies back (or n+k with optimization) |
| Space | O(n) | Auxiliary array to hold elements during merge |
| Metric | Best | Average | Worst | Notes |
|---|---|---|---|---|
| Time | O(n log n) | O(n log n) | O(n log n) | Guaranteed regardless of input |
| Space | O(n) | O(n) | O(n) | Dominated by auxiliary array |
| Comparisons | ~0.5n log n | ~n log n | ~n log n | Best case with sorted input optimization |
| Stability | Yes | Yes | Yes | With correct <= comparison |
Why O(n log n) Total
The recurrence relation tells the full story:
T(n) = 2T(n/2) + O(n)
= 2T(n/2) + cn (for some constant c)
Expanding:
T(n) = 2(2T(n/4) + cn/2) + cn
= 4T(n/4) + cn + cn
= 4T(n/4) + 2cn
= ...
= nT(1) + cn log₂(n)
= O(n) + O(n log n)
= O(n log n)
Each of the log₂(n) levels does O(n) work (the merges), giving O(n log n) total.
We've completed our deep dive into the merge operation—the combine phase that turns divide-and-conquer into actual sorting power.
Module Summary: Merge Sort as D&C
Across these four pages, we've examined merge sort through the divide-and-conquer lens:
Merge sort is more than a sorting algorithm—it's a template for divide-and-conquer thinking. The same split-recurse-merge pattern appears in counting inversions, closest pair of points, Karatsuba multiplication, and many more algorithms. Master merge sort, and you've mastered a fundamental algorithmic paradigm.
You now have a complete, deep understanding of merge sort as a divide-and-conquer algorithm. From the theoretical foundations to implementation details to practical optimizations, you possess the knowledge to implement, analyze, debug, and extend merge sort. This understanding serves as the foundation for the more advanced D&C algorithms we'll explore in the remaining modules of this chapter.