Loading content...
The divide and conquer approach to maximum subarray splits the problem into three cases:
Cases 1 and 2 are handled by simple recursion—the same problem on smaller arrays. Case 3, however, requires special handling. It's not a recursive call; it's new logic specific to this problem. This cross-boundary handling is the intellectual heart of the algorithm and what distinguishes it from simpler D&C examples.
Master this technique, and you master a pattern that appears throughout algorithm design: handling elements that span partition boundaries.
This page develops complete mastery of cross-boundary handling. You'll understand why naive approaches fail, how to efficiently compute the maximum cross-boundary subarray, the correctness argument for the technique, and how this pattern generalizes to other D&C problems.
In many divide and conquer algorithms, the "combine" step merely concatenates or merges results. Consider merge sort: after sorting left and right halves, we merge them. The merge doesn't create new structure—it simply interleaves existing sorted sequences.
Maximum subarray is fundamentally different. When we recursively find:
...we cannot simply take max(L, R) and be done. Why not? Because the optimal answer for the whole array might be a subarray that wasn't examined by either recursive call—one that starts in the left half and ends in the right half.
Cross-boundary subarrays are NOT examined by either recursive call. The left recursion only looks at subarrays within the left half. The right recursion only looks at subarrays within the right half. Cross-boundary subarrays fall through the cracks unless we explicitly handle them.
Why can't we just enumerate all cross-boundary subarrays?
A cross-boundary subarray starts at some index i in the left half and ends at some index j in the right half. There are roughly (n/2) × (n/2) = n²/4 such pairs. Enumerating all of them would give O(n²) work at each level of recursion, yielding an overall complexity worse than our O(n²) brute force!
The key insight is that we don't need to examine all cross-boundary subarrays. We can find the best one in O(n) time using a clever observation about the structure of such subarrays.
A cross-boundary subarray must include:
This constraint forces a specific structure:
A cross-boundary subarray = (suffix of left half) + (prefix of right half)
Where:
The Decomposition Insight:
Since a cross-boundary subarray is the concatenation of a left suffix and a right prefix, and these two parts are independent of each other:
Maximum cross-boundary sum = (Maximum left suffix sum) + (Maximum right prefix sum)
This is profound! Instead of examining O(n²) combinations of starting and ending points, we can:
Total: O(n) time for the cross-boundary case.
The key insight is that choosing the best left suffix doesn't constrain our choice of right prefix (and vice versa). We're not solving a 2D optimization—we're solving two independent 1D optimizations. This decomposition from O(n²) combinations to O(n) scans is the fundamental technique.
Let's develop the algorithms for finding maximum suffix (for left half) and maximum prefix (for right half) sums.
Maximum Suffix Sum (Left Half):
We scan from the midpoint leftward to the start of the left half, maintaining a running sum. At each step, we update our maximum if the current running sum exceeds it.
Left half: [A, B, C, D] (D is adjacent to midpoint)
Start at D:
sum = D, maxSuffix = D
Extend to C:
sum = D + C, maxSuffix = max(maxSuffix, D + C)
Extend to B:
sum = D + C + B, maxSuffix = max(maxSuffix, D + C + B)
Extend to A:
sum = D + C + B + A, maxSuffix = max(maxSuffix, D + C + B + A)
Critically, we must include the element adjacent to the midpoint (D in this case) because any cross-boundary subarray must extend up to and including the midpoint.
Maximum Prefix Sum (Right Half):
We scan from the midpoint+1 rightward to the end of the right half, maintaining a running sum. At each step, we update our maximum.
Right half: [E, F, G, H] (E is adjacent to midpoint)
Start at E:
sum = E, maxPrefix = E
Extend to F:
sum = E + F, maxPrefix = max(maxPrefix, E + F)
Extend to G:
sum = E + F + G, maxPrefix = max(maxPrefix, E + F + G)
Extend to H:
sum = E + F + G + H, maxPrefix = max(maxPrefix, E + F + G + H)
Symmetrically, we must include the element adjacent to the midpoint (E in this case).
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
def find_max_crossing_subarray(arr, low, mid, high): """ Find the maximum subarray that crosses the midpoint. The subarray must include arr[mid] and arr[mid+1]. Parameters: arr: the input array low: start index of the range (inclusive) mid: midpoint index (last element of left half) high: end index of the range (inclusive) Returns: tuple: (max_left_index, max_right_index, max_sum) Time: O(high - low + 1) = O(n) for this subproblem Space: O(1) """ # Find maximum suffix of left half # Starting from mid, extend leftward left_sum = float('-inf') current_sum = 0 max_left = mid # Track index for reconstruction for i in range(mid, low - 1, -1): # mid, mid-1, ..., low current_sum += arr[i] if current_sum > left_sum: left_sum = current_sum max_left = i # Find maximum prefix of right half # Starting from mid+1, extend rightward right_sum = float('-inf') current_sum = 0 max_right = mid + 1 # Track index for reconstruction for i in range(mid + 1, high + 1): # mid+1, mid+2, ..., high current_sum += arr[i] if current_sum > right_sum: right_sum = current_sum max_right = i # Return combined result return (max_left, max_right, left_sum + right_sum) # Example walkthrougharr = [-2, 1, -3, 4, -1, 2, 1, -5, 4]# Suppose we're finding cross-boundary for full array# low=0, mid=4, high=8 result = find_max_crossing_subarray(arr, 0, 4, 8)print(f"Cross-boundary: indices {result[0]} to {result[1]}, sum = {result[2]}") # Left half suffix scan from index 4 leftward:# i=4: sum=-1, max=-1# i=3: sum=-1+4=3, max=3# i=2: sum=3-3=0, max=3 (no update)# i=1: sum=0+1=1, max=3 (no update) # i=0: sum=1-2=-1, max=3 (no update)# Best left suffix starts at index 3, sum=3 # Right half prefix scan from index 5 rightward:# i=5: sum=2, max=2# i=6: sum=2+1=3, max=3# i=7: sum=3-5=-2, max=3 (no update)# i=8: sum=-2+4=2, max=3 (no update)# Best right prefix ends at index 6, sum=3 # Total cross-boundary: 3 + 3 = 6, indices 3 to 6We track max_left and max_right indices so we can report not just the maximum sum but also the subarray that achieves it. This is often required in practice—knowing the sum alone isn't useful without knowing which elements to include.
Let's rigorously prove that our cross-boundary computation is correct—that finding the maximum left suffix and maximum right prefix independently gives us the maximum cross-boundary subarray.
Theorem: The maximum sum cross-boundary subarray equals (maximum suffix sum of left half) + (maximum prefix sum of right half).
Proof:
Claim 1: Any cross-boundary subarray is a suffix + prefix.
Let A[i..j] be a cross-boundary subarray where i ≤ mid < j.
Claim 2: The maximum cross-boundary sum is achieved by independently maximizing each part.
Let OPT = A[i*..j*] be the optimal cross-boundary subarray with sum S*.
Suppose (for contradiction) that either:
Case 1: A[i*..mid] is not the maximum suffix. Then there exists some suffix A[i'..mid] with larger sum. But then A[i'..j*] would have sum greater than S*, contradicting optimality of OPT.
Case 2: A[mid+1..j*] is not the maximum prefix. Symmetric argument leads to contradiction.
Therefore, in OPT, both parts are individually optimal, and we can find OPT by independently maximizing each part. ∎
The proof hinges on the fact that choosing a longer/shorter suffix doesn't affect which prefix is optimal (and vice versa). This independence allows us to decompose a 2D optimization into two 1D optimizations—a powerful technique that recurs throughout algorithm design.
Complexity Analysis:
Total cross-boundary computation: O(n)
This is crucial for achieving O(n log n) overall complexity—if the cross-boundary step were O(n²), the entire algorithm would be O(n²).
Now we assemble the pieces into the complete algorithm. The structure follows the canonical D&C pattern with our specialized cross-boundary handling.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108
def find_maximum_subarray(arr, low, high): """ Find maximum subarray using divide and conquer. Returns: tuple: (left_idx, right_idx, sum) for the maximum subarray Time: O(n log n) Space: O(log n) for recursion stack """ # Base case: single element if low == high: return (low, high, arr[low]) # Divide: compute midpoint mid = (low + high) // 2 # Conquer: recursively solve left and right halves left_low, left_high, left_sum = find_maximum_subarray(arr, low, mid) right_low, right_high, right_sum = find_maximum_subarray(arr, mid + 1, high) # Combine: find cross-boundary maximum cross_low, cross_high, cross_sum = find_max_crossing_subarray(arr, low, mid, high) # Return the best of three possibilities if left_sum >= right_sum and left_sum >= cross_sum: return (left_low, left_high, left_sum) elif right_sum >= left_sum and right_sum >= cross_sum: return (right_low, right_high, right_sum) else: return (cross_low, cross_high, cross_sum) def find_max_crossing_subarray(arr, low, mid, high): """Find the maximum subarray crossing the midpoint.""" # Maximum suffix of left half left_sum = float('-inf') current_sum = 0 max_left = mid for i in range(mid, low - 1, -1): current_sum += arr[i] if current_sum > left_sum: left_sum = current_sum max_left = i # Maximum prefix of right half right_sum = float('-inf') current_sum = 0 max_right = mid + 1 for i in range(mid + 1, high + 1): current_sum += arr[i] if current_sum > right_sum: right_sum = current_sum max_right = i return (max_left, max_right, left_sum + right_sum) # Wrapper for clean APIdef max_subarray_dc(arr): """Find maximum subarray sum using D&C.""" if not arr: return 0 idx_low, idx_high, max_sum = find_maximum_subarray(arr, 0, len(arr) - 1) return max_sum # Complete example with tracedef find_maximum_subarray_traced(arr, low, high, depth=0): """Traced version to understand recursion flow.""" indent = " " * depth print(f"{indent}Solving arr[{low}:{high+1}] = {arr[low:high+1]}") if low == high: print(f"{indent} Base case: return ({low}, {high}, {arr[low]})") return (low, high, arr[low]) mid = (low + high) // 2 print(f"{indent} Split at mid={mid}") left = find_maximum_subarray_traced(arr, low, mid, depth + 1) right = find_maximum_subarray_traced(arr, mid + 1, high, depth + 1) cross = find_max_crossing_subarray(arr, low, mid, high) print(f"{indent} Left={left[2]}, Right={right[2]}, Cross={cross[2]}") if left[2] >= right[2] and left[2] >= cross[2]: result = left elif right[2] >= left[2] and right[2] >= cross[2]: result = right else: result = cross print(f"{indent} Return: {result}") return result # Examplearr = [-2, 1, -3, 4, -1, 2, 1, -5, 4]print("=" * 50)print(f"Array: {arr}")print("=" * 50)result = find_maximum_subarray_traced(arr, 0, len(arr) - 1)print("=" * 50)print(f"Final answer: indices {result[0]}-{result[1]}, sum = {result[2]}")print(f"Subarray: {arr[result[0]:result[1]+1]}")Notice the clean separation: find_maximum_subarray handles the D&C skeleton (divide, conquer, combine), while find_max_crossing_subarray handles the specialized cross-boundary logic. This separation of concerns makes the code easier to understand, test, and modify.
The base case and edge cases in D&C algorithms are often sources of subtle bugs. Let's examine them carefully.
123456789101112131415161718192021
# Edge case testingdef test_edge_cases(): test_cases = [ ([], 0, "Empty array"), ([5], 5, "Single positive element"), ([-5], -5, "Single negative element"), ([0], 0, "Single zero"), ([-3, -1, -2], -1, "All negative (pick least negative)"), ([1, 2, 3], 6, "All positive (pick all)"), ([1, -1, 1, -1, 1], 1, "Alternating"), ([-1, 2, -1, 2, -1], 3, "Alternating with larger positives"), ([100], 100, "Large single element"), ([-100], -100, "Large negative single element"), ] for arr, expected, description in test_cases: result = max_subarray_dc(arr) status = "✓" if result == expected else "✗" print(f"{status} {description}: got {result}, expected {expected}") test_edge_cases()Common bugs: (1) Using 'mid' vs 'mid+1' incorrectly when splitting or scanning; (2) Inclusive vs exclusive bounds in loops; (3) Forgetting that cross-boundary MUST include at least one element from each side. Always trace through examples with size 2 and 3 arrays to catch these.
Let's trace through the algorithm on our example array to see exactly how it works.
Array: [-2, 1, -3, 4, -1, 2, 1, -5, 4] (indices 0-8)
| Call | Range | Left | Right | Cross | Result |
|---|---|---|---|---|---|
| Level 0 | [0..8] | Recurse | Recurse | Calculate | max(L,R,C) |
| Level 1 L | [0..4] | Recurse | Recurse | Calculate | max(L,R,C) |
| Level 1 R | [5..8] | Recurse | Recurse | Calculate | max(L,R,C) |
| Level 2 LL | [0..2] | Recurse | Recurse | Calculate | -1 |
| Level 2 LR | [3..4] | Recurse | Recurse | Calculate | 4 |
| Level 2 RL | [5..6] | Recurse | Recurse | Calculate | 3 |
| Level 2 RR | [7..8] | Recurse | Recurse | Calculate | 4 |
| ... base cases | ... | ... | ... | ... | ... |
Detailed trace for root call [0..8]:
Solving [0..8], mid=4
├── Left [0..4] returns max_sum = 4 (subarray [3..3] = [4])
├── Right [5..8] returns max_sum = 4 (subarray [8..8] = [4])
└── Cross:
├── Left suffix from mid=4 backward:
│ i=4: sum=-1, max=-1
│ i=3: sum=-1+4=3, max=3 ★
│ i=2: sum=3-3=0
│ i=1: sum=0+1=1
│ i=0: sum=1-2=-1
│ Best suffix: [3..4] = [4,-1], sum=3
├── Right prefix from mid+1=5 forward:
│ i=5: sum=2, max=2
│ i=6: sum=2+1=3, max=3 ★
│ i=7: sum=3-5=-2
│ i=8: sum=-2+4=2
│ Best prefix: [5..6] = [2,1], sum=3
└── Cross sum: 3 + 3 = 6
max(4, 4, 6) = 6 ★
Cross-boundary wins! Answer: [3..6] = [4,-1,2,1], sum=6
The best way to internalize this algorithm is to trace it manually on small arrays. Try arrays of length 4-6 with various patterns of positive and negative numbers. Notice how the cross-boundary computation "bridges" the gap between recursive solutions.
What's Next:
With the complete algorithm understood, we'll now analyze its complexity rigorously. Using the Master Theorem, we'll prove the O(n log n) bound and develop intuition for why this is the best we can achieve with D&C for this problem.
You now understand the heart of the D&C approach to maximum subarray: the cross-boundary handling technique. You can implement the complete algorithm, understand its correctness, and trace its execution. Next, we'll analyze why this achieves O(n log n) complexity.