Maximum Subarray - Learning Module

Loading content...

0/276

D&C Approach — Cross-Boundary Handling

The Heart of the Algorithm

The divide and conquer approach to maximum subarray splits the problem into three cases:

Maximum subarray lies entirely in the left half
Maximum subarray lies entirely in the right half
Maximum subarray crosses the boundary between halves

Cases 1 and 2 are handled by simple recursion—the same problem on smaller arrays. Case 3, however, requires special handling. It's not a recursive call; it's new logic specific to this problem. This cross-boundary handling is the intellectual heart of the algorithm and what distinguishes it from simpler D&C examples.

Master this technique, and you master a pattern that appears throughout algorithm design: handling elements that span partition boundaries.

What You Will Learn

This page develops complete mastery of cross-boundary handling. You'll understand why naive approaches fail, how to efficiently compute the maximum cross-boundary subarray, the correctness argument for the technique, and how this pattern generalizes to other D&C problems.

Why Cross-Boundary Cases Are Different

In many divide and conquer algorithms, the "combine" step merely concatenates or merges results. Consider merge sort: after sorting left and right halves, we merge them. The merge doesn't create new structure—it simply interleaves existing sorted sequences.

Maximum subarray is fundamentally different. When we recursively find:

Maximum subarray of left half = some value L
Maximum subarray of right half = some value R

...we cannot simply take max(L, R) and be done. Why not? Because the optimal answer for the whole array might be a subarray that wasn't examined by either recursive call—one that starts in the left half and ends in the right half.

Cross-Boundary ExampleArray: [1, -100, 50, -5, 40, -100, 1]

Input

Output

The Fundamental Challenge

Cross-boundary subarrays are NOT examined by either recursive call. The left recursion only looks at subarrays within the left half. The right recursion only looks at subarrays within the right half. Cross-boundary subarrays fall through the cracks unless we explicitly handle them.

Why can't we just enumerate all cross-boundary subarrays?

A cross-boundary subarray starts at some index i in the left half and ends at some index j in the right half. There are roughly (n/2) × (n/2) = n²/4 such pairs. Enumerating all of them would give O(n²) work at each level of recursion, yielding an overall complexity worse than our O(n²) brute force!

The key insight is that we don't need to examine all cross-boundary subarrays. We can find the best one in O(n) time using a clever observation about the structure of such subarrays.

The Structure of Cross-Boundary Subarrays

A cross-boundary subarray must include:

At least one element from the left half
At least one element from the right half
All elements must be contiguous (no gaps)

This constraint forces a specific structure:

A cross-boundary subarray = (suffix of left half) + (prefix of right half)

Where:

"Suffix of left half" means elements from some index i to the last element of the left half (the midpoint)
"Prefix of right half" means elements from the first element of the right half (midpoint + 1) to some index j

Converting Mermaid diagram...

The Decomposition Insight:

Since a cross-boundary subarray is the concatenation of a left suffix and a right prefix, and these two parts are independent of each other:

Maximum cross-boundary sum = (Maximum left suffix sum) + (Maximum right prefix sum)

This is profound! Instead of examining O(n²) combinations of starting and ending points, we can:

Find the maximum suffix sum in the left half: O(n/2)
Find the maximum prefix sum in the right half: O(n/2)
Add them together: O(1)

Total: O(n) time for the cross-boundary case.

Why Independence Works

The key insight is that choosing the best left suffix doesn't constrain our choice of right prefix (and vice versa). We're not solving a 2D optimization—we're solving two independent 1D optimizations. This decomposition from O(n²) combinations to O(n) scans is the fundamental technique.

Computing Maximum Suffix and Prefix Sums

Let's develop the algorithms for finding maximum suffix (for left half) and maximum prefix (for right half) sums.

Maximum Suffix Sum (Left Half):

We scan from the midpoint leftward to the start of the left half, maintaining a running sum. At each step, we update our maximum if the current running sum exceeds it.

Left half: [A, B, C, D]  (D is adjacent to midpoint)

Start at D:
  sum = D,        maxSuffix = D
Extend to C:
  sum = D + C,    maxSuffix = max(maxSuffix, D + C)
Extend to B:
  sum = D + C + B, maxSuffix = max(maxSuffix, D + C + B)
Extend to A:
  sum = D + C + B + A, maxSuffix = max(maxSuffix, D + C + B + A)

Critically, we must include the element adjacent to the midpoint (D in this case) because any cross-boundary subarray must extend up to and including the midpoint.

Maximum Prefix Sum (Right Half):

We scan from the midpoint+1 rightward to the end of the right half, maintaining a running sum. At each step, we update our maximum.

Right half: [E, F, G, H]  (E is adjacent to midpoint)

Start at E:
  sum = E,        maxPrefix = E
Extend to F:
  sum = E + F,    maxPrefix = max(maxPrefix, E + F)
Extend to G:
  sum = E + F + G, maxPrefix = max(maxPrefix, E + F + G)
Extend to H:
  sum = E + F + G + H, maxPrefix = max(maxPrefix, E + F + G + H)

Symmetrically, we must include the element adjacent to the midpoint (E in this case).

find_max_crossing
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
def find_max_crossing_subarray(arr, low, mid, high):
    """
    Find the maximum subarray that crosses the midpoint.
    
    The subarray must include arr[mid] and arr[mid+1].
    
    Parameters:
        arr: the input array
        low: start index of the range (inclusive)
        mid: midpoint index (last element of left half)
        high: end index of the range (inclusive)
    
    Returns:
        tuple: (max_left_index, max_right_index, max_sum)
    
    Time: O(high - low + 1) = O(n) for this subproblem
    Space: O(1)
    """
    # Find maximum suffix of left half
    # Starting from mid, extend leftward
    left_sum = float('-inf')
    current_sum = 0
    max_left = mid  # Track index for reconstruction
    
    for i in range(mid, low - 1, -1):  # mid, mid-1, ..., low
        current_sum += arr[i]
        if current_sum > left_sum:
            left_sum = current_sum
            max_left = i
    
    # Find maximum prefix of right half
    # Starting from mid+1, extend rightward
    right_sum = float('-inf')
    current_sum = 0
    max_right = mid + 1  # Track index for reconstruction
    
    for i in range(mid + 1, high + 1):  # mid+1, mid+2, ..., high
        current_sum += arr[i]
        if current_sum > right_sum:
            right_sum = current_sum
            max_right = i
    
    # Return combined result
    return (max_left, max_right, left_sum + right_sum)
 
 
# Example walkthrough
arr = [-2, 1, -3, 4, -1, 2, 1, -5, 4]
# Suppose we're finding cross-boundary for full array
# low=0, mid=4, high=8
 
result = find_max_crossing_subarray(arr, 0, 4, 8)
print(f"Cross-boundary: indices {result[0]} to {result[1]}, sum = {result[2]}")
 
# Left half suffix scan from index 4 leftward:
#   i=4: sum=-1, max=-1
#   i=3: sum=-1+4=3, max=3
#   i=2: sum=3-3=0, max=3 (no update)
#   i=1: sum=0+1=1, max=3 (no update) 
#   i=0: sum=1-2=-1, max=3 (no update)
# Best left suffix starts at index 3, sum=3
 
# Right half prefix scan from index 5 rightward:
#   i=5: sum=2, max=2
#   i=6: sum=2+1=3, max=3
#   i=7: sum=3-5=-2, max=3 (no update)
#   i=8: sum=-2+4=2, max=3 (no update)
# Best right prefix ends at index 6, sum=3
 
# Total cross-boundary: 3 + 3 = 6, indices 3 to 6

Index Tracking for Reconstruction

We track max_left and max_right indices so we can report not just the maximum sum but also the subarray that achieves it. This is often required in practice—knowing the sum alone isn't useful without knowing which elements to include.

Correctness Proof for Cross-Boundary Computation

Let's rigorously prove that our cross-boundary computation is correct—that finding the maximum left suffix and maximum right prefix independently gives us the maximum cross-boundary subarray.

Theorem: The maximum sum cross-boundary subarray equals (maximum suffix sum of left half) + (maximum prefix sum of right half).

Proof:

Claim 1: Any cross-boundary subarray is a suffix + prefix.

Let A[i..j] be a cross-boundary subarray where i ≤ mid < j.

Since the subarray is contiguous and must include both sides of the midpoint, it includes all elements from i to j.
The portion from i to mid is a suffix of the left half (it ends at mid).
The portion from mid+1 to j is a prefix of the right half (it starts at mid+1).
Therefore, A[i..j] = (suffix A[i..mid]) + (prefix A[mid+1..j]).

Claim 2: The maximum cross-boundary sum is achieved by independently maximizing each part.

Let OPT = A[i*..j*] be the optimal cross-boundary subarray with sum S*.

Suppose (for contradiction) that either:

A[i*..mid] is NOT the maximum suffix of the left half, OR
A[mid+1..j*] is NOT the maximum prefix of the right half

Case 1: A[i*..mid] is not the maximum suffix. Then there exists some suffix A[i'..mid] with larger sum. But then A[i'..j*] would have sum greater than S*, contradicting optimality of OPT.

Case 2: A[mid+1..j*] is not the maximum prefix. Symmetric argument leads to contradiction.

Therefore, in OPT, both parts are individually optimal, and we can find OPT by independently maximizing each part. ∎

The Independence Property

The proof hinges on the fact that choosing a longer/shorter suffix doesn't affect which prefix is optimal (and vice versa). This independence allows us to decompose a 2D optimization into two 1D optimizations—a powerful technique that recurs throughout algorithm design.

Complexity Analysis:

Left suffix scan: O(mid - low + 1) ≈ O(n/2)
Right prefix scan: O(high - mid) ≈ O(n/2)
Combining: O(1)

Total cross-boundary computation: O(n)

This is crucial for achieving O(n log n) overall complexity—if the cross-boundary step were O(n²), the entire algorithm would be O(n²).

The Complete Divide & Conquer Algorithm

Now we assemble the pieces into the complete algorithm. The structure follows the canonical D&C pattern with our specialized cross-boundary handling.

max_subarray_dc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
def find_maximum_subarray(arr, low, high):
    """
    Find maximum subarray using divide and conquer.
    
    Returns:
        tuple: (left_idx, right_idx, sum) for the maximum subarray
    
    Time: O(n log n)
    Space: O(log n) for recursion stack
    """
    # Base case: single element
    if low == high:
        return (low, high, arr[low])
    
    # Divide: compute midpoint
    mid = (low + high) // 2
    
    # Conquer: recursively solve left and right halves
    left_low, left_high, left_sum = find_maximum_subarray(arr, low, mid)
    right_low, right_high, right_sum = find_maximum_subarray(arr, mid + 1, high)
    
    # Combine: find cross-boundary maximum
    cross_low, cross_high, cross_sum = find_max_crossing_subarray(arr, low, mid, high)
    
    # Return the best of three possibilities
    if left_sum >= right_sum and left_sum >= cross_sum:
        return (left_low, left_high, left_sum)
    elif right_sum >= left_sum and right_sum >= cross_sum:
        return (right_low, right_high, right_sum)
    else:
        return (cross_low, cross_high, cross_sum)
 
 
def find_max_crossing_subarray(arr, low, mid, high):
    """Find the maximum subarray crossing the midpoint."""
    # Maximum suffix of left half
    left_sum = float('-inf')
    current_sum = 0
    max_left = mid
    
    for i in range(mid, low - 1, -1):
        current_sum += arr[i]
        if current_sum > left_sum:
            left_sum = current_sum
            max_left = i
    
    # Maximum prefix of right half
    right_sum = float('-inf')
    current_sum = 0
    max_right = mid + 1
    
    for i in range(mid + 1, high + 1):
        current_sum += arr[i]
        if current_sum > right_sum:
            right_sum = current_sum
            max_right = i
    
    return (max_left, max_right, left_sum + right_sum)
 
 
# Wrapper for clean API
def max_subarray_dc(arr):
    """Find maximum subarray sum using D&C."""
    if not arr:
        return 0
    idx_low, idx_high, max_sum = find_maximum_subarray(arr, 0, len(arr) - 1)
    return max_sum
 
 
# Complete example with trace
def find_maximum_subarray_traced(arr, low, high, depth=0):
    """Traced version to understand recursion flow."""
    indent = "  " * depth
    print(f"{indent}Solving arr[{low}:{high+1}] = {arr[low:high+1]}")
    
    if low == high:
        print(f"{indent}  Base case: return ({low}, {high}, {arr[low]})")
        return (low, high, arr[low])
    
    mid = (low + high) // 2
    print(f"{indent}  Split at mid={mid}")
    
    left = find_maximum_subarray_traced(arr, low, mid, depth + 1)
    right = find_maximum_subarray_traced(arr, mid + 1, high, depth + 1)
    cross = find_max_crossing_subarray(arr, low, mid, high)
    
    print(f"{indent}  Left={left[2]}, Right={right[2]}, Cross={cross[2]}")
    
    if left[2] >= right[2] and left[2] >= cross[2]:
        result = left
    elif right[2] >= left[2] and right[2] >= cross[2]:
        result = right
    else:
        result = cross
    
    print(f"{indent}  Return: {result}")
    return result
 
 
# Example
arr = [-2, 1, -3, 4, -1, 2, 1, -5, 4]
print("=" * 50)
print(f"Array: {arr}")
print("=" * 50)
result = find_maximum_subarray_traced(arr, 0, len(arr) - 1)
print("=" * 50)
print(f"Final answer: indices {result[0]}-{result[1]}, sum = {result[2]}")
print(f"Subarray: {arr[result[0]:result[1]+1]}")

Algorithm Structure

Notice the clean separation: find_maximum_subarray handles the D&C skeleton (divide, conquer, combine), while find_max_crossing_subarray handles the specialized cross-boundary logic. This separation of concerns makes the code easier to understand, test, and modify.

Base Case Nuances and Edge Cases

The base case and edge cases in D&C algorithms are often sources of subtle bugs. Let's examine them carefully.

Critical Edge Cases

•Single element array: Base case returns (0, 0, arr[0]). Whether this is correct depends on the problem formulation—we're assuming at least one element must be in the subarray.
•All negative elements: The algorithm correctly returns the least negative element, since every subarray has negative sum and we pick the "best" one.
•All positive elements: The algorithm correctly returns the entire array, since including more positive elements always helps.
•Empty array: We handle this in the wrapper function, not the recursive function. Convention varies (return 0 or error).
•Single element in one half: When array size is odd, one half has one more element. The mid calculation ensures this works correctly.

edge_cases
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Edge case testing
def test_edge_cases():
    test_cases = [
        ([], 0, "Empty array"),
        ([5], 5, "Single positive element"),
        ([-5], -5, "Single negative element"),
        ([0], 0, "Single zero"),
        ([-3, -1, -2], -1, "All negative (pick least negative)"),
        ([1, 2, 3], 6, "All positive (pick all)"),
        ([1, -1, 1, -1, 1], 1, "Alternating"),
        ([-1, 2, -1, 2, -1], 3, "Alternating with larger positives"),
        ([100], 100, "Large single element"),
        ([-100], -100, "Large negative single element"),
    ]
    
    for arr, expected, description in test_cases:
        result = max_subarray_dc(arr)
        status = "✓" if result == expected else "✗"
        print(f"{status} {description}: got {result}, expected {expected}")
 
test_edge_cases()

Off-by-One Errors

Common bugs: (1) Using 'mid' vs 'mid+1' incorrectly when splitting or scanning; (2) Inclusive vs exclusive bounds in loops; (3) Forgetting that cross-boundary MUST include at least one element from each side. Always trace through examples with size 2 and 3 arrays to catch these.

Detailed Execution Trace

Let's trace through the algorithm on our example array to see exactly how it works.

Array: [-2, 1, -3, 4, -1, 2, 1, -5, 4] (indices 0-8)

Recursion Tree for Maximum Subarray
Call	Range	Left	Right	Cross	Result
Level 0	[0..8]	Recurse	Recurse	Calculate	max(L,R,C)
Level 1 L	[0..4]	Recurse	Recurse	Calculate	max(L,R,C)
Level 1 R	[5..8]	Recurse	Recurse	Calculate	max(L,R,C)
Level 2 LL	[0..2]	Recurse	Recurse	Calculate	-1
Level 2 LR	[3..4]	Recurse	Recurse	Calculate	4
Level 2 RL	[5..6]	Recurse	Recurse	Calculate	3
Level 2 RR	[7..8]	Recurse	Recurse	Calculate	4
... base cases	...	...	...	...	...

Detailed trace for root call [0..8]:

Solving [0..8], mid=4
├── Left [0..4] returns max_sum = 4 (subarray [3..3] = [4])
├── Right [5..8] returns max_sum = 4 (subarray [8..8] = [4])
└── Cross:
    ├── Left suffix from mid=4 backward:
    │   i=4: sum=-1, max=-1
    │   i=3: sum=-1+4=3, max=3 ★
    │   i=2: sum=3-3=0
    │   i=1: sum=0+1=1
    │   i=0: sum=1-2=-1
    │   Best suffix: [3..4] = [4,-1], sum=3
    ├── Right prefix from mid+1=5 forward:
    │   i=5: sum=2, max=2
    │   i=6: sum=2+1=3, max=3 ★
    │   i=7: sum=3-5=-2
    │   i=8: sum=-2+4=2
    │   Best prefix: [5..6] = [2,1], sum=3
    └── Cross sum: 3 + 3 = 6

max(4, 4, 6) = 6 ★
Cross-boundary wins! Answer: [3..6] = [4,-1,2,1], sum=6

Trace Your Own Examples

The best way to internalize this algorithm is to trace it manually on small arrays. Try arrays of length 4-6 with various patterns of positive and negative numbers. Notice how the cross-boundary computation "bridges" the gap between recursive solutions.

Summary: Mastering Cross-Boundary Handling

Key Takeaways

•Cross-boundary handling is non-trivial: Unlike merge sort's merge, we can't just combine results—we must find new subarrays not seen by either recursive call.
•Structure exploitation: Cross-boundary subarrays = suffix + prefix, allowing independent optimization.
•O(n) cross-boundary computation: Scan left from mid and right from mid+1, tracking running maxima.
•Correctness through independence: Choosing the best suffix doesn't affect which prefix is optimal.
•Three-way comparison: Return max(recursively-left, recursively-right, cross-boundary).
•Base case: single element: Return that element as both sum and indices.
•This pattern generalizes: Closest pair of points, counting inversions, and more use similar cross-boundary techniques.

What's Next:

With the complete algorithm understood, we'll now analyze its complexity rigorously. Using the Master Theorem, we'll prove the O(n log n) bound and develop intuition for why this is the best we can achieve with D&C for this problem.

Page Complete

You now understand the heart of the D&C approach to maximum subarray: the cross-boundary handling technique. You can implement the complete algorithm, understand its correctness, and trace its execution. Next, we'll analyze why this achieves O(n log n) complexity.