Maximum Subarray - Learning Module

Loading content...

0/276

Time Complexity — O(n log n) Analysis

Quantifying the Algorithm's Efficiency

We've developed the divide and conquer algorithm for maximum subarray. But how efficient is it really? Claiming "it's faster than brute force" isn't enough—we need precise bounds that let us predict performance and compare with alternatives.

In this page, we'll rigorously derive the O(n log n) time complexity. Along the way, you'll master the technique of recurrence relations and see how the Master Theorem provides a powerful shortcut for analyzing D&C algorithms.

What You Will Learn

By the end of this page, you'll be able to: (1) Write the recurrence relation for the D&C maximum subarray algorithm; (2) Apply the Master Theorem to solve it; (3) Develop intuition for why O(n log n) arises; (4) Understand the space complexity implications; (5) Compare with brute force and anticipate where further improvement might come from.

Setting Up the Recurrence Relation

A recurrence relation expresses the running time of an algorithm in terms of the running time on smaller inputs. For divide and conquer algorithms, this naturally captures the recursive structure.

Let T(n) denote the running time of our algorithm on an input of size n. We analyze what happens at each level of recursion.

Analyze each step:

1. Divide (compute midpoint):

Operation: mid = (low + high) // 2
Time: O(1) — constant time operation

2. Conquer (recursive calls):

Left half: T(n/2) on array of size n/2
Right half: T(n/2) on array of size n/2
Total: 2 × T(n/2)

3. Combine (cross-boundary + comparison):

Cross-boundary finding: Two linear scans totaling O(n)
Three-way comparison: O(1)
Total: O(n)

4. Base case:

Single element: O(1) — return immediately

The Recurrence:

Combining these observations:

$$T(n) = \begin{cases} O(1) & \text{if } n = 1 \ 2T(n/2) + O(n) & \text{if } n > 1 \end{cases}$$

Or more precisely, for constants a, b:

$$T(n) = 2T(n/2) + cn$$

where:

2 = number of recursive subproblems
n/2 = size of each subproblem
cn = work done outside recursion (cross-boundary finding)

This is exactly the form amenable to the Master Theorem.

The Critical Insight

The O(n) cross-boundary work is what determines the overall complexity. If cross-boundary were O(n²), the algorithm would be O(n² log n) or worse. The efficient linear-time cross-boundary computation is what makes D&C competitive.

The Master Theorem — A Powerful Tool

The Master Theorem provides a direct formula for solving recurrences of the form:

$$T(n) = aT(n/b) + f(n)$$

where:

a ≥ 1: number of subproblems
b > 1: factor by which problem size shrinks
f(n): cost of dividing and combining

The theorem compares f(n) with n^(log_b(a)) to determine the overall complexity.

Master Theorem Cases

•Case 1: If f(n) = O(n^(log_b(a) - ε)) for some ε > 0, then T(n) = Θ(n^(log_b(a))). The recursive work dominates.
•Case 2: If f(n) = Θ(n^(log_b(a))), then T(n) = Θ(n^(log_b(a)) × log n). Work is evenly distributed across levels.
•Case 3: If f(n) = Ω(n^(log_b(a) + ε)) for some ε > 0 and regularity holds, then T(n) = Θ(f(n)). The combine work dominates.

Applying to Maximum Subarray:

Our recurrence: T(n) = 2T(n/2) + cn

Parameters:

a = 2 (two recursive calls)
b = 2 (each on half the input)
f(n) = cn = Θ(n) (linear combine work)

Compute the critical exponent:

log_b(a) = log_2(2) = 1
So n^(log_b(a)) = n^1 = n

Compare f(n) with n^(log_b(a)):

f(n) = Θ(n)
n^(log_b(a)) = Θ(n)
They are exactly equal!

This is Case 2 of the Master Theorem.

The Result

By Case 2 of the Master Theorem: T(n) = Θ(n^(log_b(a)) × log n) = Θ(n × log n) = Θ(n log n). The D&C maximum subarray algorithm runs in O(n log n) time.

Intuition Behind O(n log n)

The Master Theorem gives us the answer, but let's develop deep intuition for why this complexity arises. Understanding the "why" helps you predict complexity for new algorithms.

The Recursion Tree Perspective:

Visualize the algorithm as a tree where each node represents a subproblem:

                    [n]                Level 0: 1 problem of size n
                   /   \
                [n/2]  [n/2]           Level 1: 2 problems of size n/2  
                /  \   /  \
            [n/4][n/4][n/4][n/4]       Level 2: 4 problems of size n/4
               ⋮     ⋮     ⋮     ⋮
            [1][1][1]...[1][1][1]      Level log₂n: n problems of size 1

Work at each level:

Level 0: c × n work (one cross-boundary scan of size n)
Level 1: 2 × c × (n/2) = cn work (two scans of size n/2 each)
Level 2: 4 × c × (n/4) = cn work (four scans of size n/4 each)
...
Level k: 2^k × c × (n/2^k) = cn work

Key observation: Every level does exactly cn work!

Work Distribution Across Recursion Levels
Level	Problems	Size Each	Work per Problem	Total Work
0	1	n	cn	cn
1	2	n/2	c(n/2)	cn
2	4	n/4	c(n/4)	cn
3	8	n/8	c(n/8)	cn
...	...	...	...	cn
log₂n	n	1	c	cn

Total work:

Number of levels: log₂n
Work per level: cn
Total: cn × log₂n = O(n log n)

This is the essence of Master Theorem Case 2: when the work at each level is the same, we multiply the per-level work by the number of levels.

The Balance Point

Case 2 occurs at a "balance point" where the reduction in problem size exactly compensates for the increase in number of subproblems. This balance (a = b in our case) is characteristic of 'n log n' algorithms like merge sort.

Solving the Recurrence Directly

For completeness, let's solve the recurrence directly using the substitution method. This builds skills for recurrences where the Master Theorem doesn't apply.

Recurrence: T(n) = 2T(n/2) + cn, T(1) = c

Guess: T(n) = O(n log n)

Proof by strong induction:

Inductive Hypothesis: Assume T(k) ≤ dk log k for all k < n, for some constant d to be determined.

Inductive Step: Prove T(n) ≤ dn log n.

$$T(n) = 2T(n/2) + cn$$ $$\leq 2 \cdot d(n/2)\log(n/2) + cn \quad \text{(by IH)}$$ $$= dn \cdot \log(n/2) + cn$$ $$= dn \cdot (\log n - \log 2) + cn$$ $$= dn \cdot (\log n - 1) + cn$$ $$= dn \log n - dn + cn$$ $$= dn \log n - (d - c)n$$ $$\leq dn \log n \quad \text{(provided } d \geq c \text{)}$$

Base Case: For small n, T(n) is bounded by a constant, satisfied by choosing d large enough.

Conclusion: T(n) = O(n log n). ∎

Lower Bound

A symmetric argument shows T(n) = Ω(n log n). Together, this gives T(n) = Θ(n log n)—the algorithm is asymptotically tight at this complexity; we can't do better with D&C and O(n) combine work.

Space Complexity Analysis

Time complexity isn't the only resource to consider. Let's analyze the space used by our algorithm.

Sources of space usage:

1. Input Storage: O(n)

The input array must be stored, but this is typically not counted as "extra" space.

2. Recursion Stack: O(log n)

The recursion depth is log₂n (we halve at each level).
Each stack frame uses O(1) space (a few pointers and indices).
Total stack space: O(log n)

3. Cross-Boundary Variables: O(1)

We only use a constant number of variables in find_max_crossing_subarray.
No arrays or data structures are allocated.

Total Additional Space: O(log n)

This compares favorably to algorithms that require O(n) auxiliary space (like merge sort, which needs an auxiliary array for merging).

Space Complexity Breakdown
Component	Space	Notes
Input array	O(n)	Required regardless of algorithm
Recursion stack	O(log n)	log₂n depth × O(1) per frame
Cross-boundary vars	O(1)	Just counters and running sums
Total extra space	O(log n)	Dominated by recursion depth

Iterative Alternatives

The O(log n) stack space comes from recursion. An iterative bottom-up D&C implementation could reduce this to O(1) extra space, though it's more complex to implement. Kadane's algorithm (which we'll see next) achieves O(1) space naturally.

Complexity Comparison with Other Approaches

Let's place the D&C algorithm in context by comparing it with other approaches to maximum subarray.

Algorithm Complexity Comparison
Algorithm	Time	Space	Key Insight
Brute Force (cubic)	O(n³)	O(1)	Enumerate all pairs, sum each
Brute Force (quadratic)	O(n²)	O(1)	Running sum optimization
Divide & Conquer	O(n log n)	O(log n)	Cross-boundary decomposition
Kadane's Algorithm	O(n)	O(1)	DP / greedy insight

Practical performance differences:

For n = 1,000,000:

Brute Force O(n³): ~10¹⁸ operations (years)
Brute Force O(n²): ~10¹² operations (hours)
D&C O(n log n): ~20 million operations (milliseconds)
Kadane O(n): ~1 million operations (sub-millisecond)

The D&C algorithm is practical for large inputs, though Kadane's algorithm is still 20× faster asymptotically.

Why isn't D&C optimal here?

The fundamental issue is that D&C solves a problem that's "easier" than the algorithm realizes:

D&C doesn't use information across iterations: Each recursive call is independent; information discovered in the left half doesn't help solve the right half.
The problem has "overlapping subproblems": Many subproblems share common structure that D&C recomputes.
There's a simpler recurrence: Kadane's algorithm exploits the fact that max subarray ending at position i depends only on max subarray ending at position i-1.

D&C is like using a sledgehammer when a screwdriver suffices—it works, but it's not the right tool.

The Lesson

D&C isn't always optimal even when it works. For maximum subarray, the O(n log n) D&C solution is beaten by Kadane's O(n) algorithm. The value of studying D&C here is learning the technique for problems where it IS optimal (like closest pair of points).

When D&C Achieves Optimal Complexity

To appreciate the D&C approach fully, let's understand when O(n log n) is actually optimal for D&C algorithms.

Problems Where D&C is Optimal

•Merge Sort: O(n log n) is optimal for comparison-based sorting. D&C achieves the lower bound.
•Closest Pair of Points: O(n log n) is proven optimal. The cross-boundary handling is similar to maximum subarray.
•FFT (Fast Fourier Transform): O(n log n) vs O(n²) naive. D&C is essential for the improvement.
•Counting Inversions: O(n log n) via modified merge sort. Optimal for this problem.
•Strassen's Matrix Multiplication: O(n^2.807) vs O(n³) naive. D&C enables sub-cubic multiplication.

What makes these different from maximum subarray?

In problems where D&C is optimal, there's typically:

No simpler recurrence structure that gives a better bound
Information locality—what you learn in one half truly doesn't help the other
A proven lower bound matching the D&C upper bound

Maximum subarray lacks these properties: Kadane's algorithm shows a simpler recurrence exists, and information (the running maximum) does propagate usefully across positions.

Pedagogical Value

The fact that D&C isn't optimal for maximum subarray is part of its pedagogical value. It teaches you to always ask: 'Is there a simpler approach?' rather than reflexively applying D&C because the problem can be divided.

Practical Performance Considerations

Asymptotic complexity tells only part of the story. In practice, constant factors, cache behavior, and recursion overhead matter.

Practical Factors Affecting D&C Performance

•Recursion Overhead: Each function call has overhead (stack frame, parameter passing). For small n, this can exceed the work done.
•Cache Locality: D&C on arrays has good locality—we access contiguous regions. This helps performance on modern CPUs.
•Branch Prediction: The comparison at the end (which of three is maximum) is well-predicted after the first few iterations.
•Parallelization Potential: The two recursive calls are independent and can run in parallel. This can halve wall-clock time on multi-core systems.
•Hybrid Approaches: For small subproblems (n < 32), switching to brute force can be faster due to lower overhead.

optimized_dc
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
def find_maximum_subarray_optimized(arr, low, high, threshold=32):
    """
    Optimized D&C with threshold for switching to brute force.
    
    For small subproblems, brute force has lower overhead.
    """
    # Switch to brute force for small subproblems
    if high - low + 1 <= threshold:
        return brute_force_max_subarray(arr, low, high)
    
    # Standard D&C for larger problems
    mid = (low + high) // 2
    
    left = find_maximum_subarray_optimized(arr, low, mid, threshold)
    right = find_maximum_subarray_optimized(arr, mid + 1, high, threshold)
    cross = find_max_crossing_subarray(arr, low, mid, high)
    
    if left[2] >= right[2] and left[2] >= cross[2]:
        return left
    elif right[2] >= cross[2]:
        return right
    else:
        return cross
 
 
def brute_force_max_subarray(arr, low, high):
    """O(n²) brute force for small ranges."""
    max_sum = arr[low]
    max_left, max_right = low, low
    
    for i in range(low, high + 1):
        current_sum = 0
        for j in range(i, high + 1):
            current_sum += arr[j]
            if current_sum > max_sum:
                max_sum = current_sum
                max_left, max_right = i, j
    
    return (max_left, max_right, max_sum)
 
 
# Empirical measurement
import time
 
def benchmark(func, arr, iterations=100):
    """Benchmark a function."""
    start = time.perf_counter()
    for _ in range(iterations):
        func(arr, 0, len(arr) - 1)
    return (time.perf_counter() - start) / iterations * 1000  # ms
 
# Compare pure D&C vs hybrid
arr = list(range(-500, 500))  # 1000 elements
import random
random.shuffle(arr)
 
# Results typically show: hybrid is 10-30% faster for moderate n

The Threshold Heuristic

The optimal threshold depends on hardware and implementation. Common values range from 16 to 64. Profile on your target hardware to tune this parameter. Most standard library implementations (e.g., for sorting) use similar hybrid approaches.

Summary: Understanding O(n log n)

Key Takeaways

•Recurrence relation: T(n) = 2T(n/2) + cn captures the D&C structure.
•Master Theorem Case 2: When f(n) = Θ(n^(log_b(a))), we get T(n) = Θ(n log n).
•Intuition: Each of the log n levels does O(n) work, giving O(n log n) total.
•Space complexity: O(log n) for the recursion stack.
•Not optimal for this problem: Kadane's O(n) algorithm is better.
•But valuable as technique: Cross-boundary handling appears in problems where D&C IS optimal.
•Practical considerations: Recursion overhead, cache effects, and hybrid approaches affect real performance.

What's Next:

We've established that D&C gives O(n log n) for maximum subarray. But an O(n) algorithm exists! In the next page, we'll explore Kadane's algorithm and understand why D&C, while correct and reasonable, isn't the best approach for this particular problem.

Page Complete

You now understand the rigorous complexity analysis of the D&C maximum subarray algorithm. You can derive O(n log n) using both the Master Theorem and direct substitution, understand why this bound arises, and appreciate its practical implications. Next, we'll see why D&C isn't the final word on this problem.

Time Complexity — O(n log n) Analysis

Quantifying the Algorithm's Efficiency

What You Will Learn

Setting Up the Recurrence Relation

Let T(n) denote the running time of our algorithm on an input of size n. We analyze what happens at each level of recursion.

Analyze each step:

1. Divide (compute midpoint):

Operation: mid = (low + high) // 2
Time: O(1) — constant time operation

2. Conquer (recursive calls):

Left half: T(n/2) on array of size n/2
Right half: T(n/2) on array of size n/2
Total: 2 × T(n/2)

3. Combine (cross-boundary + comparison):

Cross-boundary finding: Two linear scans totaling O(n)
Three-way comparison: O(1)
Total: O(n)

4. Base case:

Single element: O(1) — return immediately

The Recurrence:

Combining these observations:

$$T(n) = \begin{cases} O(1) & \text{if } n = 1 \ 2T(n/2) + O(n) & \text{if } n > 1 \end{cases}$$

Or more precisely, for constants a, b:

$$T(n) = 2T(n/2) + cn$$

where:

2 = number of recursive subproblems
n/2 = size of each subproblem
cn = work done outside recursion (cross-boundary finding)

This is exactly the form amenable to the Master Theorem.

The Critical Insight

The Master Theorem — A Powerful Tool

The Master Theorem provides a direct formula for solving recurrences of the form:

$$T(n) = aT(n/b) + f(n)$$

where:

a ≥ 1: number of subproblems
b > 1: factor by which problem size shrinks
f(n): cost of dividing and combining

The theorem compares f(n) with n^(log_b(a)) to determine the overall complexity.

Master Theorem Cases

•Case 1: If f(n) = O(n^(log_b(a) - ε)) for some ε > 0, then T(n) = Θ(n^(log_b(a))). The recursive work dominates.
•Case 2: If f(n) = Θ(n^(log_b(a))), then T(n) = Θ(n^(log_b(a)) × log n). Work is evenly distributed across levels.
•Case 3: If f(n) = Ω(n^(log_b(a) + ε)) for some ε > 0 and regularity holds, then T(n) = Θ(f(n)). The combine work dominates.

Applying to Maximum Subarray:

Our recurrence: T(n) = 2T(n/2) + cn

Parameters:

a = 2 (two recursive calls)
b = 2 (each on half the input)
f(n) = cn = Θ(n) (linear combine work)

Compute the critical exponent:

log_b(a) = log_2(2) = 1
So n^(log_b(a)) = n^1 = n

Compare f(n) with n^(log_b(a)):

f(n) = Θ(n)
n^(log_b(a)) = Θ(n)
They are exactly equal!

This is Case 2 of the Master Theorem.

The Result

By Case 2 of the Master Theorem: T(n) = Θ(n^(log_b(a)) × log n) = Θ(n × log n) = Θ(n log n). The D&C maximum subarray algorithm runs in O(n log n) time.

Intuition Behind O(n log n)

The Master Theorem gives us the answer, but let's develop deep intuition for why this complexity arises. Understanding the "why" helps you predict complexity for new algorithms.

The Recursion Tree Perspective:

Visualize the algorithm as a tree where each node represents a subproblem:

                    [n]                Level 0: 1 problem of size n
                   /   \
                [n/2]  [n/2]           Level 1: 2 problems of size n/2  
                /  \   /  \
            [n/4][n/4][n/4][n/4]       Level 2: 4 problems of size n/4
               ⋮     ⋮     ⋮     ⋮
            [1][1][1]...[1][1][1]      Level log₂n: n problems of size 1

Work at each level:

Level 0: c × n work (one cross-boundary scan of size n)
Level 1: 2 × c × (n/2) = cn work (two scans of size n/2 each)
Level 2: 4 × c × (n/4) = cn work (four scans of size n/4 each)
...
Level k: 2^k × c × (n/2^k) = cn work

Key observation: Every level does exactly cn work!

Work Distribution Across Recursion Levels
Level	Problems	Size Each	Work per Problem	Total Work
0	1	n	cn	cn
1	2	n/2	c(n/2)	cn
2	4	n/4	c(n/4)	cn
3	8	n/8	c(n/8)	cn
...	...	...	...	cn
log₂n	n	1	c	cn

Total work:

Number of levels: log₂n
Work per level: cn
Total: cn × log₂n = O(n log n)

This is the essence of Master Theorem Case 2: when the work at each level is the same, we multiply the per-level work by the number of levels.

The Balance Point

Solving the Recurrence Directly

For completeness, let's solve the recurrence directly using the substitution method. This builds skills for recurrences where the Master Theorem doesn't apply.

Recurrence: T(n) = 2T(n/2) + cn, T(1) = c

Guess: T(n) = O(n log n)

Proof by strong induction:

Inductive Hypothesis: Assume T(k) ≤ dk log k for all k < n, for some constant d to be determined.

Inductive Step: Prove T(n) ≤ dn log n.

Base Case: For small n, T(n) is bounded by a constant, satisfied by choosing d large enough.

Conclusion: T(n) = O(n log n). ∎

Lower Bound

A symmetric argument shows T(n) = Ω(n log n). Together, this gives T(n) = Θ(n log n)—the algorithm is asymptotically tight at this complexity; we can't do better with D&C and O(n) combine work.

Space Complexity Analysis

Time complexity isn't the only resource to consider. Let's analyze the space used by our algorithm.

Sources of space usage:

1. Input Storage: O(n)

The input array must be stored, but this is typically not counted as "extra" space.

2. Recursion Stack: O(log n)

The recursion depth is log₂n (we halve at each level).
Each stack frame uses O(1) space (a few pointers and indices).
Total stack space: O(log n)

3. Cross-Boundary Variables: O(1)

We only use a constant number of variables in find_max_crossing_subarray.
No arrays or data structures are allocated.

Total Additional Space: O(log n)

This compares favorably to algorithms that require O(n) auxiliary space (like merge sort, which needs an auxiliary array for merging).

Space Complexity Breakdown
Component	Space	Notes
Input array	O(n)	Required regardless of algorithm
Recursion stack	O(log n)	log₂n depth × O(1) per frame
Cross-boundary vars	O(1)	Just counters and running sums
Total extra space	O(log n)	Dominated by recursion depth

Iterative Alternatives

Complexity Comparison with Other Approaches

Let's place the D&C algorithm in context by comparing it with other approaches to maximum subarray.

Algorithm Complexity Comparison
Algorithm	Time	Space	Key Insight
Brute Force (cubic)	O(n³)	O(1)	Enumerate all pairs, sum each
Brute Force (quadratic)	O(n²)	O(1)	Running sum optimization
Divide & Conquer	O(n log n)	O(log n)	Cross-boundary decomposition
Kadane's Algorithm	O(n)	O(1)	DP / greedy insight

Practical performance differences:

For n = 1,000,000:

Brute Force O(n³): ~10¹⁸ operations (years)
Brute Force O(n²): ~10¹² operations (hours)
D&C O(n log n): ~20 million operations (milliseconds)
Kadane O(n): ~1 million operations (sub-millisecond)

The D&C algorithm is practical for large inputs, though Kadane's algorithm is still 20× faster asymptotically.

Why isn't D&C optimal here?

The fundamental issue is that D&C solves a problem that's "easier" than the algorithm realizes:

D&C doesn't use information across iterations: Each recursive call is independent; information discovered in the left half doesn't help solve the right half.
The problem has "overlapping subproblems": Many subproblems share common structure that D&C recomputes.
There's a simpler recurrence: Kadane's algorithm exploits the fact that max subarray ending at position i depends only on max subarray ending at position i-1.

D&C is like using a sledgehammer when a screwdriver suffices—it works, but it's not the right tool.

The Lesson

When D&C Achieves Optimal Complexity

To appreciate the D&C approach fully, let's understand when O(n log n) is actually optimal for D&C algorithms.

Problems Where D&C is Optimal

•Merge Sort: O(n log n) is optimal for comparison-based sorting. D&C achieves the lower bound.
•Closest Pair of Points: O(n log n) is proven optimal. The cross-boundary handling is similar to maximum subarray.
•FFT (Fast Fourier Transform): O(n log n) vs O(n²) naive. D&C is essential for the improvement.
•Counting Inversions: O(n log n) via modified merge sort. Optimal for this problem.
•Strassen's Matrix Multiplication: O(n^2.807) vs O(n³) naive. D&C enables sub-cubic multiplication.

What makes these different from maximum subarray?

In problems where D&C is optimal, there's typically:

No simpler recurrence structure that gives a better bound
Information locality—what you learn in one half truly doesn't help the other
A proven lower bound matching the D&C upper bound

Maximum subarray lacks these properties: Kadane's algorithm shows a simpler recurrence exists, and information (the running maximum) does propagate usefully across positions.

Pedagogical Value

Practical Performance Considerations

Asymptotic complexity tells only part of the story. In practice, constant factors, cache behavior, and recursion overhead matter.

Practical Factors Affecting D&C Performance

•Recursion Overhead: Each function call has overhead (stack frame, parameter passing). For small n, this can exceed the work done.
•Cache Locality: D&C on arrays has good locality—we access contiguous regions. This helps performance on modern CPUs.
•Branch Prediction: The comparison at the end (which of three is maximum) is well-predicted after the first few iterations.
•Parallelization Potential: The two recursive calls are independent and can run in parallel. This can halve wall-clock time on multi-core systems.
•Hybrid Approaches: For small subproblems (n < 32), switching to brute force can be faster due to lower overhead.

optimized_dc
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
def find_maximum_subarray_optimized(arr, low, high, threshold=32):
    """
    Optimized D&C with threshold for switching to brute force.
    
    For small subproblems, brute force has lower overhead.
    """
    # Switch to brute force for small subproblems
    if high - low + 1 <= threshold:
        return brute_force_max_subarray(arr, low, high)
    
    # Standard D&C for larger problems
    mid = (low + high) // 2
    
    left = find_maximum_subarray_optimized(arr, low, mid, threshold)
    right = find_maximum_subarray_optimized(arr, mid + 1, high, threshold)
    cross = find_max_crossing_subarray(arr, low, mid, high)
    
    if left[2] >= right[2] and left[2] >= cross[2]:
        return left
    elif right[2] >= cross[2]:
        return right
    else:
        return cross
 
 
def brute_force_max_subarray(arr, low, high):
    """O(n²) brute force for small ranges."""
    max_sum = arr[low]
    max_left, max_right = low, low
    
    for i in range(low, high + 1):
        current_sum = 0
        for j in range(i, high + 1):
            current_sum += arr[j]
            if current_sum > max_sum:
                max_sum = current_sum
                max_left, max_right = i, j
    
    return (max_left, max_right, max_sum)
 
 
# Empirical measurement
import time
 
def benchmark(func, arr, iterations=100):
    """Benchmark a function."""
    start = time.perf_counter()
    for _ in range(iterations):
        func(arr, 0, len(arr) - 1)
    return (time.perf_counter() - start) / iterations * 1000  # ms
 
# Compare pure D&C vs hybrid
arr = list(range(-500, 500))  # 1000 elements
import random
random.shuffle(arr)
 
# Results typically show: hybrid is 10-30% faster for moderate n

The Threshold Heuristic

Summary: Understanding O(n log n)

Key Takeaways

•Recurrence relation: T(n) = 2T(n/2) + cn captures the D&C structure.
•Master Theorem Case 2: When f(n) = Θ(n^(log_b(a))), we get T(n) = Θ(n log n).
•Intuition: Each of the log n levels does O(n) work, giving O(n log n) total.
•Space complexity: O(log n) for the recursion stack.
•Not optimal for this problem: Kadane's O(n) algorithm is better.
•But valuable as technique: Cross-boundary handling appears in problems where D&C IS optimal.
•Practical considerations: Recursion overhead, cache effects, and hybrid approaches affect real performance.

What's Next:

Page Complete