Heap Sort - Learning Module

Loading content...

0/276

Time Complexity: O(n log n) Guaranteed

Understanding Guaranteed Performance

In algorithm analysis, there's a profound difference between "O(n log n) on average" and "O(n log n) guaranteed." The former means most inputs behave well; the latter means ALL inputs behave well. Heap sort belongs to the second, more exclusive category.

This distinction matters enormously in practice. Systems that must handle adversarial input, real-time constraints, or high-stakes computations cannot afford the occasional O(n²) performance spike. Heap sort's guaranteed O(n log n) performance makes it suitable for contexts where predictability is paramount.

What You Will Learn

By the end of this page, you will understand exactly why heap sort achieves O(n log n) in all cases, how to prove this bound rigorously, why no input pattern can degrade heap sort's performance, and when guaranteed performance matters in practice.

The Structure of Heap Sort's Guarantee

To understand why heap sort's O(n log n) bound is guaranteed, we need to examine what makes some algorithms have variable performance and why heaps avoid this variability.

Why Quick Sort Has Variable Performance:

Quick sort's performance depends on pivot selection:

Good pivots: partition roughly in half → O(n log n)
Bad pivots: partition 1:(n-1) → O(n²) depth, O(n²) total
Adversary can craft input to always produce bad pivots

Why Heap Sort's Performance Is Fixed:

Heap sort's operations don't depend on input patterns:

Build-heap processes nodes in a fixed order
Heapify always traverses from current position toward leaves
The heap's shape (complete binary tree) is determined by size, not values
Values affect which comparisons succeed, but not how many comparisons occur

The key insight is that heap shape is value-independent. A heap of n elements always has the same structure—height ⌊log₂ n⌋, same parent-child relationships—regardless of the values stored.

Why Some Algorithms Have Variable Performance
Algorithm	Performance Depends On	Can Be Adversarial?
Quick Sort	Pivot quality	Yes — adversary can force O(n²)
Insertion Sort	Initial order	Yes — reverse order is O(n²)
Merge Sort	Nothing (fixed pattern)	No — always O(n log n)
Heap Sort	Nothing (fixed pattern)	No — always O(n log n)
Tim Sort	Initial order	No — worst case is O(n log n)

The Comparison-Oblivious Property

Heap sort is 'comparison-oblivious' in the sense that its control flow doesn't depend on comparison outcomes. It always builds the heap bottom-up, always extracts from the root, and always heapifies along some path to a leaf. Values affect WHICH path, but not path LENGTH.

Rigorous Complexity Proof

Let's prove heap sort's O(n log n) bound with mathematical rigor. We'll analyze both phases separately, then combine.

Phase 1: Build-Heap is O(n)

We've covered this in the previous page, but let's formalize:

Let h = ⌊log₂ n⌋ be the height of the heap.

At height k (leaves are height 0):

Number of nodes: ≤ ⌈n / 2^(k+1)⌉
Work per node: O(k) — heapify descends at most k levels

Total work: $$T_{build}(n) = \sum_{k=0}^{h} \left\lceil \frac{n}{2^{k+1}} \right\rceil \cdot O(k)$$

Simplifying (dropping floors/ceilings for asymptotic analysis): $$T_{build}(n) = O\left( n \sum_{k=0}^{h} \frac{k}{2^{k+1}} \right) = O\left( n \cdot \frac{1}{2} \sum_{k=0}^{\infty} \frac{k}{2^k} \right)$$

The infinite sum converges: $\sum_{k=0}^{\infty} \frac{k}{2^k} = 2$

Therefore: $T_{build}(n) = O(n \cdot 1) = O(n)$

Phase 2: Extraction is O(n log n)

The extraction phase performs n - 1 extract-max operations:

Each extraction: swap root with last element + heapify
Heapify from root: O(log(current heap size))

Total work: $$T_{extract}(n) = \sum_{k=2}^{n} O(\log k) = O\left( \sum_{k=2}^{n} \log k \right) = O(\log n!)$$

Using Stirling's approximation: $\log n! = \Theta(n \log n)$

Therefore: $T_{extract}(n) = O(n \log n)$

Combined:

$$T_{total}(n) = T_{build}(n) + T_{extract}(n) = O(n) + O(n \log n) = O(n \log n)$$

Since $O(n) \subseteq O(n \log n)$, the extraction phase dominates, giving the final bound.

Lower Bound:

Every comparison-based sort requires Ω(n log n) comparisons. Heap sort achieves this optimal bound (within constant factors), making it asymptotically optimal among comparison sorts.

The Mathematical Detail

The sum Σk/2^k = 2 comes from differentiating the geometric series. If S = Σx^k = 1/(1-x) for |x| < 1, then dS/dx = Σkx^(k-1) = 1/(1-x)². Setting x = 1/2 gives Σk(1/2)^(k-1) = 4, so Σk(1/2)^k = 2. This mathematical identity is the foundation of the O(n) build-heap analysis.

Why No Bad Input Exists

Understanding why heap sort has no pathological inputs deepens our understanding of algorithm analysis.

The Adversary Argument:

Imagine an adversary who knows your sorting algorithm and wants to create the worst possible input. For quick sort:

Adversary observes your pivot selection strategy
Crafts input where each pivot is minimum or maximum
Forces O(n²) behavior

For heap sort, the adversary hits a wall:

Heap structure is fixed by count, not values
Heapify always traverses a root-to-leaf path
Path length is always ⌊log₂(heap size)⌋
No value arrangement changes path lengths

What Values CAN Affect:

Values affect which child heapify follows (left or right), but:

Both children are at the same distance from leaves
Either path has the same length
The computation time is identical

Values also affect how deep heapify needs to go:

Best case for single heapify: 0 steps (root is already maximum)
Worst case: h steps (value sinks to leaf level)

But even if we construct an adversarial input where every heapify goes to maximum depth, we still get O(n log n)—that's what our proof assumed!

adversarial_attempt.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
"""
Attempting to construct adversarial inputs for heap sort.
Spoiler: It doesn't work because heap structure is value-independent.
"""
import random
import time
 
 
def heap_sort(arr: list) -> int:
    """Heap sort that counts comparisons."""
    comparisons = 0
    n = len(arr)
    
    def heapify(size: int, i: int) -> None:
        nonlocal comparisons
        while True:
            largest = i
            left = 2 * i + 1
            right = 2 * i + 2
            
            if left < size:
                comparisons += 1
                if arr[left] > arr[largest]:
                    largest = left
            if right < size:
                comparisons += 1
                if arr[right] > arr[largest]:
                    largest = right
            
            if largest == i:
                break
            arr[i], arr[largest] = arr[largest], arr[i]
            i = largest
    
    # Build heap
    for i in range((n // 2) - 1, -1, -1):
        heapify(n, i)
    
    # Extract
    for i in range(n - 1, 0, -1):
        arr[0], arr[i] = arr[i], arr[0]
        heapify(i, 0)
    
    return comparisons
 
 
def try_adversarial_inputs(n: int) -> dict:
    """
    Try various input patterns that might be 'adversarial'.
    None of them significantly affect heap sort's behavior.
    """
    results = {}
    
    # Random input
    random_arr = list(range(n))
    random.shuffle(random_arr)
    results["Random"] = heap_sort(random_arr.copy())
    
    # Already sorted (ascending)
    sorted_arr = list(range(n))
    results["Sorted Ascending"] = heap_sort(sorted_arr.copy())
    
    # Reverse sorted
    reverse_arr = list(range(n, 0, -1))
    results["Sorted Descending"] = heap_sort(reverse_arr.copy())
    
    # All same elements
    same_arr = [42] * n
    results["All Same"] = heap_sort(same_arr.copy())
    
    # Almost sorted (one out of place)
    almost_arr = list(range(n))
    almost_arr[0], almost_arr[-1] = almost_arr[-1], almost_arr[0]
    results["Almost Sorted"] = heap_sort(almost_arr.copy())
    
    # Saw-tooth pattern
    sawtooth_arr = []
    for i in range(n // 10 + 1):
        sawtooth_arr.extend(range(i * 10, min((i + 1) * 10, n)))
    sawtooth_arr = sawtooth_arr[:n]
    results["Sawtooth"] = heap_sort(sawtooth_arr.copy())
    
    # Alternating high-low
    alternating = []
    for i in range(n):
        alternating.append(n - i if i % 2 == 0 else i)
    results["Alternating"] = heap_sort(alternating.copy())
    
    # Median at ends (try to make pivots bad for quicksort)
    median_ends = list(range(n))
    mid = n // 2
    median_ends[0], median_ends[mid] = median_ends[mid], median_ends[0]
    results["Median at Ends"] = heap_sort(median_ends.copy())
    
    return results
 
 
def analyze_variance(n: int, trials: int = 100) -> dict:
    """
    Analyze how much comparison count varies across random inputs.
    For heap sort, variance should be low.
    """
    counts = []
    for _ in range(trials):
        arr = list(range(n))
        random.shuffle(arr)
        counts.append(heap_sort(arr.copy()))
    
    avg = sum(counts) / len(counts)
    variance = sum((c - avg) ** 2 for c in counts) / len(counts)
    std_dev = variance ** 0.5
    
    return {
        "n": n,
        "trials": trials,
        "min_comparisons": min(counts),
        "max_comparisons": max(counts),
        "avg_comparisons": avg,
        "std_deviation": std_dev,
        "range_percent": (max(counts) - min(counts)) / avg * 100
    }
 
 
if __name__ == "__main__":
    n = 1000
    
    print("Adversarial Input Attempts (n = 1000)")
    print("=" * 50)
    
    results = try_adversarial_inputs(n)
    for pattern, comps in results.items():
        print(f"  {pattern:20s}: {comps:6d} comparisons")
    
    min_comps = min(results.values())
    max_comps = max(results.values())
    variation = (max_comps - min_comps) / min_comps * 100
    
    print(f"\nVariation between patterns: {variation:.1f}%")
    print(f"(Compare to quick sort which can have 1000x+ variation)")
    
    print("\n" + "=" * 50)
    print("Random Input Variance Analysis")
    print("=" * 50)
    
    analysis = analyze_variance(1000, 100)
    print(f"  Over {analysis['trials']} random inputs:")
    print(f"    Min comparisons: {analysis['min_comparisons']}")
    print(f"    Max comparisons: {analysis['max_comparisons']}")
    print(f"    Avg comparisons: {analysis['avg_comparisons']:.1f}")
    print(f"    Std deviation:   {analysis['std_deviation']:.1f}")
    print(f"    Range:           {analysis['range_percent']:.1f}%")

Key Insight

When you run the adversarial analysis above, you'll find that all input patterns produce similar comparison counts for heap sort—typically within 10-15% of each other. This is the empirical manifestation of heap sort's guaranteed performance. No matter how hard an adversary tries, they cannot force heap sort into slow behavior.

Comparison Constants and Hidden Factors

While heap sort achieves optimal O(n log n) asymptotic performance, its constant factors are higher than quick sort's. Understanding these constants helps explain real-world performance differences.

Comparison Count Analysis:

For heap sort on n elements:

Build-heap: approximately 2n comparisons
Extraction phase: approximately 2n log₂ n comparisons
Total: approximately 2n log₂ n + 2n comparisons

For quick sort (average case):

Approximately 1.39 n log₂ n comparisons

For merge sort:

Approximately n log₂ n comparisons

The 2x Factor in Heap Sort:

Every heapify comparison checks two children:

if left < n and arr[left] > arr[largest]: largest = left
if right < n and arr[right] > arr[largest]: largest = right

This is 2 comparisons per level traversed, while:

Quick sort's partition does 1 comparison per element
Merge sort's merge does 1-2 comparisons per element placed

The factor of 2 in heap sort's comparisons comes directly from the binary tree structure—we must examine both children to find the maximum.

Comparison Counts for O(n log n) Sorts
Algorithm	Approximate Comparisons	For n = 1,000,000
Merge Sort	n log₂ n	~20 million
Quick Sort (avg)	1.39 n log₂ n	~28 million
Heap Sort	2 n log₂ n	~40 million
Quick Sort (worst)	n²/2	~500 billion!

Beyond Comparisons: Memory Access Patterns

Comparison counts don't tell the whole story. Cache behavior significantly impacts real-world performance:

Merge Sort:

Sequential access patterns
Excellent cache utilization
Extra memory for merging may cause cache pressure

Quick Sort:

Good locality within partitions
Recursion follows data locality
Often fastest in practice

Heap Sort:

Scattered access patterns (parent → child jumps)
Poor cache utilization
Jumps between indices 2i+1, 2i+2, parent
Memory access pattern is determined by heap structure, not value locality

This cache behavior is why heap sort's ~2x comparison disadvantage often translates to a 2-3x wall-clock time disadvantage against quick sort.

Practice vs Theory

Modern CPUs have complex memory hierarchies. An algorithm that's optimal in terms of comparisons might be suboptimal in terms of cache misses. Heap sort trades good cache behavior for guaranteed worst-case performance. This is often an acceptable trade, but it's not free.

When Guaranteed Performance Matters

Heap sort's guaranteed O(n log n) performance is not just a theoretical nicety—it's crucial in specific application domains. Understanding when to choose heap sort over faster-on-average alternatives is a key engineering skill.

Real-Time Systems:

In real-time systems, meeting deadlines is often more important than average-case speed:

Aircraft control systems must respond within fixed time bounds
Medical devices cannot have occasional 100x slowdowns
Industrial automation requires predictable timing

For these systems, quick sort's O(n²) worst case is unacceptable even if rare.

Adversarial Environments:

When input comes from untrusted sources:

Public-facing APIs processing user data
Security-critical sorting (e.g., sorting audit logs)
Competition judges creating test cases

An adversary who knows you use quick sort can craft O(n²) inputs. Heap sort is immune.

Service Level Agreements (SLAs):

Modern cloud services often have SLAs specifying latency percentiles:

"99th percentile latency under 100ms"
"Maximum response time under 500ms"

If sorting is on the critical path, a rare O(n²) quick sort can blow these SLAs.

Choose Heap Sort When

•Worst-case guarantees matter — SLAs, real-time systems, adversarial inputs
•Memory is severely constrained — Embedded systems where O(n) extra space is unavailable
•You need partial results — Heap provides O(k log n) for k-smallest elements
•Predictability beats average speed — Security contexts, deterministic replay systems
•Building a priority queue anyway — If data needs heap operations later, sorting is nearly free

Avoid Heap Sort When

•Average-case speed is paramount — Quick sort is typically faster
•Memory is available — Merge sort gives stability and good cache behavior
•Data is nearly sorted — Insertion sort or Tim sort adapt better
•Stability is required — Heap sort reorders equal elements
•Cache performance critical — Heap's access pattern hurts L1/L2 hit rates

Heap Sort and the Ω(n log n) Lower Bound

Heap sort's O(n log n) performance is not just good—it's asymptotically optimal for comparison-based sorting. Let's understand why no comparison sort can do better.

The Decision Tree Argument:

Any comparison-based sort can be modeled as a binary decision tree:

Each internal node is a comparison (is a[i] < a[j]?)
Each branch follows based on comparison outcome
Each leaf corresponds to a final permutation of the input

For n elements:

There are n! possible input permutations
Each must map to a distinct leaf
Binary tree with n! leaves has height ≥ log₂(n!)

By Stirling's approximation: $$\log_2(n!) = \Theta(n \log n)$$

Therefore, any comparison sort must make at least Ω(n log n) comparisons in the worst case.

Heap Sort Achieves This Bound:

Heap sort performs O(n log n) comparisons, matching the lower bound (up to constant factors). It is therefore asymptotically optimal among comparison sorts.

What About Non-Comparison Sorts?

Counting sort, radix sort, and bucket sort can achieve O(n) time, but with restrictions:

Require knowledge of value range
Work only for specific data types (integers, strings)
Use O(n + k) space where k is the range

For general comparison-based sorting, O(n log n) is the best possible.

lower_bound_intuition.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
"""
Understanding the Ω(n log n) lower bound for comparison sorts.
 
The key insight: with n! possible orderings, we need at least
log₂(n!) bits of information to distinguish them. Each comparison
gives us 1 bit, so we need Ω(log(n!)) = Ω(n log n) comparisons.
"""
import math
 
 
def calculate_lower_bound(n: int) -> dict:
    """
    Calculate the theoretical lower bound for comparison-based sorting.
    """
    # Number of permutations
    num_permutations = math.factorial(n)
    
    # Information-theoretic lower bound
    # log₂(n!) bits of information needed
    # Stirling: log(n!) ≈ n log n - n log e + O(log n)
    log_n_factorial = math.log2(num_permutations) if n > 0 else 0
    
    # Stirling approximation: log₂(n!) ≈ n log₂(n) - n/ln(2) + O(log n)
    stirling_approx = (
        n * math.log2(n) - n / math.log(2) + 0.5 * math.log2(2 * math.pi * n)
        if n > 0 else 0
    )
    
    return {
        "n": n,
        "num_permutations": num_permutations if n <= 20 else "very large",
        "exact_log2_n_factorial": log_n_factorial,
        "stirling_approximation": stirling_approx,
        "relative_error_percent": abs(log_n_factorial - stirling_approx) / log_n_factorial * 100 if log_n_factorial > 0 else 0,
        "lower_bound_comparisons": math.ceil(log_n_factorial),
        "heap_sort_upper_bound": int(2 * n * math.log2(n)) if n > 1 else 0,
        "heap_sort_ratio": (2 * n * math.log2(n)) / log_n_factorial if log_n_factorial > 0 else float('inf'),
    }
 
 
def demonstrate_optimality():
    """
    Show that heap sort is within constant factor of optimal.
    """
    print("Comparison Sort Lower Bound Analysis")
    print("=" * 60)
    
    for n in [10, 100, 1000, 10000, 100000]:
        bounds = calculate_lower_bound(n)
        
        print(f"\nn = {n:,}")
        print(f"  Lower bound (log₂(n!)):    {bounds['lower_bound_comparisons']:>12,}")
        print(f"  Heap sort upper (2n log n): {bounds['heap_sort_upper_bound']:>12,}")
        print(f"  Ratio (heap/lower):         {bounds['heap_sort_ratio']:>12.2f}x")
    
    print("\n" + "=" * 60)
    print("Conclusion: Heap sort is within ~3x of the theoretical minimum.")
    print("This constant factor is the price of guaranteed worst-case O(n log n).")
 
 
def information_theory_perspective():
    """
    Illustrate the information theory behind the lower bound.
    """
    print("\nInformation Theory Perspective")
    print("=" * 60)
    print("\nSorting is fundamentally about gaining information.")
    print("Before sorting: n! equally likely orderings")
    print("After sorting: exactly 1 ordering known")
    print("Information gained: log₂(n!) bits")
    print("\nEach comparison provides at most 1 bit of information.")
    print("Therefore, we need at least log₂(n!) comparisons.")
    print("\nBy Stirling's approximation:")
    print("  log₂(n!) ≈ n log₂(n) - 1.44n + O(log n)")
    print("  = Θ(n log n)")
    
    # Show the bit counts
    print("\nBits of information needed (exact):")
    for n in [3, 5, 10, 20]:
        bits = math.log2(math.factorial(n))
        print(f"  n = {n:2d}: {bits:.1f} bits (n! = {math.factorial(n):,})")
 
 
if __name__ == "__main__":
    demonstrate_optimality()
    information_theory_perspective()

Optimality vs Efficiency

Being asymptotically optimal (matching the lower bound) doesn't mean being practically fastest. Heap sort is optimal at Θ(n log n), but quick sort's average-case Θ(n log n) has smaller constants. The lower bound tells us the shape of the function, not the coefficients.

Experimental Verification

Theory predicts O(n log n) behavior with consistent performance across input patterns. Let's verify this experimentally.

Measuring Growth Rate:

If T(n) = c·n log n, then T(n) / (n log n) should be approximately constant as n grows. This ratio gives us the leading constant c.

Measuring Variance:

For guaranteed performance, variance across different input patterns should be low. We should see similar times for sorted, reverse-sorted, random, and pathological inputs.

Comparing with Quick Sort:

Quick sort's average case should be faster, but its worst case (sorted input without randomization) should be dramatically slower.

experimental_verification.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
"""
Experimental verification of heap sort's O(n log n) guaranteed performance.
"""
import random
import time
import math
 
 
def heap_sort_timed(arr: list) -> tuple[list, float, int]:
    """Heap sort with timing and comparison counting."""
    comparisons = [0]  # Use list for nonlocal modification
    
    def heapify(arr, n, i):
        while True:
            largest = i
            left = 2 * i + 1
            right = 2 * i + 2
            
            if left < n:
                comparisons[0] += 1
                if arr[left] > arr[largest]:
                    largest = left
            if right < n:
                comparisons[0] += 1
                if arr[right] > arr[largest]:
                    largest = right
            
            if largest == i:
                break
            arr[i], arr[largest] = arr[largest], arr[i]
            i = largest
    
    n = len(arr)
    
    start_time = time.perf_counter()
    
    # Build heap
    for i in range((n // 2) - 1, -1, -1):
        heapify(arr, n, i)
    
    # Extract
    for i in range(n - 1, 0, -1):
        arr[0], arr[i] = arr[i], arr[0]
        heapify(arr, i, 0)
    
    elapsed = time.perf_counter() - start_time
    
    return arr, elapsed, comparisons[0]
 
 
def verify_n_log_n_growth():
    """Verify that time grows as n log n."""
    print("\n" + "=" * 60)
    print("Verifying O(n log n) Growth Rate")
    print("=" * 60)
    
    sizes = [1000, 2000, 5000, 10000, 20000, 50000, 100000]
    
    print(f"{'n':>10} {'Time (ms)':>12} {'Comparisons':>14} {'T/(n log n)':>14}")
    print("-" * 52)
    
    for n in sizes:
        arr = list(range(n))
        random.shuffle(arr)
        
        _, elapsed, comparisons = heap_sort_timed(arr.copy())
        
        n_log_n = n * math.log2(n)
        ratio = (elapsed * 1000) / n_log_n * 10000  # Normalize
        
        print(f"{n:>10,} {elapsed*1000:>12.2f} {comparisons:>14,} {ratio:>14.4f}")
    
    print("\nIf ratio column is roughly constant, growth is O(n log n) ✓")
 
 
def verify_input_independence():
    """Verify consistent performance across input patterns."""
    print("\n" + "=" * 60)
    print("Verifying Input Pattern Independence")
    print("=" * 60)
    
    n = 50000
    patterns = {
        "Random": lambda: random.sample(range(n), n),
        "Sorted": lambda: list(range(n)),
        "Reverse Sorted": lambda: list(range(n, 0, -1)),
        "All Same": lambda: [42] * n,
        "Many Duplicates": lambda: [random.randint(0, 10) for _ in range(n)],
        "Sawtooth": lambda: [i % 100 for i in range(n)],
        "Few Unique": lambda: [random.randint(0, 3) for _ in range(n)],
    }
    
    results = {}
    
    for name, gen_func in patterns.items():
        arr = gen_func()
        _, elapsed, comparisons = heap_sort_timed(arr.copy())
        results[name] = (elapsed * 1000, comparisons)
    
    print(f"{'Pattern':>18} {'Time (ms)':>12} {'Comparisons':>14}")
    print("-" * 46)
    
    for name, (ms, comps) in results.items():
        print(f"{name:>18} {ms:>12.2f} {comps:>14,}")
    
    times = [r[0] for r in results.values()]
    max_variation = (max(times) - min(times)) / min(times) * 100
    
    print(f"\nMax variation in timing: {max_variation:.1f}%")
    print("Low variation confirms guaranteed performance ✓")
 
 
def compare_with_quicksort():
    """Compare with quick sort to show the tradeoff."""
    print("\n" + "=" * 60)
    print("Comparison: Heap Sort vs Quick Sort (Worst Case)")
    print("=" * 60)
    
    def quicksort_deterministic(arr, low, high, comparisons):
        """Deterministic quick sort (first element pivot) - bad for sorted."""
        if low < high:
            # Partition
            pivot = arr[low]
            i = low + 1
            for j in range(low + 1, high + 1):
                comparisons[0] += 1
                if arr[j] < pivot:
                    arr[i], arr[j] = arr[j], arr[i]
                    i += 1
            arr[low], arr[i-1] = arr[i-1], arr[low]
            
            quicksort_deterministic(arr, low, i - 2, comparisons)
            quicksort_deterministic(arr, i, high, comparisons)
    
    import sys
    old_limit = sys.getrecursionlimit()
    sys.setrecursionlimit(10000)
    
    # Test on sorted input (worst case for naive quick sort)
    n = 3000  # Smaller n to avoid stack overflow
    
    print(f"\nSorting n = {n} (sorted input - worst case for quick sort)")
    
    # Heap sort on sorted input
    sorted_arr = list(range(n))
    _, heap_time, heap_comps = heap_sort_timed(sorted_arr.copy())
    
    # Quick sort on sorted input
    sorted_arr = list(range(n))
    qs_comps = [0]
    start = time.perf_counter()
    try:
        quicksort_deterministic(sorted_arr, 0, n - 1, qs_comps)
        qs_time = time.perf_counter() - start
        qs_status = "completed"
    except RecursionError:
        qs_time = float('inf')
        qs_status = "stack overflow!"
    
    print(f"\n{'Algorithm':>12} {'Time (ms)':>12} {'Comparisons':>14} {'Status':>20}")
    print("-" * 62)
    print(f"{'Heap Sort':>12} {heap_time*1000:>12.2f} {heap_comps:>14,} {'O(n log n)':>20}")
    if qs_time < float('inf'):
        print(f"{'Quick Sort':>12} {qs_time*1000:>12.2f} {qs_comps[0]:>14,} {'O(n²) worst case':>20}")
    else:
        print(f"{'Quick Sort':>12} {'N/A':>12} {'N/A':>14} {qs_status:>20}")
    
    print("\nHeap sort's consistent behavior is the payoff for constant factor overhead.")
    
    sys.setrecursionlimit(old_limit)
 
 
if __name__ == "__main__":
    verify_n_log_n_growth()
    verify_input_independence()
    compare_with_quicksort()

Experimental Confirmation

Running these experiments confirms our theoretical analysis: heap sort's time grows proportionally to n log n, performance barely varies across input patterns, and while slower than optimized quick sort on average, heap sort avoids the catastrophic O(n²) that can hit quick sort on adversarial inputs.

Summary: Guaranteed O(n log n) Performance

We've thoroughly analyzed heap sort's time complexity guarantees. Let's consolidate the key insights:

Key Takeaways

•Heap sort achieves O(n log n) in ALL cases — This is not average-case analysis; it's a worst-case guarantee.
•No adversarial input exists — The heap's structure depends only on element count, not values. No input pattern can degrade performance.
•Build-heap is O(n), extraction is O(n log n) — Extraction dominates, but both phases contribute to the guarantee.
•Heap sort matches the comparison-sort lower bound — It is asymptotically optimal at Θ(n log n).
•Constant factors are higher than quick sort — Heap sort's ~2n log n comparisons exceed quick sort's ~1.4n log n average.
•Guaranteed performance enables specific use cases — Real-time systems, adversarial inputs, and SLA compliance.

Looking Ahead:

In the final page of this module, we'll explore in-place sorting with heaps—understanding how heap sort achieves O(1) extra space, manipulating heaps destructively for sorting, and the implementation details that make truly in-place heap sort possible.

Page Complete

You now understand why heap sort's O(n log n) guarantee is ironclad, how the heap's value-independent structure prevents adversarial inputs, and when this guarantee matters in practice. Next, we'll explore how heap sort achieves its space efficiency through in-place sorting techniques.