Quick Sort as D&C - Learning Module

Loading content...

0/276

Divide: Partition Around Pivot

The Engine of Quick Sort

If quick sort's D&C structure is its skeleton, then partitioning is its beating heart. The partition operation is where quick sort diverges from every other sorting algorithm—where it does the real work of ordering elements. Understanding partition deeply is not optional for mastering quick sort; it is essential.

The partition operation must accomplish something remarkable in a single linear-time pass: take an unsorted array segment and transform it so that one element (the pivot) reaches its final sorted position forever. Every partition call permanently places one element, and the subproblems it creates never touch that element again.

This page dissects the partition operation with surgical precision. We'll examine multiple partitioning schemes, their invariants, their subtle differences, and the critical question of pivot selection. By the end, you will not just understand partition—you will be able to implement it from first principles, debug it with confidence, and reason about its performance implications.

What You Will Master

By the end of this page, you will understand both the Lomuto and Hoare partition schemes at a deep mechanical level. You will know their invariants, their trade-offs, and when to choose each. You will understand why pivot selection is critical and how different strategies affect performance. Most importantly, you will see partition as the key that unlocks quick sort's practical dominance.

The Partition Problem

Before examining specific partition algorithms, we must precisely define what partition must accomplish.

Formal Problem Statement:

Given an array A[low...high] and a pivot element p selected from within this range:

Rearrange elements so that:
- All elements less than p appear to the left of p's final position
- All elements greater than p appear to the right of p's final position
- The pivot p is at some final index q
Return the index q where the pivot now resides

Post-condition (The Partition Invariant):

After partition(A, low, high) returns q:

∀i ∈ [low, q-1]: A[i] < A[q]
∀i ∈ [q+1, high]: A[i] ≥ A[q]
A[q] = pivot (in its final sorted position)

This invariant is remarkably strong. It guarantees that the pivot is correctly placed for all time—no future recursive call will ever move it. This permanence is what enables quick sort's in-place operation and trivial combine phase.

Why Partition is Non-Trivial:

Partition seems simple in concept, but the implementation must satisfy multiple constraints simultaneously:

Linear time: Must process each element exactly once (O(n))
In-place: Should use O(1) auxiliary space (no temporary arrays)
Maintain invariant: Must correctly separate less-than and greater-than regions throughout execution
Handle equal elements: Must correctly handle elements equal to pivot (stability considerations)
Handle edge cases: Must work for arrays with all identical elements, single elements, or already-sorted data

Different partition schemes prioritize these constraints differently, leading to subtle but important behavioral differences.

Partition Requirements
Requirement	Importance	Challenge
Linear time O(n)	Critical—defines D&C complexity	Must avoid nested loops or rescanning
In-place operation	Enables O(log n) space	Must track regions without auxiliary arrays
Correct invariant	Correctness of entire algorithm	Must maintain through all swaps
Handle duplicates	Prevents worst-case on repeated values	Different schemes handle differently
Minimal swaps	Practical performance	Trade-off with code simplicity

The Lomuto Partition Scheme

The Lomuto partition scheme, popularized by Nico Lomuto, is the more commonly taught version due to its simplicity and clarity. It selects the last element as the pivot and maintains a clear two-region invariant.

Algorithm Overview:

Select the last element A[high] as the pivot
Maintain two regions using index i (boundary) and j (scanner)
Everything in A[low...i] is less than pivot
Everything in A[i+1...j-1] is greater than or equal to pivot
A[j...high-1] is the unprocessed region
Final swap places pivot at position i+1

lomuto_partition.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def lomuto_partition(arr, low, high):
    """
    Lomuto Partition Scheme
    
    Invariant maintained during execution:
    - arr[low...i]:      Elements < pivot
    - arr[i+1...j-1]:    Elements >= pivot
    - arr[j...high-1]:   Unprocessed elements
    - arr[high]:         Pivot (not yet in final position)
    
    Visual representation of the invariant:
    +------------------+------------------+------------------+-------+
    |    < pivot       |    >= pivot      |   unprocessed    | pivot |
    +------------------+------------------+------------------+-------+
    low              i                 j-1                 high-1   high
    """
    pivot = arr[high]  # Last element as pivot
    i = low - 1        # Boundary of "less than" region (starts before array)
    
    for j in range(low, high):  # Scan through array (excluding pivot)
        if arr[j] < pivot:
            # Element belongs in "less than" region
            i += 1              # Expand "less than" region
            arr[i], arr[j] = arr[j], arr[i]  # Swap element into place
    
    # Place pivot in its final position (right after "less than" region)
    arr[i + 1], arr[high] = arr[high], arr[i + 1]
    
    return i + 1  # Return pivot's final index
 
 
# Step-by-step trace for [5, 2, 8, 1, 9, 4, 7, 3, 6]
# Pivot = 6, i = -1
#
# j=0: arr[0]=5 < 6? Yes. i=0. Swap arr[0]↔arr[0]. Array: [5,2,8,1,9,4,7,3,6]
# j=1: arr[1]=2 < 6? Yes. i=1. Swap arr[1]↔arr[1]. Array: [5,2,8,1,9,4,7,3,6]
# j=2: arr[2]=8 < 6? No. Continue.
# j=3: arr[3]=1 < 6? Yes. i=2. Swap arr[2]↔arr[3]. Array: [5,2,1,8,9,4,7,3,6]
# j=4: arr[4]=9 < 6? No. Continue.
# j=5: arr[5]=4 < 6? Yes. i=3. Swap arr[3]↔arr[5]. Array: [5,2,1,4,9,8,7,3,6]
# j=6: arr[6]=7 < 6? No. Continue.
# j=7: arr[7]=3 < 6? Yes. i=4. Swap arr[4]↔arr[7]. Array: [5,2,1,4,3,8,7,9,6]
# 
# Final: Swap arr[5]↔arr[8] (pivot). Array: [5,2,1,4,3,6,7,9,8]
# Return 5 (pivot's final position)

Lomuto's Invariant Analysis:

The genius of Lomuto's scheme lies in its simple loop invariant. At the end of each iteration of the for loop:

arr[low...i] contains only elements strictly less than pivot
arr[i+1...j] contains only elements greater than or equal to pivot
arr[j+1...high-1] has not been examined yet
arr[high] is the pivot, untouched until the end

This invariant is easy to verify:

Initialization: Before the loop, i = low-1, so both "less than" and "greater-or-equal" regions are empty.
Maintenance: Each iteration either extends the "less than" region (by incrementing i and swapping) or extends the "greater-or-equal" region (by just advancing j).
Termination: When j = high, all elements except pivot have been classified and placed correctly.

Why Lomuto is Taught First

Lomuto partition has a single, forward-moving pointer j that examines each element exactly once. The boundary pointer i only moves during swaps. This unidirectional flow is easy to visualize and reason about. The invariant is simple to state and verify. For educational purposes, Lomuto is ideal—even if Hoare's scheme is more efficient in practice.

The Hoare Partition Scheme

The Hoare partition scheme, invented by Tony Hoare (quick sort's creator himself), predates Lomuto and is more efficient in practice. It uses two pointers that move toward each other from opposite ends of the array.

Algorithm Overview:

Select a pivot (typically the first element or middle element)
Initialize left pointer i at low-1, right pointer j at high+1
Move i right until finding an element ≥ pivot
Move j left until finding an element ≤ pivot
If i < j, swap arr[i] and arr[j] and repeat
When pointers cross, j marks the partition boundary

Critical Difference from Lomuto: Hoare's scheme doesn't place the pivot in its final position! It only guarantees that elements in arr[low...j] are ≤ pivot and elements in arr[j+1...high] are ≥ pivot. The recursive calls must be adjusted accordingly.

hoare_partition.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
def hoare_partition(arr, low, high):
    """
    Hoare Partition Scheme
    
    Invariant maintained:
    - Elements at indices < i are <= pivot
    - Elements at indices > j are >= pivot
    - When i >= j, partition is complete
    
    KEY DIFFERENCE: Pivot does NOT end up in its final sorted position!
    The partition point j defines the boundary, not the pivot location.
    
    Visual representation:
    +------------------------+------------------------+
    |      <= pivot          |      >= pivot          |
    +------------------------+------------------------+
    low                    j  j+1                   high
    """
    pivot = arr[low]  # First element as pivot (common choice for Hoare)
    i = low - 1       # Left pointer (will move right)
    j = high + 1      # Right pointer (will move left)
    
    while True:
        # Move i right, stopping at element >= pivot
        while True:
            i += 1
            if arr[i] >= pivot:
                break
        
        # Move j left, stopping at element <= pivot
        while True:
            j -= 1
            if arr[j] <= pivot:
                break
        
        # If pointers have crossed, we're done
        if i >= j:
            return j  # j is the partition boundary
        
        # Swap elements that are on wrong sides
        arr[i], arr[j] = arr[j], arr[i]
 
 
def quick_sort_hoare(arr, low, high):
    """
    Quick sort using Hoare partition.
    NOTE: Recursive calls differ from Lomuto version!
    
    With Hoare: we recur on [low...p] and [p+1...high]
    NOT on [low...p-1] and [p+1...high] like Lomuto
    """
    if low < high:
        p = hoare_partition(arr, low, high)
        quick_sort_hoare(arr, low, p)      # Include p in left partition
        quick_sort_hoare(arr, p + 1, high)
 
 
# Step-by-step trace for [5, 2, 8, 1, 9, 4, 7, 3, 6]
# Pivot = 5, i = -1, j = 9
#
# Iteration 1:
#   Move i right: i=0, arr[0]=5 >= 5? Yes. Stop at i=0.
#   Move j left: j=8, arr[8]=6 <= 5? No. j=7, arr[7]=3 <= 5? Yes. Stop at j=7.
#   i=0 < j=7? Yes. Swap arr[0]↔arr[7]. Array: [3,2,8,1,9,4,7,5,6]
#
# Iteration 2:
#   Move i right: i=1, arr[1]=2 >= 5? No. i=2, arr[2]=8 >= 5? Yes. Stop at i=2.
#   Move j left: j=6, arr[6]=7 <= 5? No. j=5, arr[5]=4 <= 5? Yes. Stop at j=5.
#   i=2 < j=5? Yes. Swap arr[2]↔arr[5]. Array: [3,2,4,1,9,8,7,5,6]
#
# Iteration 3:
#   Move i right: i=3, arr[3]=1 >= 5? No. i=4, arr[4]=9 >= 5? Yes. Stop at i=4.
#   Move j left: j=4, arr[4]=9 <= 5? No. j=3, arr[3]=1 <= 5? Yes. Stop at j=3.
#   i=4 >= j=3? Yes. Return j=3.
#
# After partition: [3,2,4,1 | 9,8,7,5,6]
# Left partition (indices 0-3): all elements <= 5
# Right partition (indices 4-8): all elements >= 5

Critical Difference: Pivot Position

Unlike Lomuto, Hoare's partition does NOT place the pivot in its final sorted position. The pivot value may end up anywhere within the array. What Hoare guarantees is weaker: all elements in the left partition are ≤ pivot, all elements in the right partition are ≥ pivot. This affects how recursive calls are structured.

Why Hoare is More Efficient:

Hoare's scheme performs approximately 3x fewer swaps than Lomuto on average. This difference becomes significant for large arrays:

Aspect	Lomuto	Hoare
Average swaps	n/2	n/6
Pointer movements	n (single direction)	n (both directions)
Comparisons	n	n
Pivot placement	Final position	Not final position
Equal elements	Degrades to O(n²)	Handles well

The swap reduction comes from Hoare's bidirectional approach: instead of moving one element at a time into the correct partition, it finds pairs of misplaced elements and swaps them simultaneously.

Handling Equal Elements:

Lomuto's "< pivot" condition causes problems with many duplicate elements—all equal elements end up in the right partition, causing imbalance. Hoare's "<= pivot" and ">= pivot" conditions distribute equal elements more evenly between partitions.

Comparing Partition Schemes

Understanding when to use each partition scheme requires analyzing their trade-offs across multiple dimensions.

Lomuto Advantages

•Simpler invariant: Single forward-moving pointer makes logic easier to follow and debug
•Pivot placement: Pivot ends up in its final sorted position, simplifying recursive structure
•Easier to modify: Three-way partitioning and other variants are simpler to derive from Lomuto
•Tail recursion: Can be optimized to process smaller partition with recursion, larger with iteration
•Educational value: Preferred for teaching due to clarity of execution flow

Hoare Advantages

•Fewer swaps: Approximately 3x fewer swaps on average, reducing memory write operations
•Better cache behavior: Bidirectional scanning can be more cache-friendly for large arrays
•Handles duplicates better: Equal elements distributed across partitions rather than clustered
•Historical significance: Original scheme designed by Hoare alongside quick sort itself
•Practical performance: Generally faster in benchmarks for most input distributions

The Duplicate Element Problem:

One of the most significant practical differences emerges with arrays containing many duplicate elements:

Lomuto with duplicates: Consider partitioning [3, 3, 3, 3, 3] with pivot 3.

The condition arr[j] < pivot is never true
All elements go into the "≥ pivot" partition
Partition returns index 0 (pivot at position 0)
Subproblems: [] and [3, 3, 3, 3]
This creates maximally imbalanced partitions: T(n) = T(0) + T(n-1) + n = O(n²)

Hoare with duplicates: Consider the same array [3, 3, 3, 3, 3].

First arr[i] >= pivot at i=0, first arr[j] <= pivot at j=4
Swap and continue until pointers meet near middle
Partition returns approximately middle index
Subproblems are roughly balanced: T(n) ≈ 2T(n/2) + n = O(n log n)

When to Use Which

Use Lomuto when: teaching, implementing three-way partitioning, or when simplicity matters more than raw speed. Use Hoare when: maximum performance is needed, dealing with potentially repetitive data, or implementing for production systems. Modern standard libraries typically use Hoare-style schemes or sophisticated hybrids.

Pivot Selection Strategies

The choice of pivot element is the single most important factor in quick sort's performance. A poor pivot can degrade O(n log n) to O(n²), while a good pivot ensures balanced partitions. Let's examine the major strategies.

Strategy 1: First/Last Element

The simplest approach—select the first or last element as pivot.

pivot_strategies.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# Strategy 1: Fixed position (first or last)
def partition_last(arr, low, high):
    pivot = arr[high]  # Last element
    # ... Lomuto partition
    
# PROBLEM: Worst case on sorted/reverse-sorted input!
# If arr = [1,2,3,4,5], pivot = 5 (largest)
# All elements go left, partition is maximally unbalanced
# Recurrence: T(n) = T(n-1) + O(n) = O(n²)
 
 
# Strategy 2: Random selection
import random
 
def partition_random(arr, low, high):
    # Select random element and swap to end
    rand_idx = random.randint(low, high)
    arr[rand_idx], arr[high] = arr[high], arr[rand_idx]
    pivot = arr[high]
    # ... Lomuto partition
    
# ADVANTAGE: Expected O(n log n) for ANY input
# No adversary can construct worst-case input
# Probability of worst case: 1/n! for sorted input
 
 
# Strategy 3: Median-of-Three
def median_of_three(arr, low, high):
    """Select median of first, middle, and last elements."""
    mid = (low + high) // 2
    
    # Sort the three candidates
    if arr[low] > arr[mid]:
        arr[low], arr[mid] = arr[mid], arr[low]
    if arr[mid] > arr[high]:
        arr[mid], arr[high] = arr[high], arr[mid]
    if arr[low] > arr[mid]:
        arr[low], arr[mid] = arr[mid], arr[low]
    
    # Now: arr[low] <= arr[mid] <= arr[high]
    # Median is at mid, swap to high-1 for partition
    arr[mid], arr[high - 1] = arr[high - 1], arr[mid]
    return arr[high - 1]
 
# ADVANTAGE: Handles sorted input well (median is middle element)
# Provides reasonable pivot for most distributions
# O(1) overhead, no random number generation
 
 
# Strategy 4: Ninther (Median of Medians of Three)
def ninther(arr, low, high):
    """
    Select ninther for very large arrays.
    Take median-of-three from three groups:
    - First third
    - Middle third  
    - Last third
    Then take median of those three medians.
    """
    n = high - low + 1
    if n < 9:
        return median_of_three(arr, low, high)
    
    third = n // 3
    m1 = median_of_three_value(arr, low, low + third - 1)
    m2 = median_of_three_value(arr, low + third, high - third)
    m3 = median_of_three_value(arr, high - third + 1, high)
    
    return median_value(m1, m2, m3)
 
# ADVANTAGE: Better approximation of true median
# Reduces probability of worst case further
# Used in production implementations (pdqsort, introsort)

Pivot Selection Strategy Comparison
Strategy	Best Case	Worst Case	Average	Overhead
First/Last	O(n log n)	O(n²) on sorted	O(n log n)	O(1)
Random	O(n log n)	O(n²) very rare	O(n log n)	O(1) + RNG
Median-of-3	O(n log n)	O(n²) on killers	O(n log n)	O(1) + 3 swaps
Ninther	O(n log n)	O(n²) extremely rare	O(n log n)	O(1) + 9 comparisons
True Median	O(n log n)	O(n log n)	O(n log n)	O(n)—too expensive

Median-of-Three Killers

Even median-of-three selection is vulnerable to specially crafted 'killer' sequences that create worst-case behavior. These sequences are constructed by analyzing the median-of-three selection algorithm and placing elements to consistently produce unbalanced partitions. This is why production implementations combine multiple strategies—randomization defeats killers, median-of-three handles common cases efficiently.

Three-Way Partitioning (Dutch National Flag)

When arrays contain many duplicate elements, standard two-way partitioning can perform poorly. Three-way partitioning (also known as the Dutch National Flag algorithm, after Dijkstra's formulation) solves this by creating three regions: less than pivot, equal to pivot, and greater than pivot.

Why Three-Way Matters:

Consider sorting [2, 1, 3, 2, 2, 3, 1, 2, 2] where 2 appears 5 times.

With two-way partitioning:

Each 2 is processed individually in recursive calls
Total calls proportional to n
May still hit O(n²) if unlucky with pivot selection

With three-way partitioning:

All 2s are grouped together in one partition
Equal elements are excluded from recursive calls entirely
Achieves O(n) for arrays with all identical elements

three_way_partition.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def three_way_partition(arr, low, high):
    """
    Three-Way Partitioning (Dutch National Flag)
    
    Creates three regions:
    1. arr[low...lt-1]:   Elements < pivot
    2. arr[lt...gt]:      Elements == pivot (final positions!)
    3. arr[gt+1...high]:  Elements > pivot
    
    Returns: (lt, gt) - boundaries of the equal region
    
    Visual representation:
    +------------------+------------------+------------------+
    |    < pivot       |    == pivot      |    > pivot       |
    +------------------+------------------+------------------+
    low              lt-1  lt          gt  gt+1           high
    
    After partition:
    - All elements equal to pivot are in FINAL positions
    - Recursive calls only on < pivot and > pivot regions
    """
    pivot = arr[low]  # Use first element as pivot
    lt = low          # Everything < lt is strictly less than pivot
    gt = high         # Everything > gt is strictly greater than pivot
    i = low           # Current element being examined
    
    while i <= gt:
        if arr[i] < pivot:
            # Element belongs in "less than" region
            arr[lt], arr[i] = arr[i], arr[lt]
            lt += 1
            i += 1
        elif arr[i] > pivot:
            # Element belongs in "greater than" region
            arr[gt], arr[i] = arr[i], arr[gt]
            gt -= 1
            # Note: don't increment i (need to examine swapped element)
        else:
            # Element equals pivot, already in correct region
            i += 1
    
    return lt, gt
 
 
def quick_sort_three_way(arr, low, high):
    """Quick sort using three-way partitioning."""
    if low >= high:
        return
    
    lt, gt = three_way_partition(arr, low, high)
    
    # Elements in arr[lt...gt] are already sorted (all equal)
    # Only recurse on the strictly less and strictly greater regions
    quick_sort_three_way(arr, low, lt - 1)   # Elements < pivot
    quick_sort_three_way(arr, gt + 1, high)  # Elements > pivot
 
 
# Example trace for [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
# Let's say pivot = 5
#
# After three-way partition:
# [3, 1, 4, 1, 2, 3] [5, 5, 5] [9, 6]
#        < 5            = 5      > 5
#
# The three 5s will NEVER be touched again
# Recursive calls only on the non-equal regions

When to Use Three-Way

Three-way partitioning shines when data has many duplicates—arrays with limited distinct values, data with heavy repetition patterns, or cases where elements are complex objects compared by a single key. Modern implementations detect duplicate patterns and switch to three-way partitioning adaptively.

Partition Invariants: The Foundation of Correctness

The correctness of quick sort rests entirely on partition invariants. Understanding these invariants deeply enables you to implement partition correctly, debug failures, and design new variations.

Loop Invariant Proof Structure:

For any partition algorithm, we must prove:

Initialization: The invariant holds before the first iteration
Maintenance: If the invariant holds before an iteration, it holds after
Termination: When the loop ends, the invariant plus the exit condition give us the postcondition

Lomuto Invariant Proof:

lomuto_invariant_proof.txt
LOMUTO PARTITION LOOP INVARIANT:
At the start of each iteration of the for loop:
 
1. arr[low...i] contains elements strictly LESS than pivot
2. arr[i+1...j-1] contains elements GREATER OR EQUAL to pivot
3. arr[high] is the pivot (unchanged)
 
INITIALIZATION (before first iteration, j = low):
- i = low - 1, so arr[low...i] = arr[low...low-1] = empty ✓
- arr[i+1...j-1] = arr[low...low-1] = empty ✓
- arr[high] = pivot ✓
 
MAINTENANCE (assuming invariant holds at start of iteration):
Case 1: arr[j] >= pivot
  - j increments, arr[i+1...j] now includes arr[j]
  - arr[j] >= pivot, so property 2 maintained ✓
  - i unchanged, property 1 unchanged ✓
 
Case 2: arr[j] < pivot
  - i increments
  - swap arr[i] with arr[j]
  - Before swap: arr[j] < pivot (condition we checked)
  - Before swap: arr[i] >= pivot (from invariant 2, since i+1 was in >= region)
  - After swap: arr[i] < pivot (satisfies property 1) ✓
  - After swap: arr[j] >= pivot (satisfies property 2) ✓
  - j increments, ready for next iteration
 
TERMINATION (j = high):
- arr[low...i] < pivot (from invariant 1)
- arr[i+1...high-1] >= pivot (from invariant 2)  
- arr[high] = pivot
 
Final swap: arr[i+1] ↔ arr[high]
- arr[i+1] >= pivot (from invariant 2)
- After swap: pivot at arr[i+1], the >= element at arr[high]
 
POSTCONDITION:
- arr[low...i] < arr[i+1] (pivot) ≤ arr[i+2...high]
- Partition complete with pivot at its final sorted position ✓

Debugging Using Invariants:

When partition fails, check the invariant at each step:

Print array state before and after each swap
Verify invariant properties at each iteration
Identify the first iteration where an invariant is violated
Examine what condition led to the violation

Common bugs and their symptoms:

Bug	Symptom	Invariant Violated
Wrong comparison (`<=` vs `<`)	Infinite loop or incorrect partition	Elements in wrong region
Off-by-one in loop bound	Missing element or array overflow	Termination condition
Wrong index in swap	Elements duplicated or lost	Both regions corrupted
Pivot position incorrect	Pivot not in final position	Postcondition

By tracing invariants, you can identify exactly where the algorithm deviates from correctness.

Invariants as Design Tools

Invariants aren't just proof techniques—they're design tools. When implementing partition, start by writing down the invariant you want to maintain. Then write code that maintains it. The invariant guides implementation and makes debugging systematic rather than ad-hoc.

Summary: Partition as Quick Sort's Divide

We've thoroughly examined the partition operation—quick sort's divide phase. Let's consolidate the key insights:

Key Takeaways

•Partition is the heart of quick sort—it performs the ordering work that makes combination trivial.
•Lomuto partition is simpler but performs more swaps; it places the pivot in its final position.
•Hoare partition is more efficient (fewer swaps) but doesn't place the pivot in its final position; it requires adjusted recursive calls.
•Pivot selection critically affects performance—random or median-of-three strategies prevent worst-case on common inputs.
•Three-way partitioning handles duplicate elements efficiently by grouping equals together and excluding them from recursion.
•Loop invariants are the foundation of partition correctness; understanding them enables debugging and design of new variants.

What's Next:

With partition understood as the divide phase, we move to the conquer phase: sorting partitions recursively. We'll examine the recursive structure of quick sort, understand its call stack behavior, and explore optimizations like tail recursion elimination.

Page Complete

You now understand partition at a deep level—multiple schemes, their invariants, pivot selection strategies, and three-way partitioning for duplicates. You can implement partition from first principles, choose the appropriate scheme for your context, and debug partition failures systematically. Next, we'll explore the conquer phase of quick sort.

Divide: Partition Around Pivot

The Engine of Quick Sort

What You Will Master

The Partition Problem

Before examining specific partition algorithms, we must precisely define what partition must accomplish.

Formal Problem Statement:

Given an array A[low...high] and a pivot element p selected from within this range:

Rearrange elements so that:
- All elements less than p appear to the left of p's final position
- All elements greater than p appear to the right of p's final position
- The pivot p is at some final index q
Return the index q where the pivot now resides

Post-condition (The Partition Invariant):

After partition(A, low, high) returns q:

∀i ∈ [low, q-1]: A[i] < A[q]
∀i ∈ [q+1, high]: A[i] ≥ A[q]
A[q] = pivot (in its final sorted position)

Why Partition is Non-Trivial:

Partition seems simple in concept, but the implementation must satisfy multiple constraints simultaneously:

Linear time: Must process each element exactly once (O(n))
In-place: Should use O(1) auxiliary space (no temporary arrays)
Maintain invariant: Must correctly separate less-than and greater-than regions throughout execution
Handle equal elements: Must correctly handle elements equal to pivot (stability considerations)
Handle edge cases: Must work for arrays with all identical elements, single elements, or already-sorted data

Different partition schemes prioritize these constraints differently, leading to subtle but important behavioral differences.

Partition Requirements
Requirement	Importance	Challenge
Linear time O(n)	Critical—defines D&C complexity	Must avoid nested loops or rescanning
In-place operation	Enables O(log n) space	Must track regions without auxiliary arrays
Correct invariant	Correctness of entire algorithm	Must maintain through all swaps
Handle duplicates	Prevents worst-case on repeated values	Different schemes handle differently
Minimal swaps	Practical performance	Trade-off with code simplicity

The Lomuto Partition Scheme

Algorithm Overview:

Select the last element A[high] as the pivot
Maintain two regions using index i (boundary) and j (scanner)
Everything in A[low...i] is less than pivot
Everything in A[i+1...j-1] is greater than or equal to pivot
A[j...high-1] is the unprocessed region
Final swap places pivot at position i+1

lomuto_partition.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def lomuto_partition(arr, low, high):
    """
    Lomuto Partition Scheme
    
    Invariant maintained during execution:
    - arr[low...i]:      Elements < pivot
    - arr[i+1...j-1]:    Elements >= pivot
    - arr[j...high-1]:   Unprocessed elements
    - arr[high]:         Pivot (not yet in final position)
    
    Visual representation of the invariant:
    +------------------+------------------+------------------+-------+
    |    < pivot       |    >= pivot      |   unprocessed    | pivot |
    +------------------+------------------+------------------+-------+
    low              i                 j-1                 high-1   high
    """
    pivot = arr[high]  # Last element as pivot
    i = low - 1        # Boundary of "less than" region (starts before array)
    
    for j in range(low, high):  # Scan through array (excluding pivot)
        if arr[j] < pivot:
            # Element belongs in "less than" region
            i += 1              # Expand "less than" region
            arr[i], arr[j] = arr[j], arr[i]  # Swap element into place
    
    # Place pivot in its final position (right after "less than" region)
    arr[i + 1], arr[high] = arr[high], arr[i + 1]
    
    return i + 1  # Return pivot's final index
 
 
# Step-by-step trace for [5, 2, 8, 1, 9, 4, 7, 3, 6]
# Pivot = 6, i = -1
#
# j=0: arr[0]=5 < 6? Yes. i=0. Swap arr[0]↔arr[0]. Array: [5,2,8,1,9,4,7,3,6]
# j=1: arr[1]=2 < 6? Yes. i=1. Swap arr[1]↔arr[1]. Array: [5,2,8,1,9,4,7,3,6]
# j=2: arr[2]=8 < 6? No. Continue.
# j=3: arr[3]=1 < 6? Yes. i=2. Swap arr[2]↔arr[3]. Array: [5,2,1,8,9,4,7,3,6]
# j=4: arr[4]=9 < 6? No. Continue.
# j=5: arr[5]=4 < 6? Yes. i=3. Swap arr[3]↔arr[5]. Array: [5,2,1,4,9,8,7,3,6]
# j=6: arr[6]=7 < 6? No. Continue.
# j=7: arr[7]=3 < 6? Yes. i=4. Swap arr[4]↔arr[7]. Array: [5,2,1,4,3,8,7,9,6]
# 
# Final: Swap arr[5]↔arr[8] (pivot). Array: [5,2,1,4,3,6,7,9,8]
# Return 5 (pivot's final position)

Lomuto's Invariant Analysis:

The genius of Lomuto's scheme lies in its simple loop invariant. At the end of each iteration of the for loop:

arr[low...i] contains only elements strictly less than pivot
arr[i+1...j] contains only elements greater than or equal to pivot
arr[j+1...high-1] has not been examined yet
arr[high] is the pivot, untouched until the end

This invariant is easy to verify:

Initialization: Before the loop, i = low-1, so both "less than" and "greater-or-equal" regions are empty.
Maintenance: Each iteration either extends the "less than" region (by incrementing i and swapping) or extends the "greater-or-equal" region (by just advancing j).
Termination: When j = high, all elements except pivot have been classified and placed correctly.

Why Lomuto is Taught First

The Hoare Partition Scheme

Algorithm Overview:

Select a pivot (typically the first element or middle element)
Initialize left pointer i at low-1, right pointer j at high+1
Move i right until finding an element ≥ pivot
Move j left until finding an element ≤ pivot
If i < j, swap arr[i] and arr[j] and repeat
When pointers cross, j marks the partition boundary

hoare_partition.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
def hoare_partition(arr, low, high):
    """
    Hoare Partition Scheme
    
    Invariant maintained:
    - Elements at indices < i are <= pivot
    - Elements at indices > j are >= pivot
    - When i >= j, partition is complete
    
    KEY DIFFERENCE: Pivot does NOT end up in its final sorted position!
    The partition point j defines the boundary, not the pivot location.
    
    Visual representation:
    +------------------------+------------------------+
    |      <= pivot          |      >= pivot          |
    +------------------------+------------------------+
    low                    j  j+1                   high
    """
    pivot = arr[low]  # First element as pivot (common choice for Hoare)
    i = low - 1       # Left pointer (will move right)
    j = high + 1      # Right pointer (will move left)
    
    while True:
        # Move i right, stopping at element >= pivot
        while True:
            i += 1
            if arr[i] >= pivot:
                break
        
        # Move j left, stopping at element <= pivot
        while True:
            j -= 1
            if arr[j] <= pivot:
                break
        
        # If pointers have crossed, we're done
        if i >= j:
            return j  # j is the partition boundary
        
        # Swap elements that are on wrong sides
        arr[i], arr[j] = arr[j], arr[i]
 
 
def quick_sort_hoare(arr, low, high):
    """
    Quick sort using Hoare partition.
    NOTE: Recursive calls differ from Lomuto version!
    
    With Hoare: we recur on [low...p] and [p+1...high]
    NOT on [low...p-1] and [p+1...high] like Lomuto
    """
    if low < high:
        p = hoare_partition(arr, low, high)
        quick_sort_hoare(arr, low, p)      # Include p in left partition
        quick_sort_hoare(arr, p + 1, high)
 
 
# Step-by-step trace for [5, 2, 8, 1, 9, 4, 7, 3, 6]
# Pivot = 5, i = -1, j = 9
#
# Iteration 1:
#   Move i right: i=0, arr[0]=5 >= 5? Yes. Stop at i=0.
#   Move j left: j=8, arr[8]=6 <= 5? No. j=7, arr[7]=3 <= 5? Yes. Stop at j=7.
#   i=0 < j=7? Yes. Swap arr[0]↔arr[7]. Array: [3,2,8,1,9,4,7,5,6]
#
# Iteration 2:
#   Move i right: i=1, arr[1]=2 >= 5? No. i=2, arr[2]=8 >= 5? Yes. Stop at i=2.
#   Move j left: j=6, arr[6]=7 <= 5? No. j=5, arr[5]=4 <= 5? Yes. Stop at j=5.
#   i=2 < j=5? Yes. Swap arr[2]↔arr[5]. Array: [3,2,4,1,9,8,7,5,6]
#
# Iteration 3:
#   Move i right: i=3, arr[3]=1 >= 5? No. i=4, arr[4]=9 >= 5? Yes. Stop at i=4.
#   Move j left: j=4, arr[4]=9 <= 5? No. j=3, arr[3]=1 <= 5? Yes. Stop at j=3.
#   i=4 >= j=3? Yes. Return j=3.
#
# After partition: [3,2,4,1 | 9,8,7,5,6]
# Left partition (indices 0-3): all elements <= 5
# Right partition (indices 4-8): all elements >= 5

Critical Difference: Pivot Position

Why Hoare is More Efficient:

Hoare's scheme performs approximately 3x fewer swaps than Lomuto on average. This difference becomes significant for large arrays:

Aspect	Lomuto	Hoare
Average swaps	n/2	n/6
Pointer movements	n (single direction)	n (both directions)
Comparisons	n	n
Pivot placement	Final position	Not final position
Equal elements	Degrades to O(n²)	Handles well

The swap reduction comes from Hoare's bidirectional approach: instead of moving one element at a time into the correct partition, it finds pairs of misplaced elements and swaps them simultaneously.

Handling Equal Elements:

Comparing Partition Schemes

Understanding when to use each partition scheme requires analyzing their trade-offs across multiple dimensions.

Lomuto Advantages

•Simpler invariant: Single forward-moving pointer makes logic easier to follow and debug
•Pivot placement: Pivot ends up in its final sorted position, simplifying recursive structure
•Easier to modify: Three-way partitioning and other variants are simpler to derive from Lomuto
•Tail recursion: Can be optimized to process smaller partition with recursion, larger with iteration
•Educational value: Preferred for teaching due to clarity of execution flow

Hoare Advantages

•Fewer swaps: Approximately 3x fewer swaps on average, reducing memory write operations
•Better cache behavior: Bidirectional scanning can be more cache-friendly for large arrays
•Handles duplicates better: Equal elements distributed across partitions rather than clustered
•Historical significance: Original scheme designed by Hoare alongside quick sort itself
•Practical performance: Generally faster in benchmarks for most input distributions

The Duplicate Element Problem:

One of the most significant practical differences emerges with arrays containing many duplicate elements:

Lomuto with duplicates: Consider partitioning [3, 3, 3, 3, 3] with pivot 3.

The condition arr[j] < pivot is never true
All elements go into the "≥ pivot" partition
Partition returns index 0 (pivot at position 0)
Subproblems: [] and [3, 3, 3, 3]
This creates maximally imbalanced partitions: T(n) = T(0) + T(n-1) + n = O(n²)

Hoare with duplicates: Consider the same array [3, 3, 3, 3, 3].

First arr[i] >= pivot at i=0, first arr[j] <= pivot at j=4
Swap and continue until pointers meet near middle
Partition returns approximately middle index
Subproblems are roughly balanced: T(n) ≈ 2T(n/2) + n = O(n log n)

When to Use Which

Pivot Selection Strategies

Strategy 1: First/Last Element

The simplest approach—select the first or last element as pivot.

pivot_strategies.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# Strategy 1: Fixed position (first or last)
def partition_last(arr, low, high):
    pivot = arr[high]  # Last element
    # ... Lomuto partition
    
# PROBLEM: Worst case on sorted/reverse-sorted input!
# If arr = [1,2,3,4,5], pivot = 5 (largest)
# All elements go left, partition is maximally unbalanced
# Recurrence: T(n) = T(n-1) + O(n) = O(n²)
 
 
# Strategy 2: Random selection
import random
 
def partition_random(arr, low, high):
    # Select random element and swap to end
    rand_idx = random.randint(low, high)
    arr[rand_idx], arr[high] = arr[high], arr[rand_idx]
    pivot = arr[high]
    # ... Lomuto partition
    
# ADVANTAGE: Expected O(n log n) for ANY input
# No adversary can construct worst-case input
# Probability of worst case: 1/n! for sorted input
 
 
# Strategy 3: Median-of-Three
def median_of_three(arr, low, high):
    """Select median of first, middle, and last elements."""
    mid = (low + high) // 2
    
    # Sort the three candidates
    if arr[low] > arr[mid]:
        arr[low], arr[mid] = arr[mid], arr[low]
    if arr[mid] > arr[high]:
        arr[mid], arr[high] = arr[high], arr[mid]
    if arr[low] > arr[mid]:
        arr[low], arr[mid] = arr[mid], arr[low]
    
    # Now: arr[low] <= arr[mid] <= arr[high]
    # Median is at mid, swap to high-1 for partition
    arr[mid], arr[high - 1] = arr[high - 1], arr[mid]
    return arr[high - 1]
 
# ADVANTAGE: Handles sorted input well (median is middle element)
# Provides reasonable pivot for most distributions
# O(1) overhead, no random number generation
 
 
# Strategy 4: Ninther (Median of Medians of Three)
def ninther(arr, low, high):
    """
    Select ninther for very large arrays.
    Take median-of-three from three groups:
    - First third
    - Middle third  
    - Last third
    Then take median of those three medians.
    """
    n = high - low + 1
    if n < 9:
        return median_of_three(arr, low, high)
    
    third = n // 3
    m1 = median_of_three_value(arr, low, low + third - 1)
    m2 = median_of_three_value(arr, low + third, high - third)
    m3 = median_of_three_value(arr, high - third + 1, high)
    
    return median_value(m1, m2, m3)
 
# ADVANTAGE: Better approximation of true median
# Reduces probability of worst case further
# Used in production implementations (pdqsort, introsort)

Pivot Selection Strategy Comparison
Strategy	Best Case	Worst Case	Average	Overhead
First/Last	O(n log n)	O(n²) on sorted	O(n log n)	O(1)
Random	O(n log n)	O(n²) very rare	O(n log n)	O(1) + RNG
Median-of-3	O(n log n)	O(n²) on killers	O(n log n)	O(1) + 3 swaps
Ninther	O(n log n)	O(n²) extremely rare	O(n log n)	O(1) + 9 comparisons
True Median	O(n log n)	O(n log n)	O(n log n)	O(n)—too expensive

Median-of-Three Killers

Three-Way Partitioning (Dutch National Flag)

Why Three-Way Matters:

Consider sorting [2, 1, 3, 2, 2, 3, 1, 2, 2] where 2 appears 5 times.

With two-way partitioning:

Each 2 is processed individually in recursive calls
Total calls proportional to n
May still hit O(n²) if unlucky with pivot selection

With three-way partitioning:

All 2s are grouped together in one partition
Equal elements are excluded from recursive calls entirely
Achieves O(n) for arrays with all identical elements

three_way_partition.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def three_way_partition(arr, low, high):
    """
    Three-Way Partitioning (Dutch National Flag)
    
    Creates three regions:
    1. arr[low...lt-1]:   Elements < pivot
    2. arr[lt...gt]:      Elements == pivot (final positions!)
    3. arr[gt+1...high]:  Elements > pivot
    
    Returns: (lt, gt) - boundaries of the equal region
    
    Visual representation:
    +------------------+------------------+------------------+
    |    < pivot       |    == pivot      |    > pivot       |
    +------------------+------------------+------------------+
    low              lt-1  lt          gt  gt+1           high
    
    After partition:
    - All elements equal to pivot are in FINAL positions
    - Recursive calls only on < pivot and > pivot regions
    """
    pivot = arr[low]  # Use first element as pivot
    lt = low          # Everything < lt is strictly less than pivot
    gt = high         # Everything > gt is strictly greater than pivot
    i = low           # Current element being examined
    
    while i <= gt:
        if arr[i] < pivot:
            # Element belongs in "less than" region
            arr[lt], arr[i] = arr[i], arr[lt]
            lt += 1
            i += 1
        elif arr[i] > pivot:
            # Element belongs in "greater than" region
            arr[gt], arr[i] = arr[i], arr[gt]
            gt -= 1
            # Note: don't increment i (need to examine swapped element)
        else:
            # Element equals pivot, already in correct region
            i += 1
    
    return lt, gt
 
 
def quick_sort_three_way(arr, low, high):
    """Quick sort using three-way partitioning."""
    if low >= high:
        return
    
    lt, gt = three_way_partition(arr, low, high)
    
    # Elements in arr[lt...gt] are already sorted (all equal)
    # Only recurse on the strictly less and strictly greater regions
    quick_sort_three_way(arr, low, lt - 1)   # Elements < pivot
    quick_sort_three_way(arr, gt + 1, high)  # Elements > pivot
 
 
# Example trace for [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
# Let's say pivot = 5
#
# After three-way partition:
# [3, 1, 4, 1, 2, 3] [5, 5, 5] [9, 6]
#        < 5            = 5      > 5
#
# The three 5s will NEVER be touched again
# Recursive calls only on the non-equal regions

When to Use Three-Way

Partition Invariants: The Foundation of Correctness

The correctness of quick sort rests entirely on partition invariants. Understanding these invariants deeply enables you to implement partition correctly, debug failures, and design new variations.

Loop Invariant Proof Structure:

For any partition algorithm, we must prove:

Initialization: The invariant holds before the first iteration
Maintenance: If the invariant holds before an iteration, it holds after
Termination: When the loop ends, the invariant plus the exit condition give us the postcondition

Lomuto Invariant Proof:

lomuto_invariant_proof.txt
LOMUTO PARTITION LOOP INVARIANT:
At the start of each iteration of the for loop:
 
1. arr[low...i] contains elements strictly LESS than pivot
2. arr[i+1...j-1] contains elements GREATER OR EQUAL to pivot
3. arr[high] is the pivot (unchanged)
 
INITIALIZATION (before first iteration, j = low):
- i = low - 1, so arr[low...i] = arr[low...low-1] = empty ✓
- arr[i+1...j-1] = arr[low...low-1] = empty ✓
- arr[high] = pivot ✓
 
MAINTENANCE (assuming invariant holds at start of iteration):
Case 1: arr[j] >= pivot
  - j increments, arr[i+1...j] now includes arr[j]
  - arr[j] >= pivot, so property 2 maintained ✓
  - i unchanged, property 1 unchanged ✓
 
Case 2: arr[j] < pivot
  - i increments
  - swap arr[i] with arr[j]
  - Before swap: arr[j] < pivot (condition we checked)
  - Before swap: arr[i] >= pivot (from invariant 2, since i+1 was in >= region)
  - After swap: arr[i] < pivot (satisfies property 1) ✓
  - After swap: arr[j] >= pivot (satisfies property 2) ✓
  - j increments, ready for next iteration
 
TERMINATION (j = high):
- arr[low...i] < pivot (from invariant 1)
- arr[i+1...high-1] >= pivot (from invariant 2)  
- arr[high] = pivot
 
Final swap: arr[i+1] ↔ arr[high]
- arr[i+1] >= pivot (from invariant 2)
- After swap: pivot at arr[i+1], the >= element at arr[high]
 
POSTCONDITION:
- arr[low...i] < arr[i+1] (pivot) ≤ arr[i+2...high]
- Partition complete with pivot at its final sorted position ✓

Debugging Using Invariants:

When partition fails, check the invariant at each step:

Print array state before and after each swap
Verify invariant properties at each iteration
Identify the first iteration where an invariant is violated
Examine what condition led to the violation

Common bugs and their symptoms:

Bug	Symptom	Invariant Violated
Wrong comparison (`<=` vs `<`)	Infinite loop or incorrect partition	Elements in wrong region
Off-by-one in loop bound	Missing element or array overflow	Termination condition
Wrong index in swap	Elements duplicated or lost	Both regions corrupted
Pivot position incorrect	Pivot not in final position	Postcondition

By tracing invariants, you can identify exactly where the algorithm deviates from correctness.

Invariants as Design Tools

Summary: Partition as Quick Sort's Divide

We've thoroughly examined the partition operation—quick sort's divide phase. Let's consolidate the key insights:

Key Takeaways

•Partition is the heart of quick sort—it performs the ordering work that makes combination trivial.
•Lomuto partition is simpler but performs more swaps; it places the pivot in its final position.
•Hoare partition is more efficient (fewer swaps) but doesn't place the pivot in its final position; it requires adjusted recursive calls.
•Pivot selection critically affects performance—random or median-of-three strategies prevent worst-case on common inputs.
•Three-way partitioning handles duplicate elements efficiently by grouping equals together and excluding them from recursion.
•Loop invariants are the foundation of partition correctness; understanding them enables debugging and design of new variants.

What's Next:

Page Complete