Loading content...
If quick sort's D&C structure is its skeleton, then partitioning is its beating heart. The partition operation is where quick sort diverges from every other sorting algorithm—where it does the real work of ordering elements. Understanding partition deeply is not optional for mastering quick sort; it is essential.
The partition operation must accomplish something remarkable in a single linear-time pass: take an unsorted array segment and transform it so that one element (the pivot) reaches its final sorted position forever. Every partition call permanently places one element, and the subproblems it creates never touch that element again.
This page dissects the partition operation with surgical precision. We'll examine multiple partitioning schemes, their invariants, their subtle differences, and the critical question of pivot selection. By the end, you will not just understand partition—you will be able to implement it from first principles, debug it with confidence, and reason about its performance implications.
By the end of this page, you will understand both the Lomuto and Hoare partition schemes at a deep mechanical level. You will know their invariants, their trade-offs, and when to choose each. You will understand why pivot selection is critical and how different strategies affect performance. Most importantly, you will see partition as the key that unlocks quick sort's practical dominance.
Before examining specific partition algorithms, we must precisely define what partition must accomplish.
Formal Problem Statement:
Given an array A[low...high] and a pivot element p selected from within this range:
Rearrange elements so that:
Return the index q where the pivot now resides
Post-condition (The Partition Invariant):
After partition(A, low, high) returns q:
∀i ∈ [low, q-1]: A[i] < A[q]
∀i ∈ [q+1, high]: A[i] ≥ A[q]
A[q] = pivot (in its final sorted position)
This invariant is remarkably strong. It guarantees that the pivot is correctly placed for all time—no future recursive call will ever move it. This permanence is what enables quick sort's in-place operation and trivial combine phase.
Why Partition is Non-Trivial:
Partition seems simple in concept, but the implementation must satisfy multiple constraints simultaneously:
Different partition schemes prioritize these constraints differently, leading to subtle but important behavioral differences.
| Requirement | Importance | Challenge |
|---|---|---|
| Linear time O(n) | Critical—defines D&C complexity | Must avoid nested loops or rescanning |
| In-place operation | Enables O(log n) space | Must track regions without auxiliary arrays |
| Correct invariant | Correctness of entire algorithm | Must maintain through all swaps |
| Handle duplicates | Prevents worst-case on repeated values | Different schemes handle differently |
| Minimal swaps | Practical performance | Trade-off with code simplicity |
The Lomuto partition scheme, popularized by Nico Lomuto, is the more commonly taught version due to its simplicity and clarity. It selects the last element as the pivot and maintains a clear two-region invariant.
Algorithm Overview:
123456789101112131415161718192021222324252627282930313233343536373839404142434445
def lomuto_partition(arr, low, high): """ Lomuto Partition Scheme Invariant maintained during execution: - arr[low...i]: Elements < pivot - arr[i+1...j-1]: Elements >= pivot - arr[j...high-1]: Unprocessed elements - arr[high]: Pivot (not yet in final position) Visual representation of the invariant: +------------------+------------------+------------------+-------+ | < pivot | >= pivot | unprocessed | pivot | +------------------+------------------+------------------+-------+ low i j-1 high-1 high """ pivot = arr[high] # Last element as pivot i = low - 1 # Boundary of "less than" region (starts before array) for j in range(low, high): # Scan through array (excluding pivot) if arr[j] < pivot: # Element belongs in "less than" region i += 1 # Expand "less than" region arr[i], arr[j] = arr[j], arr[i] # Swap element into place # Place pivot in its final position (right after "less than" region) arr[i + 1], arr[high] = arr[high], arr[i + 1] return i + 1 # Return pivot's final index # Step-by-step trace for [5, 2, 8, 1, 9, 4, 7, 3, 6]# Pivot = 6, i = -1## j=0: arr[0]=5 < 6? Yes. i=0. Swap arr[0]↔arr[0]. Array: [5,2,8,1,9,4,7,3,6]# j=1: arr[1]=2 < 6? Yes. i=1. Swap arr[1]↔arr[1]. Array: [5,2,8,1,9,4,7,3,6]# j=2: arr[2]=8 < 6? No. Continue.# j=3: arr[3]=1 < 6? Yes. i=2. Swap arr[2]↔arr[3]. Array: [5,2,1,8,9,4,7,3,6]# j=4: arr[4]=9 < 6? No. Continue.# j=5: arr[5]=4 < 6? Yes. i=3. Swap arr[3]↔arr[5]. Array: [5,2,1,4,9,8,7,3,6]# j=6: arr[6]=7 < 6? No. Continue.# j=7: arr[7]=3 < 6? Yes. i=4. Swap arr[4]↔arr[7]. Array: [5,2,1,4,3,8,7,9,6]# # Final: Swap arr[5]↔arr[8] (pivot). Array: [5,2,1,4,3,6,7,9,8]# Return 5 (pivot's final position)Lomuto's Invariant Analysis:
The genius of Lomuto's scheme lies in its simple loop invariant. At the end of each iteration of the for loop:
This invariant is easy to verify:
Lomuto partition has a single, forward-moving pointer j that examines each element exactly once. The boundary pointer i only moves during swaps. This unidirectional flow is easy to visualize and reason about. The invariant is simple to state and verify. For educational purposes, Lomuto is ideal—even if Hoare's scheme is more efficient in practice.
The Hoare partition scheme, invented by Tony Hoare (quick sort's creator himself), predates Lomuto and is more efficient in practice. It uses two pointers that move toward each other from opposite ends of the array.
Algorithm Overview:
Critical Difference from Lomuto: Hoare's scheme doesn't place the pivot in its final position! It only guarantees that elements in arr[low...j] are ≤ pivot and elements in arr[j+1...high] are ≥ pivot. The recursive calls must be adjusted accordingly.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
def hoare_partition(arr, low, high): """ Hoare Partition Scheme Invariant maintained: - Elements at indices < i are <= pivot - Elements at indices > j are >= pivot - When i >= j, partition is complete KEY DIFFERENCE: Pivot does NOT end up in its final sorted position! The partition point j defines the boundary, not the pivot location. Visual representation: +------------------------+------------------------+ | <= pivot | >= pivot | +------------------------+------------------------+ low j j+1 high """ pivot = arr[low] # First element as pivot (common choice for Hoare) i = low - 1 # Left pointer (will move right) j = high + 1 # Right pointer (will move left) while True: # Move i right, stopping at element >= pivot while True: i += 1 if arr[i] >= pivot: break # Move j left, stopping at element <= pivot while True: j -= 1 if arr[j] <= pivot: break # If pointers have crossed, we're done if i >= j: return j # j is the partition boundary # Swap elements that are on wrong sides arr[i], arr[j] = arr[j], arr[i] def quick_sort_hoare(arr, low, high): """ Quick sort using Hoare partition. NOTE: Recursive calls differ from Lomuto version! With Hoare: we recur on [low...p] and [p+1...high] NOT on [low...p-1] and [p+1...high] like Lomuto """ if low < high: p = hoare_partition(arr, low, high) quick_sort_hoare(arr, low, p) # Include p in left partition quick_sort_hoare(arr, p + 1, high) # Step-by-step trace for [5, 2, 8, 1, 9, 4, 7, 3, 6]# Pivot = 5, i = -1, j = 9## Iteration 1:# Move i right: i=0, arr[0]=5 >= 5? Yes. Stop at i=0.# Move j left: j=8, arr[8]=6 <= 5? No. j=7, arr[7]=3 <= 5? Yes. Stop at j=7.# i=0 < j=7? Yes. Swap arr[0]↔arr[7]. Array: [3,2,8,1,9,4,7,5,6]## Iteration 2:# Move i right: i=1, arr[1]=2 >= 5? No. i=2, arr[2]=8 >= 5? Yes. Stop at i=2.# Move j left: j=6, arr[6]=7 <= 5? No. j=5, arr[5]=4 <= 5? Yes. Stop at j=5.# i=2 < j=5? Yes. Swap arr[2]↔arr[5]. Array: [3,2,4,1,9,8,7,5,6]## Iteration 3:# Move i right: i=3, arr[3]=1 >= 5? No. i=4, arr[4]=9 >= 5? Yes. Stop at i=4.# Move j left: j=4, arr[4]=9 <= 5? No. j=3, arr[3]=1 <= 5? Yes. Stop at j=3.# i=4 >= j=3? Yes. Return j=3.## After partition: [3,2,4,1 | 9,8,7,5,6]# Left partition (indices 0-3): all elements <= 5# Right partition (indices 4-8): all elements >= 5Unlike Lomuto, Hoare's partition does NOT place the pivot in its final sorted position. The pivot value may end up anywhere within the array. What Hoare guarantees is weaker: all elements in the left partition are ≤ pivot, all elements in the right partition are ≥ pivot. This affects how recursive calls are structured.
Why Hoare is More Efficient:
Hoare's scheme performs approximately 3x fewer swaps than Lomuto on average. This difference becomes significant for large arrays:
| Aspect | Lomuto | Hoare |
|---|---|---|
| Average swaps | n/2 | n/6 |
| Pointer movements | n (single direction) | n (both directions) |
| Comparisons | n | n |
| Pivot placement | Final position | Not final position |
| Equal elements | Degrades to O(n²) | Handles well |
The swap reduction comes from Hoare's bidirectional approach: instead of moving one element at a time into the correct partition, it finds pairs of misplaced elements and swaps them simultaneously.
Handling Equal Elements:
Lomuto's "< pivot" condition causes problems with many duplicate elements—all equal elements end up in the right partition, causing imbalance. Hoare's "<= pivot" and ">= pivot" conditions distribute equal elements more evenly between partitions.
Understanding when to use each partition scheme requires analyzing their trade-offs across multiple dimensions.
The Duplicate Element Problem:
One of the most significant practical differences emerges with arrays containing many duplicate elements:
Lomuto with duplicates: Consider partitioning [3, 3, 3, 3, 3] with pivot 3.
arr[j] < pivot is never trueHoare with duplicates: Consider the same array [3, 3, 3, 3, 3].
arr[i] >= pivot at i=0, first arr[j] <= pivot at j=4Use Lomuto when: teaching, implementing three-way partitioning, or when simplicity matters more than raw speed. Use Hoare when: maximum performance is needed, dealing with potentially repetitive data, or implementing for production systems. Modern standard libraries typically use Hoare-style schemes or sophisticated hybrids.
The choice of pivot element is the single most important factor in quick sort's performance. A poor pivot can degrade O(n log n) to O(n²), while a good pivot ensures balanced partitions. Let's examine the major strategies.
Strategy 1: First/Last Element
The simplest approach—select the first or last element as pivot.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
# Strategy 1: Fixed position (first or last)def partition_last(arr, low, high): pivot = arr[high] # Last element # ... Lomuto partition # PROBLEM: Worst case on sorted/reverse-sorted input!# If arr = [1,2,3,4,5], pivot = 5 (largest)# All elements go left, partition is maximally unbalanced# Recurrence: T(n) = T(n-1) + O(n) = O(n²) # Strategy 2: Random selectionimport random def partition_random(arr, low, high): # Select random element and swap to end rand_idx = random.randint(low, high) arr[rand_idx], arr[high] = arr[high], arr[rand_idx] pivot = arr[high] # ... Lomuto partition # ADVANTAGE: Expected O(n log n) for ANY input# No adversary can construct worst-case input# Probability of worst case: 1/n! for sorted input # Strategy 3: Median-of-Threedef median_of_three(arr, low, high): """Select median of first, middle, and last elements.""" mid = (low + high) // 2 # Sort the three candidates if arr[low] > arr[mid]: arr[low], arr[mid] = arr[mid], arr[low] if arr[mid] > arr[high]: arr[mid], arr[high] = arr[high], arr[mid] if arr[low] > arr[mid]: arr[low], arr[mid] = arr[mid], arr[low] # Now: arr[low] <= arr[mid] <= arr[high] # Median is at mid, swap to high-1 for partition arr[mid], arr[high - 1] = arr[high - 1], arr[mid] return arr[high - 1] # ADVANTAGE: Handles sorted input well (median is middle element)# Provides reasonable pivot for most distributions# O(1) overhead, no random number generation # Strategy 4: Ninther (Median of Medians of Three)def ninther(arr, low, high): """ Select ninther for very large arrays. Take median-of-three from three groups: - First third - Middle third - Last third Then take median of those three medians. """ n = high - low + 1 if n < 9: return median_of_three(arr, low, high) third = n // 3 m1 = median_of_three_value(arr, low, low + third - 1) m2 = median_of_three_value(arr, low + third, high - third) m3 = median_of_three_value(arr, high - third + 1, high) return median_value(m1, m2, m3) # ADVANTAGE: Better approximation of true median# Reduces probability of worst case further# Used in production implementations (pdqsort, introsort)| Strategy | Best Case | Worst Case | Average | Overhead |
|---|---|---|---|---|
| First/Last | O(n log n) | O(n²) on sorted | O(n log n) | O(1) |
| Random | O(n log n) | O(n²) very rare | O(n log n) | O(1) + RNG |
| Median-of-3 | O(n log n) | O(n²) on killers | O(n log n) | O(1) + 3 swaps |
| Ninther | O(n log n) | O(n²) extremely rare | O(n log n) | O(1) + 9 comparisons |
| True Median | O(n log n) | O(n log n) | O(n log n) | O(n)—too expensive |
Even median-of-three selection is vulnerable to specially crafted 'killer' sequences that create worst-case behavior. These sequences are constructed by analyzing the median-of-three selection algorithm and placing elements to consistently produce unbalanced partitions. This is why production implementations combine multiple strategies—randomization defeats killers, median-of-three handles common cases efficiently.
When arrays contain many duplicate elements, standard two-way partitioning can perform poorly. Three-way partitioning (also known as the Dutch National Flag algorithm, after Dijkstra's formulation) solves this by creating three regions: less than pivot, equal to pivot, and greater than pivot.
Why Three-Way Matters:
Consider sorting [2, 1, 3, 2, 2, 3, 1, 2, 2] where 2 appears 5 times.
With two-way partitioning:
With three-way partitioning:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
def three_way_partition(arr, low, high): """ Three-Way Partitioning (Dutch National Flag) Creates three regions: 1. arr[low...lt-1]: Elements < pivot 2. arr[lt...gt]: Elements == pivot (final positions!) 3. arr[gt+1...high]: Elements > pivot Returns: (lt, gt) - boundaries of the equal region Visual representation: +------------------+------------------+------------------+ | < pivot | == pivot | > pivot | +------------------+------------------+------------------+ low lt-1 lt gt gt+1 high After partition: - All elements equal to pivot are in FINAL positions - Recursive calls only on < pivot and > pivot regions """ pivot = arr[low] # Use first element as pivot lt = low # Everything < lt is strictly less than pivot gt = high # Everything > gt is strictly greater than pivot i = low # Current element being examined while i <= gt: if arr[i] < pivot: # Element belongs in "less than" region arr[lt], arr[i] = arr[i], arr[lt] lt += 1 i += 1 elif arr[i] > pivot: # Element belongs in "greater than" region arr[gt], arr[i] = arr[i], arr[gt] gt -= 1 # Note: don't increment i (need to examine swapped element) else: # Element equals pivot, already in correct region i += 1 return lt, gt def quick_sort_three_way(arr, low, high): """Quick sort using three-way partitioning.""" if low >= high: return lt, gt = three_way_partition(arr, low, high) # Elements in arr[lt...gt] are already sorted (all equal) # Only recurse on the strictly less and strictly greater regions quick_sort_three_way(arr, low, lt - 1) # Elements < pivot quick_sort_three_way(arr, gt + 1, high) # Elements > pivot # Example trace for [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]# Let's say pivot = 5## After three-way partition:# [3, 1, 4, 1, 2, 3] [5, 5, 5] [9, 6]# < 5 = 5 > 5## The three 5s will NEVER be touched again# Recursive calls only on the non-equal regionsThree-way partitioning shines when data has many duplicates—arrays with limited distinct values, data with heavy repetition patterns, or cases where elements are complex objects compared by a single key. Modern implementations detect duplicate patterns and switch to three-way partitioning adaptively.
The correctness of quick sort rests entirely on partition invariants. Understanding these invariants deeply enables you to implement partition correctly, debug failures, and design new variations.
Loop Invariant Proof Structure:
For any partition algorithm, we must prove:
Lomuto Invariant Proof:
LOMUTO PARTITION LOOP INVARIANT:At the start of each iteration of the for loop: 1. arr[low...i] contains elements strictly LESS than pivot2. arr[i+1...j-1] contains elements GREATER OR EQUAL to pivot3. arr[high] is the pivot (unchanged) INITIALIZATION (before first iteration, j = low):- i = low - 1, so arr[low...i] = arr[low...low-1] = empty ✓- arr[i+1...j-1] = arr[low...low-1] = empty ✓- arr[high] = pivot ✓ MAINTENANCE (assuming invariant holds at start of iteration):Case 1: arr[j] >= pivot - j increments, arr[i+1...j] now includes arr[j] - arr[j] >= pivot, so property 2 maintained ✓ - i unchanged, property 1 unchanged ✓ Case 2: arr[j] < pivot - i increments - swap arr[i] with arr[j] - Before swap: arr[j] < pivot (condition we checked) - Before swap: arr[i] >= pivot (from invariant 2, since i+1 was in >= region) - After swap: arr[i] < pivot (satisfies property 1) ✓ - After swap: arr[j] >= pivot (satisfies property 2) ✓ - j increments, ready for next iteration TERMINATION (j = high):- arr[low...i] < pivot (from invariant 1)- arr[i+1...high-1] >= pivot (from invariant 2) - arr[high] = pivot Final swap: arr[i+1] ↔ arr[high]- arr[i+1] >= pivot (from invariant 2)- After swap: pivot at arr[i+1], the >= element at arr[high] POSTCONDITION:- arr[low...i] < arr[i+1] (pivot) ≤ arr[i+2...high]- Partition complete with pivot at its final sorted position ✓Debugging Using Invariants:
When partition fails, check the invariant at each step:
Common bugs and their symptoms:
| Bug | Symptom | Invariant Violated |
|---|---|---|
Wrong comparison (<= vs <) | Infinite loop or incorrect partition | Elements in wrong region |
| Off-by-one in loop bound | Missing element or array overflow | Termination condition |
| Wrong index in swap | Elements duplicated or lost | Both regions corrupted |
| Pivot position incorrect | Pivot not in final position | Postcondition |
By tracing invariants, you can identify exactly where the algorithm deviates from correctness.
Invariants aren't just proof techniques—they're design tools. When implementing partition, start by writing down the invariant you want to maintain. Then write code that maintains it. The invariant guides implementation and makes debugging systematic rather than ad-hoc.
We've thoroughly examined the partition operation—quick sort's divide phase. Let's consolidate the key insights:
What's Next:
With partition understood as the divide phase, we move to the conquer phase: sorting partitions recursively. We'll examine the recursive structure of quick sort, understand its call stack behavior, and explore optimizations like tail recursion elimination.
You now understand partition at a deep level—multiple schemes, their invariants, pivot selection strategies, and three-way partitioning for duplicates. You can implement partition from first principles, choose the appropriate scheme for your context, and debug partition failures systematically. Next, we'll explore the conquer phase of quick sort.