Data Structures & AlgorithmsSorting Algorithms

Insertion Sort — Intuition & Analysis

LevelBeginner

Duration75 mins

TopicSorting Algorithms

3 / 4

Time Complexity: O(n²) Worst, O(n) Best

The Complexity Landscape of Insertion Sort

Unlike algorithms with fixed complexity regardless of input (like Selection Sort's unwavering O(n²)), Insertion Sort's performance varies dramatically based on the input's initial order. It can be as fast as O(n) or as slow as O(n²), with everything in between possible.

This variability isn't a weakness—it's Insertion Sort's adaptive superpower. Understanding when each case occurs, and precisely how to derive these bounds, is essential for knowing when to deploy Insertion Sort and when to choose alternatives.

What You Will Learn

This page derives Insertion Sort's time complexity rigorously: best case O(n), worst case O(n²), and average case O(n²). You'll understand exactly which inputs trigger which behavior, the mathematical proofs behind each bound, and the practical implications of this complexity profile.

What We're Counting: The Cost Model

Before analyzing complexity, we must define what operations we're counting. For sorting algorithms, we typically focus on two fundamental operations:

Primary Operations in Insertion Sort

•Comparisons: The A[j] > key check that determines whether to shift. This is the dominant operation in comparison-based sorting.
•Data Movements (Shifts): The A[j+1] = A[j] operation that moves elements right. Also includes the final A[j+1] = key placement.

The outer loop:

for i in range(1, n):  # Runs n-1 times

The outer loop executes exactly n-1 times regardless of input—we must process each element from index 1 to n-1. This contributes O(n) to the complexity.

The inner loop:

while j >= 0 and A[j] > key:  # Runs variable number of times
    A[j+1] = A[j]
    j -= 1

The inner loop's iteration count varies based on:

How many elements in the sorted region are larger than key
This depends on key's relative value and the sorted region's composition

The total complexity hinges on the sum of inner loop iterations across all outer loop iterations.

Why Focus on Inner Loop?

The outer loop contributes a constant O(n) factor. The algorithm's characteristic behavior comes from the inner loop. If the inner loop averages O(1) iterations per outer iteration, total is O(n). If it averages O(n) iterations, total is O(n²).

Best Case Analysis: O(n) — Already Sorted Input

Best Case Scenario

Best Case Input: An already-sorted array, e.g., [1, 2, 3, 4, 5]. Best Case Complexity: O(n) comparisons, 0 shifts.

Why sorted arrays are fast:

Consider inserting element at index i into the sorted subarray A[0..i-1]:

We extract key = A[i]
We compare key with A[i-1] (the largest element in the sorted region)
If A[i-1] ≤ key, the inner loop condition A[j] > key is immediately false
The inner loop doesn't execute at all
We place key at A[i] (where it already was)

For a sorted array, A[i] ≥ A[i-1] for all i, so every insertion terminates after a single comparison.

Formal derivation:

Let C(n) be the number of comparisons for a sorted array of size n.

For each i from 1 to n-1, we perform exactly 1 comparison (checking A[i-1] > A[i])
This comparison always returns false (since array is sorted)
Inner loop body never executes → 0 shifts

Total comparisons: $$C(n) = \sum_{i=1}^{n-1} 1 = n - 1 = O(n)$$

Total shifts: $$S(n) = 0$$

Total time: O(n) comparisons + O(n) loop overhead = O(n)

Trace for Sorted Array [1, 2, 3, 4, 5]
i	key	Compare A[i-1] > key?	Inner Loop Runs?
1	2	1 > 2? No	No
2	3	2 > 3? No	No
3	4	3 > 4? No	No
4	5	4 > 5? No	No

Total: 4 comparisons, 0 shifts for n=5. This is O(n) behavior.

Key insight: The best case occurs when no element needs to be shifted—each element is already at least as large as all elements before it. This is precisely the definition of a sorted array.

Worst Case Analysis: O(n²) — Reverse-Sorted Input

Worst Case Scenario

Worst Case Input: A reverse-sorted array, e.g., [5, 4, 3, 2, 1]. Worst Case Complexity: O(n²) comparisons, O(n²) shifts.

Why reverse-sorted arrays are slow:

Consider inserting element at index i into the sorted subarray A[0..i-1]:

In a reverse-sorted original array, A[i] is smaller than all elements in A[0..i-1]
The inner loop must shift ALL i elements in the sorted region
key ends up at position 0 every time
We do the maximum possible work for each insertion

For element at index i, we perform i comparisons and i shifts.

Formal derivation:

Let C(n) be comparisons, S(n) be shifts for a reverse-sorted array of size n.

For iteration i (inserting element at index i):

We compare with A[i-1], A[i-2], ..., A[0] (all i elements)
Each comparison returns true (all are larger than key)
We shift all i elements

Total comparisons: $$C(n) = \sum_{i=1}^{n-1} i = 1 + 2 + 3 + \cdots + (n-1) = \frac{(n-1)n}{2} = \frac{n^2 - n}{2} = O(n^2)$$

Total shifts: $$S(n) = \sum_{i=1}^{n-1} i = \frac{(n-1)n}{2} = O(n^2)$$

Total time: O(n²) comparisons + O(n²) shifts = O(n²)

Trace for Reverse-Sorted Array [5, 4, 3, 2, 1]
i	key	Comparisons	Shifts	Result
1	4	1 (with 5)	1	[4, 5, 3, 2, 1]
2	3	2 (with 5, 4)	2	[3, 4, 5, 2, 1]
3	2	3 (with 5, 4, 3)	3	[2, 3, 4, 5, 1]
4	1	4 (with 5, 4, 3, 2)	4	[1, 2, 3, 4, 5]

Total: 1+2+3+4 = 10 comparisons and 10 shifts for n=5.

Using the formula: (5-1)×5/2 = 10. ✓

The pattern: Each new element must travel all the way to position 0, pushing every previously sorted element one position right.

Quadratic Explosion

For n=1,000, worst case requires ~500,000 operations. For n=1,000,000, it's ~500,000,000,000 operations. This is why Insertion Sort is impractical for large, arbitrarily-ordered datasets.

Average Case Analysis: O(n²) — Random Input

Average Case Scenario

Average Case Input: A uniformly random permutation of elements. Average Case Complexity: O(n²) comparisons and shifts, specifically ~n²/4.

The probabilistic setup:

For average case analysis, we assume:

All n! permutations of the input are equally likely
Each element is equally likely to end up at any position after insertion

The key insight: On average, each element A[i] is larger than half the elements in A[0..i-1] and smaller than the other half. So it needs to move approximately to the middle—shifting about i/2 elements.

Formal derivation:

When inserting A[i] into A[0..i-1], the key can end up at any of the i+1 positions (0, 1, ..., i) with equal probability 1/(i+1).

Expected comparisons for inserting element at index i:

If key ends up at position 0: i comparisons
If key ends up at position 1: i comparisons (compare with all, but one is smaller)
Actually, we compare until we find a smaller element, so expected comparisons ≈ i/2 + 1/2

More precisely, the expected number of comparisons for element i is: $$E[C_i] = \frac{1}{i+1}\left(1 + 2 + 3 + \cdots + i + i\right) = \frac{1}{i+1}\left(\frac{i(i+1)}{2} + i\right) = \frac{i}{2} + 1 - \frac{1}{i+1}$$

Approximately: E[Cᵢ] ≈ i/2

Total expected comparisons: $$E[C(n)] = \sum_{i=1}^{n-1} \frac{i}{2} \approx \frac{1}{2} \cdot \frac{(n-1)n}{2} = \frac{n^2 - n}{4} \approx \frac{n^2}{4}$$

Expected shifts are similar: approximately n²/4.

Total average time: O(n²), specifically ~n²/4 operations.

Comparing Actual Operation Counts
Case	n=10	n=100	n=1000
Best (sorted)	9	99	999
Average (random)	~22	~2,500	~250,000
Worst (reverse)	45	4,950	499,500

Practical observation: The average case is about half the worst case, but both are O(n²). For random inputs, Insertion Sort's average case is not meaningfully better than its worst case in Big-O terms.

Alternative Analysis: Counting Inversions

A more elegant way to understand Insertion Sort's complexity uses inversions—the fundamental measure of array disorder we introduced earlier.

Key Theorem

Theorem: The number of shifts in Insertion Sort exactly equals the number of inversions in the input array. Therefore, Time(Insertion Sort) = Θ(n + inversions).

Proof:

Each shift moves element A[j] from position j to position j+1, making room for key. This happens because A[j] > key and j < possible_position_of_key.

Before the shift: A[j] appears before key in the array, and A[j] > key. This is an inversion.

After the shift and final placement: key is placed before A[j] in the array, and key < A[j]. The inversion is resolved.

Each shift removes exactly one inversion, and each inversion is removed by exactly one shift.

Therefore: Number of shifts = Number of inversions = I(A)

Complexity: $$T(n) = n - 1 \text{ (outer loop iterations)} + I(A) \text{ (shifts)} = \Theta(n + I(A))$$

Inversions in Different Input Types
Input Type	Number of Inversions	Time Complexity
Sorted array	0	Θ(n + 0) = Θ(n)
Reverse sorted	n(n-1)/2	Θ(n + n²/2) = Θ(n²)
Random permutation	~n²/4 expected	Θ(n + n²/4) = Θ(n²)
Array with k unsorted elements	O(kn)	Θ(n + kn) = O(kn)
Each element ≤ c positions from sorted	O(cn)	Θ(n + cn) = O(n) if c is constant

The power of inversion analysis:

This analysis precisely captures Insertion Sort's adaptivity:

It generalizes all cases: Best, worst, and average are just specific inversion counts.
It predicts performance for ANY input: Given an array, count inversions to estimate Insertion Sort's runtime.
It identifies 'nearly sorted' precisely: An array is 'nearly sorted' if I(A) = O(n), making Insertion Sort linear.
It enables hybrid algorithm design: Use Insertion Sort when inversions are expected to be low; switch to O(n log n) algorithms otherwise.

Practical Rule of Thumb

If you expect each element to be at most k positions away from its sorted position (and k is a small constant), Insertion Sort runs in O(n). This is common after incremental updates to an already-sorted dataset.

Space Complexity: O(1) Auxiliary Space

While time complexity varies, Insertion Sort's space complexity is constant: O(1) auxiliary space, making it an in-place sorting algorithm.

Memory Usage Breakdown

•Input array: O(n) — but this doesn't count (it's input, not auxiliary space)
•Loop variables (i, j): O(1) — just two integers
•Temporary storage (key): O(1) — one element
•Total auxiliary space: O(1)

Why O(1) matters:

Embedded systems: Limited memory environments can use Insertion Sort without allocating extra buffers.
Large elements: When sorting large objects, O(1) extra space means we don't double our memory footprint (unlike Merge Sort's O(n) auxiliary array).
Cache efficiency: Operating in-place improves data locality, keeping frequently accessed elements in CPU cache.
No memory allocation overhead: No dynamic allocation means no allocation failures, no fragmentation, and predictable performance.

Contrast with other algorithms:

Merge Sort: O(n) auxiliary space for the merge buffer
Quick Sort: O(log n) average for recursion stack, O(n) worst case
Heap Sort: O(1) auxiliary (also in-place)
Insertion Sort: O(1) auxiliary ✓

In-Place Definition

An in-place algorithm uses O(1) auxiliary space (or O(log n) for implicit recursion stack). Insertion Sort is strictly in-place—it needs only constant extra memory regardless of input size.

Complexity Comparison: Insertion vs. Selection Sort

Both Insertion Sort and Selection Sort are O(n²) in the worst case. However, their complexity profiles differ in important ways.

Detailed Complexity Comparison
Metric	Insertion Sort	Selection Sort
Best Case Time	O(n) — sorted input	O(n²) — always
Worst Case Time	O(n²) — reverse sorted	O(n²) — always
Average Case Time	O(n²) — random input	O(n²) — always
Comparisons (best)	n-1	n(n-1)/2
Comparisons (worst)	n(n-1)/2	n(n-1)/2
Swaps/Shifts (best)	0	n-1 (or 0 with optimization)
Swaps/Shifts (worst)	n(n-1)/2	n-1
Adaptive?	Yes — adapts to input order	No — fixed work regardless

Key differences explained:

Selection Sort's consistency: Selection Sort always does exactly n(n-1)/2 comparisons because it always scans the entire remaining portion to find the minimum. Input order doesn't matter—even a sorted array requires those comparisons.

Insertion Sort's adaptivity: Insertion Sort can terminate early on each insertion. If the element is already in place (or close), few comparisons are needed. This adaptivity is Insertion Sort's crucial advantage.

When each wins:

Selection Sort wins when swaps are very expensive and the array is random or reverse-sorted. Selection Sort does only O(n) swaps regardless of input.
Insertion Sort wins for nearly-sorted data, small arrays, online processing, or when stability is required. Its O(n) best case and O(n²/4) average case (vs O(n²/2) for Selection Sort) make it faster in practice for common cases.

The Practical Verdict

For general-purpose sorting of small arrays, Insertion Sort is preferred over Selection Sort. Its adaptivity means it handles common cases efficiently, and its stability is often required. Selection Sort's only advantage—minimal swaps—rarely matters in modern systems.

Empirical Verification: Measuring Real Performance

Theory is validated by experiment. Let's verify our complexity analysis with actual measurements.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import random
import time
 
def insertion_sort_counted(arr):
    """Insertion Sort that counts comparisons and shifts."""
    comparisons = 0
    shifts = 0
    n = len(arr)
    
    for i in range(1, n):
        key = arr[i]
        j = i - 1
        
        while j >= 0:
            comparisons += 1
            if arr[j] > key:
                arr[j + 1] = arr[j]
                shifts += 1
                j -= 1
            else:
                break
        
        arr[j + 1] = key
    
    return comparisons, shifts
 
# Test different input types
def benchmark(size):
    print(f"\n--- Array Size: {size} ---")
    
    # Best case: sorted
    sorted_arr = list(range(size))
    c, s = insertion_sort_counted(sorted_arr.copy())
    print(f"Sorted:    {c:>10} comparisons, {s:>10} shifts")
    print(f"  Expected:  {size-1:>10} comparisons, {0:>10} shifts")
    
    # Worst case: reverse sorted
    reverse_arr = list(range(size, 0, -1))
    c, s = insertion_sort_counted(reverse_arr.copy())
    expected = (size - 1) * size // 2
    print(f"Reversed:  {c:>10} comparisons, {s:>10} shifts")
    print(f"  Expected:  {expected:>10} comparisons, {expected:>10} shifts")
    
    # Average case: random
    random_arr = list(range(size))
    random.shuffle(random_arr)
    c, s = insertion_sort_counted(random_arr.copy())
    expected_avg = expected // 2
    print(f"Random:    {c:>10} comparisons, {s:>10} shifts")
    print(f"  Expected:      ~{expected_avg:>8} comparisons,     ~{expected_avg:>8} shifts")
 
for size in [10, 100, 1000]:
    benchmark(size)

Expected output pattern:

--- Array Size: 1000 ---
Sorted:          999 comparisons,          0 shifts
  Expected:       999 comparisons,          0 shifts
Reversed:      499500 comparisons,     499500 shifts  
  Expected:    499500 comparisons,     499500 shifts
Random:       ~250000 comparisons,    ~250000 shifts
  Expected:   ~249750 comparisons,    ~249750 shifts

The empirical results match our theoretical predictions:

Sorted: Exactly n-1 comparisons, 0 shifts
Reversed: Exactly n(n-1)/2 for both
Random: Approximately n²/4 for both (with variance)

Complete Complexity Summary

Insertion Sort Complexity Reference
Metric	Best Case	Average Case	Worst Case
Time Complexity	O(n)	O(n²)	O(n²)
Comparisons	n - 1	~n²/4	n(n-1)/2
Shifts	0	~n²/4	n(n-1)/2
Input Type	Sorted array	Random permutation	Reverse-sorted array
Inversions	0	~n²/4	n(n-1)/2

Space and Stability Properties
Property	Value	Notes
Space Complexity	O(1)	Only key variable and loop indices
In-Place	Yes	Modifies array directly
Stable	Yes	Equal elements maintain relative order
Online	Yes	Elements can be sorted as they arrive
Adaptive	Yes	Runtime adapts to input disorder

The Unique Complexity Profile

Insertion Sort is the only common quadratic sorting algorithm with O(n) best case. This makes it uniquely suited for nearly-sorted data and hybrid algorithms that use it as a finishing step.

Summary: Understanding Insertion Sort's Complexity

We've rigorously analyzed Insertion Sort's time and space complexity. Here are the essential takeaways:

Key Takeaways

•Best case O(n) occurs when the array is already sorted—each element requires only one comparison and no shifts.
•Worst case O(n²) occurs when the array is reverse-sorted—each element must shift past all previously sorted elements.
•Average case O(n²), specifically ~n²/4 operations, for random permutations.
•Time = Θ(n + inversions) — the inversion count precisely determines runtime, explaining Insertion Sort's adaptivity.
•Space O(1) — constant auxiliary space makes Insertion Sort truly in-place.
•Stable and online properties come 'for free' with no complexity cost.

What's next:

We've analyzed complexity mathematically. The final page explores the practical implications: when does Insertion Sort's complexity profile make it the right choice? How do nearly-sorted arrays arise in practice? And how do production-grade sorting libraries leverage Insertion Sort's unique strengths?

Page Complete

You now understand Insertion Sort's complexity from multiple angles: operation counting, case analysis, and inversion theory. The O(n) best case is Insertion Sort's defining characteristic—the key to understanding when it outperforms asymptotically faster algorithms.

3 / 4

Loading learning content...

Data Structures & AlgorithmsSorting Algorithms

Insertion Sort — Intuition & Analysis

LevelBeginner

Duration75 mins

TopicSorting Algorithms

3 / 4

Time Complexity: O(n²) Worst, O(n) Best

The Complexity Landscape of Insertion Sort

What You Will Learn

What We're Counting: The Cost Model

Before analyzing complexity, we must define what operations we're counting. For sorting algorithms, we typically focus on two fundamental operations:

Primary Operations in Insertion Sort

•Comparisons: The A[j] > key check that determines whether to shift. This is the dominant operation in comparison-based sorting.
•Data Movements (Shifts): The A[j+1] = A[j] operation that moves elements right. Also includes the final A[j+1] = key placement.

The outer loop:

for i in range(1, n):  # Runs n-1 times

The outer loop executes exactly n-1 times regardless of input—we must process each element from index 1 to n-1. This contributes O(n) to the complexity.

The inner loop:

while j >= 0 and A[j] > key:  # Runs variable number of times
    A[j+1] = A[j]
    j -= 1

The inner loop's iteration count varies based on:

How many elements in the sorted region are larger than key
This depends on key's relative value and the sorted region's composition

The total complexity hinges on the sum of inner loop iterations across all outer loop iterations.

Why Focus on Inner Loop?

Best Case Analysis: O(n) — Already Sorted Input

Best Case Scenario

Best Case Input: An already-sorted array, e.g., [1, 2, 3, 4, 5]. Best Case Complexity: O(n) comparisons, 0 shifts.

Why sorted arrays are fast:

Consider inserting element at index i into the sorted subarray A[0..i-1]:

We extract key = A[i]
We compare key with A[i-1] (the largest element in the sorted region)
If A[i-1] ≤ key, the inner loop condition A[j] > key is immediately false
The inner loop doesn't execute at all
We place key at A[i] (where it already was)

For a sorted array, A[i] ≥ A[i-1] for all i, so every insertion terminates after a single comparison.

Formal derivation:

Let C(n) be the number of comparisons for a sorted array of size n.

For each i from 1 to n-1, we perform exactly 1 comparison (checking A[i-1] > A[i])
This comparison always returns false (since array is sorted)
Inner loop body never executes → 0 shifts

Total comparisons: $$C(n) = \sum_{i=1}^{n-1} 1 = n - 1 = O(n)$$

Total shifts: $$S(n) = 0$$

Total time: O(n) comparisons + O(n) loop overhead = O(n)

Trace for Sorted Array [1, 2, 3, 4, 5]
i	key	Compare A[i-1] > key?	Inner Loop Runs?
1	2	1 > 2? No	No
2	3	2 > 3? No	No
3	4	3 > 4? No	No
4	5	4 > 5? No	No

Total: 4 comparisons, 0 shifts for n=5. This is O(n) behavior.

Key insight: The best case occurs when no element needs to be shifted—each element is already at least as large as all elements before it. This is precisely the definition of a sorted array.

Worst Case Analysis: O(n²) — Reverse-Sorted Input

Worst Case Scenario

Worst Case Input: A reverse-sorted array, e.g., [5, 4, 3, 2, 1]. Worst Case Complexity: O(n²) comparisons, O(n²) shifts.

Why reverse-sorted arrays are slow:

Consider inserting element at index i into the sorted subarray A[0..i-1]:

In a reverse-sorted original array, A[i] is smaller than all elements in A[0..i-1]
The inner loop must shift ALL i elements in the sorted region
key ends up at position 0 every time
We do the maximum possible work for each insertion

For element at index i, we perform i comparisons and i shifts.

Formal derivation:

Let C(n) be comparisons, S(n) be shifts for a reverse-sorted array of size n.

For iteration i (inserting element at index i):

We compare with A[i-1], A[i-2], ..., A[0] (all i elements)
Each comparison returns true (all are larger than key)
We shift all i elements

Total comparisons: $$C(n) = \sum_{i=1}^{n-1} i = 1 + 2 + 3 + \cdots + (n-1) = \frac{(n-1)n}{2} = \frac{n^2 - n}{2} = O(n^2)$$

Total shifts: $$S(n) = \sum_{i=1}^{n-1} i = \frac{(n-1)n}{2} = O(n^2)$$

Total time: O(n²) comparisons + O(n²) shifts = O(n²)

Trace for Reverse-Sorted Array [5, 4, 3, 2, 1]
i	key	Comparisons	Shifts	Result
1	4	1 (with 5)	1	[4, 5, 3, 2, 1]
2	3	2 (with 5, 4)	2	[3, 4, 5, 2, 1]
3	2	3 (with 5, 4, 3)	3	[2, 3, 4, 5, 1]
4	1	4 (with 5, 4, 3, 2)	4	[1, 2, 3, 4, 5]

Total: 1+2+3+4 = 10 comparisons and 10 shifts for n=5.

Using the formula: (5-1)×5/2 = 10. ✓

The pattern: Each new element must travel all the way to position 0, pushing every previously sorted element one position right.

Quadratic Explosion

For n=1,000, worst case requires ~500,000 operations. For n=1,000,000, it's ~500,000,000,000 operations. This is why Insertion Sort is impractical for large, arbitrarily-ordered datasets.

Average Case Analysis: O(n²) — Random Input

Average Case Scenario

Average Case Input: A uniformly random permutation of elements. Average Case Complexity: O(n²) comparisons and shifts, specifically ~n²/4.

The probabilistic setup:

For average case analysis, we assume:

All n! permutations of the input are equally likely
Each element is equally likely to end up at any position after insertion

Formal derivation:

When inserting A[i] into A[0..i-1], the key can end up at any of the i+1 positions (0, 1, ..., i) with equal probability 1/(i+1).

Expected comparisons for inserting element at index i:

If key ends up at position 0: i comparisons
If key ends up at position 1: i comparisons (compare with all, but one is smaller)
Actually, we compare until we find a smaller element, so expected comparisons ≈ i/2 + 1/2

Approximately: E[Cᵢ] ≈ i/2

Total expected comparisons: $$E[C(n)] = \sum_{i=1}^{n-1} \frac{i}{2} \approx \frac{1}{2} \cdot \frac{(n-1)n}{2} = \frac{n^2 - n}{4} \approx \frac{n^2}{4}$$

Expected shifts are similar: approximately n²/4.

Total average time: O(n²), specifically ~n²/4 operations.

Comparing Actual Operation Counts
Case	n=10	n=100	n=1000
Best (sorted)	9	99	999
Average (random)	~22	~2,500	~250,000
Worst (reverse)	45	4,950	499,500

Alternative Analysis: Counting Inversions

A more elegant way to understand Insertion Sort's complexity uses inversions—the fundamental measure of array disorder we introduced earlier.

Key Theorem

Theorem: The number of shifts in Insertion Sort exactly equals the number of inversions in the input array. Therefore, Time(Insertion Sort) = Θ(n + inversions).

Proof:

Each shift moves element A[j] from position j to position j+1, making room for key. This happens because A[j] > key and j < possible_position_of_key.

Before the shift: A[j] appears before key in the array, and A[j] > key. This is an inversion.

After the shift and final placement: key is placed before A[j] in the array, and key < A[j]. The inversion is resolved.

Each shift removes exactly one inversion, and each inversion is removed by exactly one shift.

Therefore: Number of shifts = Number of inversions = I(A)

Complexity: $$T(n) = n - 1 \text{ (outer loop iterations)} + I(A) \text{ (shifts)} = \Theta(n + I(A))$$

Inversions in Different Input Types
Input Type	Number of Inversions	Time Complexity
Sorted array	0	Θ(n + 0) = Θ(n)
Reverse sorted	n(n-1)/2	Θ(n + n²/2) = Θ(n²)
Random permutation	~n²/4 expected	Θ(n + n²/4) = Θ(n²)
Array with k unsorted elements	O(kn)	Θ(n + kn) = O(kn)
Each element ≤ c positions from sorted	O(cn)	Θ(n + cn) = O(n) if c is constant

The power of inversion analysis:

This analysis precisely captures Insertion Sort's adaptivity:

It generalizes all cases: Best, worst, and average are just specific inversion counts.
It predicts performance for ANY input: Given an array, count inversions to estimate Insertion Sort's runtime.
It identifies 'nearly sorted' precisely: An array is 'nearly sorted' if I(A) = O(n), making Insertion Sort linear.
It enables hybrid algorithm design: Use Insertion Sort when inversions are expected to be low; switch to O(n log n) algorithms otherwise.

Practical Rule of Thumb

Space Complexity: O(1) Auxiliary Space

While time complexity varies, Insertion Sort's space complexity is constant: O(1) auxiliary space, making it an in-place sorting algorithm.

Memory Usage Breakdown

•Input array: O(n) — but this doesn't count (it's input, not auxiliary space)
•Loop variables (i, j): O(1) — just two integers
•Temporary storage (key): O(1) — one element
•Total auxiliary space: O(1)

Why O(1) matters:

Embedded systems: Limited memory environments can use Insertion Sort without allocating extra buffers.
Large elements: When sorting large objects, O(1) extra space means we don't double our memory footprint (unlike Merge Sort's O(n) auxiliary array).
Cache efficiency: Operating in-place improves data locality, keeping frequently accessed elements in CPU cache.
No memory allocation overhead: No dynamic allocation means no allocation failures, no fragmentation, and predictable performance.

Contrast with other algorithms:

Merge Sort: O(n) auxiliary space for the merge buffer
Quick Sort: O(log n) average for recursion stack, O(n) worst case
Heap Sort: O(1) auxiliary (also in-place)
Insertion Sort: O(1) auxiliary ✓

In-Place Definition

An in-place algorithm uses O(1) auxiliary space (or O(log n) for implicit recursion stack). Insertion Sort is strictly in-place—it needs only constant extra memory regardless of input size.

Complexity Comparison: Insertion vs. Selection Sort

Both Insertion Sort and Selection Sort are O(n²) in the worst case. However, their complexity profiles differ in important ways.

Detailed Complexity Comparison
Metric	Insertion Sort	Selection Sort
Best Case Time	O(n) — sorted input	O(n²) — always
Worst Case Time	O(n²) — reverse sorted	O(n²) — always
Average Case Time	O(n²) — random input	O(n²) — always
Comparisons (best)	n-1	n(n-1)/2
Comparisons (worst)	n(n-1)/2	n(n-1)/2
Swaps/Shifts (best)	0	n-1 (or 0 with optimization)
Swaps/Shifts (worst)	n(n-1)/2	n-1
Adaptive?	Yes — adapts to input order	No — fixed work regardless

Key differences explained:

When each wins:

Selection Sort wins when swaps are very expensive and the array is random or reverse-sorted. Selection Sort does only O(n) swaps regardless of input.
Insertion Sort wins for nearly-sorted data, small arrays, online processing, or when stability is required. Its O(n) best case and O(n²/4) average case (vs O(n²/2) for Selection Sort) make it faster in practice for common cases.

The Practical Verdict

Empirical Verification: Measuring Real Performance

Theory is validated by experiment. Let's verify our complexity analysis with actual measurements.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import random
import time
 
def insertion_sort_counted(arr):
    """Insertion Sort that counts comparisons and shifts."""
    comparisons = 0
    shifts = 0
    n = len(arr)
    
    for i in range(1, n):
        key = arr[i]
        j = i - 1
        
        while j >= 0:
            comparisons += 1
            if arr[j] > key:
                arr[j + 1] = arr[j]
                shifts += 1
                j -= 1
            else:
                break
        
        arr[j + 1] = key
    
    return comparisons, shifts
 
# Test different input types
def benchmark(size):
    print(f"\n--- Array Size: {size} ---")
    
    # Best case: sorted
    sorted_arr = list(range(size))
    c, s = insertion_sort_counted(sorted_arr.copy())
    print(f"Sorted:    {c:>10} comparisons, {s:>10} shifts")
    print(f"  Expected:  {size-1:>10} comparisons, {0:>10} shifts")
    
    # Worst case: reverse sorted
    reverse_arr = list(range(size, 0, -1))
    c, s = insertion_sort_counted(reverse_arr.copy())
    expected = (size - 1) * size // 2
    print(f"Reversed:  {c:>10} comparisons, {s:>10} shifts")
    print(f"  Expected:  {expected:>10} comparisons, {expected:>10} shifts")
    
    # Average case: random
    random_arr = list(range(size))
    random.shuffle(random_arr)
    c, s = insertion_sort_counted(random_arr.copy())
    expected_avg = expected // 2
    print(f"Random:    {c:>10} comparisons, {s:>10} shifts")
    print(f"  Expected:      ~{expected_avg:>8} comparisons,     ~{expected_avg:>8} shifts")
 
for size in [10, 100, 1000]:
    benchmark(size)

Expected output pattern:

--- Array Size: 1000 ---
Sorted:          999 comparisons,          0 shifts
  Expected:       999 comparisons,          0 shifts
Reversed:      499500 comparisons,     499500 shifts  
  Expected:    499500 comparisons,     499500 shifts
Random:       ~250000 comparisons,    ~250000 shifts
  Expected:   ~249750 comparisons,    ~249750 shifts

The empirical results match our theoretical predictions:

Sorted: Exactly n-1 comparisons, 0 shifts
Reversed: Exactly n(n-1)/2 for both
Random: Approximately n²/4 for both (with variance)

Complete Complexity Summary

Insertion Sort Complexity Reference
Metric	Best Case	Average Case	Worst Case
Time Complexity	O(n)	O(n²)	O(n²)
Comparisons	n - 1	~n²/4	n(n-1)/2
Shifts	0	~n²/4	n(n-1)/2
Input Type	Sorted array	Random permutation	Reverse-sorted array
Inversions	0	~n²/4	n(n-1)/2

Space and Stability Properties
Property	Value	Notes
Space Complexity	O(1)	Only key variable and loop indices
In-Place	Yes	Modifies array directly
Stable	Yes	Equal elements maintain relative order
Online	Yes	Elements can be sorted as they arrive
Adaptive	Yes	Runtime adapts to input disorder

The Unique Complexity Profile

Insertion Sort is the only common quadratic sorting algorithm with O(n) best case. This makes it uniquely suited for nearly-sorted data and hybrid algorithms that use it as a finishing step.

Summary: Understanding Insertion Sort's Complexity

We've rigorously analyzed Insertion Sort's time and space complexity. Here are the essential takeaways:

Key Takeaways

•Best case O(n) occurs when the array is already sorted—each element requires only one comparison and no shifts.
•Worst case O(n²) occurs when the array is reverse-sorted—each element must shift past all previously sorted elements.
•Average case O(n²), specifically ~n²/4 operations, for random permutations.
•Time = Θ(n + inversions) — the inversion count precisely determines runtime, explaining Insertion Sort's adaptivity.
•Space O(1) — constant auxiliary space makes Insertion Sort truly in-place.
•Stable and online properties come 'for free' with no complexity cost.

What's next:

Page Complete

3 / 4