Data Structures & AlgorithmsCommon D&C Patterns

Common Divide and Conquer Patterns

LevelAdvanced

Duration90 mins

TopicCommon D&C Patterns

2 / 4

Counting Inversions

When Order Matters — Measuring Disorder in Sequences

How similar are two rankings? If Alice ranks movies as [A, B, C, D] and Bob ranks them [D, C, B, A], they have completely opposite preferences. But if Carol ranks them [A, B, D, C], she's mostly aligned with Alice with just one swap needed. How do we quantify this difference mathematically?

The answer lies in counting inversions—a D&C problem that reveals the degree of disorder in a sequence. An inversion is a pair of elements that are "out of order" relative to their desired ordering. This seemingly simple concept has profound applications: from measuring similarity between rankings to analyzing how "far" a sequence is from being sorted.

The naive approach takes O(n²) time. But with a beautiful insight connecting inversions to merge sort, we can count all inversions in O(n log n) time.

What You Will Master

By the end of this page, you will understand what inversions are, why they matter, and how to count them efficiently using a modified merge sort. You'll see how the merge operation naturally counts cross-inversions, achieving O(n log n) complexity. This technique is a cornerstone of D&C problem-solving.

Inversions — Formal Definition and Intuition

Formal Definition:

Given an array A of n distinct elements, an inversion is a pair of indices (i, j) such that:

i < j (i comes before j in the array)
A[i] > A[j] (the element at i is greater than the element at j)

In other words, an inversion is a pair of elements that are "out of order" relative to ascending order.

Example:

Consider the array [2, 4, 1, 3, 5]:

Pair (i, j)	Elements	i < j?	A[i] > A[j]?	Inversion?
(0, 1)	2, 4	✓	✗	No
(0, 2)	2, 1	✓	✓	Yes
(1, 2)	4, 1	✓	✓	Yes
(1, 3)	4, 3	✓	✓	Yes
(2, 3)	1, 3	✓	✗	No
...	...	...	...	...

Total inversions: (0,2), (1,2), (1,3) = 3 inversions

The Sorting Perspective

The number of inversions tells us how "far" an array is from being sorted. A sorted array has 0 inversions. A reverse-sorted array has the maximum: n(n-1)/2 inversions (every pair is inverted). The inversion count is essentially a measure of disorder—the work needed to sort the array using adjacent swaps.

Bounds on Inversions:

Minimum: 0 inversions (already sorted: [1, 2, 3, 4, 5])
Maximum: C(n,2) = n(n-1)/2 inversions (reverse sorted: [5, 4, 3, 2, 1])

For n = 5:

Minimum = 0
Maximum = 5×4/2 = 10

Connection to Bubble Sort:

Fascinatingly, the number of inversions equals the number of swaps that bubble sort performs. Each adjacent swap in bubble sort removes exactly one inversion. This isn't coincidental—it's a deep relationship between disorder measures and sorting algorithms.

Inversion Counts for Sample Arrays (n=5)
Array	Inversions	Description
[1, 2, 3, 4, 5]	0	Sorted (no inversions)
[2, 1, 3, 4, 5]	1	One swap from sorted
[2, 4, 1, 3, 5]	3	Moderately disordered
[5, 4, 3, 2, 1]	10	Reverse sorted (maximum)
[3, 1, 2, 5, 4]	3	Moderately disordered

Why Count Inversions? Real-World Applications

Counting inversions isn't just a theoretical exercise—it has practical applications across multiple domains. Understanding these applications motivates the algorithm.

Collaborative Filtering and Recommendation Systems:

When recommending movies, music, or products, systems often compare users' preference rankings. If User A and User B have similar rankings (few inversions between their orderings), they likely have similar tastes.</p>

The Kendall Tau distance between two rankings is defined as the number of pairwise inversions between them. It's a standard metric in statistics and machine learning for measuring rank correlation.

Key Applications of Inversion Counting

•Recommendation Systems: Measure similarity between user preference rankings to find like-minded users for collaborative filtering
•Voting Theory: Analyze how different rankings of candidates compare using Kendall tau distance
•Sports Analytics: Compare expert rankings of teams or players across different ranking systems
•DNA Sequence Analysis: Measure rearrangements between genomes using inversion distance as an evolutionary metric
•Algorithm Analysis: Inversions count relates to sorting algorithm performance—bubble sort does exactly this many swaps
•Quality Metrics: In some contexts, inversions measure how "shuffled" or "random" a sequence is
•Information Retrieval: Compare document rankings from different search algorithms

Kendall Tau Correlation

The Kendall tau rank correlation coefficient is: τ = 1 - 4×(inversions) / n(n-1). It ranges from -1 (completely reversed) to +1 (identical rankings). When inversions = 0, τ = 1 (perfect agreement). When inversions = n(n-1)/2, τ = -1 (perfect disagreement).

Movie Ranking ComparisonAlice ranks 5 movies: [Star Wars, Matrix, Inception, Avatar, Titanic] Bob ranks them: [Matrix, Star Wars, Avatar, Inception, Titanic] Converting to positions (Alice's order as reference): - Alice: [1, 2, 3, 4, 5] (by definition) - Bob: [2, 1, 4, 3, 5] (Matrix is Alice's #2, Star Wars is #1, etc.) Inversions in Bob's ranking: (0,1)→(2,1), (2,3)→(4,3) = 2 inversions Kendall τ = 1 - 4×2 / (5×4) = 1 - 0.4 = 0.6 (moderate agreement)

Input

Output

The Naive Approach — Brute Force Counting

The straightforward approach checks all possible pairs (i, j) where i < j and counts those where A[i] > A[j]. With C(n,2) = n(n-1)/2 pairs, this takes O(n²) time.

brute_force_inversions.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
from typing import List, Tuple
 
def count_inversions_brute_force(arr: List[int]) -> Tuple[int, List[Tuple[int, int]]]:
    """
    Count inversions using brute force.
    
    An inversion is a pair (i, j) where i < j but arr[i] > arr[j].
    
    Time Complexity: O(n²)
    Space Complexity: O(k) where k is the number of inversions (to store pairs)
    
    Returns:
        Tuple of (count, list of inversion pairs)
    """
    n = len(arr)
    inversions = []
    count = 0
    
    for i in range(n):
        for j in range(i + 1, n):
            if arr[i] > arr[j]:
                inversions.append((i, j))
                count += 1
    
    return count, inversions
 
 
def count_inversions_brute_force_optimized(arr: List[int]) -> int:
    """
    Count inversions without storing pairs (space optimized).
    
    Time Complexity: O(n²)
    Space Complexity: O(1)
    """
    n = len(arr)
    count = 0
    
    for i in range(n):
        for j in range(i + 1, n):
            if arr[i] > arr[j]:
                count += 1
    
    return count
 
 
# Demonstration
if __name__ == "__main__":
    test_cases = [
        [1, 2, 3, 4, 5],       # Sorted: 0 inversions
        [2, 1, 3, 4, 5],       # One swap: 1 inversion
        [2, 4, 1, 3, 5],       # 3 inversions
        [5, 4, 3, 2, 1],       # Reverse: 10 inversions (max for n=5)
    ]
    
    for arr in test_cases:
        count, pairs = count_inversions_brute_force(arr)
        print(f"Array: {arr}")
        print(f"  Inversions: {count}")
        print(f"  Pairs: {pairs}")
        print()

Why O(n²) Is Unacceptable:

For recommendation systems comparing millions of user rankings or genomic analysis with thousands of genes, O(n²) becomes prohibitive:

Brute Force Scalability
n	Pairs Checked	Time @ 10M pairs/sec
1,000	~500,000	0.05 seconds
10,000	~50 million	5 seconds
100,000	~5 billion	~8 minutes
1,000,000	~500 billion	~14 hours

The Need for Speed

When Netflix compares user preferences or researchers analyze genomic rearrangements, they need fast inversion counting. The O(n log n) D&C solution makes this practical for millions of elements.

The Divide & Conquer Insight — Merge Sort Connection

The key insight is that inversions can be efficiently counted during merge sort. When we merge two sorted halves, we naturally encounter and count all inversions that span the boundary.

Categorizing Inversions:

For any inversion (i, j) with i < j and A[i] > A[j], exactly one of the following is true:

Left Inversion: Both i and j are in the left half
Right Inversion: Both i and j are in the right half
Split Inversion: i is in the left half, j is in the right half

This is an exhaustive partition—every inversion falls into exactly one category. This is the foundation of our D&C approach.

The Beautiful Insight

Left inversions are counted recursively on the left half. Right inversions are counted recursively on the right half. Split inversions—the cross-boundary ones—are counted during the merge operation. Merge sort's merge step naturally reveals all split inversions!

How Merge Counts Split Inversions:

During merge, we combine two sorted subarrays. When we take an element from the right subarray before exhausting the left subarray, every remaining element in the left subarray forms a split inversion with this element.

Example: Merging [3, 5, 7] and [2, 4, 8]

Left:  [3, 5, 7]    Right: [2, 4, 8]
   i →  ^               j →  ^

Step 1: Compare 3, 2. Take 2 (from right).
        2 < 3, so 2 has inversions with ALL remaining left elements: (3,2), (5,2), (7,2)
        Count += 3 (elements remaining in left = 3)

Step 2: Compare 3, 4. Take 3 (from left).
        Taking from left means no new inversions counted.

Step 3: Compare 5, 4. Take 4 (from right).
        4 < 5, so inversions with remaining left: (5,4), (7,4)
        Count += 2

Step 4: Compare 5, 8. Take 5 (from left). No inversions.
Step 5: Compare 7, 8. Take 7 (from left). No inversions.
Step 6: Take 8 (right exhausted or only right remains). No inversions.

Total split inversions: 3 + 2 = 5
Merged: [2, 3, 4, 5, 7, 8]

The D&C Strategy

•Divide: Split the array into two halves at the midpoint
•Conquer: Recursively count inversions in left half and right half
•Combine: Count split inversions during the merge operation, which also sorts the array
•Result: Total inversions = Left inversions + Right inversions + Split inversions

Why Sorting is Essential

The algorithm sorts the array as a side effect. This is necessary—the merge step only correctly counts split inversions when both halves are sorted. Each element in the left half that remains when we pick from the right is guaranteed to be greater than the picked element. This is precisely the inversion condition!

Complete D&C Implementation

Let's implement the complete algorithm, which is essentially merge sort with inversion counting integrated into the merge step.

count_inversions_dc.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
from typing import List, Tuple
 
def count_inversions_dc(arr: List[int]) -> int:
    """
    Count inversions using Divide and Conquer (merge sort based).
    
    Time Complexity: O(n log n)
    Space Complexity: O(n)
    
    Args:
        arr: List of comparable elements
        
    Returns:
        Number of inversions in the array
    """
    if len(arr) <= 1:
        return 0
    
    # Work on a copy to avoid modifying original
    temp = arr.copy()
    return _merge_count(temp, 0, len(temp) - 1)
 
 
def _merge_count(arr: List[int], left: int, right: int) -> int:
    """
    Recursively count inversions and sort the array.
    
    Returns the number of inversions in arr[left:right+1].
    """
    if left >= right:
        return 0
    
    mid = (left + right) // 2
    
    # Count inversions in left half
    left_inversions = _merge_count(arr, left, mid)
    
    # Count inversions in right half
    right_inversions = _merge_count(arr, mid + 1, right)
    
    # Count split inversions during merge
    split_inversions = _merge_and_count(arr, left, mid, right)
    
    return left_inversions + right_inversions + split_inversions
 
 
def _merge_and_count(arr: List[int], left: int, mid: int, right: int) -> int:
    """
    Merge two sorted subarrays and count split inversions.
    
    Left subarray: arr[left:mid+1] (already sorted)
    Right subarray: arr[mid+1:right+1] (already sorted)
    
    Key insight: When we take an element from the right subarray,
    all remaining elements in the left subarray form inversions with it.
    """
    # Copy both halves to temporary arrays
    left_arr = arr[left:mid + 1]
    right_arr = arr[mid + 1:right + 1]
    
    inversions = 0
    i = 0  # Index for left_arr
    j = 0  # Index for right_arr
    k = left  # Index for merged result
    
    n_left = len(left_arr)
    n_right = len(right_arr)
    
    while i < n_left and j < n_right:
        if left_arr[i] <= right_arr[j]:
            # Taking from left: no new inversions
            arr[k] = left_arr[i]
            i += 1
        else:
            # Taking from right: all remaining left elements are inversions
            # Because left_arr[i:] are all > right_arr[j]
            arr[k] = right_arr[j]
            inversions += (n_left - i)  # Count remaining left elements
            j += 1
        k += 1
    
    # Copy remaining elements (no new inversions in these steps)
    while i < n_left:
        arr[k] = left_arr[i]
        i += 1
        k += 1
    
    while j < n_right:
        arr[k] = right_arr[j]
        j += 1
        k += 1
    
    return inversions
 
 
def count_inversions_with_details(arr: List[int]) -> Tuple[int, List[int]]:
    """
    Count inversions and return the sorted array.
    
    Useful for debugging and understanding the algorithm.
    """
    working_copy = arr.copy()
    count = _merge_count(working_copy, 0, len(working_copy) - 1)
    return count, working_copy
 
 
# ============ DEMONSTRATION ============
if __name__ == "__main__":
    test_cases = [
        [1, 2, 3, 4, 5],       # 0 inversions
        [2, 1, 3, 4, 5],       # 1 inversion: (2,1)
        [2, 4, 1, 3, 5],       # 3 inversions
        [5, 4, 3, 2, 1],       # 10 inversions (maximum)
        [1, 3, 5, 2, 4, 6],    # 3 inversions: (3,2), (5,2), (5,4)
        [8, 4, 2, 1],          # 6 inversions
    ]
    
    print("Inversion Counting using Divide and Conquer")
    print("=" * 50)
    
    for arr in test_cases:
        count = count_inversions_dc(arr)
        _, sorted_arr = count_inversions_with_details(arr)
        print(f"Array: {arr}")
        print(f"  Inversions: {count}")
        print(f"  Sorted: {sorted_arr}")
        
        # Verify with brute force
        n = len(arr)
        bf_count = sum(1 for i in range(n) for j in range(i+1, n) if arr[i] > arr[j])
        assert count == bf_count, f"Mismatch: DC={count}, BF={bf_count}"
        print()
    
    print("All tests passed! ✓")

The Key Line

The magic happens in inversions += (n_left - i). When we pick from the right array, all remaining elements in the left array (from index i to the end, which is n_left - i elements) are greater than the picked element. Each of these forms a split inversion with the picked element.

Complexity Analysis — The O(n log n) Guarantee

The algorithm's complexity directly mirrors merge sort because it is merge sort with constant additional work per comparison.

Recurrence Relation:

Let T(n) be the time to count inversions in an array of n elements:

$$T(n) = 2T(n/2) + O(n)$$

Where:

2T(n/2): Two recursive calls on halves
O(n): The merge operation (linear in segment size)

Applying Master Theorem:

This is Case 2 (a = 2, b = 2, f(n) = Θ(n) = Θ(n^log₂2)):

$$T(n) = Θ(n \log n)$$

Complexity Comparison
Approach	Time	Space	Practical for n=1M?
Brute Force	O(n²)	O(1)	✗ ~14 hours
D&C (Merge Sort)	O(n log n)	O(n)	✓ ~20 seconds
Improvement Factor	~n / log n		~50,000× faster

Space Complexity:

Auxiliary arrays: O(n) for temporary storage during merge
Recursion stack: O(log n) depth
Total: O(n)

The space usage is standard for merge sort. Unlike the closest pair problem, there's no need for multiple sorted views, so space is simpler.

Efficiency Summary

From O(n²) to O(n log n) means: • 1K elements: 500K → 10K ops • 1M elements: 500B → 20M ops This is the difference between hours and seconds.

Why This Works:

The key insight is that counting split inversions during merge is O(n) for the merge, not O(n) per element. The line inversions += (n_left - i) counts multiple inversions in O(1) time by leveraging the sorted property of both halves.

Without sorting, we'd need to compare each left element with each right element: O(n²) per level. With sorting, we get all that information from a single linear scan: O(n) per level.

This is the power of Divide and Conquer: structuring the problem so that combining solutions is cheap.

Variations, Extensions, and Advanced Topics

The inversion counting concept extends to various related problems and has connections to other algorithmic techniques.

Related Problems and Variations
Problem	Description	Approach
Count Smaller Elements After Self	For each element, count smaller elements to its right	Modified merge sort tracking indices
Count of Range Sum	Count subarrays with sum in [lower, upper]	Merge sort on prefix sums
Reverse Pairs	Count (i, j) where i < j and arr[i] > 2*arr[j]	Merge sort with modified condition
Local Inversions vs Global	Compare adjacent inversions to all inversions	Can use O(n) insight for special cases

The BIT/Fenwick Tree Alternative

Inversions can also be counted using a Binary Indexed Tree (BIT). Process elements from right to left, querying the BIT for elements smaller than current (inversions with elements to the right). This is also O(n log n) and sometimes more intuitive for related problems.

count_smaller_after_self.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
"""
Related Problem: Count of Smaller Numbers After Self
 
For each element, count how many elements to its right are smaller.
This is essentially counting inversions per element.
 
Example:
Input:  [5, 2, 6, 1]
Output: [2, 1, 1, 0]
 
Explanation:
- 5: elements 2, 1 are smaller and to the right → 2
- 2: element 1 is smaller and to the right → 1
- 6: element 1 is smaller and to the right → 1
- 1: no elements to the right → 0
"""
 
from typing import List, Tuple
 
def count_smaller(nums: List[int]) -> List[int]:
    """
    Count smaller elements after self using modified merge sort.
    We track original indices to accumulate counts for each position.
    """
    n = len(nums)
    if n == 0:
        return []
    
    # Pair each number with its original index
    indexed = [(nums[i], i) for i in range(n)]
    result = [0] * n
    
    def merge_count(arr: List[Tuple[int, int]], temp: List[Tuple[int, int]], 
                    left: int, right: int) -> None:
        if left >= right:
            return
        
        mid = (left + right) // 2
        merge_count(arr, temp, left, mid)
        merge_count(arr, temp, mid + 1, right)
        
        # Merge and count
        i, j, k = left, mid + 1, left
        right_count = 0  # Count of right elements merged so far
        
        while i <= mid and j <= right:
            if arr[i][0] > arr[j][0]:
                # Element from right is smaller
                temp[k] = arr[j]
                right_count += 1
                j += 1
            else:
                # Element from left
                temp[k] = arr[i]
                # All right elements merged so far are smaller than arr[i]
                result[arr[i][1]] += right_count
                i += 1
            k += 1
        
        while i <= mid:
            temp[k] = arr[i]
            result[arr[i][1]] += right_count
            i += 1
            k += 1
        
        while j <= right:
            temp[k] = arr[j]
            j += 1
            k += 1
        
        arr[left:right + 1] = temp[left:right + 1]
    
    temp = [(0, 0)] * n
    merge_count(indexed, temp, 0, n - 1)
    return result
 
 
# Test
nums = [5, 2, 6, 1]
result = count_smaller(nums)
print(f"Input:  {nums}")
print(f"Output: {result}")
# Expected: [2, 1, 1, 0]

When to Use Inversion Counting Techniques

•Ranking comparisons: Comparing two orderings of the same elements (Kendall tau)
•Out-of-order counting: Any problem asking "how many pairs satisfy i < j but arr[i] > arr[j]?"
•Range counting: Some range queries on arrays can be transformed to inversion-like counts
•Disorder metrics: Measuring how "shuffled" or random a sequence appears
•Sorting analysis: Understanding the work needed to sort via adjacent swaps

Summary: The Inversion Counting Paradigm

Counting inversions demonstrates a powerful D&C pattern: augmenting a known algorithm (merge sort) to solve a seemingly unrelated problem. Let's consolidate the key insights:

Key Takeaways

•Inversions decompose naturally: Left inversions + Right inversions + Split inversions = Total inversions
•Merge reveals split inversions: The merge step naturally encounters all cross-boundary inversions while maintaining O(n) complexity
•The key insight: When taking from right, count all remaining left elements as inversions (they're all greater, by sorted property)
•Same complexity as sorting: Both are O(n log n) because the algorithm is sorting with constant overhead per comparison
•Practical applications abound: Ranking similarity, sequence disorder, algorithm analysis, genomics
•Pattern extends: Similar ideas apply to counting smaller elements, range sums, and other "relationship counting" problems

Mastery Achieved

You now understand one of the most elegant D&C augmentations: counting inversions via merge sort. This technique exemplifies how D&C problems often involve finding the right decomposition where the combine step does double duty—both solving the original problem and computing additional information. Next, we'll explore Karatsuba multiplication, which shows D&C applied to arithmetic.

2 / 4

Loading learning content...

Data Structures & AlgorithmsCommon D&C Patterns

Common Divide and Conquer Patterns

LevelAdvanced

Duration90 mins

TopicCommon D&C Patterns

2 / 4

Counting Inversions

When Order Matters — Measuring Disorder in Sequences

The naive approach takes O(n²) time. But with a beautiful insight connecting inversions to merge sort, we can count all inversions in O(n log n) time.

What You Will Master

Inversions — Formal Definition and Intuition

Formal Definition:

Given an array A of n distinct elements, an inversion is a pair of indices (i, j) such that:

i < j (i comes before j in the array)
A[i] > A[j] (the element at i is greater than the element at j)

In other words, an inversion is a pair of elements that are "out of order" relative to ascending order.

Example:

Consider the array [2, 4, 1, 3, 5]:

Pair (i, j)	Elements	i < j?	A[i] > A[j]?	Inversion?
(0, 1)	2, 4	✓	✗	No
(0, 2)	2, 1	✓	✓	Yes
(1, 2)	4, 1	✓	✓	Yes
(1, 3)	4, 3	✓	✓	Yes
(2, 3)	1, 3	✓	✗	No
...	...	...	...	...

Total inversions: (0,2), (1,2), (1,3) = 3 inversions

The Sorting Perspective

Bounds on Inversions:

Minimum: 0 inversions (already sorted: [1, 2, 3, 4, 5])
Maximum: C(n,2) = n(n-1)/2 inversions (reverse sorted: [5, 4, 3, 2, 1])

For n = 5:

Minimum = 0
Maximum = 5×4/2 = 10

Connection to Bubble Sort:

Inversion Counts for Sample Arrays (n=5)
Array	Inversions	Description
[1, 2, 3, 4, 5]	0	Sorted (no inversions)
[2, 1, 3, 4, 5]	1	One swap from sorted
[2, 4, 1, 3, 5]	3	Moderately disordered
[5, 4, 3, 2, 1]	10	Reverse sorted (maximum)
[3, 1, 2, 5, 4]	3	Moderately disordered

Why Count Inversions? Real-World Applications

Counting inversions isn't just a theoretical exercise—it has practical applications across multiple domains. Understanding these applications motivates the algorithm.

Collaborative Filtering and Recommendation Systems:

Key Applications of Inversion Counting

•Recommendation Systems: Measure similarity between user preference rankings to find like-minded users for collaborative filtering
•Voting Theory: Analyze how different rankings of candidates compare using Kendall tau distance
•Sports Analytics: Compare expert rankings of teams or players across different ranking systems
•DNA Sequence Analysis: Measure rearrangements between genomes using inversion distance as an evolutionary metric
•Algorithm Analysis: Inversions count relates to sorting algorithm performance—bubble sort does exactly this many swaps
•Quality Metrics: In some contexts, inversions measure how "shuffled" or "random" a sequence is
•Information Retrieval: Compare document rankings from different search algorithms

Kendall Tau Correlation

Input

Output

The Naive Approach — Brute Force Counting

The straightforward approach checks all possible pairs (i, j) where i < j and counts those where A[i] > A[j]. With C(n,2) = n(n-1)/2 pairs, this takes O(n²) time.

brute_force_inversions.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
from typing import List, Tuple
 
def count_inversions_brute_force(arr: List[int]) -> Tuple[int, List[Tuple[int, int]]]:
    """
    Count inversions using brute force.
    
    An inversion is a pair (i, j) where i < j but arr[i] > arr[j].
    
    Time Complexity: O(n²)
    Space Complexity: O(k) where k is the number of inversions (to store pairs)
    
    Returns:
        Tuple of (count, list of inversion pairs)
    """
    n = len(arr)
    inversions = []
    count = 0
    
    for i in range(n):
        for j in range(i + 1, n):
            if arr[i] > arr[j]:
                inversions.append((i, j))
                count += 1
    
    return count, inversions
 
 
def count_inversions_brute_force_optimized(arr: List[int]) -> int:
    """
    Count inversions without storing pairs (space optimized).
    
    Time Complexity: O(n²)
    Space Complexity: O(1)
    """
    n = len(arr)
    count = 0
    
    for i in range(n):
        for j in range(i + 1, n):
            if arr[i] > arr[j]:
                count += 1
    
    return count
 
 
# Demonstration
if __name__ == "__main__":
    test_cases = [
        [1, 2, 3, 4, 5],       # Sorted: 0 inversions
        [2, 1, 3, 4, 5],       # One swap: 1 inversion
        [2, 4, 1, 3, 5],       # 3 inversions
        [5, 4, 3, 2, 1],       # Reverse: 10 inversions (max for n=5)
    ]
    
    for arr in test_cases:
        count, pairs = count_inversions_brute_force(arr)
        print(f"Array: {arr}")
        print(f"  Inversions: {count}")
        print(f"  Pairs: {pairs}")
        print()

Why O(n²) Is Unacceptable:

For recommendation systems comparing millions of user rankings or genomic analysis with thousands of genes, O(n²) becomes prohibitive:

Brute Force Scalability
n	Pairs Checked	Time @ 10M pairs/sec
1,000	~500,000	0.05 seconds
10,000	~50 million	5 seconds
100,000	~5 billion	~8 minutes
1,000,000	~500 billion	~14 hours

The Need for Speed

When Netflix compares user preferences or researchers analyze genomic rearrangements, they need fast inversion counting. The O(n log n) D&C solution makes this practical for millions of elements.

The Divide & Conquer Insight — Merge Sort Connection

The key insight is that inversions can be efficiently counted during merge sort. When we merge two sorted halves, we naturally encounter and count all inversions that span the boundary.

Categorizing Inversions:

For any inversion (i, j) with i < j and A[i] > A[j], exactly one of the following is true:

Left Inversion: Both i and j are in the left half
Right Inversion: Both i and j are in the right half
Split Inversion: i is in the left half, j is in the right half

This is an exhaustive partition—every inversion falls into exactly one category. This is the foundation of our D&C approach.

The Beautiful Insight

How Merge Counts Split Inversions:

Example: Merging [3, 5, 7] and [2, 4, 8]

Left:  [3, 5, 7]    Right: [2, 4, 8]
   i →  ^               j →  ^

Step 1: Compare 3, 2. Take 2 (from right).
        2 < 3, so 2 has inversions with ALL remaining left elements: (3,2), (5,2), (7,2)
        Count += 3 (elements remaining in left = 3)

Step 2: Compare 3, 4. Take 3 (from left).
        Taking from left means no new inversions counted.

Step 3: Compare 5, 4. Take 4 (from right).
        4 < 5, so inversions with remaining left: (5,4), (7,4)
        Count += 2

Step 4: Compare 5, 8. Take 5 (from left). No inversions.
Step 5: Compare 7, 8. Take 7 (from left). No inversions.
Step 6: Take 8 (right exhausted or only right remains). No inversions.

Total split inversions: 3 + 2 = 5
Merged: [2, 3, 4, 5, 7, 8]

The D&C Strategy

•Divide: Split the array into two halves at the midpoint
•Conquer: Recursively count inversions in left half and right half
•Combine: Count split inversions during the merge operation, which also sorts the array
•Result: Total inversions = Left inversions + Right inversions + Split inversions

Why Sorting is Essential

Complete D&C Implementation

Let's implement the complete algorithm, which is essentially merge sort with inversion counting integrated into the merge step.

count_inversions_dc.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
from typing import List, Tuple
 
def count_inversions_dc(arr: List[int]) -> int:
    """
    Count inversions using Divide and Conquer (merge sort based).
    
    Time Complexity: O(n log n)
    Space Complexity: O(n)
    
    Args:
        arr: List of comparable elements
        
    Returns:
        Number of inversions in the array
    """
    if len(arr) <= 1:
        return 0
    
    # Work on a copy to avoid modifying original
    temp = arr.copy()
    return _merge_count(temp, 0, len(temp) - 1)
 
 
def _merge_count(arr: List[int], left: int, right: int) -> int:
    """
    Recursively count inversions and sort the array.
    
    Returns the number of inversions in arr[left:right+1].
    """
    if left >= right:
        return 0
    
    mid = (left + right) // 2
    
    # Count inversions in left half
    left_inversions = _merge_count(arr, left, mid)
    
    # Count inversions in right half
    right_inversions = _merge_count(arr, mid + 1, right)
    
    # Count split inversions during merge
    split_inversions = _merge_and_count(arr, left, mid, right)
    
    return left_inversions + right_inversions + split_inversions
 
 
def _merge_and_count(arr: List[int], left: int, mid: int, right: int) -> int:
    """
    Merge two sorted subarrays and count split inversions.
    
    Left subarray: arr[left:mid+1] (already sorted)
    Right subarray: arr[mid+1:right+1] (already sorted)
    
    Key insight: When we take an element from the right subarray,
    all remaining elements in the left subarray form inversions with it.
    """
    # Copy both halves to temporary arrays
    left_arr = arr[left:mid + 1]
    right_arr = arr[mid + 1:right + 1]
    
    inversions = 0
    i = 0  # Index for left_arr
    j = 0  # Index for right_arr
    k = left  # Index for merged result
    
    n_left = len(left_arr)
    n_right = len(right_arr)
    
    while i < n_left and j < n_right:
        if left_arr[i] <= right_arr[j]:
            # Taking from left: no new inversions
            arr[k] = left_arr[i]
            i += 1
        else:
            # Taking from right: all remaining left elements are inversions
            # Because left_arr[i:] are all > right_arr[j]
            arr[k] = right_arr[j]
            inversions += (n_left - i)  # Count remaining left elements
            j += 1
        k += 1
    
    # Copy remaining elements (no new inversions in these steps)
    while i < n_left:
        arr[k] = left_arr[i]
        i += 1
        k += 1
    
    while j < n_right:
        arr[k] = right_arr[j]
        j += 1
        k += 1
    
    return inversions
 
 
def count_inversions_with_details(arr: List[int]) -> Tuple[int, List[int]]:
    """
    Count inversions and return the sorted array.
    
    Useful for debugging and understanding the algorithm.
    """
    working_copy = arr.copy()
    count = _merge_count(working_copy, 0, len(working_copy) - 1)
    return count, working_copy
 
 
# ============ DEMONSTRATION ============
if __name__ == "__main__":
    test_cases = [
        [1, 2, 3, 4, 5],       # 0 inversions
        [2, 1, 3, 4, 5],       # 1 inversion: (2,1)
        [2, 4, 1, 3, 5],       # 3 inversions
        [5, 4, 3, 2, 1],       # 10 inversions (maximum)
        [1, 3, 5, 2, 4, 6],    # 3 inversions: (3,2), (5,2), (5,4)
        [8, 4, 2, 1],          # 6 inversions
    ]
    
    print("Inversion Counting using Divide and Conquer")
    print("=" * 50)
    
    for arr in test_cases:
        count = count_inversions_dc(arr)
        _, sorted_arr = count_inversions_with_details(arr)
        print(f"Array: {arr}")
        print(f"  Inversions: {count}")
        print(f"  Sorted: {sorted_arr}")
        
        # Verify with brute force
        n = len(arr)
        bf_count = sum(1 for i in range(n) for j in range(i+1, n) if arr[i] > arr[j])
        assert count == bf_count, f"Mismatch: DC={count}, BF={bf_count}"
        print()
    
    print("All tests passed! ✓")

The Key Line

Complexity Analysis — The O(n log n) Guarantee

The algorithm's complexity directly mirrors merge sort because it is merge sort with constant additional work per comparison.

Recurrence Relation:

Let T(n) be the time to count inversions in an array of n elements:

$$T(n) = 2T(n/2) + O(n)$$

Where:

2T(n/2): Two recursive calls on halves
O(n): The merge operation (linear in segment size)

Applying Master Theorem:

This is Case 2 (a = 2, b = 2, f(n) = Θ(n) = Θ(n^log₂2)):

$$T(n) = Θ(n \log n)$$

Complexity Comparison
Approach	Time	Space	Practical for n=1M?
Brute Force	O(n²)	O(1)	✗ ~14 hours
D&C (Merge Sort)	O(n log n)	O(n)	✓ ~20 seconds
Improvement Factor	~n / log n		~50,000× faster

Space Complexity:

Auxiliary arrays: O(n) for temporary storage during merge
Recursion stack: O(log n) depth
Total: O(n)

The space usage is standard for merge sort. Unlike the closest pair problem, there's no need for multiple sorted views, so space is simpler.

Efficiency Summary

From O(n²) to O(n log n) means: • 1K elements: 500K → 10K ops • 1M elements: 500B → 20M ops This is the difference between hours and seconds.

Why This Works:

Without sorting, we'd need to compare each left element with each right element: O(n²) per level. With sorting, we get all that information from a single linear scan: O(n) per level.

This is the power of Divide and Conquer: structuring the problem so that combining solutions is cheap.

Variations, Extensions, and Advanced Topics

The inversion counting concept extends to various related problems and has connections to other algorithmic techniques.

Related Problems and Variations
Problem	Description	Approach
Count Smaller Elements After Self	For each element, count smaller elements to its right	Modified merge sort tracking indices
Count of Range Sum	Count subarrays with sum in [lower, upper]	Merge sort on prefix sums
Reverse Pairs	Count (i, j) where i < j and arr[i] > 2*arr[j]	Merge sort with modified condition
Local Inversions vs Global	Compare adjacent inversions to all inversions	Can use O(n) insight for special cases

The BIT/Fenwick Tree Alternative

count_smaller_after_self.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
"""
Related Problem: Count of Smaller Numbers After Self
 
For each element, count how many elements to its right are smaller.
This is essentially counting inversions per element.
 
Example:
Input:  [5, 2, 6, 1]
Output: [2, 1, 1, 0]
 
Explanation:
- 5: elements 2, 1 are smaller and to the right → 2
- 2: element 1 is smaller and to the right → 1
- 6: element 1 is smaller and to the right → 1
- 1: no elements to the right → 0
"""
 
from typing import List, Tuple
 
def count_smaller(nums: List[int]) -> List[int]:
    """
    Count smaller elements after self using modified merge sort.
    We track original indices to accumulate counts for each position.
    """
    n = len(nums)
    if n == 0:
        return []
    
    # Pair each number with its original index
    indexed = [(nums[i], i) for i in range(n)]
    result = [0] * n
    
    def merge_count(arr: List[Tuple[int, int]], temp: List[Tuple[int, int]], 
                    left: int, right: int) -> None:
        if left >= right:
            return
        
        mid = (left + right) // 2
        merge_count(arr, temp, left, mid)
        merge_count(arr, temp, mid + 1, right)
        
        # Merge and count
        i, j, k = left, mid + 1, left
        right_count = 0  # Count of right elements merged so far
        
        while i <= mid and j <= right:
            if arr[i][0] > arr[j][0]:
                # Element from right is smaller
                temp[k] = arr[j]
                right_count += 1
                j += 1
            else:
                # Element from left
                temp[k] = arr[i]
                # All right elements merged so far are smaller than arr[i]
                result[arr[i][1]] += right_count
                i += 1
            k += 1
        
        while i <= mid:
            temp[k] = arr[i]
            result[arr[i][1]] += right_count
            i += 1
            k += 1
        
        while j <= right:
            temp[k] = arr[j]
            j += 1
            k += 1
        
        arr[left:right + 1] = temp[left:right + 1]
    
    temp = [(0, 0)] * n
    merge_count(indexed, temp, 0, n - 1)
    return result
 
 
# Test
nums = [5, 2, 6, 1]
result = count_smaller(nums)
print(f"Input:  {nums}")
print(f"Output: {result}")
# Expected: [2, 1, 1, 0]

When to Use Inversion Counting Techniques

•Ranking comparisons: Comparing two orderings of the same elements (Kendall tau)
•Out-of-order counting: Any problem asking "how many pairs satisfy i < j but arr[i] > arr[j]?"
•Range counting: Some range queries on arrays can be transformed to inversion-like counts
•Disorder metrics: Measuring how "shuffled" or random a sequence appears
•Sorting analysis: Understanding the work needed to sort via adjacent swaps

Summary: The Inversion Counting Paradigm

Counting inversions demonstrates a powerful D&C pattern: augmenting a known algorithm (merge sort) to solve a seemingly unrelated problem. Let's consolidate the key insights:

Key Takeaways

•Inversions decompose naturally: Left inversions + Right inversions + Split inversions = Total inversions
•Merge reveals split inversions: The merge step naturally encounters all cross-boundary inversions while maintaining O(n) complexity
•The key insight: When taking from right, count all remaining left elements as inversions (they're all greater, by sorted property)
•Same complexity as sorting: Both are O(n log n) because the algorithm is sorting with constant overhead per comparison
•Practical applications abound: Ranking similarity, sequence disorder, algorithm analysis, genomics
•Pattern extends: Similar ideas apply to counting smaller elements, range sums, and other "relationship counting" problems

Mastery Achieved

2 / 4