Common D&C Patterns - Learning Module

Loading content...

0/276

Karatsuba Multiplication (Conceptual)

Challenging the Obvious — Can We Multiply Faster?

When you learned multiplication in school, you learned the "long multiplication" method: multiply each digit by each other digit and add the results with appropriate shifts. This takes O(n²) operations for n-digit numbers. For centuries, mathematicians assumed this was optimal—after all, you need to "touch" each pair of digits at least once, right?

Wrong.

In 1960, a young Soviet mathematician named Anatoly Karatsuba stunned the world by proving that multiplication could be done faster than O(n²). His insight was elegant: by cleverly reusing intermediate results, three recursive multiplications suffice where our naive approach would require four. This seemingly small improvement reduces complexity from O(n²) to O(n^1.585)—a breakthrough that opened the door to modern fast multiplication algorithms.

Karatsuba multiplication is historically significant: it was the first algorithm to break the O(n²) barrier, disproving the widely held conjecture that grade-school multiplication was optimal.

What You Will Master

By the end of this page, you will understand the mathematical insight behind Karatsuba's algorithm, how it reduces four multiplications to three, the resulting complexity improvement, and why this matters for cryptography and big-number arithmetic. This is D&C applied to the most fundamental arithmetic operation.

The Grade School Algorithm — Our O(n²) Baseline

Before we can appreciate Karatsuba's breakthrough, we need to understand what we're improving upon. The grade school algorithm—the one you learned as a child—is actually quite clever, but fundamentally limited.

The Algorithm:

To multiply two n-digit numbers X and Y:

Write X above Y, aligned at the right
Multiply Y's rightmost digit by each digit of X → produces an n-digit result
Multiply Y's next digit by each digit of X → produces an n-digit result, shifted left by 1
Repeat for all n digits of Y → produces n partial products
Add all n partial products together

Example: 23 × 14

      23
    × 14
    ----
      92    (23 × 4)
     23     (23 × 1, shifted)
    ----
     322

Complexity Analysis:

Each digit of Y is multiplied by each digit of X: n² single-digit multiplications
Adding the partial products: O(n²) single-digit additions
Total: O(n²) operations

Why This Seems Optimal:

The naive argument goes: "We have n digits in X and n digits in Y. Each digit of X might affect each digit of the result in some way depending on each digit of Y. Therefore, we need at least n² operations."

This argument is seductive but wrong. Karatsuba showed that clever decomposition can reduce redundant work.

Grade School Multiplication Scalability
Number of Digits (n)	Operations (≈n²)	Time @ 1B ops/sec
100	10,000	0.00001 seconds
1,000	1,000,000	0.001 seconds
10,000	100,000,000	0.1 seconds
100,000	10 billion	10 seconds
1,000,000	1 trillion	17 minutes

Cryptography Demands Better

Modern cryptography uses numbers with thousands or millions of digits (2048-bit RSA uses ~617 decimal digits). At O(n²), multiplying two 10,000-digit numbers takes 100 million operations. For the millions of multiplications in cryptographic operations, this becomes a bottleneck. We need faster algorithms.

Setting Up the Divide & Conquer Approach

To apply D&C to multiplication, we first need to express the multiplication of two n-digit numbers in terms of multiplications of smaller numbers.

Decomposition:

Let X and Y be n-digit numbers. Split each into two halves:

X = Xₕ × 10^(n/2) + Xₗ (high and low halves)
Y = Yₕ × 10^(n/2) + Yₗ

Example: For X = 5678 (n=4):

Xₕ = 56 (high half)
Xₗ = 78 (low half)
X = 56 × 10² + 78 = 5600 + 78 = 5678 ✓

Similarly for Y = 1234:

Yₕ = 12
Yₗ = 34
Y = 12 × 10² + 34 = 1234 ✓

The Naive D&C Expansion:

Now we expand X × Y:

$$X \times Y = (X_h \times 10^{n/2} + X_l)(Y_h \times 10^{n/2} + Y_l)$$

Expanding (FOIL method):

$$= X_h Y_h \times 10^n + X_h Y_l \times 10^{n/2} + X_l Y_h \times 10^{n/2} + X_l Y_l$$

Grouping:

$$= X_h Y_h \times 10^n + (X_h Y_l + X_l Y_h) \times 10^{n/2} + X_l Y_l$$

This requires four multiplications of (n/2)-digit numbers:

Xₕ × Yₕ
Xₕ × Yₗ
Xₗ × Yₕ
Xₗ × Yₗ

The Recurrence with Four Multiplications

With 4 recursive calls on n/2-sized subproblems plus O(n) addition/shifting:

T(n) = 4T(n/2) + O(n)

Using Master Theorem: a=4, b=2, f(n)=O(n), log₂4 = 2

Since f(n) = O(n) = O(n^(log₂4 - 1)) → Case 1: T(n) = O(n²)

This naive D&C gives us the same O(n²) as grade school! The D&C structure alone doesn't help—we need fewer recursive calls.

The Problem

•We need four multiplications: XₕYₕ, XₕYₗ, XₗYₕ, XₗYₗ
•This leads to T(n) = 4T(n/2) + O(n) = O(n²)
•We've gained nothing over the grade school method
•To beat O(n²), we must reduce the number of recursive multiplications

Karatsuba's Brilliant Insight — Three Multiplications Suffice

Karatsuba's genius was realizing that we can compute the "middle term" (XₕYₗ + XₗYₕ) using only one additional multiplication instead of two. The trick involves a clever algebraic identity.

The Key Identity:

We need three products:

P₁ = Xₕ × Yₕ (high × high)
P₃ = Xₗ × Yₗ (low × low)
P₂ = (Xₕ + Xₗ) × (Yₕ + Yₗ) (sum × sum)

The Magic:

$$X_h Y_l + X_l Y_h = P_2 - P_1 - P_3$$

Proof: $$P_2 = (X_h + X_l)(Y_h + Y_l) = X_h Y_h + X_h Y_l + X_l Y_h + X_l Y_l$$ $$P_2 - P_1 - P_3 = X_h Y_h + X_h Y_l + X_l Y_h + X_l Y_l - X_h Y_h - X_l Y_l$$ $$= X_h Y_l + X_l Y_h \text{ ✓}$$

Instead of computing XₕYₗ and XₗYₕ separately (two multiplications), we compute their sum by computing P₂ and subtracting the products we already have!

The Karatsuba Trade-off

We trade multiplications for additions/subtractions. We compute P₂ on slightly larger numbers (n/2 + 1 digits due to the addition), but addition is O(n) while multiplication is expensive. This trade-off is massively profitable at scale.

The Complete Formula:

$$X \times Y = P_1 \times 10^n + (P_2 - P_1 - P_3) \times 10^{n/2} + P_3$$

Where:

P₁ = Xₕ × Yₕ (one multiplication)
P₂ = (Xₕ + Xₗ) × (Yₕ + Yₗ) (one multiplication)
P₃ = Xₗ × Yₗ (one multiplication)

Only three multiplications!

The additions, subtractions, and shifts (multiply by 10^k) are all O(n) operations.

karatsuba_conceptual.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
def karatsuba(x: int, y: int) -> int:
    """
    Karatsuba multiplication algorithm.
    
    Demonstrates the D&C approach that achieves O(n^log₂3) ≈ O(n^1.585)
    complexity instead of O(n²).
    
    For production use, Python's built-in multiplication is faster
    (uses optimized algorithms at the C level). This is for education.
    """
    # Base case: single-digit multiplication
    if x < 10 or y < 10:
        return x * y
    
    # Determine the size (number of digits)
    n = max(len(str(x)), len(str(y)))
    half = n // 2
    
    # Split the numbers into high and low parts
    # x = x_high * 10^half + x_low
    divisor = 10 ** half
    
    x_high, x_low = divmod(x, divisor)
    y_high, y_low = divmod(y, divisor)
    
    # Karatsuba's three multiplications
    p1 = karatsuba(x_high, y_high)                     # High × High
    p3 = karatsuba(x_low, y_low)                       # Low × Low
    p2 = karatsuba(x_high + x_low, y_high + y_low)     # (Sum) × (Sum)
    
    # Combine using the key identity:
    # x * y = p1 * 10^n + (p2 - p1 - p3) * 10^(n/2) + p3
    return (
        p1 * (10 ** (2 * half)) +
        (p2 - p1 - p3) * (10 ** half) +
        p3
    )
 
 
def karatsuba_verbose(x: int, y: int, depth: int = 0) -> int:
    """
    Karatsuba with verbose output to trace execution.
    """
    indent = "  " * depth
    print(f"{indent}karatsuba({x}, {y})")
    
    if x < 10 or y < 10:
        result = x * y
        print(f"{indent}  Base case: {result}")
        return result
    
    n = max(len(str(x)), len(str(y)))
    half = n // 2
    divisor = 10 ** half
    
    x_high, x_low = divmod(x, divisor)
    y_high, y_low = divmod(y, divisor)
    
    print(f"{indent}  Split: x = {x_high}*10^{half} + {x_low}")
    print(f"{indent}         y = {y_high}*10^{half} + {y_low}")
    
    print(f"{indent}  Computing P1 = {x_high} × {y_high}:")
    p1 = karatsuba_verbose(x_high, y_high, depth + 2)
    
    print(f"{indent}  Computing P3 = {x_low} × {y_low}:")
    p3 = karatsuba_verbose(x_low, y_low, depth + 2)
    
    print(f"{indent}  Computing P2 = ({x_high}+{x_low}) × ({y_high}+{y_low}):")
    p2 = karatsuba_verbose(x_high + x_low, y_high + y_low, depth + 2)
    
    middle = p2 - p1 - p3
    print(f"{indent}  Middle term = P2 - P1 - P3 = {p2} - {p1} - {p3} = {middle}")
    
    result = p1 * (10 ** (2 * half)) + middle * (10 ** half) + p3
    print(f"{indent}  Result = {p1}*10^{2*half} + {middle}*10^{half} + {p3} = {result}")
    
    return result
 
 
# Demonstration
if __name__ == "__main__":
    # Simple test
    x, y = 5678, 1234
    
    print("=" * 60)
    print(f"Computing {x} × {y}")
    print("=" * 60)
    
    result = karatsuba(x, y)
    expected = x * y
    
    print(f"\nKaratsuba result: {result}")
    print(f"Expected (x × y): {expected}")
    print(f"Correct: {result == expected}")
    
    print("\n" + "=" * 60)
    print("Verbose trace for smaller example: 56 × 12")
    print("=" * 60 + "\n")
    result = karatsuba_verbose(56, 12)
    print(f"\nFinal result: {result} (expected: {56 * 12})")

Worked Example — Step by Step

Let's trace through a complete example to crystallize the algorithm.

Problem: Compute 5678 × 1234 using Karatsuba

Step 1: Split the numbers (n=4, half=2)

X = 5678: Xₕ = 56, Xₗ = 78
Y = 1234: Yₕ = 12, Yₗ = 34

Step 2: Compute the three products recursively

P₁ = Xₕ × Yₕ = 56 × 12

Recursively with n=2, half=1:

56: high=5, low=6
12: high=1, low=2

P₁₁ = 5 × 1 = 5 P₁₃ = 6 × 2 = 12 P₁₂ = (5+6) × (1+2) = 11 × 3 = 33 Middle₁ = 33 - 5 - 12 = 16

P₁ = 5×100 + 16×10 + 12 = 500 + 160 + 12 = 672

P₃ = Xₗ × Yₗ = 78 × 34

78: high=7, low=8
34: high=3, low=4

P₃₁ = 7 × 3 = 21 P₃₃ = 8 × 4 = 32 P₃₂ = (7+8) × (3+4) = 15 × 7 = 105 Middle₃ = 105 - 21 - 32 = 52

P₃ = 21×100 + 52×10 + 32 = 2100 + 520 + 32 = 2652

P₂ = (Xₕ + Xₗ) × (Yₕ + Yₗ) = (56+78) × (12+34) = 134 × 46

Now we need 134 × 46 (note: 134 has 3 digits!)

134: high=13, low=4 (using half=1 for n=2 of the smaller factor)

Actually, let's use half = max digits / 2 = 3/2 = 1:

134 = 13×10 + 4
46 = 4×10 + 6

P₂₁ = 13 × 4 = 52 P₂₃ = 4 × 6 = 24 P₂₂ = (13+4) × (4+6) = 17 × 10 = 170 Middle₂ = 170 - 52 - 24 = 94

P₂ = 52×100 + 94×10 + 24 = 5200 + 940 + 24 = 6164

Step 3: Combine

Middle = P₂ - P₁ - P₃ = 6164 - 672 - 2652 = 2840

Result = P₁×10⁴ + Middle×10² + P₃ = 672×10000 + 2840×100 + 2652 = 6,720,000 + 284,000 + 2,652 = 7,006,652

Verification: 5678 × 1234 = 7,006,652 ✓

The Power of Three

At each level, we made 3 recursive calls instead of 4. This compounds: over log n levels, we save an exponential amount of work. That's how we get from O(n²) to O(n^1.585).

Complexity Analysis — The O(n^1.585) Breakthrough

Let's rigorously prove the complexity improvement using the Master Theorem.

The Recurrence Relation:

Let T(n) be the time to multiply two n-digit numbers:

$$T(n) = 3T(n/2) + O(n)$$

Where:

3T(n/2): Three recursive multiplications on (n/2)-digit numbers
O(n): Additions, subtractions, and shifts (all linear in n)

Applying the Master Theorem:

Compare with T(n) = aT(n/b) + f(n):

a = 3 (three recursive calls)
b = 2 (each subproblem is half the size)
f(n) = O(n)

Compute: log_b(a) = log₂(3) ≈ 1.585

Compare f(n) = O(n) with n^(log₂3) ≈ n^1.585:

f(n) = O(n) = O(n^1) which is polynomially smaller than n^1.585
This is Case 1 of Master Theorem: f(n) = O(n^(log_b(a) - ε)) for ε ≈ 0.585

Therefore: $$T(n) = Θ(n^{log_2 3}) ≈ Θ(n^{1.585})$$

Complexity Comparison: Grade School vs Karatsuba
Digits (n)	Grade School (n²)	Karatsuba (n^1.585)	Speedup
100	10,000	~3,162	~3×
1,000	1,000,000	~39,811	~25×
10,000	100,000,000	~501,187	~200×
100,000	10 billion	~6.3 million	~1,600×
1,000,000	1 trillion	~79 million	~12,600×

Asymptotic Notation Reminder

n^1.585 vs n² might not seem like a huge difference. But remember: at n = 1 million digits: • n² = 10¹² operations • n^1.585 ≈ 79 million operations

That's 12,600× fewer operations! The exponent difference compounds dramatically at scale.

Why the Improvement?

In the naive 4-multiplication approach:

T(n) = 4T(n/2) + O(n)
log₂(4) = 2
T(n) = O(n²)

Reducing from 4 to 3 multiplications:

T(n) = 3T(n/2) + O(n)
log₂(3) ≈ 1.585
T(n) = O(n^1.585)

That one saved multiplication cascades through all recursion levels!

The Recursion Savings

Each level of recursion does 3 calls instead of 4. Over log₂n levels, we go from 4^(log₂n) = n² leaves to 3^(log₂n) = n^1.585 leaves. This is the fundamental reason for the improvement.

Practical Considerations and Modern Context

While Karatsuba's algorithm is a theoretical breakthrough, its practical use requires understanding several considerations.

When to Switch to Grade School:

Karatsuba has higher constant factors (more additions, subtractions, and bookkeeping). For small numbers, the O(n²) grade school method is actually faster. Production implementations typically use a hybrid approach:

For n < threshold (typically 10-100 digits): use grade school
For larger n: use Karatsuba

This is a classic example of algorithm engineering: using asymptotically optimal algorithms only when the input is large enough for the asymptotics to matter.

Modern Fast Multiplication Algorithms
Algorithm	Complexity	Best For	Notes
Grade School	O(n²)	n < 50 digits	Simple, low overhead
Karatsuba	O(n^1.585)	50-1000 digits	Good balance of simplicity and speed
Toom-Cook (Toom-3)	O(n^1.465)	1000-10000 digits	More complex, uses 5 multiplications
Schönhage–Strassen	O(n log n log log n)	10000-10^7 digits	FFT-based, complex implementation
Harvey–van der Hoeven	O(n log n)	Theoretical	2019 breakthrough, nearly optimal

Karatsuba's Legacy

Although faster algorithms exist, Karatsuba remains important:

It's the simplest sub-quadratic algorithm (easy to understand and implement)
It disproved the long-held belief that O(n²) was optimal
It introduced the idea of reducing subproblem count via algebraic tricks
It's still the sweet spot for medium-sized numbers in practice

Where Fast Multiplication Matters

•Cryptography (RSA, Diffie-Hellman): Operations on 2048-4096 bit integers; millions of multiplications per operation
•Arbitrary-precision arithmetic: Scientific computing, financial calculations requiring exact decimal arithmetic
•Computer algebra systems: Mathematica, Maple, SymPy handling symbolic computations
•Polynomial multiplication: FFT-based methods (related to Schönhage–Strassen) dominate signal processing
•Number theory research: Computing digits of π, primality testing for enormous numbers

The Generalization — Toom-Cook and Beyond

Karatsuba's insight generalizes. Instead of splitting numbers into 2 parts, we can split into k parts. This is the Toom-Cook family of algorithms.

The Pattern:

Karatsuba (Toom-2): Split into 2 parts, use 3 multiplications instead of 4
Toom-3: Split into 3 parts, use 5 multiplications instead of 9
Toom-4: Split into 4 parts, use 7 multiplications instead of 16
Toom-k: Split into k parts, use 2k-1 multiplications instead of k²

As k increases:

The complexity exponent decreases (approaches 1)
The constant factors increase dramatically
The bookkeeping becomes more complex

The Toom-3 algorithm achieves O(n^(log₃5)) ≈ O(n^1.465), even better than Karatsuba!

toom_cook_idea.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
"""
Conceptual comparison of multiplication algorithms.
 
This demonstrates the D&C structure, not production code.
"""
 
def grade_school_recurrence(n: int) -> int:
    """
    T(n) = 4T(n/2) + O(n) → O(n²)
    4 multiplications of half-size numbers
    """
    if n <= 1:
        return 1
    return 4 * grade_school_recurrence(n // 2) + n
 
def karatsuba_recurrence(n: int) -> int:
    """
    T(n) = 3T(n/2) + O(n) → O(n^1.585)
    3 multiplications of half-size numbers
    """
    if n <= 1:
        return 1
    return 3 * karatsuba_recurrence(n // 2) + n
 
def toom3_recurrence(n: int) -> int:
    """
    T(n) = 5T(n/3) + O(n) → O(n^1.465)
    5 multiplications of third-size numbers
    """
    if n <= 1:
        return 1
    return 5 * toom3_recurrence(n // 3) + n
 
# Compare growth rates
print("n       | Grade School | Karatsuba | Toom-3")
print("-" * 55)
for n in [10, 100, 1000, 10000]:
    gs = grade_school_recurrence(n)
    kar = karatsuba_recurrence(n)
    toom = toom3_recurrence(n)
    print(f"{n:7} | {gs:12} | {kar:9} | {toom:6}")

The Ultimate Limit

In 2019, Harvey and van der Hoeven proved that integer multiplication can be done in O(n log n) time—essentially matching the complexity of addition up to a logarithmic factor! This is believed to be optimal. However, the algorithm is highly complex and only practical for astronomically large numbers.

Summary: The Karatsuba Legacy

Karatsuba multiplication demonstrates a profound D&C principle: sometimes, algebraic restructuring can reduce the number of expensive operations, trading them for cheaper ones. Let's consolidate the key insights:

Key Takeaways

•Challenge assumptions: For centuries, O(n²) multiplication was assumed optimal. Karatsuba proved otherwise in 1960.
•Reduce recursive calls: The difference between T(n) = 4T(n/2) and T(n) = 3T(n/2) is the difference between O(n²) and O(n^1.585)
•The key identity: (Xₕ + Xₗ)(Yₕ + Yₗ) - XₕYₕ - XₗYₗ = XₕYₗ + XₗYₕ gives us the cross terms "for free"
•Trade multiplications for additions: Additions are O(n), multiplications are expensive. Trade wisely.
•Practical hybrid approaches: Real implementations switch between algorithms based on input size
•A gateway to modern algorithms: Karatsuba opened the door to Toom-Cook, FFT-based methods, and beyond

Mastery Achieved

You now understand Karatsuba multiplication—a landmark algorithm that disproved a centuries-old assumption. The technique of reducing recursive calls through clever algebraic identities appears throughout algorithm design. Next, we'll explore pattern recognition for D&C problems: how to identify when this paradigm applies and how to structure your solutions.

Karatsuba Multiplication (Conceptual)

Challenging the Obvious — Can We Multiply Faster?

Wrong.

Karatsuba multiplication is historically significant: it was the first algorithm to break the O(n²) barrier, disproving the widely held conjecture that grade-school multiplication was optimal.

What You Will Master

The Grade School Algorithm — Our O(n²) Baseline

The Algorithm:

To multiply two n-digit numbers X and Y:

Write X above Y, aligned at the right
Multiply Y's rightmost digit by each digit of X → produces an n-digit result
Multiply Y's next digit by each digit of X → produces an n-digit result, shifted left by 1
Repeat for all n digits of Y → produces n partial products
Add all n partial products together

Example: 23 × 14

      23
    × 14
    ----
      92    (23 × 4)
     23     (23 × 1, shifted)
    ----
     322

Complexity Analysis:

Each digit of Y is multiplied by each digit of X: n² single-digit multiplications
Adding the partial products: O(n²) single-digit additions
Total: O(n²) operations

Why This Seems Optimal:

This argument is seductive but wrong. Karatsuba showed that clever decomposition can reduce redundant work.

Grade School Multiplication Scalability
Number of Digits (n)	Operations (≈n²)	Time @ 1B ops/sec
100	10,000	0.00001 seconds
1,000	1,000,000	0.001 seconds
10,000	100,000,000	0.1 seconds
100,000	10 billion	10 seconds
1,000,000	1 trillion	17 minutes

Cryptography Demands Better

Setting Up the Divide & Conquer Approach

To apply D&C to multiplication, we first need to express the multiplication of two n-digit numbers in terms of multiplications of smaller numbers.

Decomposition:

Let X and Y be n-digit numbers. Split each into two halves:

X = Xₕ × 10^(n/2) + Xₗ (high and low halves)
Y = Yₕ × 10^(n/2) + Yₗ

Example: For X = 5678 (n=4):

Xₕ = 56 (high half)
Xₗ = 78 (low half)
X = 56 × 10² + 78 = 5600 + 78 = 5678 ✓

Similarly for Y = 1234:

Yₕ = 12
Yₗ = 34
Y = 12 × 10² + 34 = 1234 ✓

The Naive D&C Expansion:

Now we expand X × Y:

$$X \times Y = (X_h \times 10^{n/2} + X_l)(Y_h \times 10^{n/2} + Y_l)$$

Expanding (FOIL method):

$$= X_h Y_h \times 10^n + X_h Y_l \times 10^{n/2} + X_l Y_h \times 10^{n/2} + X_l Y_l$$

Grouping:

$$= X_h Y_h \times 10^n + (X_h Y_l + X_l Y_h) \times 10^{n/2} + X_l Y_l$$

This requires four multiplications of (n/2)-digit numbers:

Xₕ × Yₕ
Xₕ × Yₗ
Xₗ × Yₕ
Xₗ × Yₗ

The Recurrence with Four Multiplications

With 4 recursive calls on n/2-sized subproblems plus O(n) addition/shifting:

T(n) = 4T(n/2) + O(n)

Using Master Theorem: a=4, b=2, f(n)=O(n), log₂4 = 2

Since f(n) = O(n) = O(n^(log₂4 - 1)) → Case 1: T(n) = O(n²)

This naive D&C gives us the same O(n²) as grade school! The D&C structure alone doesn't help—we need fewer recursive calls.

The Problem

•We need four multiplications: XₕYₕ, XₕYₗ, XₗYₕ, XₗYₗ
•This leads to T(n) = 4T(n/2) + O(n) = O(n²)
•We've gained nothing over the grade school method
•To beat O(n²), we must reduce the number of recursive multiplications

Karatsuba's Brilliant Insight — Three Multiplications Suffice

Karatsuba's genius was realizing that we can compute the "middle term" (XₕYₗ + XₗYₕ) using only one additional multiplication instead of two. The trick involves a clever algebraic identity.

The Key Identity:

We need three products:

P₁ = Xₕ × Yₕ (high × high)
P₃ = Xₗ × Yₗ (low × low)
P₂ = (Xₕ + Xₗ) × (Yₕ + Yₗ) (sum × sum)

The Magic:

$$X_h Y_l + X_l Y_h = P_2 - P_1 - P_3$$

Proof: $$P_2 = (X_h + X_l)(Y_h + Y_l) = X_h Y_h + X_h Y_l + X_l Y_h + X_l Y_l$$ $$P_2 - P_1 - P_3 = X_h Y_h + X_h Y_l + X_l Y_h + X_l Y_l - X_h Y_h - X_l Y_l$$ $$= X_h Y_l + X_l Y_h \text{ ✓}$$

Instead of computing XₕYₗ and XₗYₕ separately (two multiplications), we compute their sum by computing P₂ and subtracting the products we already have!

The Karatsuba Trade-off

The Complete Formula:

$$X \times Y = P_1 \times 10^n + (P_2 - P_1 - P_3) \times 10^{n/2} + P_3$$

Where:

P₁ = Xₕ × Yₕ (one multiplication)
P₂ = (Xₕ + Xₗ) × (Yₕ + Yₗ) (one multiplication)
P₃ = Xₗ × Yₗ (one multiplication)

Only three multiplications!

The additions, subtractions, and shifts (multiply by 10^k) are all O(n) operations.

karatsuba_conceptual.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
def karatsuba(x: int, y: int) -> int:
    """
    Karatsuba multiplication algorithm.
    
    Demonstrates the D&C approach that achieves O(n^log₂3) ≈ O(n^1.585)
    complexity instead of O(n²).
    
    For production use, Python's built-in multiplication is faster
    (uses optimized algorithms at the C level). This is for education.
    """
    # Base case: single-digit multiplication
    if x < 10 or y < 10:
        return x * y
    
    # Determine the size (number of digits)
    n = max(len(str(x)), len(str(y)))
    half = n // 2
    
    # Split the numbers into high and low parts
    # x = x_high * 10^half + x_low
    divisor = 10 ** half
    
    x_high, x_low = divmod(x, divisor)
    y_high, y_low = divmod(y, divisor)
    
    # Karatsuba's three multiplications
    p1 = karatsuba(x_high, y_high)                     # High × High
    p3 = karatsuba(x_low, y_low)                       # Low × Low
    p2 = karatsuba(x_high + x_low, y_high + y_low)     # (Sum) × (Sum)
    
    # Combine using the key identity:
    # x * y = p1 * 10^n + (p2 - p1 - p3) * 10^(n/2) + p3
    return (
        p1 * (10 ** (2 * half)) +
        (p2 - p1 - p3) * (10 ** half) +
        p3
    )
 
 
def karatsuba_verbose(x: int, y: int, depth: int = 0) -> int:
    """
    Karatsuba with verbose output to trace execution.
    """
    indent = "  " * depth
    print(f"{indent}karatsuba({x}, {y})")
    
    if x < 10 or y < 10:
        result = x * y
        print(f"{indent}  Base case: {result}")
        return result
    
    n = max(len(str(x)), len(str(y)))
    half = n // 2
    divisor = 10 ** half
    
    x_high, x_low = divmod(x, divisor)
    y_high, y_low = divmod(y, divisor)
    
    print(f"{indent}  Split: x = {x_high}*10^{half} + {x_low}")
    print(f"{indent}         y = {y_high}*10^{half} + {y_low}")
    
    print(f"{indent}  Computing P1 = {x_high} × {y_high}:")
    p1 = karatsuba_verbose(x_high, y_high, depth + 2)
    
    print(f"{indent}  Computing P3 = {x_low} × {y_low}:")
    p3 = karatsuba_verbose(x_low, y_low, depth + 2)
    
    print(f"{indent}  Computing P2 = ({x_high}+{x_low}) × ({y_high}+{y_low}):")
    p2 = karatsuba_verbose(x_high + x_low, y_high + y_low, depth + 2)
    
    middle = p2 - p1 - p3
    print(f"{indent}  Middle term = P2 - P1 - P3 = {p2} - {p1} - {p3} = {middle}")
    
    result = p1 * (10 ** (2 * half)) + middle * (10 ** half) + p3
    print(f"{indent}  Result = {p1}*10^{2*half} + {middle}*10^{half} + {p3} = {result}")
    
    return result
 
 
# Demonstration
if __name__ == "__main__":
    # Simple test
    x, y = 5678, 1234
    
    print("=" * 60)
    print(f"Computing {x} × {y}")
    print("=" * 60)
    
    result = karatsuba(x, y)
    expected = x * y
    
    print(f"\nKaratsuba result: {result}")
    print(f"Expected (x × y): {expected}")
    print(f"Correct: {result == expected}")
    
    print("\n" + "=" * 60)
    print("Verbose trace for smaller example: 56 × 12")
    print("=" * 60 + "\n")
    result = karatsuba_verbose(56, 12)
    print(f"\nFinal result: {result} (expected: {56 * 12})")

Worked Example — Step by Step

Let's trace through a complete example to crystallize the algorithm.

Problem: Compute 5678 × 1234 using Karatsuba

Step 1: Split the numbers (n=4, half=2)

X = 5678: Xₕ = 56, Xₗ = 78
Y = 1234: Yₕ = 12, Yₗ = 34

Step 2: Compute the three products recursively

P₁ = Xₕ × Yₕ = 56 × 12

Recursively with n=2, half=1:

56: high=5, low=6
12: high=1, low=2

P₁₁ = 5 × 1 = 5 P₁₃ = 6 × 2 = 12 P₁₂ = (5+6) × (1+2) = 11 × 3 = 33 Middle₁ = 33 - 5 - 12 = 16

P₁ = 5×100 + 16×10 + 12 = 500 + 160 + 12 = 672

P₃ = Xₗ × Yₗ = 78 × 34

78: high=7, low=8
34: high=3, low=4

P₃₁ = 7 × 3 = 21 P₃₃ = 8 × 4 = 32 P₃₂ = (7+8) × (3+4) = 15 × 7 = 105 Middle₃ = 105 - 21 - 32 = 52

P₃ = 21×100 + 52×10 + 32 = 2100 + 520 + 32 = 2652

P₂ = (Xₕ + Xₗ) × (Yₕ + Yₗ) = (56+78) × (12+34) = 134 × 46

Now we need 134 × 46 (note: 134 has 3 digits!)

134: high=13, low=4 (using half=1 for n=2 of the smaller factor)

Actually, let's use half = max digits / 2 = 3/2 = 1:

134 = 13×10 + 4
46 = 4×10 + 6

P₂₁ = 13 × 4 = 52 P₂₃ = 4 × 6 = 24 P₂₂ = (13+4) × (4+6) = 17 × 10 = 170 Middle₂ = 170 - 52 - 24 = 94

P₂ = 52×100 + 94×10 + 24 = 5200 + 940 + 24 = 6164

Step 3: Combine

Middle = P₂ - P₁ - P₃ = 6164 - 672 - 2652 = 2840

Result = P₁×10⁴ + Middle×10² + P₃ = 672×10000 + 2840×100 + 2652 = 6,720,000 + 284,000 + 2,652 = 7,006,652

Verification: 5678 × 1234 = 7,006,652 ✓

The Power of Three

At each level, we made 3 recursive calls instead of 4. This compounds: over log n levels, we save an exponential amount of work. That's how we get from O(n²) to O(n^1.585).

Complexity Analysis — The O(n^1.585) Breakthrough

Let's rigorously prove the complexity improvement using the Master Theorem.

The Recurrence Relation:

Let T(n) be the time to multiply two n-digit numbers:

$$T(n) = 3T(n/2) + O(n)$$

Where:

3T(n/2): Three recursive multiplications on (n/2)-digit numbers
O(n): Additions, subtractions, and shifts (all linear in n)

Applying the Master Theorem:

Compare with T(n) = aT(n/b) + f(n):

a = 3 (three recursive calls)
b = 2 (each subproblem is half the size)
f(n) = O(n)

Compute: log_b(a) = log₂(3) ≈ 1.585

Compare f(n) = O(n) with n^(log₂3) ≈ n^1.585:

f(n) = O(n) = O(n^1) which is polynomially smaller than n^1.585
This is Case 1 of Master Theorem: f(n) = O(n^(log_b(a) - ε)) for ε ≈ 0.585

Therefore: $$T(n) = Θ(n^{log_2 3}) ≈ Θ(n^{1.585})$$

Complexity Comparison: Grade School vs Karatsuba
Digits (n)	Grade School (n²)	Karatsuba (n^1.585)	Speedup
100	10,000	~3,162	~3×
1,000	1,000,000	~39,811	~25×
10,000	100,000,000	~501,187	~200×
100,000	10 billion	~6.3 million	~1,600×
1,000,000	1 trillion	~79 million	~12,600×

Asymptotic Notation Reminder

n^1.585 vs n² might not seem like a huge difference. But remember: at n = 1 million digits: • n² = 10¹² operations • n^1.585 ≈ 79 million operations

That's 12,600× fewer operations! The exponent difference compounds dramatically at scale.

Why the Improvement?

In the naive 4-multiplication approach:

T(n) = 4T(n/2) + O(n)
log₂(4) = 2
T(n) = O(n²)

Reducing from 4 to 3 multiplications:

T(n) = 3T(n/2) + O(n)
log₂(3) ≈ 1.585
T(n) = O(n^1.585)

That one saved multiplication cascades through all recursion levels!

The Recursion Savings

Each level of recursion does 3 calls instead of 4. Over log₂n levels, we go from 4^(log₂n) = n² leaves to 3^(log₂n) = n^1.585 leaves. This is the fundamental reason for the improvement.

Practical Considerations and Modern Context

While Karatsuba's algorithm is a theoretical breakthrough, its practical use requires understanding several considerations.

When to Switch to Grade School:

For n < threshold (typically 10-100 digits): use grade school
For larger n: use Karatsuba

This is a classic example of algorithm engineering: using asymptotically optimal algorithms only when the input is large enough for the asymptotics to matter.

Modern Fast Multiplication Algorithms
Algorithm	Complexity	Best For	Notes
Grade School	O(n²)	n < 50 digits	Simple, low overhead
Karatsuba	O(n^1.585)	50-1000 digits	Good balance of simplicity and speed
Toom-Cook (Toom-3)	O(n^1.465)	1000-10000 digits	More complex, uses 5 multiplications
Schönhage–Strassen	O(n log n log log n)	10000-10^7 digits	FFT-based, complex implementation
Harvey–van der Hoeven	O(n log n)	Theoretical	2019 breakthrough, nearly optimal

Karatsuba's Legacy

Although faster algorithms exist, Karatsuba remains important:

It's the simplest sub-quadratic algorithm (easy to understand and implement)
It disproved the long-held belief that O(n²) was optimal
It introduced the idea of reducing subproblem count via algebraic tricks
It's still the sweet spot for medium-sized numbers in practice

Where Fast Multiplication Matters

•Cryptography (RSA, Diffie-Hellman): Operations on 2048-4096 bit integers; millions of multiplications per operation
•Arbitrary-precision arithmetic: Scientific computing, financial calculations requiring exact decimal arithmetic
•Computer algebra systems: Mathematica, Maple, SymPy handling symbolic computations
•Polynomial multiplication: FFT-based methods (related to Schönhage–Strassen) dominate signal processing
•Number theory research: Computing digits of π, primality testing for enormous numbers

The Generalization — Toom-Cook and Beyond

Karatsuba's insight generalizes. Instead of splitting numbers into 2 parts, we can split into k parts. This is the Toom-Cook family of algorithms.

The Pattern:

Karatsuba (Toom-2): Split into 2 parts, use 3 multiplications instead of 4
Toom-3: Split into 3 parts, use 5 multiplications instead of 9
Toom-4: Split into 4 parts, use 7 multiplications instead of 16
Toom-k: Split into k parts, use 2k-1 multiplications instead of k²

As k increases:

The complexity exponent decreases (approaches 1)
The constant factors increase dramatically
The bookkeeping becomes more complex

The Toom-3 algorithm achieves O(n^(log₃5)) ≈ O(n^1.465), even better than Karatsuba!

toom_cook_idea.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
"""
Conceptual comparison of multiplication algorithms.
 
This demonstrates the D&C structure, not production code.
"""
 
def grade_school_recurrence(n: int) -> int:
    """
    T(n) = 4T(n/2) + O(n) → O(n²)
    4 multiplications of half-size numbers
    """
    if n <= 1:
        return 1
    return 4 * grade_school_recurrence(n // 2) + n
 
def karatsuba_recurrence(n: int) -> int:
    """
    T(n) = 3T(n/2) + O(n) → O(n^1.585)
    3 multiplications of half-size numbers
    """
    if n <= 1:
        return 1
    return 3 * karatsuba_recurrence(n // 2) + n
 
def toom3_recurrence(n: int) -> int:
    """
    T(n) = 5T(n/3) + O(n) → O(n^1.465)
    5 multiplications of third-size numbers
    """
    if n <= 1:
        return 1
    return 5 * toom3_recurrence(n // 3) + n
 
# Compare growth rates
print("n       | Grade School | Karatsuba | Toom-3")
print("-" * 55)
for n in [10, 100, 1000, 10000]:
    gs = grade_school_recurrence(n)
    kar = karatsuba_recurrence(n)
    toom = toom3_recurrence(n)
    print(f"{n:7} | {gs:12} | {kar:9} | {toom:6}")

The Ultimate Limit

Summary: The Karatsuba Legacy

Key Takeaways

•Challenge assumptions: For centuries, O(n²) multiplication was assumed optimal. Karatsuba proved otherwise in 1960.
•Reduce recursive calls: The difference between T(n) = 4T(n/2) and T(n) = 3T(n/2) is the difference between O(n²) and O(n^1.585)
•The key identity: (Xₕ + Xₗ)(Yₕ + Yₗ) - XₕYₕ - XₗYₗ = XₕYₗ + XₗYₕ gives us the cross terms "for free"
•Trade multiplications for additions: Additions are O(n), multiplications are expensive. Trade wisely.
•Practical hybrid approaches: Real implementations switch between algorithms based on input size
•A gateway to modern algorithms: Karatsuba opened the door to Toom-Cook, FFT-based methods, and beyond

Mastery Achieved