Loading content...
When you learned multiplication in school, you learned the "long multiplication" method: multiply each digit by each other digit and add the results with appropriate shifts. This takes O(n²) operations for n-digit numbers. For centuries, mathematicians assumed this was optimal—after all, you need to "touch" each pair of digits at least once, right?
Wrong.
In 1960, a young Soviet mathematician named Anatoly Karatsuba stunned the world by proving that multiplication could be done faster than O(n²). His insight was elegant: by cleverly reusing intermediate results, three recursive multiplications suffice where our naive approach would require four. This seemingly small improvement reduces complexity from O(n²) to O(n^1.585)—a breakthrough that opened the door to modern fast multiplication algorithms.
Karatsuba multiplication is historically significant: it was the first algorithm to break the O(n²) barrier, disproving the widely held conjecture that grade-school multiplication was optimal.
By the end of this page, you will understand the mathematical insight behind Karatsuba's algorithm, how it reduces four multiplications to three, the resulting complexity improvement, and why this matters for cryptography and big-number arithmetic. This is D&C applied to the most fundamental arithmetic operation.
Before we can appreciate Karatsuba's breakthrough, we need to understand what we're improving upon. The grade school algorithm—the one you learned as a child—is actually quite clever, but fundamentally limited.
The Algorithm:
To multiply two n-digit numbers X and Y:
Example: 23 × 14
23
× 14
----
92 (23 × 4)
23 (23 × 1, shifted)
----
322
Complexity Analysis:
Why This Seems Optimal:
The naive argument goes: "We have n digits in X and n digits in Y. Each digit of X might affect each digit of the result in some way depending on each digit of Y. Therefore, we need at least n² operations."
This argument is seductive but wrong. Karatsuba showed that clever decomposition can reduce redundant work.
| Number of Digits (n) | Operations (≈n²) | Time @ 1B ops/sec |
|---|---|---|
| 100 | 10,000 | 0.00001 seconds |
| 1,000 | 1,000,000 | 0.001 seconds |
| 10,000 | 100,000,000 | 0.1 seconds |
| 100,000 | 10 billion | 10 seconds |
| 1,000,000 | 1 trillion | 17 minutes |
Modern cryptography uses numbers with thousands or millions of digits (2048-bit RSA uses ~617 decimal digits). At O(n²), multiplying two 10,000-digit numbers takes 100 million operations. For the millions of multiplications in cryptographic operations, this becomes a bottleneck. We need faster algorithms.
To apply D&C to multiplication, we first need to express the multiplication of two n-digit numbers in terms of multiplications of smaller numbers.
Decomposition:
Let X and Y be n-digit numbers. Split each into two halves:
Example: For X = 5678 (n=4):
Similarly for Y = 1234:
The Naive D&C Expansion:
Now we expand X × Y:
$$X \times Y = (X_h \times 10^{n/2} + X_l)(Y_h \times 10^{n/2} + Y_l)$$
Expanding (FOIL method):
$$= X_h Y_h \times 10^n + X_h Y_l \times 10^{n/2} + X_l Y_h \times 10^{n/2} + X_l Y_l$$
Grouping:
$$= X_h Y_h \times 10^n + (X_h Y_l + X_l Y_h) \times 10^{n/2} + X_l Y_l$$
This requires four multiplications of (n/2)-digit numbers:
With 4 recursive calls on n/2-sized subproblems plus O(n) addition/shifting:
T(n) = 4T(n/2) + O(n)
Using Master Theorem: a=4, b=2, f(n)=O(n), log₂4 = 2
Since f(n) = O(n) = O(n^(log₂4 - 1)) → Case 1: T(n) = O(n²)
This naive D&C gives us the same O(n²) as grade school! The D&C structure alone doesn't help—we need fewer recursive calls.
Karatsuba's genius was realizing that we can compute the "middle term" (XₕYₗ + XₗYₕ) using only one additional multiplication instead of two. The trick involves a clever algebraic identity.
The Key Identity:
We need three products:
The Magic:
$$X_h Y_l + X_l Y_h = P_2 - P_1 - P_3$$
Proof: $$P_2 = (X_h + X_l)(Y_h + Y_l) = X_h Y_h + X_h Y_l + X_l Y_h + X_l Y_l$$ $$P_2 - P_1 - P_3 = X_h Y_h + X_h Y_l + X_l Y_h + X_l Y_l - X_h Y_h - X_l Y_l$$ $$= X_h Y_l + X_l Y_h \text{ ✓}$$
Instead of computing XₕYₗ and XₗYₕ separately (two multiplications), we compute their sum by computing P₂ and subtracting the products we already have!
We trade multiplications for additions/subtractions. We compute P₂ on slightly larger numbers (n/2 + 1 digits due to the addition), but addition is O(n) while multiplication is expensive. This trade-off is massively profitable at scale.
The Complete Formula:
$$X \times Y = P_1 \times 10^n + (P_2 - P_1 - P_3) \times 10^{n/2} + P_3$$
Where:
Only three multiplications!
The additions, subtractions, and shifts (multiply by 10^k) are all O(n) operations.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100
def karatsuba(x: int, y: int) -> int: """ Karatsuba multiplication algorithm. Demonstrates the D&C approach that achieves O(n^log₂3) ≈ O(n^1.585) complexity instead of O(n²). For production use, Python's built-in multiplication is faster (uses optimized algorithms at the C level). This is for education. """ # Base case: single-digit multiplication if x < 10 or y < 10: return x * y # Determine the size (number of digits) n = max(len(str(x)), len(str(y))) half = n // 2 # Split the numbers into high and low parts # x = x_high * 10^half + x_low divisor = 10 ** half x_high, x_low = divmod(x, divisor) y_high, y_low = divmod(y, divisor) # Karatsuba's three multiplications p1 = karatsuba(x_high, y_high) # High × High p3 = karatsuba(x_low, y_low) # Low × Low p2 = karatsuba(x_high + x_low, y_high + y_low) # (Sum) × (Sum) # Combine using the key identity: # x * y = p1 * 10^n + (p2 - p1 - p3) * 10^(n/2) + p3 return ( p1 * (10 ** (2 * half)) + (p2 - p1 - p3) * (10 ** half) + p3 ) def karatsuba_verbose(x: int, y: int, depth: int = 0) -> int: """ Karatsuba with verbose output to trace execution. """ indent = " " * depth print(f"{indent}karatsuba({x}, {y})") if x < 10 or y < 10: result = x * y print(f"{indent} Base case: {result}") return result n = max(len(str(x)), len(str(y))) half = n // 2 divisor = 10 ** half x_high, x_low = divmod(x, divisor) y_high, y_low = divmod(y, divisor) print(f"{indent} Split: x = {x_high}*10^{half} + {x_low}") print(f"{indent} y = {y_high}*10^{half} + {y_low}") print(f"{indent} Computing P1 = {x_high} × {y_high}:") p1 = karatsuba_verbose(x_high, y_high, depth + 2) print(f"{indent} Computing P3 = {x_low} × {y_low}:") p3 = karatsuba_verbose(x_low, y_low, depth + 2) print(f"{indent} Computing P2 = ({x_high}+{x_low}) × ({y_high}+{y_low}):") p2 = karatsuba_verbose(x_high + x_low, y_high + y_low, depth + 2) middle = p2 - p1 - p3 print(f"{indent} Middle term = P2 - P1 - P3 = {p2} - {p1} - {p3} = {middle}") result = p1 * (10 ** (2 * half)) + middle * (10 ** half) + p3 print(f"{indent} Result = {p1}*10^{2*half} + {middle}*10^{half} + {p3} = {result}") return result # Demonstrationif __name__ == "__main__": # Simple test x, y = 5678, 1234 print("=" * 60) print(f"Computing {x} × {y}") print("=" * 60) result = karatsuba(x, y) expected = x * y print(f"\nKaratsuba result: {result}") print(f"Expected (x × y): {expected}") print(f"Correct: {result == expected}") print("\n" + "=" * 60) print("Verbose trace for smaller example: 56 × 12") print("=" * 60 + "\n") result = karatsuba_verbose(56, 12) print(f"\nFinal result: {result} (expected: {56 * 12})")Let's trace through a complete example to crystallize the algorithm.
Problem: Compute 5678 × 1234 using Karatsuba
Step 1: Split the numbers (n=4, half=2)
Step 2: Compute the three products recursively
P₁ = Xₕ × Yₕ = 56 × 12
Recursively with n=2, half=1:
P₁₁ = 5 × 1 = 5 P₁₃ = 6 × 2 = 12 P₁₂ = (5+6) × (1+2) = 11 × 3 = 33 Middle₁ = 33 - 5 - 12 = 16
P₁ = 5×100 + 16×10 + 12 = 500 + 160 + 12 = 672
P₃ = Xₗ × Yₗ = 78 × 34
P₃₁ = 7 × 3 = 21 P₃₃ = 8 × 4 = 32 P₃₂ = (7+8) × (3+4) = 15 × 7 = 105 Middle₃ = 105 - 21 - 32 = 52
P₃ = 21×100 + 52×10 + 32 = 2100 + 520 + 32 = 2652
P₂ = (Xₕ + Xₗ) × (Yₕ + Yₗ) = (56+78) × (12+34) = 134 × 46
Now we need 134 × 46 (note: 134 has 3 digits!)
Actually, let's use half = max digits / 2 = 3/2 = 1:
P₂₁ = 13 × 4 = 52 P₂₃ = 4 × 6 = 24 P₂₂ = (13+4) × (4+6) = 17 × 10 = 170 Middle₂ = 170 - 52 - 24 = 94
P₂ = 52×100 + 94×10 + 24 = 5200 + 940 + 24 = 6164
Step 3: Combine
Middle = P₂ - P₁ - P₃ = 6164 - 672 - 2652 = 2840
Result = P₁×10⁴ + Middle×10² + P₃ = 672×10000 + 2840×100 + 2652 = 6,720,000 + 284,000 + 2,652 = 7,006,652
Verification: 5678 × 1234 = 7,006,652 ✓
At each level, we made 3 recursive calls instead of 4. This compounds: over log n levels, we save an exponential amount of work. That's how we get from O(n²) to O(n^1.585).
Let's rigorously prove the complexity improvement using the Master Theorem.
The Recurrence Relation:
Let T(n) be the time to multiply two n-digit numbers:
$$T(n) = 3T(n/2) + O(n)$$
Where:
Applying the Master Theorem:
Compare with T(n) = aT(n/b) + f(n):
Compute: log_b(a) = log₂(3) ≈ 1.585
Compare f(n) = O(n) with n^(log₂3) ≈ n^1.585:
Therefore: $$T(n) = Θ(n^{log_2 3}) ≈ Θ(n^{1.585})$$
| Digits (n) | Grade School (n²) | Karatsuba (n^1.585) | Speedup |
|---|---|---|---|
| 100 | 10,000 | ~3,162 | ~3× |
| 1,000 | 1,000,000 | ~39,811 | ~25× |
| 10,000 | 100,000,000 | ~501,187 | ~200× |
| 100,000 | 10 billion | ~6.3 million | ~1,600× |
| 1,000,000 | 1 trillion | ~79 million | ~12,600× |
n^1.585 vs n² might not seem like a huge difference. But remember: at n = 1 million digits: • n² = 10¹² operations • n^1.585 ≈ 79 million operations
That's 12,600× fewer operations! The exponent difference compounds dramatically at scale.
Why the Improvement?
In the naive 4-multiplication approach:
Reducing from 4 to 3 multiplications:
That one saved multiplication cascades through all recursion levels!
Each level of recursion does 3 calls instead of 4. Over log₂n levels, we go from 4^(log₂n) = n² leaves to 3^(log₂n) = n^1.585 leaves. This is the fundamental reason for the improvement.
While Karatsuba's algorithm is a theoretical breakthrough, its practical use requires understanding several considerations.
When to Switch to Grade School:
Karatsuba has higher constant factors (more additions, subtractions, and bookkeeping). For small numbers, the O(n²) grade school method is actually faster. Production implementations typically use a hybrid approach:
This is a classic example of algorithm engineering: using asymptotically optimal algorithms only when the input is large enough for the asymptotics to matter.
| Algorithm | Complexity | Best For | Notes |
|---|---|---|---|
| Grade School | O(n²) | n < 50 digits | Simple, low overhead |
| Karatsuba | O(n^1.585) | 50-1000 digits | Good balance of simplicity and speed |
| Toom-Cook (Toom-3) | O(n^1.465) | 1000-10000 digits | More complex, uses 5 multiplications |
| Schönhage–Strassen | O(n log n log log n) | 10000-10^7 digits | FFT-based, complex implementation |
| Harvey–van der Hoeven | O(n log n) | Theoretical | 2019 breakthrough, nearly optimal |
Although faster algorithms exist, Karatsuba remains important:
Karatsuba's insight generalizes. Instead of splitting numbers into 2 parts, we can split into k parts. This is the Toom-Cook family of algorithms.
The Pattern:
As k increases:
The Toom-3 algorithm achieves O(n^(log₃5)) ≈ O(n^1.465), even better than Karatsuba!
1234567891011121314151617181920212223242526272829303132333435363738394041
"""Conceptual comparison of multiplication algorithms. This demonstrates the D&C structure, not production code.""" def grade_school_recurrence(n: int) -> int: """ T(n) = 4T(n/2) + O(n) → O(n²) 4 multiplications of half-size numbers """ if n <= 1: return 1 return 4 * grade_school_recurrence(n // 2) + n def karatsuba_recurrence(n: int) -> int: """ T(n) = 3T(n/2) + O(n) → O(n^1.585) 3 multiplications of half-size numbers """ if n <= 1: return 1 return 3 * karatsuba_recurrence(n // 2) + n def toom3_recurrence(n: int) -> int: """ T(n) = 5T(n/3) + O(n) → O(n^1.465) 5 multiplications of third-size numbers """ if n <= 1: return 1 return 5 * toom3_recurrence(n // 3) + n # Compare growth ratesprint("n | Grade School | Karatsuba | Toom-3")print("-" * 55)for n in [10, 100, 1000, 10000]: gs = grade_school_recurrence(n) kar = karatsuba_recurrence(n) toom = toom3_recurrence(n) print(f"{n:7} | {gs:12} | {kar:9} | {toom:6}")In 2019, Harvey and van der Hoeven proved that integer multiplication can be done in O(n log n) time—essentially matching the complexity of addition up to a logarithmic factor! This is believed to be optimal. However, the algorithm is highly complex and only practical for astronomically large numbers.
Karatsuba multiplication demonstrates a profound D&C principle: sometimes, algebraic restructuring can reduce the number of expensive operations, trading them for cheaper ones. Let's consolidate the key insights:
You now understand Karatsuba multiplication—a landmark algorithm that disproved a centuries-old assumption. The technique of reducing recursive calls through clever algebraic identities appears throughout algorithm design. Next, we'll explore pattern recognition for D&C problems: how to identify when this paradigm applies and how to structure your solutions.