Loading content...
Every transformation from one string to another can be decomposed into a precise sequence of atomic operations. These operations—insertion, deletion, and substitution (replacement)—form the complete basis for string transformation. Understanding them deeply is essential not just for implementing edit distance correctly, but for reasoning about string algorithms, designing efficient solutions, and understanding the elegant mathematical structure underlying string comparison.
In this page, we examine each operation in meticulous detail, explore how they interact in the DP formulation, implement the complete algorithm, and investigate important variants where different operations carry different costs.
By the end of this page, you will deeply understand how each operation contributes to the edit distance computation, be able to trace through the algorithm step-by-step, implement both memoized and tabulation solutions, and understand weighted edit distance variants used in real-world applications.
Insertion adds a character to the source string to make it more like the target. It's the operation we use when the target has characters that the source lacks.
Formal Definition:
An insertion at position k in string s = s₀s₁...sₘ₋₁ with character c produces:
s' = s₀s₁...sₖ₋₁ · c · sₖ...sₘ₋₁
The new string s' has length m + 1.
The DP Perspective on Insertion:
In the recurrence relation, insertion corresponds to:
$$dp[i][j] = dp[i][j-1] + 1$$
Interpretation: We've matched the first i characters of s with the first j-1 characters of t. Now we insert t[j-1] to match the j-th character of t. The source pointer (i) stays the same because we didn't consume any source character—we added one.
Visual Intuition:
Source: c a r _ (pointer stays at position 3, after 'r')
Target: c a r t (pointer moves from position 3 to 4)
↑
insert 't' here
After insertion, both strings have matched up to position j.
Insertions are the primary operation when transforming a shorter string into a longer one. The base case dp[0][j] = j represents a sequence of j insertions, transforming the empty string into the first j characters of t.
Deletion removes a character from the source string. It's the operation we use when the source has characters that don't belong in the target.
Formal Definition:
A deletion at position k in string s = s₀s₁...sₘ₋₁ produces:
s' = s₀s₁...sₖ₋₁ · sₖ₊₁...sₘ₋₁
Character sₖ is removed. The new string s' has length m - 1.
The DP Perspective on Deletion:
In the recurrence relation, deletion corresponds to:
$$dp[i][j] = dp[i-1][j] + 1$$
Interpretation: We delete s[i-1] and then solve the subproblem of matching the first i-1 characters of s with the first j characters of t. The target pointer (j) stays the same because we didn't progress in matching t—we just removed a source character.
Visual Intuition:
Source: c a r t (pointer moves from position 4 to 3)
Target: c a r _ (pointer stays at position 3)
↑
delete 't' here
After deletion, we continue trying to match the remaining source with the target.
Insertion into s is equivalent to deletion from t, and vice versa. This is why edit distance is symmetric: d(s, t) = d(t, s). Every insertion in one direction becomes a deletion in the reverse direction. This symmetry is a powerful insight for reasoning about string transformations.
Substitution (also called replacement) changes one character into another. It's used when the source and target have different characters at corresponding positions.
Formal Definition:
A substitution at position k in string s = s₀s₁...sₘ₋₁, replacing sₖ with character c, produces:
s' = s₀s₁...sₖ₋₁ · c · sₖ₊₁...sₘ₋₁
The string length remains m, but character sₖ is now c.
The DP Perspective on Substitution:
In the recurrence relation, substitution corresponds to:
$$dp[i][j] = dp[i-1][j-1] + 1$$
Interpretation: We replace s[i-1] with t[j-1], making them match. Then we solve the subproblem of matching the first i-1 characters of s with the first j-1 characters of t. Both pointers advance because we've handled one position in both strings.
Visual Intuition:
Source: c a t (pointer moves from position 2 to 1)
Target: c u t (pointer moves from position 2 to 1)
↑
substitute 'a' → 'u'
The Match Case (Free Substitution):
When s[i-1] == t[j-1], the substitution costs 0 instead of 1. We're effectively "substituting" a character with itself—which is a no-op. This is why the recurrence has the special case:
if s[i-1] == t[j-1]:
dp[i][j] = dp[i-1][j-1] # No cost, characters already match
Note that a substitution can be simulated by a deletion followed by an insertion (or vice versa). Replacing 'a' with 'u' is equivalent to: delete 'a', then insert 'u'. This costs 2 with standard weights. The fact that substitution costs only 1 in Levenshtein distance reflects a modeling choice: we consider a single-character change to be a single atomic operation, not two operations.
Now that we understand each operation individually, let's see how they combine into the complete recurrence relation that drives the edit distance algorithm.
1234567891011121314151617
EDIT-DISTANCE-RECURRENCE(s, t, i, j): # Base cases if i == 0: return j # Transform empty to t[0..j-1]: j insertions if j == 0: return i # Transform s[0..i-1] to empty: i deletions # If characters match, no operation needed at this position if s[i-1] == t[j-1]: return dp[i-1][j-1] # Move diagonally, cost 0 # Characters don't match: consider all three operations insertion = dp[i][j-1] + 1 # Insert t[j-1] deletion = dp[i-1][j] + 1 # Delete s[i-1] substitution = dp[i-1][j-1] + 1 # Replace s[i-1] with t[j-1] return min(insertion, deletion, substitution)Dependency Diagram:
Each cell dp[i][j] depends on exactly three neighboring cells:
j-1 j
┌─────┬─────┐
i-1 │ ↖ │ ↑ │
│ sub │ del │
├─────┼─────┤
i │ ← │ ? │
│ ins │dp[i][j]
└─────┴─────┘
This dependency structure means we can fill the table in any order that ensures these three cells are computed before dp[i][j]. Row-by-row (left to right) and column-by-column (top to bottom) both work.
At each cell, we always take the minimum of the three options. This greedy-within-DP choice is what ensures we find the globally optimal solution. The recurrence explores all possible transformation strategies and picks the cheapest at every step—and thanks to optimal substructure, local optimality leads to global optimality.
Let's implement edit distance in both top-down (memoized) and bottom-up (tabulation) styles. Both have the same O(mn) time complexity, but with different tradeoffs.
12345678910111213141516171819202122232425262728293031323334353637
def edit_distance(s: str, t: str) -> int: """ Compute edit distance using bottom-up tabulation. Time: O(m * n), Space: O(m * n) """ m, n = len(s), len(t) # Create DP table with (m+1) x (n+1) dimensions # dp[i][j] = edit distance between s[0:i] and t[0:j] dp = [[0] * (n + 1) for _ in range(m + 1)] # Base cases: transforming to/from empty string for i in range(m + 1): dp[i][0] = i # Delete all characters from s[0:i] for j in range(n + 1): dp[0][j] = j # Insert all characters of t[0:j] # Fill the table for i in range(1, m + 1): for j in range(1, n + 1): if s[i - 1] == t[j - 1]: # Characters match - no operation needed dp[i][j] = dp[i - 1][j - 1] else: # Take minimum of three operations dp[i][j] = 1 + min( dp[i][j - 1], # Insert t[j-1] dp[i - 1][j], # Delete s[i-1] dp[i - 1][j - 1] # Replace s[i-1] with t[j-1] ) return dp[m][n] # Example usageprint(edit_distance("kitten", "sitting")) # Output: 3print(edit_distance("saturday", "sunday")) # Output: 3Tabulation is generally preferred for edit distance because: (1) it avoids recursion stack overhead, (2) it has more predictable memory access patterns (cache-friendly), and (3) it's easier to optimize for space (as we'll see later). Memoization can be useful when you only need to compute a subset of the table or when the natural problem formulation is recursive.
Let's trace through the algorithm for computing edit distance between "INTENTION" and "EXECUTION". This classic example appears in many algorithms textbooks.
Strings:
Expected Result: 5 operations
| ε | E | X | E | C | U | T | I | O | N | |
|---|---|---|---|---|---|---|---|---|---|---|
| ε | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| I | 1 | 1 | 2 | 3 | 4 | 5 | 6 | 6 | 7 | 8 |
| N | 2 | 2 | 2 | 3 | 4 | 5 | 6 | 7 | 7 | 7 |
| T | 3 | 3 | 3 | 3 | 4 | 5 | 5 | 6 | 7 | 8 |
| E | 4 | 3 | 4 | 3 | 4 | 5 | 6 | 6 | 7 | 8 |
| N | 5 | 4 | 4 | 4 | 4 | 5 | 6 | 7 | 7 | 7 |
| T | 6 | 5 | 5 | 5 | 5 | 5 | 5 | 6 | 7 | 8 |
| I | 7 | 6 | 6 | 6 | 6 | 6 | 6 | 5 | 6 | 7 |
| O | 8 | 7 | 7 | 7 | 7 | 7 | 7 | 6 | 5 | 6 |
| N | 9 | 8 | 8 | 8 | 8 | 8 | 8 | 7 | 6 | 5 |
Key Observations:
Base cases (row 0 and column 0): Values increment from 0 to 9, representing insertions/deletions of empty string.
Diagonal matches: When characters match, the cell equals its diagonal predecessor. For example:
The final answer: dp[9][9] = 5
The 5 operations (one possible sequence):
Actually, let me provide the standard sequence:
There are often multiple sequences of operations that achieve the minimum edit distance. The table tells us the cost but not uniquely which operations to use. In the next page, we'll learn how to reconstruct an optimal sequence by backtracking through the table.
In the standard Levenshtein distance, all operations cost 1. But in many real-world applications, different operations should have different costs.
Common Weighted Variants:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475
def weighted_edit_distance( s: str, t: str, insert_cost: float = 1.0, delete_cost: float = 1.0, replace_cost: float = 1.0) -> float: """ Compute weighted edit distance with configurable operation costs. Args: s: Source string t: Target string insert_cost: Cost of inserting a character delete_cost: Cost of deleting a character replace_cost: Cost of replacing a character Returns: Minimum weighted cost to transform s into t """ m, n = len(s), len(t) dp = [[0.0] * (n + 1) for _ in range(m + 1)] # Base cases with weighted costs for i in range(m + 1): dp[i][0] = i * delete_cost for j in range(n + 1): dp[0][j] = j * insert_cost for i in range(1, m + 1): for j in range(1, n + 1): if s[i - 1] == t[j - 1]: dp[i][j] = dp[i - 1][j - 1] else: dp[i][j] = min( dp[i][j - 1] + insert_cost, dp[i - 1][j] + delete_cost, dp[i - 1][j - 1] + replace_cost ) return dp[m][n] # Variant: position-dependent or character-dependent costsdef custom_edit_distance( s: str, t: str, insert_cost_fn, # (char, position) -> cost delete_cost_fn, # (char, position) -> cost replace_cost_fn # (char_s, char_t, position) -> cost) -> float: """ Fully customizable edit distance with function-based costs. """ m, n = len(s), len(t) dp = [[0.0] * (n + 1) for _ in range(m + 1)] # Base cases (accumulate position-dependent costs) for i in range(1, m + 1): dp[i][0] = dp[i-1][0] + delete_cost_fn(s[i-1], i-1) for j in range(1, n + 1): dp[0][j] = dp[0][j-1] + insert_cost_fn(t[j-1], j-1) for i in range(1, m + 1): for j in range(1, n + 1): if s[i-1] == t[j-1]: dp[i][j] = dp[i-1][j-1] else: dp[i][j] = min( dp[i][j-1] + insert_cost_fn(t[j-1], j-1), dp[i-1][j] + delete_cost_fn(s[i-1], i-1), dp[i-1][j-1] + replace_cost_fn(s[i-1], t[j-1], i-1) ) return dp[m][n]The right weighting scheme depends on your domain. For spell checking, keyboard-based weights work well. For DNA sequence alignment, bioinformatics uses substitution matrices like BLOSUM that capture biological relationships. For plagiarism detection, word-level operations might cost less than character-level. Always consider what 'similarity' means in your specific context.
We've now fully explored the three fundamental operations that define edit distance. Let's consolidate our understanding:
What's Next:
We can now compute the edit distance—the minimum cost. But often we need more: the actual sequence of operations that achieves this minimum. In the next page, we'll learn how to construct the DP table and then backtrack through it to reconstruct the optimal edit sequence.
You now deeply understand how insertion, deletion, and substitution work individually and together in the edit distance algorithm. You can implement the complete solution and understand weighted variants. Next, we'll learn table construction patterns and how to reconstruct the actual edit sequence from the DP table.