Data Structures & AlgorithmsString DP — Edit Distance

String DP — Edit Distance

LevelIntermediate

Duration75 mins

TopicString DP — Edit Distance

1 / 4

Edit Distance (Levenshtein Distance)

The Problem of Measuring String Similarity

How do you quantify how different two strings are? This deceptively simple question lies at the heart of some of the most important applications in computer science—from the spell checker correcting your typos to the genetic sequencing algorithms that revolutionized modern medicine.

When you type "teh" and your word processor suggests "the," when a search engine understands that "restaraunt" means "restaurant," when DNA sequences are compared to identify mutations or evolutionary relationships—behind all of these is a precise mathematical framework for measuring string similarity. That framework is edit distance.

What You Will Learn

By the end of this page, you will understand what edit distance means mathematically, why it's defined the way it is, how it captures our intuitive notion of string similarity, and why it's one of the most important problems in string dynamic programming. You'll see the formal definition, develop deep intuition for what the metric represents, and understand why this problem is ideally suited for dynamic programming.

What Is Edit Distance?

Edit distance (also known as Levenshtein distance after Soviet mathematician Vladimir Levenshtein, who defined it in 1965) is the minimum number of single-character edits required to transform one string into another.

The elegance of this definition lies in its simplicity: we're counting the fewest changes needed to make two strings identical. But what exactly counts as an "edit"?

The Three Fundamental Operations

•Insertion — Add a character at any position in the string. Example: "cat" → "cart" (insert 'r' after 'a')
•Deletion — Remove a character from any position. Example: "cart" → "cat" (delete 'r')
•Substitution — Replace one character with another. Example: "cat" → "bat" (replace 'c' with 'b')

Each of these operations has a cost of 1 in the standard Levenshtein distance. The edit distance between two strings is the minimum total cost to transform one into the other.

Formal Definition:

Given two strings s of length m and t of length n, the edit distance d(s, t) is defined as:

$$d(s, t) = \min{\text{cost of any sequence of edits transforming } s \text{ into } t}$$

This minimum is taken over all possible sequences of insertions, deletions, and substitutions.

Why These Three Operations?

These three operations capture all possible single-character modifications. Any transformation between strings can be decomposed into a sequence of these atomic operations. Insertion adds information, deletion removes information, and substitution changes information. Together, they form a complete basis for string transformation.

Understanding Through Examples

Let's build intuition by working through several examples, progressively increasing in complexity.

Example 1: Simple SubstitutionTransform "cat" into "bat"

Input

Output

Example 2: InsertionTransform "cat" into "cats"

Input

Output

Example 3: Mixed OperationsTransform "kitten" into "sitting"

Input

Output

Why is the minimum important?

For any pair of strings, there are infinitely many edit sequences that transform one into the other. You could always make unnecessary edits. For example, to transform "cat" into "bat," you could:

Delete all characters: "cat" → "" (3 deletions)
Insert new characters: "" → "bat" (3 insertions)
Total cost: 6

But this is clearly wasteful. The edit distance captures the minimum cost, reflecting the "true" difference between strings.

Multiple Optimal Solutions May Exist

Two strings may have multiple different edit sequences that achieve the minimum cost. For example, transforming "abc" to "adc" could be done by substituting 'b' → 'd', or by deleting 'b' and inserting 'd'. Both have cost 1 (substitution) or cost 2 (delete + insert), so the substitution is optimal. When multiple sequences have the same minimum cost, they're all optimal solutions.

Mathematical Properties of Edit Distance

Edit distance isn't just a useful metric—it's a proper mathematical distance function (a metric). This gives it powerful properties that make it suitable for many applications.

Metric Properties

•Non-negativity: d(s, t) ≥ 0 for all strings s, t. Edit distance is always zero or positive. Proof: Each operation costs 1, and we can't have negative operations. The minimum over non-negative values is non-negative.
•Identity of indiscernibles: d(s, t) = 0 if and only if s = t. Zero distance means identical strings. Proof: If s = t, no edits are needed. If s ≠ t, at least one edit is required to make them identical.
•Symmetry: d(s, t) = d(t, s). The distance from s to t equals the distance from t to s. Proof: Each operation has an inverse—insertions become deletions and vice versa. Any sequence transforming s → t can be inverted to transform t → s with the same cost.
•Triangle inequality: d(s, u) ≤ d(s, t) + d(t, u). Going through an intermediate string is never shorter. Proof: A sequence transforming s → t → u exists with cost d(s,t) + d(t,u). The minimum d(s, u) cannot exceed this.

Why Do These Properties Matter?

Because edit distance is a proper metric, we can:

Build metric data structures like BK-trees for efficient approximate string matching
Cluster strings using standard clustering algorithms
Define neighborhoods (all strings within distance k from a given string)
Reason mathematically using established metric space theory

The symmetry property is particularly important for applications: it doesn't matter which string we consider the "source" and which the "target."

Bounded by Maximum Length

For strings of lengths m and n, the edit distance is at most max(m, n). This is because we can always transform s into t by deleting all characters from s (cost m) and inserting all characters of t (cost n), giving total cost m + n. But we can do better: delete all of s (cost m) gives us the empty string, which needs n insertions to become t, but if m < n we should insert n characters instead of deleting m and inserting n. The maximum is achieved when the strings have no common structure.

Why Edit Distance Is Computationally Challenging

Given how simple the definition is—just count the minimum edits—you might expect a straightforward algorithm. But computing edit distance naively is computationally expensive.

The Brute Force Approach:

To find the minimum, we could enumerate all possible edit sequences. But how many are there?

Consider transforming a string of length m into a string of length n. At each step, we have multiple choices:

Which position to edit
What type of edit (insert, delete, substitute)
What character to use

The number of possible sequences grows exponentially. For the classic "kitten" to "sitting" example, there are thousands of possible edit sequences to explore.

naive_edit_distance.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def naive_edit_distance(s, t):
    """
    Naive recursive edit distance - exponential time complexity!
    This demonstrates the approach but is impractical for real use.
    """
    # Base cases
    if len(s) == 0:
        return len(t)  # Insert all characters of t
    if len(t) == 0:
        return len(s)  # Delete all characters of s
    
    # If last characters match, no edit needed for them
    if s[-1] == t[-1]:
        return naive_edit_distance(s[:-1], t[:-1])
    
    # Otherwise, try all three operations and take minimum
    insert = naive_edit_distance(s, t[:-1])       # Insert t[-1]
    delete = naive_edit_distance(s[:-1], t)       # Delete s[-1]  
    replace = naive_edit_distance(s[:-1], t[:-1]) # Replace s[-1] with t[-1]
    
    return 1 + min(insert, delete, replace)
 
# This works but is EXTREMELY slow!
# Time complexity: O(3^(m+n)) - exponential!

Exponential Explosion:

For strings of length 10, the naive algorithm might explore millions of subproblems. For length 100, it's computationally intractable—the heat death of the universe would occur before the algorithm completes.

But there's a saving grace: look at the recursive calls. We're computing naive_edit_distance(s[:-1], t[:-1]), naive_edit_distance(s, t[:-1]), etc. These are the same subproblems over and over!

If we have strings of length m and n, there are only m × n unique subproblems (each pair of prefixes). But the naive recursion explores exponentially many paths through these subproblems, recomputing the same answers repeatedly.

This Is Classic DP Territory

Edit distance exhibits both hallmarks of dynamic programming: overlapping subproblems (the same prefix pairs are computed repeatedly) and optimal substructure (the optimal solution to the whole problem is built from optimal solutions to subproblems). This makes it a perfect candidate for DP optimization, reducing time complexity from exponential to polynomial—specifically O(m × n).

The Recursive Structure of Edit Distance

To solve edit distance efficiently with DP, we need to understand its recursive structure precisely. Let's define the problem in terms of subproblems.

State Definition:

Let dp[i][j] = the edit distance between the first i characters of string s (i.e., s[0..i-1]) and the first j characters of string t (i.e., t[0..j-1]).

Our goal is to compute dp[m][n] where m = len(s) and n = len(t).

Base Cases:

dp[0][j] = j: Transforming an empty string to the first j characters of t requires j insertions.
dp[i][0] = i: Transforming the first i characters of s to an empty string requires i deletions.

Recurrence Relation:

For dp[i][j] where i > 0 and j > 0, we consider two cases:

Case 1: The characters match (s[i-1] == t[j-1])

If the current characters match, no edit is needed for this position. The edit distance equals the cost of transforming the remaining prefixes:

$$dp[i][j] = dp[i-1][j-1]$$

Case 2: The characters don't match (s[i-1] ≠ t[j-1])

We must make an edit. We consider all three operations and take the minimum:

$$dp[i][j] = 1 + \min(dp[i][j-1], dp[i-1][j], dp[i-1][j-1])$$

Where:

dp[i][j-1] + 1: Insert t[j-1] at position i in s, then solve for s[0..i-1] vs t[0..j-2]
dp[i-1][j] + 1: Delete s[i-1], then solve for s[0..i-2] vs t[0..j-1]
dp[i-1][j-1] + 1: Substitute s[i-1] with t[j-1], then solve for s[0..i-2] vs t[0..j-2]

Understanding the Transitions

Think of it this way: we're aligning the two strings from right to left. At each position, if characters match, we just move left in both strings. If they don't match, we try all ways to fix the mismatch (insert, delete, substitute) and take the cheapest. The DP table records the optimal cost for each prefix pair so we never recompute.

The Complete Recurrence:

dp[i][j] = 
  if i == 0: j
  else if j == 0: i
  else if s[i-1] == t[j-1]: dp[i-1][j-1]
  else: 1 + min(
    dp[i][j-1],    // insert
    dp[i-1][j],    // delete
    dp[i-1][j-1]   // substitute
  )

Dependency Structure:

Each cell dp[i][j] depends on:

dp[i-1][j-1] (diagonal)
dp[i-1][j] (above)
dp[i][j-1] (left)

This means we can fill the table row by row, left to right, or column by column, top to bottom. All dependencies will be satisfied.

Visualizing the DP Table

Let's trace through a complete example to see how the DP table is constructed. We'll compute the edit distance between "cat" and "cut".

Edit Distance Table: "cat" → "cut"
	ε	c	u	t
ε	0	1	2	3
c	1	0	1	2
a	2	1	1	2
t	3	2	2	1

Reading the table:

The row labels represent characters of "cat" (s), the column labels represent characters of "cut" (t)
ε represents the empty string
Cell (i, j) contains the edit distance between s[0..i-1] and t[0..j-1]
The final answer is in the bottom-right cell: 1

Trace of key cells:

dp[1][1]: 'c' == 'c', so dp[1][1] = dp[0][0] = 0 (characters match)
dp[2][2]: 'a' ≠ 'u', so dp[2][2] = 1 + min(dp[1][2], dp[2][1], dp[1][1]) = 1 + min(1, 1, 0) = 1
dp[3][3]: 't' == 't', so dp[3][3] = dp[2][2] = 1 (characters match)

The edit distance is 1. This corresponds to substituting 'a' → 'u' to transform "cat" → "cut".

The Table Tells a Story

Every path from the top-left corner to the bottom-right corner represents a sequence of edit operations. Diagonal moves with no cost increase mean matching characters. Diagonal moves with +1 mean substitution. Horizontal moves mean insertion. Vertical moves mean deletion. The optimal path follows the cells that led to the minimum values.

Why Edit Distance Matters

Edit distance is not just an academic exercise—it's one of the most widely applied algorithms in computer science. Understanding its importance helps motivate the careful study we'll undertake in the following pages.

Real-World Applications

•Spell Checking: Suggesting corrections for misspelled words by finding dictionary words with small edit distance. "recieve" → "receive" (distance 2).
•DNA Sequence Alignment: Measuring evolutionary distance between genetic sequences. Mutations in DNA correspond directly to edit operations.
•Plagiarism Detection: Identifying text that's been slightly modified from a source by computing document similarity.
•Fuzzy Search: Finding approximate matches in databases when exact matches might not exist (names with typos, etc.).
•Natural Language Processing: Computing semantic similarity, machine translation quality metrics (like Word Error Rate).
•Version Control: The diff algorithm underlying Git uses concepts closely related to edit distance (specifically, longest common subsequence).

Historical Significance:

Vladimir Levenshtein introduced this distance in 1965 while studying error-correcting codes. The algorithm to compute it efficiently was soon developed, and it became a cornerstone of string processing algorithms.

The DP solution for edit distance is considered one of the classic examples of dynamic programming—simple enough to teach the paradigm, yet representative of real algorithmic challenges. Nearly every algorithms course covers it, and it appears frequently in technical interviews.

A Foundation for String DP

Understanding edit distance provides the mental framework for tackling many other string DP problems. The pattern of building a 2D table over two string indices, with transitions based on character comparisons, recurs in problems like longest common subsequence, longest palindromic subsequence, regular expression matching, and more. Master edit distance, and you've mastered a fundamental pattern.

Summary: Edit Distance Foundations

We've established the conceptual foundation for edit distance. Let's consolidate the key insights:

Key Takeaways

•Edit distance is the minimum number of single-character edits (insert, delete, substitute) to transform one string into another.
•It's a proper metric, satisfying non-negativity, identity, symmetry, and triangle inequality.
•The naive recursive approach has exponential time complexity due to redundant subproblem computation.
•Edit distance exhibits optimal substructure and overlapping subproblems, making it ideal for DP.
•The DP state is dp[i][j] = edit distance between first i characters of s and first j characters of t.
•The recurrence considers match (no cost) or the minimum of insert, delete, substitute (each cost 1).
•The final answer is found in dp[m][n] where m and n are the string lengths.

What's Next:

We've understood what edit distance is. In the next page, we'll dive deep into the three fundamental operations—insertion, deletion, and substitution—understanding how each contributes to the solution, exploring weighted variants, and examining how the choice of operation costs affects the optimal solution.

Page Complete

You now understand the edit distance problem at a conceptual level: its definition, properties, computational challenges, and significance. Next, we'll explore the three operations in detail and see how to implement the complete DP solution.

1 / 4

Loading learning content...

Data Structures & AlgorithmsString DP — Edit Distance

String DP — Edit Distance

LevelIntermediate

Duration75 mins

TopicString DP — Edit Distance

1 / 4

Edit Distance (Levenshtein Distance)

The Problem of Measuring String Similarity

What You Will Learn

What Is Edit Distance?

The elegance of this definition lies in its simplicity: we're counting the fewest changes needed to make two strings identical. But what exactly counts as an "edit"?

The Three Fundamental Operations

•Insertion — Add a character at any position in the string. Example: "cat" → "cart" (insert 'r' after 'a')
•Deletion — Remove a character from any position. Example: "cart" → "cat" (delete 'r')
•Substitution — Replace one character with another. Example: "cat" → "bat" (replace 'c' with 'b')

Each of these operations has a cost of 1 in the standard Levenshtein distance. The edit distance between two strings is the minimum total cost to transform one into the other.

Formal Definition:

Given two strings s of length m and t of length n, the edit distance d(s, t) is defined as:

$$d(s, t) = \min{\text{cost of any sequence of edits transforming } s \text{ into } t}$$

This minimum is taken over all possible sequences of insertions, deletions, and substitutions.

Why These Three Operations?

Understanding Through Examples

Let's build intuition by working through several examples, progressively increasing in complexity.

Example 1: Simple SubstitutionTransform "cat" into "bat"

Input

Output

Example 2: InsertionTransform "cat" into "cats"

Input

Output

Example 3: Mixed OperationsTransform "kitten" into "sitting"

Input

Output

Why is the minimum important?

For any pair of strings, there are infinitely many edit sequences that transform one into the other. You could always make unnecessary edits. For example, to transform "cat" into "bat," you could:

Delete all characters: "cat" → "" (3 deletions)
Insert new characters: "" → "bat" (3 insertions)
Total cost: 6

But this is clearly wasteful. The edit distance captures the minimum cost, reflecting the "true" difference between strings.

Multiple Optimal Solutions May Exist

Mathematical Properties of Edit Distance

Edit distance isn't just a useful metric—it's a proper mathematical distance function (a metric). This gives it powerful properties that make it suitable for many applications.

Metric Properties

•Non-negativity: d(s, t) ≥ 0 for all strings s, t. Edit distance is always zero or positive. Proof: Each operation costs 1, and we can't have negative operations. The minimum over non-negative values is non-negative.
•Identity of indiscernibles: d(s, t) = 0 if and only if s = t. Zero distance means identical strings. Proof: If s = t, no edits are needed. If s ≠ t, at least one edit is required to make them identical.
•Symmetry: d(s, t) = d(t, s). The distance from s to t equals the distance from t to s. Proof: Each operation has an inverse—insertions become deletions and vice versa. Any sequence transforming s → t can be inverted to transform t → s with the same cost.
•Triangle inequality: d(s, u) ≤ d(s, t) + d(t, u). Going through an intermediate string is never shorter. Proof: A sequence transforming s → t → u exists with cost d(s,t) + d(t,u). The minimum d(s, u) cannot exceed this.

Why Do These Properties Matter?

Because edit distance is a proper metric, we can:

Build metric data structures like BK-trees for efficient approximate string matching
Cluster strings using standard clustering algorithms
Define neighborhoods (all strings within distance k from a given string)
Reason mathematically using established metric space theory

The symmetry property is particularly important for applications: it doesn't matter which string we consider the "source" and which the "target."

Bounded by Maximum Length

Why Edit Distance Is Computationally Challenging

Given how simple the definition is—just count the minimum edits—you might expect a straightforward algorithm. But computing edit distance naively is computationally expensive.

The Brute Force Approach:

To find the minimum, we could enumerate all possible edit sequences. But how many are there?

Consider transforming a string of length m into a string of length n. At each step, we have multiple choices:

Which position to edit
What type of edit (insert, delete, substitute)
What character to use

The number of possible sequences grows exponentially. For the classic "kitten" to "sitting" example, there are thousands of possible edit sequences to explore.

naive_edit_distance.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def naive_edit_distance(s, t):
    """
    Naive recursive edit distance - exponential time complexity!
    This demonstrates the approach but is impractical for real use.
    """
    # Base cases
    if len(s) == 0:
        return len(t)  # Insert all characters of t
    if len(t) == 0:
        return len(s)  # Delete all characters of s
    
    # If last characters match, no edit needed for them
    if s[-1] == t[-1]:
        return naive_edit_distance(s[:-1], t[:-1])
    
    # Otherwise, try all three operations and take minimum
    insert = naive_edit_distance(s, t[:-1])       # Insert t[-1]
    delete = naive_edit_distance(s[:-1], t)       # Delete s[-1]  
    replace = naive_edit_distance(s[:-1], t[:-1]) # Replace s[-1] with t[-1]
    
    return 1 + min(insert, delete, replace)
 
# This works but is EXTREMELY slow!
# Time complexity: O(3^(m+n)) - exponential!

Exponential Explosion:

But there's a saving grace: look at the recursive calls. We're computing naive_edit_distance(s[:-1], t[:-1]), naive_edit_distance(s, t[:-1]), etc. These are the same subproblems over and over!

This Is Classic DP Territory

The Recursive Structure of Edit Distance

To solve edit distance efficiently with DP, we need to understand its recursive structure precisely. Let's define the problem in terms of subproblems.

State Definition:

Let dp[i][j] = the edit distance between the first i characters of string s (i.e., s[0..i-1]) and the first j characters of string t (i.e., t[0..j-1]).

Our goal is to compute dp[m][n] where m = len(s) and n = len(t).

Base Cases:

dp[0][j] = j: Transforming an empty string to the first j characters of t requires j insertions.
dp[i][0] = i: Transforming the first i characters of s to an empty string requires i deletions.

Recurrence Relation:

For dp[i][j] where i > 0 and j > 0, we consider two cases:

Case 1: The characters match (s[i-1] == t[j-1])

If the current characters match, no edit is needed for this position. The edit distance equals the cost of transforming the remaining prefixes:

$$dp[i][j] = dp[i-1][j-1]$$

Case 2: The characters don't match (s[i-1] ≠ t[j-1])

We must make an edit. We consider all three operations and take the minimum:

$$dp[i][j] = 1 + \min(dp[i][j-1], dp[i-1][j], dp[i-1][j-1])$$

Where:

dp[i][j-1] + 1: Insert t[j-1] at position i in s, then solve for s[0..i-1] vs t[0..j-2]
dp[i-1][j] + 1: Delete s[i-1], then solve for s[0..i-2] vs t[0..j-1]
dp[i-1][j-1] + 1: Substitute s[i-1] with t[j-1], then solve for s[0..i-2] vs t[0..j-2]

Understanding the Transitions

The Complete Recurrence:

dp[i][j] = 
  if i == 0: j
  else if j == 0: i
  else if s[i-1] == t[j-1]: dp[i-1][j-1]
  else: 1 + min(
    dp[i][j-1],    // insert
    dp[i-1][j],    // delete
    dp[i-1][j-1]   // substitute
  )

Dependency Structure:

Each cell dp[i][j] depends on:

dp[i-1][j-1] (diagonal)
dp[i-1][j] (above)
dp[i][j-1] (left)

This means we can fill the table row by row, left to right, or column by column, top to bottom. All dependencies will be satisfied.

Visualizing the DP Table

Let's trace through a complete example to see how the DP table is constructed. We'll compute the edit distance between "cat" and "cut".

Edit Distance Table: "cat" → "cut"
	ε	c	u	t
ε	0	1	2	3
c	1	0	1	2
a	2	1	1	2
t	3	2	2	1

Reading the table:

The row labels represent characters of "cat" (s), the column labels represent characters of "cut" (t)
ε represents the empty string
Cell (i, j) contains the edit distance between s[0..i-1] and t[0..j-1]
The final answer is in the bottom-right cell: 1

Trace of key cells:

dp[1][1]: 'c' == 'c', so dp[1][1] = dp[0][0] = 0 (characters match)
dp[2][2]: 'a' ≠ 'u', so dp[2][2] = 1 + min(dp[1][2], dp[2][1], dp[1][1]) = 1 + min(1, 1, 0) = 1
dp[3][3]: 't' == 't', so dp[3][3] = dp[2][2] = 1 (characters match)

The edit distance is 1. This corresponds to substituting 'a' → 'u' to transform "cat" → "cut".

The Table Tells a Story

Why Edit Distance Matters

Real-World Applications

•Spell Checking: Suggesting corrections for misspelled words by finding dictionary words with small edit distance. "recieve" → "receive" (distance 2).
•DNA Sequence Alignment: Measuring evolutionary distance between genetic sequences. Mutations in DNA correspond directly to edit operations.
•Plagiarism Detection: Identifying text that's been slightly modified from a source by computing document similarity.
•Fuzzy Search: Finding approximate matches in databases when exact matches might not exist (names with typos, etc.).
•Natural Language Processing: Computing semantic similarity, machine translation quality metrics (like Word Error Rate).
•Version Control: The diff algorithm underlying Git uses concepts closely related to edit distance (specifically, longest common subsequence).

Historical Significance:

A Foundation for String DP

Summary: Edit Distance Foundations

We've established the conceptual foundation for edit distance. Let's consolidate the key insights:

Key Takeaways

•Edit distance is the minimum number of single-character edits (insert, delete, substitute) to transform one string into another.
•It's a proper metric, satisfying non-negativity, identity, symmetry, and triangle inequality.
•The naive recursive approach has exponential time complexity due to redundant subproblem computation.
•Edit distance exhibits optimal substructure and overlapping subproblems, making it ideal for DP.
•The DP state is dp[i][j] = edit distance between first i characters of s and first j characters of t.
•The recurrence considers match (no cost) or the minimum of insert, delete, substitute (each cost 1).
•The final answer is found in dp[m][n] where m and n are the string lengths.

What's Next:

Page Complete

1 / 4