Longest Common Subsequence - Learning Module

Loading content...

0/279

LCS with Full Implementation Detail

The Subsequence Similarity Problem

Imagine you're building a file comparison tool—like Git's diff command—that needs to show what changed between two versions of a document. Or perhaps you're developing a bioinformatics application that compares DNA sequences to find evolutionary relationships. Or maybe you're creating a plagiarism detection system that identifies shared content between documents despite reordering.

What do all these problems have in common? They all require finding the longest sequence of elements that appears in both inputs, maintaining their relative order but not necessarily their positions. This is precisely the Longest Common Subsequence (LCS) problem—one of the most fundamental and elegantly solvable problems in computer science.

What You Will Master

By the end of this page, you will understand the precise formal definition of the LCS problem, why naive approaches fail dramatically, and how dynamic programming transforms an exponential problem into a polynomial one. You'll see the complete derivation of the DP recurrence and understand every detail of the classical implementation.

Formal Problem Definition

Before diving into algorithms, we must establish precise definitions. Ambiguity in problem understanding leads to incorrect solutions.

Definition: Subsequence

A subsequence of a sequence X = ⟨x₁, x₂, ..., xₘ⟩ is a sequence Z = ⟨z₁, z₂, ..., zₖ⟩ such that there exists a strictly increasing sequence of indices ⟨i₁, i₂, ..., iₖ⟩ with 1 ≤ i₁ < i₂ < ... < iₖ ≤ m and zⱼ = xᵢⱼ for all j = 1, 2, ..., k.

In simpler terms: A subsequence is obtained by deleting zero or more elements from the original sequence without changing the order of the remaining elements.

Critical Distinction: Subsequence vs. Substring

Substring (also called subarray): Elements must be contiguous
Subsequence: Elements need not be contiguous, but relative order must be preserved

Understanding SubsequencesConsider the sequence X = "ALGORITHM"

Input

Output

Definition: Common Subsequence

Given two sequences X and Y, a sequence Z is a common subsequence of X and Y if Z is a subsequence of both X and Y simultaneously.

Definition: Longest Common Subsequence (LCS)

Given two sequences X = ⟨x₁, x₂, ..., xₘ⟩ and Y = ⟨y₁, y₂, ..., yₙ⟩, find a maximum-length common subsequence of X and Y.

Important Note: The LCS may not be unique. Multiple subsequences of the same maximum length may exist.

LCS Computation ExampleFinding the LCS of X = "ABCBDAB" and Y = "BDCABA"

Input

Output

The Optimization vs. Decision Problem

The LCS problem has two forms: (1) Optimization: Find the length of the LCS, and (2) Construction: Actually produce one or all LCS sequences. The DP table we'll build solves the optimization problem; reconstruction (covered in a later section) addresses the construction problem.

Why Naive Approaches Fail

To appreciate dynamic programming, we must first understand why simpler approaches are computationally infeasible.

Naive Approach 1: Enumerate All Subsequences

The most straightforward approach:

Generate all possible subsequences of X
For each subsequence, check if it's also a subsequence of Y
Track the longest one found

Complexity Analysis:

How many subsequences does a sequence of length m have? For each element, we make a binary choice: include it or exclude it. This gives us 2ᵐ possible subsequences (including the empty subsequence).

For each subsequence of length k, checking if it's a subsequence of Y takes O(n) time (linear scan with two pointers).

Total time: O(2ᵐ × n)

This is exponential in m. For two strings of length 1000, we'd need to check approximately 10³⁰¹ subsequences. Even at a trillion operations per second, this would take longer than the age of the universe—by a factor of about 10²⁸⁰.

Exponential Growth of Subsequence Enumeration
String Length (m)	Number of Subsequences (2ᵐ)	Time at 10⁹ ops/sec
10	1,024	~1 microsecond
20	1,048,576	~1 millisecond
30	1,073,741,824	~1 second
40	~1.1 trillion	~18 minutes
50	~1 quadrillion	~13 days
100	~1.27 × 10³⁰	~4 × 10¹³ years

Naive Approach 2: Recursive Definition

A more structured approach uses recursion based on comparing characters:

LCS(X[1..m], Y[1..n]) =
  | 0                                          if m = 0 or n = 0
  | 1 + LCS(X[1..m-1], Y[1..n-1])             if xₘ = yₙ
  | max(LCS(X[1..m-1], Y), LCS(X, Y[1..n-1])) if xₘ ≠ yₙ

This recurrence is correct, but direct implementation has a critical flaw: overlapping subproblems lead to exponential redundant computation.

naive_recursive_lcs.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
def lcs_naive(X: str, Y: str, m: int, n: int) -> int:
    """
    Naive recursive LCS implementation.
    
    Time Complexity: O(2^(m+n)) in the worst case
    Space Complexity: O(m + n) for recursion stack
    
    WARNING: This is for educational purposes only.
    Do NOT use this for strings longer than ~20 characters.
    """
    # Base case: empty string has LCS of 0
    if m == 0 or n == 0:
        return 0
    
    # If last characters match, they're part of the LCS
    if X[m - 1] == Y[n - 1]:
        return 1 + lcs_naive(X, Y, m - 1, n - 1)
    
    # If they don't match, try both possibilities
    # and take the maximum
    return max(
        lcs_naive(X, Y, m - 1, n),  # Exclude last char of X
        lcs_naive(X, Y, m, n - 1)   # Exclude last char of Y
    )
 
 
# Example usage (keep strings short!)
X = "ABCBDAB"
Y = "BDCABA"
print(f"LCS length: {lcs_naive(X, Y, len(X), len(Y))}")  # Output: 4

The Overlapping Subproblems Explosion

The recursion tree for this naive approach explodes exponentially. For strings "ABCD" and "EFGH" (worst case: no matches), we make 2 recursive calls at each node, creating 2^8 = 256 calls for just 4 characters each. For length-20 strings with no matches, we'd make over a trillion calls.

Visualizing the Problem

Consider computing LCS("ABC", "AC"). The recursion creates:

LCS(3,2)
├── LCS(2,2) [match at C]
│   └── LCS(1,1) [no match]
│       ├── LCS(0,1) → 0
│       └── LCS(1,0) → 0
└── (not taken, since characters matched)

But for non-matching characters, both branches are explored:

LCS("AXBY", "AYBX")
└── LCS(3,3) [Y ≠ X]
    ├── LCS(2,3) → explores many subproblems
    └── LCS(3,2) → SAME subproblems recomputed!

The subproblem LCS(2,2) gets computed multiple times, as do many others. This redundancy is exactly what dynamic programming eliminates.

Optimal Substructure Property

Dynamic programming requires two properties: optimal substructure and overlapping subproblems. The LCS problem exhibits both beautifully.

Theorem: Optimal Substructure of LCS

Let X = ⟨x₁, x₂, ..., xₘ⟩ and Y = ⟨y₁, y₂, ..., yₙ⟩ be sequences, and let Z = ⟨z₁, z₂, ..., zₖ⟩ be any LCS of X and Y.

If xₘ = yₙ, then zₖ = xₘ = yₙ and Z[1..k-1] is an LCS of X[1..m-1] and Y[1..n-1]
If xₘ ≠ yₙ, then:
- zₖ ≠ xₘ implies Z is an LCS of X[1..m-1] and Y
- zₖ ≠ yₙ implies Z is an LCS of X and Y[1..n-1]

Proof Intuition

For case 1: If the last characters match, they must be in some optimal solution (if they're not, we can always add them to extend the LCS). For case 2: If the last characters don't match, at least one of them cannot be in the LCS, so we can safely remove it without affecting the optimal solution.

Formal Proof of Case 1

Suppose xₘ = yₙ. We claim that this common character must appear at the end of some LCS.

Proof by contradiction: Assume Z = ⟨z₁, ..., zₖ⟩ is an LCS and zₖ ≠ xₘ (= yₙ). Then we could form Z' = ⟨z₁, ..., zₖ, xₘ⟩. Since xₘ appears at position m in X (after all elements of Z that came from X) and yₙ appears at position n in Y (after all elements of Z that came from Y), Z' is a valid common subsequence. But |Z'| = k + 1 > |Z| = k, contradicting that Z is longest. ∎

Now, if zₖ = xₘ = yₙ, then Z[1..k-1] must be a common subsequence of X[1..m-1] and Y[1..n-1]. If it weren't longest, we could replace it with a longer one and get a longer Z, contradiction.

Formal Proof of Case 2

Suppose xₘ ≠ yₙ. At most one of them can equal zₖ.

If zₖ ≠ xₘ: Z doesn't use xₘ, so Z is a common subsequence of X[1..m-1] and Y. If there were a longer common subsequence of X[1..m-1] and Y, it would also be a common subsequence of X and Y, contradicting that Z is longest.

The argument for zₖ ≠ yₙ is symmetric. ∎

The Recurrence Relation

Based on optimal substructure, we define:

c[i, j] = length of LCS of X[1..i] and Y[1..j]

The recurrence is:

c[i, j] = 
  | 0                           if i = 0 or j = 0
  | c[i-1, j-1] + 1            if i, j > 0 and xᵢ = yⱼ
  | max(c[i-1, j], c[i, j-1])  if i, j > 0 and xᵢ ≠ yⱼ

This recurrence has:

(m+1) × (n+1) unique subproblems (including base cases with 0)
Each subproblem depends on at most 3 other subproblems
Total work: O(m × n)

This is the key insight: we've reduced exponential work to polynomial work by recognizing and eliminating redundant computation.

The Power of Memoization

The transformation from O(2^(m+n)) to O(m×n) represents an enormous speedup. For two strings of length 1000, we go from ~10^600 operations to just 1,000,000 operations—a reduction by a factor of approximately 10^594. This is why dynamic programming is considered one of the most powerful algorithmic techniques.

Complete Dynamic Programming Implementation

Now we translate the recurrence into working code. We'll present both the standard 2D table approach and analyze every aspect of the implementation.

lcs_dynamic_programming.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
def lcs_length(X: str, Y: str) -> int:
    """
    Compute the length of the Longest Common Subsequence using DP.
    
    Time Complexity: O(m × n) where m = len(X), n = len(Y)
    Space Complexity: O(m × n) for the DP table
    
    Args:
        X: First string
        Y: Second string
    
    Returns:
        Length of the longest common subsequence
    """
    m, n = len(X), len(Y)
    
    # Create DP table with (m+1) rows and (n+1) columns
    # dp[i][j] = LCS length of X[0:i] and Y[0:j]
    # Extra row/column for base case (empty prefix)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Fill the table bottom-up
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if X[i - 1] == Y[j - 1]:
                # Characters match: extend LCS from diagonal
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                # Characters don't match: take max of excluding either
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    
    # Answer is in the bottom-right cell
    return dp[m][n]
 
 
def lcs_with_table(X: str, Y: str) -> tuple[int, list[list[int]]]:
    """
    Compute LCS length and return the full DP table for visualization.
    
    Returns:
        Tuple of (LCS length, DP table)
    """
    m, n = len(X), len(Y)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if X[i - 1] == Y[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    
    return dp[m][n], dp
 
 
# Example with detailed output
X = "ABCBDAB"
Y = "BDCABA"
 
length, table = lcs_with_table(X, Y)
 
print(f"X = '{X}'")
print(f"Y = '{Y}'")
print(f"LCS Length = {length}
")
 
# Print the DP table
print("DP Table:")
print("    ", end="")
print("  ε  " + "  ".join(f" {c} " for c in Y))
print("  ε " + " ".join(f" {table[0][j]:2d}" for j in range(len(Y) + 1)))
for i in range(1, len(X) + 1):
    print(f"  {X[i-1]} " + " ".join(f" {table[i][j]:2d}" for j in range(len(Y) + 1)))

Understanding the Table Structure

The DP table for X = "ABCBDAB" and Y = "BDCABA" looks like this:

        ε    B    D    C    A    B    A
    ε   0    0    0    0    0    0    0
    A   0    0    0    0    1    1    1
    B   0    1    1    1    1    2    2
    C   0    1    1    2    2    2    2
    B   0    1    1    2    2    3    3
    D   0    1    2    2    2    3    3
    A   0    1    2    2    3    3    4
    B   0    1    2    2    3    4    4

Reading the Table:

Row i, Column j gives the LCS length of X[1..i] and Y[1..j]
The final answer (LCS length = 4) is at position (7, 6)
ε represents the empty string (base case)
Each cell depends on at most 3 neighbors: top, left, and top-left diagonal

Pattern Recognition

Notice how values can only stay the same or increase as we move right or down. This monotonicity comes from the fact that adding characters can only maintain or improve the LCS, never shorten it. Also, values increase by at most 1 per cell—each match contributes exactly one character to the LCS.

Space-Optimized Variants

The standard implementation uses O(m × n) space for the DP table. For very long strings (e.g., genome sequences with millions of characters), this may be prohibitive. We can reduce space significantly.

Observation: When computing row i, we only need values from row i-1 and the current row. Rows 0 through i-2 are never accessed again.

Two-Row Optimization: O(min(m, n)) space

lcs_space_optimized.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
def lcs_length_optimized(X: str, Y: str) -> int:
    """
    Space-optimized LCS using only two rows.
    
    Time Complexity: O(m × n)
    Space Complexity: O(min(m, n))
    
    Key insight: Each cell only depends on:
    - The cell directly above (previous row, same column)
    - The cell to the left (current row, previous column)
    - The diagonal cell (previous row, previous column)
    
    Therefore, we only need to keep track of two rows at a time.
    """
    # Ensure Y is the shorter string to minimize space
    if len(X) < len(Y):
        X, Y = Y, X
    
    m, n = len(X), len(Y)
    
    # Only two rows needed: previous and current
    prev = [0] * (n + 1)
    curr = [0] * (n + 1)
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if X[i - 1] == Y[j - 1]:
                curr[j] = prev[j - 1] + 1
            else:
                curr[j] = max(prev[j], curr[j - 1])
        
        # Swap rows: current becomes previous for next iteration
        prev, curr = curr, prev
    
    # Answer is in prev because we swapped after the last iteration
    return prev[n]
 
 
def lcs_length_single_row(X: str, Y: str) -> int:
    """
    Further optimized LCS using a single row plus one variable.
    
    Time Complexity: O(m × n)
    Space Complexity: O(min(m, n))
    
    The trick: We process left-to-right, so curr[j-1] is already
    updated. We just need to save the diagonal value before
    overwriting it.
    """
    if len(X) < len(Y):
        X, Y = Y, X
    
    m, n = len(X), len(Y)
    dp = [0] * (n + 1)
    
    for i in range(1, m + 1):
        prev_diagonal = 0  # Stores dp[i-1][j-1] before overwrite
        for j in range(1, n + 1):
            temp = dp[j]  # Save current value (will become diagonal)
            if X[i - 1] == Y[j - 1]:
                dp[j] = prev_diagonal + 1
            else:
                dp[j] = max(dp[j], dp[j - 1])
            prev_diagonal = temp
    
    return dp[n]
 
 
# Verify both implementations
X = "ABCBDAB"
Y = "BDCABA"
 
print(f"Two-row optimized: {lcs_length_optimized(X, Y)}")
print(f"Single-row optimized: {lcs_length_single_row(X, Y)}")

Tradeoff Alert

The space-optimized versions can only compute the LCS length, not the actual LCS sequence. To reconstruct the LCS, we need the full table (or use Hirschberg's divide-and-conquer algorithm, which achieves O(n) space while still allowing reconstruction—but that's an advanced topic).

Complexity Analysis Deep Dive

Let's rigorously analyze the time and space complexity of our DP solution.

LCS Algorithm Complexity Comparison
Algorithm	Time Complexity	Space Complexity	Reconstruction?
Naive Recursive	O(2^(m+n))	O(m + n) stack	Yes
Memoized Recursive	O(m × n)	O(m × n) + stack	Yes
Bottom-up DP	O(m × n)	O(m × n)	Yes
Two-row DP	O(m × n)	O(min(m, n))	No
Single-row DP	O(m × n)	O(min(m, n))	No
Hirschberg's Algorithm	O(m × n)	O(min(m, n))	Yes

Detailed Time Analysis:

Outer loop runs m times (once per character in X)
Inner loop runs n times (once per character in Y)
Each iteration performs O(1) work:
- One character comparison
- At most one addition
- At most one max comparison
- Two array accesses (reads) and one write

Total: Θ(m × n)

Note: This is tight—we must fill every cell, and each cell takes constant time.

Detailed Space Analysis (standard version):

DP table: (m + 1) × (n + 1) integers
Loop variables: O(1)
Input storage: O(m + n) if we count the strings themselves

Total: Θ(m × n) (dominated by the table)

For practical applications, this means:

Two 1,000-character strings: 1 million cells × 4 bytes = 4 MB
Two 10,000-character strings: 100 million cells × 4 bytes = 400 MB
Two 100,000-character strings: 10 billion cells × 4 bytes = 40 GB

For genome sequences with millions of base pairs, space optimization becomes essential.

Is O(m × n) Optimal?

For general LCS, O(m × n) is considered optimal under the Strong Exponential Time Hypothesis (SETH). However, for strings with bounded alphabet size or specific structure, faster algorithms exist. For example, the "Four Russians" technique achieves O(m × n / log n) for constant-sized alphabets, though the constants are large.

Summary: Complete LCS Implementation

We've established a complete, rigorous understanding of the LCS problem and its dynamic programming solution.

Key Takeaways

•Subsequence ≠ Substring — Subsequences preserve relative order but not contiguity. This distinction is fundamental.
•Naive approaches are exponentially slow — Enumerating subsequences leads to O(2^m) work, making it impractical for even modest inputs.
•Optimal substructure enables DP — The LCS problem has the crucial property that optimal solutions contain optimal subsolutions.
•The recurrence is elegant — Three cases: match (extend), or take max of two smaller problems. Simple but powerful.
•O(m × n) time and space — The standard DP solution improves exponential to polynomial, a dramatic speedup.
•Space can be reduced to O(min(m, n)) — If we only need the length, not the actual LCS sequence.

What's Next:

Now that we understand how to compute the LCS length, the next page dives deep into DP table construction—examining the table visually, understanding the flow of dependencies, and building intuition for why the algorithm works through detailed worked examples.

Page Complete

You now have a complete understanding of the LCS problem definition, why naive approaches fail, and how dynamic programming provides an efficient O(m × n) solution. You've seen both the standard and space-optimized implementations with full complexity analysis.

LCS with Full Implementation Detail

The Subsequence Similarity Problem

What You Will Master

Formal Problem Definition

Before diving into algorithms, we must establish precise definitions. Ambiguity in problem understanding leads to incorrect solutions.

Definition: Subsequence

In simpler terms: A subsequence is obtained by deleting zero or more elements from the original sequence without changing the order of the remaining elements.

Critical Distinction: Subsequence vs. Substring

Substring (also called subarray): Elements must be contiguous
Subsequence: Elements need not be contiguous, but relative order must be preserved

Understanding SubsequencesConsider the sequence X = "ALGORITHM"

Input

Output

Definition: Common Subsequence

Given two sequences X and Y, a sequence Z is a common subsequence of X and Y if Z is a subsequence of both X and Y simultaneously.

Definition: Longest Common Subsequence (LCS)

Given two sequences X = ⟨x₁, x₂, ..., xₘ⟩ and Y = ⟨y₁, y₂, ..., yₙ⟩, find a maximum-length common subsequence of X and Y.

Important Note: The LCS may not be unique. Multiple subsequences of the same maximum length may exist.

LCS Computation ExampleFinding the LCS of X = "ABCBDAB" and Y = "BDCABA"

Input

Output

The Optimization vs. Decision Problem

Why Naive Approaches Fail

To appreciate dynamic programming, we must first understand why simpler approaches are computationally infeasible.

Naive Approach 1: Enumerate All Subsequences

The most straightforward approach:

Generate all possible subsequences of X
For each subsequence, check if it's also a subsequence of Y
Track the longest one found

Complexity Analysis:

For each subsequence of length k, checking if it's a subsequence of Y takes O(n) time (linear scan with two pointers).

Total time: O(2ᵐ × n)

Exponential Growth of Subsequence Enumeration
String Length (m)	Number of Subsequences (2ᵐ)	Time at 10⁹ ops/sec
10	1,024	~1 microsecond
20	1,048,576	~1 millisecond
30	1,073,741,824	~1 second
40	~1.1 trillion	~18 minutes
50	~1 quadrillion	~13 days
100	~1.27 × 10³⁰	~4 × 10¹³ years

Naive Approach 2: Recursive Definition

A more structured approach uses recursion based on comparing characters:

LCS(X[1..m], Y[1..n]) =
  | 0                                          if m = 0 or n = 0
  | 1 + LCS(X[1..m-1], Y[1..n-1])             if xₘ = yₙ
  | max(LCS(X[1..m-1], Y), LCS(X, Y[1..n-1])) if xₘ ≠ yₙ

This recurrence is correct, but direct implementation has a critical flaw: overlapping subproblems lead to exponential redundant computation.

naive_recursive_lcs.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
def lcs_naive(X: str, Y: str, m: int, n: int) -> int:
    """
    Naive recursive LCS implementation.
    
    Time Complexity: O(2^(m+n)) in the worst case
    Space Complexity: O(m + n) for recursion stack
    
    WARNING: This is for educational purposes only.
    Do NOT use this for strings longer than ~20 characters.
    """
    # Base case: empty string has LCS of 0
    if m == 0 or n == 0:
        return 0
    
    # If last characters match, they're part of the LCS
    if X[m - 1] == Y[n - 1]:
        return 1 + lcs_naive(X, Y, m - 1, n - 1)
    
    # If they don't match, try both possibilities
    # and take the maximum
    return max(
        lcs_naive(X, Y, m - 1, n),  # Exclude last char of X
        lcs_naive(X, Y, m, n - 1)   # Exclude last char of Y
    )
 
 
# Example usage (keep strings short!)
X = "ABCBDAB"
Y = "BDCABA"
print(f"LCS length: {lcs_naive(X, Y, len(X), len(Y))}")  # Output: 4

The Overlapping Subproblems Explosion

Visualizing the Problem

Consider computing LCS("ABC", "AC"). The recursion creates:

LCS(3,2)
├── LCS(2,2) [match at C]
│   └── LCS(1,1) [no match]
│       ├── LCS(0,1) → 0
│       └── LCS(1,0) → 0
└── (not taken, since characters matched)

But for non-matching characters, both branches are explored:

LCS("AXBY", "AYBX")
└── LCS(3,3) [Y ≠ X]
    ├── LCS(2,3) → explores many subproblems
    └── LCS(3,2) → SAME subproblems recomputed!

The subproblem LCS(2,2) gets computed multiple times, as do many others. This redundancy is exactly what dynamic programming eliminates.

Optimal Substructure Property

Dynamic programming requires two properties: optimal substructure and overlapping subproblems. The LCS problem exhibits both beautifully.

Theorem: Optimal Substructure of LCS

Let X = ⟨x₁, x₂, ..., xₘ⟩ and Y = ⟨y₁, y₂, ..., yₙ⟩ be sequences, and let Z = ⟨z₁, z₂, ..., zₖ⟩ be any LCS of X and Y.

If xₘ = yₙ, then zₖ = xₘ = yₙ and Z[1..k-1] is an LCS of X[1..m-1] and Y[1..n-1]
If xₘ ≠ yₙ, then:
- zₖ ≠ xₘ implies Z is an LCS of X[1..m-1] and Y
- zₖ ≠ yₙ implies Z is an LCS of X and Y[1..n-1]

Proof Intuition

Formal Proof of Case 1

Suppose xₘ = yₙ. We claim that this common character must appear at the end of some LCS.

Now, if zₖ = xₘ = yₙ, then Z[1..k-1] must be a common subsequence of X[1..m-1] and Y[1..n-1]. If it weren't longest, we could replace it with a longer one and get a longer Z, contradiction.

Formal Proof of Case 2

Suppose xₘ ≠ yₙ. At most one of them can equal zₖ.

The argument for zₖ ≠ yₙ is symmetric. ∎

The Recurrence Relation

Based on optimal substructure, we define:

c[i, j] = length of LCS of X[1..i] and Y[1..j]

The recurrence is:

c[i, j] = 
  | 0                           if i = 0 or j = 0
  | c[i-1, j-1] + 1            if i, j > 0 and xᵢ = yⱼ
  | max(c[i-1, j], c[i, j-1])  if i, j > 0 and xᵢ ≠ yⱼ

This recurrence has:

(m+1) × (n+1) unique subproblems (including base cases with 0)
Each subproblem depends on at most 3 other subproblems
Total work: O(m × n)

This is the key insight: we've reduced exponential work to polynomial work by recognizing and eliminating redundant computation.

The Power of Memoization

Complete Dynamic Programming Implementation

Now we translate the recurrence into working code. We'll present both the standard 2D table approach and analyze every aspect of the implementation.

lcs_dynamic_programming.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
def lcs_length(X: str, Y: str) -> int:
    """
    Compute the length of the Longest Common Subsequence using DP.
    
    Time Complexity: O(m × n) where m = len(X), n = len(Y)
    Space Complexity: O(m × n) for the DP table
    
    Args:
        X: First string
        Y: Second string
    
    Returns:
        Length of the longest common subsequence
    """
    m, n = len(X), len(Y)
    
    # Create DP table with (m+1) rows and (n+1) columns
    # dp[i][j] = LCS length of X[0:i] and Y[0:j]
    # Extra row/column for base case (empty prefix)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Fill the table bottom-up
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if X[i - 1] == Y[j - 1]:
                # Characters match: extend LCS from diagonal
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                # Characters don't match: take max of excluding either
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    
    # Answer is in the bottom-right cell
    return dp[m][n]
 
 
def lcs_with_table(X: str, Y: str) -> tuple[int, list[list[int]]]:
    """
    Compute LCS length and return the full DP table for visualization.
    
    Returns:
        Tuple of (LCS length, DP table)
    """
    m, n = len(X), len(Y)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if X[i - 1] == Y[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    
    return dp[m][n], dp
 
 
# Example with detailed output
X = "ABCBDAB"
Y = "BDCABA"
 
length, table = lcs_with_table(X, Y)
 
print(f"X = '{X}'")
print(f"Y = '{Y}'")
print(f"LCS Length = {length}
")
 
# Print the DP table
print("DP Table:")
print("    ", end="")
print("  ε  " + "  ".join(f" {c} " for c in Y))
print("  ε " + " ".join(f" {table[0][j]:2d}" for j in range(len(Y) + 1)))
for i in range(1, len(X) + 1):
    print(f"  {X[i-1]} " + " ".join(f" {table[i][j]:2d}" for j in range(len(Y) + 1)))

Understanding the Table Structure

The DP table for X = "ABCBDAB" and Y = "BDCABA" looks like this:

        ε    B    D    C    A    B    A
    ε   0    0    0    0    0    0    0
    A   0    0    0    0    1    1    1
    B   0    1    1    1    1    2    2
    C   0    1    1    2    2    2    2
    B   0    1    1    2    2    3    3
    D   0    1    2    2    2    3    3
    A   0    1    2    2    3    3    4
    B   0    1    2    2    3    4    4

Reading the Table:

Row i, Column j gives the LCS length of X[1..i] and Y[1..j]
The final answer (LCS length = 4) is at position (7, 6)
ε represents the empty string (base case)
Each cell depends on at most 3 neighbors: top, left, and top-left diagonal

Pattern Recognition

Space-Optimized Variants

Observation: When computing row i, we only need values from row i-1 and the current row. Rows 0 through i-2 are never accessed again.

Two-Row Optimization: O(min(m, n)) space

lcs_space_optimized.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
def lcs_length_optimized(X: str, Y: str) -> int:
    """
    Space-optimized LCS using only two rows.
    
    Time Complexity: O(m × n)
    Space Complexity: O(min(m, n))
    
    Key insight: Each cell only depends on:
    - The cell directly above (previous row, same column)
    - The cell to the left (current row, previous column)
    - The diagonal cell (previous row, previous column)
    
    Therefore, we only need to keep track of two rows at a time.
    """
    # Ensure Y is the shorter string to minimize space
    if len(X) < len(Y):
        X, Y = Y, X
    
    m, n = len(X), len(Y)
    
    # Only two rows needed: previous and current
    prev = [0] * (n + 1)
    curr = [0] * (n + 1)
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if X[i - 1] == Y[j - 1]:
                curr[j] = prev[j - 1] + 1
            else:
                curr[j] = max(prev[j], curr[j - 1])
        
        # Swap rows: current becomes previous for next iteration
        prev, curr = curr, prev
    
    # Answer is in prev because we swapped after the last iteration
    return prev[n]
 
 
def lcs_length_single_row(X: str, Y: str) -> int:
    """
    Further optimized LCS using a single row plus one variable.
    
    Time Complexity: O(m × n)
    Space Complexity: O(min(m, n))
    
    The trick: We process left-to-right, so curr[j-1] is already
    updated. We just need to save the diagonal value before
    overwriting it.
    """
    if len(X) < len(Y):
        X, Y = Y, X
    
    m, n = len(X), len(Y)
    dp = [0] * (n + 1)
    
    for i in range(1, m + 1):
        prev_diagonal = 0  # Stores dp[i-1][j-1] before overwrite
        for j in range(1, n + 1):
            temp = dp[j]  # Save current value (will become diagonal)
            if X[i - 1] == Y[j - 1]:
                dp[j] = prev_diagonal + 1
            else:
                dp[j] = max(dp[j], dp[j - 1])
            prev_diagonal = temp
    
    return dp[n]
 
 
# Verify both implementations
X = "ABCBDAB"
Y = "BDCABA"
 
print(f"Two-row optimized: {lcs_length_optimized(X, Y)}")
print(f"Single-row optimized: {lcs_length_single_row(X, Y)}")

Tradeoff Alert

Complexity Analysis Deep Dive

Let's rigorously analyze the time and space complexity of our DP solution.

LCS Algorithm Complexity Comparison
Algorithm	Time Complexity	Space Complexity	Reconstruction?
Naive Recursive	O(2^(m+n))	O(m + n) stack	Yes
Memoized Recursive	O(m × n)	O(m × n) + stack	Yes
Bottom-up DP	O(m × n)	O(m × n)	Yes
Two-row DP	O(m × n)	O(min(m, n))	No
Single-row DP	O(m × n)	O(min(m, n))	No
Hirschberg's Algorithm	O(m × n)	O(min(m, n))	Yes

Detailed Time Analysis:

Outer loop runs m times (once per character in X)
Inner loop runs n times (once per character in Y)
Each iteration performs O(1) work:
- One character comparison
- At most one addition
- At most one max comparison
- Two array accesses (reads) and one write

Total: Θ(m × n)

Note: This is tight—we must fill every cell, and each cell takes constant time.

Detailed Space Analysis (standard version):

DP table: (m + 1) × (n + 1) integers
Loop variables: O(1)
Input storage: O(m + n) if we count the strings themselves

Total: Θ(m × n) (dominated by the table)

For practical applications, this means:

Two 1,000-character strings: 1 million cells × 4 bytes = 4 MB
Two 10,000-character strings: 100 million cells × 4 bytes = 400 MB
Two 100,000-character strings: 10 billion cells × 4 bytes = 40 GB

For genome sequences with millions of base pairs, space optimization becomes essential.

Is O(m × n) Optimal?

Summary: Complete LCS Implementation

We've established a complete, rigorous understanding of the LCS problem and its dynamic programming solution.

Key Takeaways

•Subsequence ≠ Substring — Subsequences preserve relative order but not contiguity. This distinction is fundamental.
•Naive approaches are exponentially slow — Enumerating subsequences leads to O(2^m) work, making it impractical for even modest inputs.
•Optimal substructure enables DP — The LCS problem has the crucial property that optimal solutions contain optimal subsolutions.
•The recurrence is elegant — Three cases: match (extend), or take max of two smaller problems. Simple but powerful.
•O(m × n) time and space — The standard DP solution improves exponential to polynomial, a dramatic speedup.
•Space can be reduced to O(min(m, n)) — If we only need the length, not the actual LCS sequence.

What's Next:

Page Complete