Data Structures & AlgorithmsZ-Algorithm

Z-Algorithm: Linear-Time Pattern Matching

LevelIntermediate

Duration50 mins

TopicZ-Algorithm

1 / 4

The Z-Array Definition

A Lens into String Structure

Strings are everywhere in computing—from the DNA sequences analyzed by bioinformaticians to the log files parsed by system administrators, from the code indexed by search engines to the text processed by natural language models. At the heart of working with strings lies a fundamental question: how do parts of a string relate to each other?

The Z-array is a deceptively simple data structure that answers this question with remarkable elegance. For any string, the Z-array captures precisely how the prefix of that string matches substrings starting at each position. This seemingly modest information unlocks a world of efficient algorithms—from pattern matching to string compression to finding the longest repeated prefix.

What You Will Learn

By the end of this page, you will understand exactly what the Z-array is, how it encodes prefix-matching information, and why this structure forms the foundation for the Z-algorithm. You'll be able to compute Z-arrays by hand and reason about their properties with confidence.

The Fundamental Definition

Let's establish the precise definition that will guide our entire discussion.

Definition (Z-array): Given a string S of length n (indexed from 0 to n-1), the Z-array Z[0..n-1] is defined such that:

Z[0] is typically defined as 0, undefined, or equal to n (conventions vary)
For i > 0, Z[i] equals the length of the longest substring starting at position i that matches a prefix of the string S

In other words, Z[i] answers the question: "Starting at position i, how many characters match the beginning of the string?"

This is a prefix-matching measure. It tells us exactly how much of the prefix is "repeated" starting at position i.

Intuitive Understanding

Think of Z[i] as asking: "If I overlay the string on top of itself, starting from position i, how far can I go before the characters stop matching?" This overlay perspective will help you visualize Z-values throughout this module.

Formal Expression:

For position i where 1 ≤ i < n:

Z[i] = max { k : S[0..k-1] = S[i..i+k-1] }

Where S[a..b] denotes the substring from index a to index b (inclusive). The value Z[i] is the largest k such that the first k characters of S match the k characters starting at position i.

If no match exists at all (i.e., S[0] ≠ S[i]), then Z[i] = 0.

Concrete Examples

Abstract definitions become clear through concrete examples. Let's compute Z-arrays for several strings of increasing complexity.

Example 1: S = "aabcaab"

Let's trace through each position:

Index	Character	Substring starting here	Prefix Match	Z[i]
0	a	(S itself or undefined)	—	0*
1	a	"abcaab"	S[0]="a" matches, S[1]="a" vs S[1]="b" ≠	1
2	b	"bcaab"	S[0]="a" vs "b" ≠	0
3	c	"caab"	S[0]="a" vs "c" ≠	0
4	a	"aab"	S[0..2]="aab" matches!	3
5	a	"ab"	S[0]="a" matches, S[1]="a" vs "b" ≠	1
6	b	"b"	S[0]="a" vs "b" ≠	0

Z-array: [0, 1, 0, 0, 3, 1, 0]

Note: Z[0] is often set to 0 by convention since comparing the entire string to itself is trivially n.

Key Observation

Notice at position 4, Z[4] = 3, indicating that "aab" appears again starting at index 4. This is precisely the kind of repetition information that makes the Z-array so powerful for pattern matching.

Example 2: S = "aaaaa"

A string of all identical characters:

Index	Substring	Prefix Match	Z[i]
0	"aaaaa"	—	0
1	"aaaa"	Matches S[0..3]="aaaa"	4
2	"aaa"	Matches S[0..2]="aaa"	3
3	"aa"	Matches S[0..1]="aa"	2
4	"a"	Matches S[0]="a"	1

Z-array: [0, 4, 3, 2, 1]

This reveals a beautiful pattern: in a string of identical characters, Z[i] = n - i for i > 0. Each position matches as much as possible until reaching the end.

Example 3: S = "abcdefg" (all distinct)

When no characters repeat:

Index	Substring	Prefix Match
0	"abcdefg"	—
1	"bcdefg"	"a" vs "b" ≠
2	"cdefg"	"a" vs "c" ≠
3	"defg"	"a" vs "d" ≠
4	"efg"	"a" vs "e" ≠
5	"fg"	"a" vs "f" ≠
6	"g"	"a" vs "g" ≠

Z-array: [0, 0, 0, 0, 0, 0, 0]

For strings with no character repetition, the Z-array is all zeros (except potentially Z[0]). No position matches even the first character of the prefix.

The Z-Box Concept

Understanding the Z-array leads naturally to a related concept that becomes crucial for the efficient algorithm: the Z-box.

Definition (Z-box): If Z[i] > 0, then the interval [i, i + Z[i] - 1] is called a Z-box. This interval represents the span of characters that match the prefix.

Visually, a Z-box is a "window" into the string where we know the content exactly matches the beginning of the string.

Example: In S = "aabcaab", at position 4, Z[4] = 3.

The Z-box is [4, 6] (indices 4, 5, 6)
This span "aab" matches S[0..2] = "aab"

Why Z-Boxes Matter

Z-boxes are not just theoretical constructs—they're the key to computing the Z-array in linear time. When we're inside a known Z-box, we can potentially reuse previously computed Z-values rather than comparing character by character. This insight transforms a naive O(n²) algorithm into an elegant O(n) solution.

The Rightmost Z-Box:

As we compute the Z-array from left to right, we track the rightmost Z-box seen so far. Let's denote:

l (left): the starting index of the rightmost Z-box
r (right): the ending index of the rightmost Z-box

This rightmost Z-box represents our "window of knowledge"—within this window, we know exactly how the characters relate to the prefix, and this information can accelerate our computation.

Visual Representation:

String S:   a a b c a a b
Index:      0 1 2 3 4 5 6

Z-array:    - 1 0 0 3 1 0

Z-boxes:
  Position 1: [1, 1] (length 1)
  Position 4: [4, 6] (length 3) ← This becomes the rightmost Z-box
  Position 5: [5, 5] (length 1, but inside [4,6])

Z-Box Properties

•Prefix Equivalence: Characters in a Z-box starting at i are identical to the corresponding prefix: S[i..i+Z[i]-1] = S[0..Z[i]-1]
•Nesting: Z-boxes can overlap and nest within each other, but we always track the rightmost one for efficiency
•Information Reuse: If position k falls within a Z-box [l, r], we know S[k] = S[k-l] without checking
•Boundary Handling: The key algorithmic insight is knowing when we can trust Z-box information versus when we must compare characters directly

Properties of Z-Values

Before we dive into algorithms, let's establish some important properties of Z-values that inform both correctness proofs and practical applications.

Fundamental Properties

•Boundedness: For any i > 0, we have 0 ≤ Z[i] ≤ n - i. The match cannot extend beyond the string's end.
•Zero Condition: Z[i] = 0 if and only if S[i] ≠ S[0]. The first character must match for any positive Z-value.
•Maximality: Z[i] is the maximum match length. If Z[i] = k, then S[0..k-1] = S[i..i+k-1], but S[k] ≠ S[i+k] (unless we've hit the string boundary).
•Uniqueness: The Z-array is uniquely determined by the input string—it's a function of the string.
•Prefix Recovery: From the Z-array, we can answer many prefix-related queries in O(1) time each.

Property: Immediate Mismatch Detection

If we know Z[i] = k, we immediately know:

S[0..k-1] = S[i..i+k-1] (the matching prefix)
If i+k < n, then S[k] ≠ S[i+k] (the mismatch point)

This "mismatch boundary" information is exactly what makes the Z-array useful for pattern matching. We don't just know that a match exists—we know precisely where it ends and why.

Connection to Prefix Function

The Z-array and the failure function (prefix function) from KMP encode related but subtly different information. The prefix function answers 'what's the longest proper prefix that's also a suffix up to this point?' while the Z-array answers 'how much of the prefix matches starting here?' These dual perspectives on prefix structure are both powerful and complementary.

Manual Computation Method

Before learning the efficient algorithm, every practitioner should be able to compute Z-arrays by hand. This builds the intuition necessary to understand and debug more complex implementations.

Step-by-Step Manual Process:

Write out the string with indices
For each position i from 1 to n-1:
- Compare S[i], S[i+1], S[i+2], ... with S[0], S[1], S[2], ...
- Count how many characters match consecutively
- This count is Z[i]
The result is your Z-array

Detailed Walkthrough: S = "aabxaabxcaab"

Let's compute this step by step:

Index:  0 1 2 3 4 5 6 7 8 9 10 11
String: a a b x a a b x c a a  b

Position 1: Compare S[1..] with S[0..]

S[1]='a' vs S[0]='a' ✓
S[2]='b' vs S[1]='a' ✗
Z[1] = 1

Position 2: Compare S[2..] with S[0..]

S[2]='b' vs S[0]='a' ✗
Z[2] = 0

Position 3: Compare S[3..] with S[0..]

S[3]='x' vs S[0]='a' ✗
Z[3] = 0

Position 4: Compare S[4..] with S[0..]

S[4]='a' vs S[0]='a' ✓
S[5]='a' vs S[1]='a' ✓
S[6]='b' vs S[2]='b' ✓
S[7]='x' vs S[3]='x' ✓
S[8]='c' vs S[4]='a' ✗
Z[4] = 4

Position 5: Compare S[5..] with S[0..]

S[5]='a' vs S[0]='a' ✓
S[6]='b' vs S[1]='a' ✗
Z[5] = 1

Continue for remaining positions...

Final Z-array: [0, 1, 0, 0, 4, 1, 0, 0, 0, 3, 1, 0]

Naive Complexity Alert

This manual method has O(n²) worst-case complexity. Consider S = "aaaa...a" (n a's): Position 1 requires n-1 comparisons, position 2 requires n-2, and so on. Total: O(n²) comparisons. The efficient Z-algorithm achieves O(n) by cleverly reusing Z-box information.

Z-Array vs Other String Functions

The Z-array is one member of a family of string preprocessing structures. Understanding how it compares to alternatives helps you choose the right tool for each problem.

Comparison of String Preprocessing Structures
Structure	What It Computes	Key Applications	Relation to Z-Array
Z-Array	Prefix matches at each position	Pattern matching, longest common prefix	The subject of this module
Prefix Function (KMP)	Longest proper prefix-suffix at each position	Pattern matching, string periodicity	Can be converted to/from Z-array in O(n)
Suffix Array	Sorted order of all suffixes	Pattern matching, LCP queries, compression	Different perspective; heavier preprocessing
Rolling Hash	Hash values for substrings	Pattern matching, similarity detection	Probabilistic; no structural information

The Z-Array ↔ Prefix Function Equivalence:

A profound result in string algorithmics is that the Z-array and the KMP prefix function contain equivalent information. Given one, you can compute the other in O(n) time:

From Z to Prefix Function:

for i = 1 to n-1:
    if Z[i] > 0:
        for j = Z[i]-1 down to 0:
            if prefix[i+j] > 0: break
            prefix[i+j] = j + 1

From Prefix Function to Z:

for i = 1 to n-1:
    if prefix[i] > 0:
        Z[i - prefix[i] + 1] = prefix[i]

(Note: These conversions require careful handling of overlapping cases, but the fundamental equivalence holds.)

This equivalence means that any problem solvable with the Z-array is also solvable with the prefix function, and vice versa. The choice often comes down to which is more intuitive for a given problem.

Practical Guidance

Many practitioners find the Z-array more intuitive than the prefix function because Z[i] directly answers 'how much matches starting here?' In contrast, the prefix function answers 'what is the longest proper prefix that is also a suffix?'—a more indirect question. However, both are worth mastering as different problems naturally suit different perspectives.

Visualizing the Z-Array

A powerful way to understand the Z-array is through visualization. Let's see how Z-values manifest as "match bars" extending from each position.

Visual Representation for S = "aabcaab":

Index:     0   1   2   3   4   5   6
String:    a   a   b   c   a   a   b
           ─────────────────────────────
Z[0]:      -   (not computed / convention)
Z[1]:      │   •   (length 1: matches 'a')
Z[2]:          -   (length 0: 'b' ≠ 'a')
Z[3]:              -   (length 0: 'c' ≠ 'a')
Z[4]:                  •───•───•  (length 3: matches 'aab')
Z[5]:                      •   (length 1: matches 'a')
Z[6]:                          -   (length 0: 'b' ≠ 'a')

Each "•" represents a matching character, and the "───" shows the extent of the match. The Z-box for position 4 spans indices [4, 5, 6].

Overlay Perspective:

Another way to visualize: imagine sliding a copy of the string starting at each position and counting matching characters:

Original:   a a b c a a b
                         
Slide to 1:   a a b c a a b
              ↓
              a ✓  (1 match)
                a ✗  

Slide to 4:         a a b c a a b
                    ↓ ↓ ↓
              a a b ✓ ✓ ✓  (3 matches)
                    c ✗ (position 7 would mismatch)

This overlay model makes it viscerally clear what Z[i] measures: the "overlap" between the prefix and the substring starting at i.

Building Intuition

Spend time with these visualizations until the Z-array feels intuitive. When you can look at a string and roughly estimate its Z-values, you'll find the algorithm and its applications much easier to grasp. This visual intuition is what separates proficient from fluent string algorithm practitioners.

Key Insights Summary

We've established the conceptual foundation for the Z-algorithm. Let's consolidate the key insights before moving to the efficient computation method.

Core Concepts Mastered

•Z[i] Definition: The length of the longest substring starting at position i that matches a prefix of the string.
•Z-Box: The interval [i, i+Z[i]-1] where we know the characters match the prefix exactly.
•Information Encoding: The Z-array encapsulates all prefix-matching information for a string in a single pass.
•Properties: Z-values are bounded, unique, and encode both match and mismatch information.
•Visualization: Z-values can be seen as "match bars" or "overlay lengths" at each position.
•Equivalence: The Z-array and KMP prefix function contain equivalent structural information.

What's Next:

We now understand what the Z-array is. The next page addresses the critical question: how do we compute it efficiently? We'll see how the Z-box concept enables a beautiful linear-time algorithm that computes all Z-values in just O(n) time, regardless of the string's structure.

Page Complete

You now have a solid understanding of the Z-array definition, its properties, and its relationship to other string structures. This conceptual foundation is essential for understanding both the efficient algorithm and its applications in pattern matching.

1 / 4

Loading learning content...

Data Structures & AlgorithmsZ-Algorithm

Z-Algorithm: Linear-Time Pattern Matching

LevelIntermediate

Duration50 mins

TopicZ-Algorithm

1 / 4

The Z-Array Definition

A Lens into String Structure

What You Will Learn

The Fundamental Definition

Let's establish the precise definition that will guide our entire discussion.

Definition (Z-array): Given a string S of length n (indexed from 0 to n-1), the Z-array Z[0..n-1] is defined such that:

Z[0] is typically defined as 0, undefined, or equal to n (conventions vary)
For i > 0, Z[i] equals the length of the longest substring starting at position i that matches a prefix of the string S

In other words, Z[i] answers the question: "Starting at position i, how many characters match the beginning of the string?"

This is a prefix-matching measure. It tells us exactly how much of the prefix is "repeated" starting at position i.

Intuitive Understanding

Formal Expression:

For position i where 1 ≤ i < n:

Z[i] = max { k : S[0..k-1] = S[i..i+k-1] }

If no match exists at all (i.e., S[0] ≠ S[i]), then Z[i] = 0.

Concrete Examples

Abstract definitions become clear through concrete examples. Let's compute Z-arrays for several strings of increasing complexity.

Example 1: S = "aabcaab"

Let's trace through each position:

Index	Character	Substring starting here	Prefix Match	Z[i]
0	a	(S itself or undefined)	—	0*
1	a	"abcaab"	S[0]="a" matches, S[1]="a" vs S[1]="b" ≠	1
2	b	"bcaab"	S[0]="a" vs "b" ≠	0
3	c	"caab"	S[0]="a" vs "c" ≠	0
4	a	"aab"	S[0..2]="aab" matches!	3
5	a	"ab"	S[0]="a" matches, S[1]="a" vs "b" ≠	1
6	b	"b"	S[0]="a" vs "b" ≠	0

Z-array: [0, 1, 0, 0, 3, 1, 0]

Note: Z[0] is often set to 0 by convention since comparing the entire string to itself is trivially n.

Key Observation

Notice at position 4, Z[4] = 3, indicating that "aab" appears again starting at index 4. This is precisely the kind of repetition information that makes the Z-array so powerful for pattern matching.

Example 2: S = "aaaaa"

A string of all identical characters:

Index	Substring	Prefix Match	Z[i]
0	"aaaaa"	—	0
1	"aaaa"	Matches S[0..3]="aaaa"	4
2	"aaa"	Matches S[0..2]="aaa"	3
3	"aa"	Matches S[0..1]="aa"	2
4	"a"	Matches S[0]="a"	1

Z-array: [0, 4, 3, 2, 1]

This reveals a beautiful pattern: in a string of identical characters, Z[i] = n - i for i > 0. Each position matches as much as possible until reaching the end.

Example 3: S = "abcdefg" (all distinct)

When no characters repeat:

Index	Substring	Prefix Match
0	"abcdefg"	—
1	"bcdefg"	"a" vs "b" ≠
2	"cdefg"	"a" vs "c" ≠
3	"defg"	"a" vs "d" ≠
4	"efg"	"a" vs "e" ≠
5	"fg"	"a" vs "f" ≠
6	"g"	"a" vs "g" ≠

Z-array: [0, 0, 0, 0, 0, 0, 0]

For strings with no character repetition, the Z-array is all zeros (except potentially Z[0]). No position matches even the first character of the prefix.

The Z-Box Concept

Understanding the Z-array leads naturally to a related concept that becomes crucial for the efficient algorithm: the Z-box.

Definition (Z-box): If Z[i] > 0, then the interval [i, i + Z[i] - 1] is called a Z-box. This interval represents the span of characters that match the prefix.

Visually, a Z-box is a "window" into the string where we know the content exactly matches the beginning of the string.

Example: In S = "aabcaab", at position 4, Z[4] = 3.

The Z-box is [4, 6] (indices 4, 5, 6)
This span "aab" matches S[0..2] = "aab"

Why Z-Boxes Matter

The Rightmost Z-Box:

As we compute the Z-array from left to right, we track the rightmost Z-box seen so far. Let's denote:

l (left): the starting index of the rightmost Z-box
r (right): the ending index of the rightmost Z-box

This rightmost Z-box represents our "window of knowledge"—within this window, we know exactly how the characters relate to the prefix, and this information can accelerate our computation.

Visual Representation:

String S:   a a b c a a b
Index:      0 1 2 3 4 5 6

Z-array:    - 1 0 0 3 1 0

Z-boxes:
  Position 1: [1, 1] (length 1)
  Position 4: [4, 6] (length 3) ← This becomes the rightmost Z-box
  Position 5: [5, 5] (length 1, but inside [4,6])

Z-Box Properties

•Prefix Equivalence: Characters in a Z-box starting at i are identical to the corresponding prefix: S[i..i+Z[i]-1] = S[0..Z[i]-1]
•Nesting: Z-boxes can overlap and nest within each other, but we always track the rightmost one for efficiency
•Information Reuse: If position k falls within a Z-box [l, r], we know S[k] = S[k-l] without checking
•Boundary Handling: The key algorithmic insight is knowing when we can trust Z-box information versus when we must compare characters directly

Properties of Z-Values

Before we dive into algorithms, let's establish some important properties of Z-values that inform both correctness proofs and practical applications.

Fundamental Properties

•Boundedness: For any i > 0, we have 0 ≤ Z[i] ≤ n - i. The match cannot extend beyond the string's end.
•Zero Condition: Z[i] = 0 if and only if S[i] ≠ S[0]. The first character must match for any positive Z-value.
•Maximality: Z[i] is the maximum match length. If Z[i] = k, then S[0..k-1] = S[i..i+k-1], but S[k] ≠ S[i+k] (unless we've hit the string boundary).
•Uniqueness: The Z-array is uniquely determined by the input string—it's a function of the string.
•Prefix Recovery: From the Z-array, we can answer many prefix-related queries in O(1) time each.

Property: Immediate Mismatch Detection

If we know Z[i] = k, we immediately know:

S[0..k-1] = S[i..i+k-1] (the matching prefix)
If i+k < n, then S[k] ≠ S[i+k] (the mismatch point)

This "mismatch boundary" information is exactly what makes the Z-array useful for pattern matching. We don't just know that a match exists—we know precisely where it ends and why.

Connection to Prefix Function

Manual Computation Method

Before learning the efficient algorithm, every practitioner should be able to compute Z-arrays by hand. This builds the intuition necessary to understand and debug more complex implementations.

Step-by-Step Manual Process:

Write out the string with indices
For each position i from 1 to n-1:
- Compare S[i], S[i+1], S[i+2], ... with S[0], S[1], S[2], ...
- Count how many characters match consecutively
- This count is Z[i]
The result is your Z-array

Detailed Walkthrough: S = "aabxaabxcaab"

Let's compute this step by step:

Index:  0 1 2 3 4 5 6 7 8 9 10 11
String: a a b x a a b x c a a  b

Position 1: Compare S[1..] with S[0..]

S[1]='a' vs S[0]='a' ✓
S[2]='b' vs S[1]='a' ✗
Z[1] = 1

Position 2: Compare S[2..] with S[0..]

S[2]='b' vs S[0]='a' ✗
Z[2] = 0

Position 3: Compare S[3..] with S[0..]

S[3]='x' vs S[0]='a' ✗
Z[3] = 0

Position 4: Compare S[4..] with S[0..]

S[4]='a' vs S[0]='a' ✓
S[5]='a' vs S[1]='a' ✓
S[6]='b' vs S[2]='b' ✓
S[7]='x' vs S[3]='x' ✓
S[8]='c' vs S[4]='a' ✗
Z[4] = 4

Position 5: Compare S[5..] with S[0..]

S[5]='a' vs S[0]='a' ✓
S[6]='b' vs S[1]='a' ✗
Z[5] = 1

Continue for remaining positions...

Final Z-array: [0, 1, 0, 0, 4, 1, 0, 0, 0, 3, 1, 0]

Naive Complexity Alert

Z-Array vs Other String Functions

The Z-array is one member of a family of string preprocessing structures. Understanding how it compares to alternatives helps you choose the right tool for each problem.

Comparison of String Preprocessing Structures
Structure	What It Computes	Key Applications	Relation to Z-Array
Z-Array	Prefix matches at each position	Pattern matching, longest common prefix	The subject of this module
Prefix Function (KMP)	Longest proper prefix-suffix at each position	Pattern matching, string periodicity	Can be converted to/from Z-array in O(n)
Suffix Array	Sorted order of all suffixes	Pattern matching, LCP queries, compression	Different perspective; heavier preprocessing
Rolling Hash	Hash values for substrings	Pattern matching, similarity detection	Probabilistic; no structural information

The Z-Array ↔ Prefix Function Equivalence:

A profound result in string algorithmics is that the Z-array and the KMP prefix function contain equivalent information. Given one, you can compute the other in O(n) time:

From Z to Prefix Function:

for i = 1 to n-1:
    if Z[i] > 0:
        for j = Z[i]-1 down to 0:
            if prefix[i+j] > 0: break
            prefix[i+j] = j + 1

From Prefix Function to Z:

for i = 1 to n-1:
    if prefix[i] > 0:
        Z[i - prefix[i] + 1] = prefix[i]

(Note: These conversions require careful handling of overlapping cases, but the fundamental equivalence holds.)

Practical Guidance

Visualizing the Z-Array

A powerful way to understand the Z-array is through visualization. Let's see how Z-values manifest as "match bars" extending from each position.

Visual Representation for S = "aabcaab":

Index:     0   1   2   3   4   5   6
String:    a   a   b   c   a   a   b
           ─────────────────────────────
Z[0]:      -   (not computed / convention)
Z[1]:      │   •   (length 1: matches 'a')
Z[2]:          -   (length 0: 'b' ≠ 'a')
Z[3]:              -   (length 0: 'c' ≠ 'a')
Z[4]:                  •───•───•  (length 3: matches 'aab')
Z[5]:                      •   (length 1: matches 'a')
Z[6]:                          -   (length 0: 'b' ≠ 'a')

Each "•" represents a matching character, and the "───" shows the extent of the match. The Z-box for position 4 spans indices [4, 5, 6].

Overlay Perspective:

Another way to visualize: imagine sliding a copy of the string starting at each position and counting matching characters:

Original:   a a b c a a b
                         
Slide to 1:   a a b c a a b
              ↓
              a ✓  (1 match)
                a ✗  

Slide to 4:         a a b c a a b
                    ↓ ↓ ↓
              a a b ✓ ✓ ✓  (3 matches)
                    c ✗ (position 7 would mismatch)

This overlay model makes it viscerally clear what Z[i] measures: the "overlap" between the prefix and the substring starting at i.

Building Intuition

Key Insights Summary

We've established the conceptual foundation for the Z-algorithm. Let's consolidate the key insights before moving to the efficient computation method.

Core Concepts Mastered

•Z[i] Definition: The length of the longest substring starting at position i that matches a prefix of the string.
•Z-Box: The interval [i, i+Z[i]-1] where we know the characters match the prefix exactly.
•Information Encoding: The Z-array encapsulates all prefix-matching information for a string in a single pass.
•Properties: Z-values are bounded, unique, and encode both match and mismatch information.
•Visualization: Z-values can be seen as "match bars" or "overlay lengths" at each position.
•Equivalence: The Z-array and KMP prefix function contain equivalent structural information.

What's Next:

Page Complete

1 / 4