Loading content...
Throughout this module, we've explored multiple algorithms for searching in 2D matrices, each suited to different sorting structures. Now it's time to bring everything together with a rigorous complexity analysis that compares all approaches.
Understanding complexity isn't just about memorizing formulas—it's about developing intuition for why different algorithms perform the way they do. This page derives each complexity from first principles, compares them mathematically across various matrix dimensions, and provides empirical benchmarks to ground the theory in reality.
By the end, you'll have a complete mental framework for analyzing and selecting 2D search algorithms, backed by both theoretical understanding and practical experience.
By the end of this page, you will derive and understand the time complexity of each 2D search algorithm, compare complexities mathematically for different matrix dimensions, understand the constants and practical factors that Big-O notation hides, and know exactly which algorithm to choose for any given matrix structure.
Let's establish the complete set of algorithms we'll analyze. For an m × n matrix:
1. Brute Force (Linear Scan)
2. Binary Search per Row
3. Binary Search per Column
4. Staircase Search
5. Binary Search (1D Treatment)
6. Two-Stage Binary Search
| Algorithm | Matrix Requirement | Time Complexity | Space Complexity |
|---|---|---|---|
| Brute Force | None | O(m × n) | O(1) |
| Binary Search/Row | Row-sorted | O(m × log n) | O(1) |
| Binary Search/Column | Column-sorted | O(n × log m) | O(1) |
| Staircase | Row-and-column sorted | O(m + n) | O(1) |
| Binary Search 1D | Fully sorted | O(log(m × n)) | O(1) |
| Two-Stage Binary | Fully sorted | O(log m + log n) | O(1) |
Every algorithm discussed operates in-place with only a constant number of variables. This is a significant advantage over approaches that might build auxiliary data structures. Space efficiency is particularly important for very large matrices.
Let's derive each complexity rigorously, understanding why rather than just what.
1. Brute Force: O(m × n)
The algorithm visits every element exactly once in the worst case (target not present or at the last position).
Total operations = m × n
Time complexity = O(m × n)
This is the baseline that all other algorithms improve upon.
2. Binary Search per Row: O(m × log n)
For each of the m rows, we perform one binary search of length n.
Cost per row = O(log n)
Number of rows = m
Total = m × O(log n) = O(m × log n)
This is better than brute force when log n < n (always true for n > 1).
3. Staircase Search: O(m + n)
Proof by potential function:
Define potential Φ = row + (n - 1 - col) for top-right start.
Wait, that doesn't account for column movements. Let's redo:
Key insight: row can increase at most m times (0 to m-1 to m).
col can decrease at most n times (n-1 to 0 to -1).
Each iteration does exactly one of these.
Therefore: maximum iterations = m + n.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
def staircase_with_operation_count(matrix, target): """ Staircase search with operation counting to verify O(m + n). """ if not matrix or not matrix[0]: return (-1, -1), 0 m, n = len(matrix), len(matrix[0]) row, col = 0, n - 1 comparisons = 0 row_moves = 0 col_moves = 0 while row < m and col >= 0: comparisons += 1 if matrix[row][col] == target: return (row, col), comparisons elif matrix[row][col] > target: col -= 1 col_moves += 1 else: row += 1 row_moves += 1 # Verify the bound total_moves = row_moves + col_moves print(f"Matrix: {m}×{n}") print(f" Row moves: {row_moves} (max possible: {m})") print(f" Col moves: {col_moves} (max possible: {n})") print(f" Total moves: {total_moves} ≤ m+n = {m + n} ✓") print(f" Comparisons: {comparisons}") return (-1, -1), comparisons # Test worst case: target not in matrix, forces maximum pathmatrix = [ [1, 3, 5, 7, 9 ], [2, 4, 6, 8, 10], [11, 13, 15, 17, 19], [12, 14, 16, 18, 20]] # Target that forces traversing from (0, n-1) to (m-1, -1)# Need a value that alternates between "too big" and "too small"result = staircase_with_operation_count(matrix, 0) # Below all values # Another test with larger matriximport randomm, n = 100, 150large_matrix = [[i * n + j for j in range(n)] for i in range(m)]result = staircase_with_operation_count(large_matrix, -1)4. Binary Search 1D: O(log(m × n))
The matrix has m × n total elements. Binary search on a sequence of length N takes O(log N) comparisons.
N = m × n
Time = O(log(m × n))
= O(log m + log n) // by logarithm properties
Why O(log(m × n)) ≠ O(log m × log n):
A common mistake is confusing log(m × n) with (log m) × (log n).
log(m × n) = log m + log n ← Correct (logarithm of product)
(log m)(log n) ← Wrong interpretation
For m = n = 1000:
log(1000 × 1000) = log(10^6) ≈ 20
(log 1000)(log 1000) = 10 × 10 = 100 ← 5x larger!
log(a × b) = log a + log b. This is why O(log(m × n)) = O(log m + log n). For balanced matrices where m ≈ n ≈ √(total), this becomes O(2 log √total) = O(log total). The fully sorted binary search is always logarithmic in the total element count.
Let's compare the complexities for specific matrix dimensions to build intuition.
Comparison for Square Matrices (m = n):
For a square n × n matrix:
| n | Brute Force O(n²) | Binary/Row O(n log n) | Staircase O(2n) | 1D Binary O(2 log n) |
|---|---|---|---|---|
| 10 | 100 | 33 | 20 | 7 |
| 100 | 10,000 | 664 | 200 | 13 |
| 1,000 | 1,000,000 | 9,966 | 2,000 | 20 |
| 10,000 | 100,000,000 | 132,877 | 20,000 | 27 |
| 100,000 | 10,000,000,000 | 1,660,964 | 200,000 | 33 |
Key Observations:
Brute force grows quadratically and quickly becomes impractical.
Binary search per row is better but still grows faster than linear.
Staircase search grows linearly—excellent for row-and-column sorted matrices.
1D binary search grows logarithmically—the clear winner when applicable.
The Crossover Points:
When does staircase beat binary search per row?
O(m + n) < O(m × log n)
m + n < m × log n
1 + n/m < log n
For m = n: 2 < log n, meaning n > 4. So staircase wins for essentially all practical cases.
When is 1D binary search available?
Only for fully sorted matrices. If you only have row-and-column sorting, staircase is your best option (O(m+n)).
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
import math def compare_complexities(m, n): """ Compare algorithm complexities for an m × n matrix. Returns actual operation counts (not just asymptotic). """ brute_force = m * n binary_per_row = m * math.ceil(math.log2(n)) if n > 1 else m binary_per_col = n * math.ceil(math.log2(m)) if m > 1 else n staircase = m + n binary_1d = math.ceil(math.log2(m * n)) if m * n > 1 else 1 return { 'dimensions': f'{m}×{n}', 'total_elements': m * n, 'brute_force': brute_force, 'binary_per_row': binary_per_row, 'binary_per_col': binary_per_col, 'staircase': staircase, 'binary_1d': binary_1d, } def print_comparison_table(dimensions_list): """Print formatted comparison table.""" print(f"{'Dims':>12} {'Total':>12} {'Brute':>12} {'Bin/Row':>12} " f"{'Bin/Col':>12} {'Stair':>12} {'Bin1D':>12}") print("-" * 90) for m, n in dimensions_list: c = compare_complexities(m, n) print(f"{c['dimensions']:>12} {c['total_elements']:>12,} " f"{c['brute_force']:>12,} {c['binary_per_row']:>12,} " f"{c['binary_per_col']:>12,} {c['staircase']:>12,} " f"{c['binary_1d']:>12,}") # Compare various dimensionsdimensions = [ (10, 10), # Small square (100, 100), # Medium square (1000, 1000), # Large square (10, 1000), # Tall narrow (1000, 10), # Short wide (100, 10000), # Very wide (10000, 100), # Very tall] print("\nComplexity Comparison (worst-case operation counts):\n")print_comparison_table(dimensions) # Find when staircase beats binary per rowprint("\n\nWhen does staircase beat binary per row?")print("(For square n×n matrices)\n") for n in [2, 3, 4, 5, 10, 100]: staircase = 2 * n binary_row = n * math.ceil(math.log2(n)) winner = "staircase" if staircase < binary_row else "binary/row" print(f"n={n}: staircase={staircase}, binary/row={binary_row} → {winner}")For non-square matrices, the optimal algorithm may differ. A 10×1000 matrix: binary search per row = 10 × 10 = 100 operations. Staircase = 1010 operations. Binary search per row wins! For 1000×10: binary search per column wins. Consider the aspect ratio.
Each algorithm has different best, average, and worst cases. Understanding these helps predict real-world performance.
Brute Force:
Binary Search per Row:
Staircase Search:
1D Binary Search:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116
import randomimport statistics def analyze_cases(matrix, search_func, trials=1000): """ Empirically measure best, average, and worst case performance. search_func should return (result, comparison_count) """ m, n = len(matrix), len(matrix[0]) all_values = [matrix[i][j] for i in range(m) for j in range(n)] # Generate test targets test_targets = [] # Cases where target exists test_targets.extend(random.choices(all_values, k=trials // 2)) # Cases where target doesn't exist min_val, max_val = min(all_values), max(all_values) for _ in range(trials // 2): # Generate value likely not in matrix candidate = random.randint(min_val - 100, max_val + 100) if candidate not in all_values: test_targets.append(candidate) else: test_targets.append(max_val + random.randint(1, 100)) # Run searches comparison_counts = [] for target in test_targets: _, count = search_func(matrix, target) comparison_counts.append(count) return { 'best': min(comparison_counts), 'worst': max(comparison_counts), 'average': statistics.mean(comparison_counts), 'median': statistics.median(comparison_counts), 'std_dev': statistics.stdev(comparison_counts) if len(comparison_counts) > 1 else 0, } def staircase_search_counted(matrix, target): """Staircase search returning comparison count.""" if not matrix or not matrix[0]: return (-1, -1), 0 m, n = len(matrix), len(matrix[0]) row, col = 0, n - 1 comparisons = 0 while row < m and col >= 0: comparisons += 1 if matrix[row][col] == target: return (row, col), comparisons elif matrix[row][col] > target: col -= 1 else: row += 1 return (-1, -1), comparisons def binary_1d_counted(matrix, target): """1D binary search returning comparison count.""" if not matrix or not matrix[0]: return (-1, -1), 0 m, n = len(matrix), len(matrix[0]) left, right = 0, m * n - 1 comparisons = 0 while left <= right: comparisons += 1 mid = (left + right) // 2 val = matrix[mid // n][mid % n] if val == target: return (mid // n, mid % n), comparisons elif val < target: left = mid + 1 else: right = mid - 1 return (-1, -1), comparisons # Create test matricesdef create_young_tableau(m, n): """Create a row-and-column sorted matrix.""" # Use coordinate sums for guaranteed sorting return [[i + j for j in range(n)] for i in range(m)] def create_fully_sorted(m, n): """Create a fully sorted matrix.""" return [[i * n + j for j in range(n)] for i in range(m)] # Analyzeprint("Case Analysis for 100×100 matrices:\n") young = create_young_tableau(100, 100)fully = create_fully_sorted(100, 100) print("Row-and-Column Sorted Matrix (Staircase Search):")stats = analyze_cases(young, staircase_search_counted, trials=1000)print(f" Best: {stats['best']}, Worst: {stats['worst']}, " f"Avg: {stats['average']:.1f}, Median: {stats['median']:.1f}")print(f" Theoretical worst: {100 + 100} = 200") print("\nFully Sorted Matrix (1D Binary Search):")stats = analyze_cases(fully, binary_1d_counted, trials=1000)print(f" Best: {stats['best']}, Worst: {stats['worst']}, " f"Avg: {stats['average']:.1f}, Median: {stats['median']:.1f}")print(f" Theoretical worst: ceil(log2(10000)) = 14")Binary search has remarkably consistent performance—the best, average, and worst cases are all within a factor of 2. This predictability is valuable for real-time systems where worst-case guarantees matter. Staircase search has more variance but a tight O(m+n) worst-case bound.
Big-O notation hides constant factors that can significantly impact real-world performance. Let's examine what's lurking beneath the asymptotic surface.
Operations per Comparison:
Not all comparisons are equal:
Binary search has slightly higher overhead per iteration, but far fewer iterations.
Memory Access Patterns:
For very large matrices that don't fit in cache, this can matter.
Branch Prediction:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586
import timeimport random def benchmark_real_time(m, n, trials=10000): """ Measure actual wall-clock time for different algorithms. """ # Create matrices young = [[i + j for j in range(n)] for i in range(m)] fully = [[i * n + j for j in range(n)] for i in range(m)] # Generate random targets existing_young = [(i, j) for i in range(m) for j in range(n)] targets_young = [young[i][j] for i, j in random.choices(existing_young, k=trials)] targets_fully = [random.randint(0, m*n-1) for _ in range(trials)] # Non-existing targets targets_missing = [m * n + random.randint(1, 1000) for _ in range(trials)] results = {} # Staircase search on Young tableau def staircase(target): row, col = 0, n - 1 while row < m and col >= 0: if young[row][col] == target: return True elif young[row][col] > target: col -= 1 else: row += 1 return False start = time.perf_counter() for t in targets_young: staircase(t) results['staircase_found'] = (time.perf_counter() - start) * 1000 / trials start = time.perf_counter() for t in targets_missing: staircase(t) results['staircase_missing'] = (time.perf_counter() - start) * 1000 / trials # 1D binary search on fully sorted matrix def binary_1d(target): left, right = 0, m * n - 1 while left <= right: mid = (left + right) // 2 val = fully[mid // n][mid % n] if val == target: return True elif val < target: left = mid + 1 else: right = mid - 1 return False start = time.perf_counter() for t in targets_fully: binary_1d(t) results['binary1d_found'] = (time.perf_counter() - start) * 1000 / trials start = time.perf_counter() for t in targets_missing: binary_1d(t) results['binary1d_missing'] = (time.perf_counter() - start) * 1000 / trials return results # Run benchmarks for different sizesprint("Real-world timing comparison (milliseconds per search):\n")print(f"{'Size':>12} {'Stair(hit)':>12} {'Stair(miss)':>12} " f"{'Bin1D(hit)':>12} {'Bin1D(miss)':>12}")print("-" * 60) for size in [100, 500, 1000, 2000]: results = benchmark_real_time(size, size, trials=5000) print(f"{size}×{size:>6} " f"{results['staircase_found']*1000:>12.3f} " f"{results['staircase_missing']*1000:>12.3f} " f"{results['binary1d_found']*1000:>12.3f} " f"{results['binary1d_missing']*1000:>12.3f}") print("\n(Times in microseconds)")In practice, binary search often outperforms its theoretical advantage due to CPU optimizations and the small number of operations. However, for cache-hostile access patterns in extremely large matrices, staircase search's more sequential access can sometimes compensate for its higher operation count.
All algorithms discussed use O(1) auxiliary space, but there are space-time tradeoffs worth considering:
Preprocessing Options:
If we're willing to spend space, we can accelerate searches:
1. Hash Set Construction: O(m × n) space, O(1) search
2. Flattening to 1D Array: O(m × n) space, O(log(m×n)) search
3. Auxiliary Data Structures: Variable
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150
class MatrixSearcher: """ Wrapper demonstrating space-time tradeoffs for matrix search. """ def __init__(self, matrix, strategy='auto'): """ Initialize with a matrix and search strategy. Strategies: - 'auto': Detect matrix type and use optimal algorithm - 'hash': Build hash set for O(1) lookup (uses O(n) space) - 'native': Use in-place algorithm (O(1) space) """ self.matrix = matrix self.m = len(matrix) if matrix else 0 self.n = len(matrix[0]) if matrix and matrix[0] else 0 self.strategy = strategy self.hash_set = None self.matrix_type = self._detect_type() if strategy == 'hash' or strategy == 'auto' and self.m * self.n < 10000: self._build_hash_set() def _detect_type(self): """Detect matrix sorting structure.""" if not self.matrix: return 'empty' # Check fully sorted fully_sorted = True for i in range(self.m - 1): if self.matrix[i][self.n - 1] > self.matrix[i + 1][0]: fully_sorted = False break row_sorted = all( self.matrix[i][j] <= self.matrix[i][j + 1] for i in range(self.m) for j in range(self.n - 1) ) if not row_sorted: return 'unsorted' if fully_sorted: return 'fully_sorted' col_sorted = all( self.matrix[i][j] <= self.matrix[i + 1][j] for j in range(self.n) for i in range(self.m - 1) ) if col_sorted: return 'row_column_sorted' return 'row_sorted' def _build_hash_set(self): """Build hash set for O(1) membership testing.""" self.hash_set = set() for row in self.matrix: for val in row: self.hash_set.add(val) def search(self, target): """ Search for target using the selected strategy. Returns: True if found, False otherwise """ if self.hash_set is not None: return target in self.hash_set if self.matrix_type == 'fully_sorted': return self._binary_search_1d(target) elif self.matrix_type == 'row_column_sorted': return self._staircase_search(target) elif self.matrix_type == 'row_sorted': return self._binary_search_rows(target) else: return self._linear_search(target) def _binary_search_1d(self, target): left, right = 0, self.m * self.n - 1 while left <= right: mid = (left + right) // 2 val = self.matrix[mid // self.n][mid % self.n] if val == target: return True elif val < target: left = mid + 1 else: right = mid - 1 return False def _staircase_search(self, target): row, col = 0, self.n - 1 while row < self.m and col >= 0: if self.matrix[row][col] == target: return True elif self.matrix[row][col] > target: col -= 1 else: row += 1 return False def _binary_search_rows(self, target): for row in self.matrix: left, right = 0, len(row) - 1 while left <= right: mid = (left + right) // 2 if row[mid] == target: return True elif row[mid] < target: left = mid + 1 else: right = mid - 1 return False def _linear_search(self, target): for row in self.matrix: if target in row: return True return False def get_stats(self): """Return information about the searcher.""" return { 'matrix_type': self.matrix_type, 'dimensions': f'{self.m}×{self.n}', 'using_hash': self.hash_set is not None, 'space_used': 'O(m×n)' if self.hash_set else 'O(1)', 'search_time': 'O(1)' if self.hash_set else { 'fully_sorted': 'O(log(m×n))', 'row_column_sorted': 'O(m+n)', 'row_sorted': 'O(m×log n)', 'unsorted': 'O(m×n)', }.get(self.matrix_type, 'O(m×n)') } # Demofully = [[i * 4 + j for j in range(4)] for i in range(3)]searcher = MatrixSearcher(fully, strategy='native')print("Stats:", searcher.get_stats())print("Search for 7:", searcher.search(7))print("Search for 99:", searcher.search(99))Preprocessing (like building a hash set) is worthwhile when: (1) the matrix is static, (2) you'll perform many searches, and (3) the preprocessing cost is amortized across searches. For a single search, always use the in-place algorithm.
Let's consolidate everything into a clear decision framework.
Step 1: Identify the Matrix Structure
Ask these questions in order:
Step 2: Consider the Use Case
Step 3: Select Algorithm
2D matrix search is a staple of technical interviews. Here's how to excel:
1. Always Clarify the Sorting Structure
Don't assume! Ask: "Are both rows and columns sorted? Does each row end before the next row starts?"
2. State Your Approach Before Coding
"Given this matrix structure, I'll use [algorithm] which achieves O([complexity])."
3. Handle Edge Cases First
4. Know Both Top-Right and Bottom-Left Variants
Some interviewers prefer one; be ready for either.
5. Be Ready to Prove Correctness
Explain the loop invariant if asked: "At each step, if the target exists, it's in the remaining submatrix."
6. Discuss Trade-offs
Mention that staircase is O(m+n) while binary search is O(log(m×n)), but staircase requires weaker sorting assumptions.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182
def searchMatrix(matrix, target): """ Template for interview 2D matrix search problems. Before coding: 1. Clarify: What sorting properties does the matrix have? 2. Choose: Based on structure, select algorithm 3. State: "I'll use [X] for O([Y]) complexity" 4. Code: Implement with clear structure 5. Test: Walk through with example This template handles the most common case: row-and-column sorted. """ # 1. Edge case handling (DO THIS FIRST) if not matrix or not matrix[0]: return False m, n = len(matrix), len(matrix[0]) # Optional: Quick rejection if outside range if target < matrix[0][0] or target > matrix[m-1][n-1]: return False # 2. Algorithm implementation # Staircase search from top-right (or bottom-left) row, col = 0, n - 1 # 3. Clear loop with invariant comment # Invariant: target (if exists) is in M[row:m][0:col+1] while row < m and col >= 0: current = matrix[row][col] if current == target: return True # Found! elif current > target: col -= 1 # Eliminate column else: row += 1 # Eliminate row # 4. Target not found return False def searchMatrix_FullySorted(matrix, target): """ Alternative for FULLY sorted matrix (LeetCode 74 style). Use when: "First integer of each row > last of previous row" """ if not matrix or not matrix[0]: return False m, n = len(matrix), len(matrix[0]) left, right = 0, m * n - 1 while left <= right: mid = left + (right - left) // 2 # Convert 1D index to 2D coordinates row, col = mid // n, mid % n val = matrix[row][col] if val == target: return True elif val < target: left = mid + 1 else: right = mid - 1 return False # Interview walkthrough example:# "Let me trace through with matrix [[1,3,5],[10,11,16],[23,30,34]] and target 11"# # Step 1: Start at (0, 2) = 5# Step 2: 5 < 11, move down to (1, 2) = 16# Step 3: 16 > 11, move left to (1, 1) = 11# Step 4: 11 == 11, return True## "This took 3 comparisons, well within the O(m+n) = 6 bound."This module has taken you from basic 2D matrix concepts to complete mastery of efficient search algorithms. Let's consolidate everything.
| Matrix Type | Best Algorithm | Time | When to Use |
|---|---|---|---|
| Fully sorted | 1D Binary Search | O(log(m×n)) | LeetCode 74, row ends < next start |
| Row-and-column sorted | Staircase | O(m + n) | LeetCode 240, independent row/col sorting |
| Row-sorted only | Binary/Row | O(m × log n) | Rows sorted, columns arbitrary |
| Column-sorted only | Binary/Column | O(n × log m) | Columns sorted, rows arbitrary |
| Unsorted | Linear scan | O(m × n) | No structure to exploit |
Congratulations!
You've completed the Searching in 2D Matrices module. You now possess the theoretical understanding and practical skills to tackle any 2D matrix search problem efficiently. This knowledge extends beyond interviews—it applies to image processing, database queries, geographic information systems, and countless other domains where 2D data must be searched.
The principles learned here—exploiting structure, choosing algorithms based on constraints, analyzing complexity rigorously—are transferable to all algorithmic problem-solving. You're not just learning techniques; you're developing the analytical mindset of a world-class engineer.
You've mastered 2D matrix search! You can identify matrix types, select optimal algorithms, implement them correctly, prove their correctness, and analyze their complexity. From O(m×n) brute force to O(log(m×n)) binary search, you understand the full spectrum of approaches and when each applies.