Data Structures & AlgorithmsSearching Algorithms

The Search Problem — Formalization

LevelBeginner

Duration60 mins

TopicSearching Algorithms

2 / 4

Search Space and Target

The Two Pillars of Every Search

Every search algorithm, from the simplest linear scan to the most sophisticated distributed search engine, operates on two fundamental pillars: the search space (where we look) and the target (what we seek). These two components completely define the search problem—and understanding them deeply is what separates engineers who apply algorithms mechanically from those who design algorithms creatively.

Consider the difference between searching for a name in a phone book versus searching for a route through a city. The targets are different (a string versus a path), but more importantly, the search spaces are radically different. The phone book is a sorted linear sequence; the city is a graph of interconnected streets. Algorithms that work brilliantly for one fail utterly for the other.

What You Will Learn

By the end of this page, you will master the art of characterizing search spaces—their size, structure, ordering, and accessibility—and understand how to precisely specify search targets. These skills form the foundation for all algorithmic analysis and selection.

Understanding the Search Space

The search space is the universe of all possible candidates that might contain our target. It's not just a collection of elements—it's a structured mathematical entity with properties that enable or preclude different algorithmic approaches.

What Constitutes a Search Space?

Formally, a search space S is a set (or multiset, if duplicates exist) of elements from some domain D, along with:

A cardinality |S| = n, the number of elements
A structure defining relationships between elements
An access model specifying how elements can be examined
Properties like ordering, uniqueness, and distribution

Search Space Characteristics and Their Impact
Characteristic	Definition	Impact on Algorithm Selection
Size (n)	Number of elements in S	Determines asymptotic requirements; small n allows O(n²), large n demands O(log n) or better
Boundedness	Is n finite or infinite?	Infinite spaces require termination criteria; may need approximation algorithms
Static vs Dynamic	Does S change during search?	Dynamic spaces invalidate some optimizations; may need concurrent data structures
Explicit vs Implicit	Is S materialized or generated on-demand?	Implicit spaces save memory but may cost computation; crucial for state-space search
Uniform vs Structured	Do elements have meaningful relationships?	Structure enables divide-and-conquer; uniformity limits to scan

The Size Question

Always ask: 'How many candidates exist?' For an array of n elements, it's simply n. For all possible chess positions, it's ~10^43. For all real numbers between 0 and 1, it's infinite. The answer fundamentally shapes your approach—exhaustive search is viable for n = 100, impossible for n = 10^43, and meaningless for infinite sets.

Structural Classifications of Search Spaces

Search spaces can be classified by their inherent structure. This classification is perhaps the most important factor in algorithm selection.

Linear Search Spaces

Elements exist in a one-dimensional sequence. This includes:

Unordered sequences: Arrays, lists, streams without any sorting
Ordered sequences: Sorted arrays, priority-ordered queues
Indexed sequences: Direct-access arrays versus linked structures

Hierarchical Search Spaces

Elements form parent-child relationships:

Binary trees: Each node has at most two children
N-ary trees: Each node may have many children
Binary search trees: Ordered hierarchy enabling elimination
Heaps: Partially ordered for priority access

Graph-Based Spaces

•Directed graphs: Edges have direction (one-way relationships)
•Undirected graphs: Edges are bidirectional
•Weighted graphs: Edges carry costs or distances
•DAGs: Directed acyclic graphs (dependency structures)
•Complete graphs: All vertices connected to all others
•Sparse graphs: Few edges relative to vertices

Special Spaces

•Hash spaces: Elements mapped by hash function
•Metric spaces: Elements with distance functions
•State spaces: Configurations in problem-solving
•Continuous spaces: Real-valued domains
•Lattice spaces: Partially ordered sets
•Probabilistic spaces: Elements with associated probabilities

Why Structure Matters:

Structure determines information gain from each operation. In an unsorted array, checking element i tells you nothing about element j. In a sorted array, checking element i eliminates half the candidates. In a BST, each comparison eliminates an entire subtree.

The Elimination Principle: The more information each operation provides, the fewer operations needed. Structure is what creates information.

Ordering and Its Power

Among all properties a search space can possess, ordering is arguably the most powerful. A totally ordered search space transforms search from a scanning problem into an elimination problem.

Types of Ordering

Total Order: Every pair of elements is comparable. Examples:

Numbers with < relation
Strings with lexicographic order
Dates with chronological order

Partial Order: Some pairs are comparable, others are not. Examples:

Tasks with dependencies (some tasks are independent)
Version control branches (commits on different branches)
Subset inclusion (neither {1,2} < {2,3} nor {2,3} < {1,2})

ordering-impact.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// The profound impact of ordering on search
 
// Without ordering (unsorted array):
function linear_search(arr, target):
    for each element in arr:    // Must check ALL elements
        if element == target:
            return FOUND
    return NOT_FOUND
    // Worst case: O(n) comparisons
    // Each comparison eliminates: 1 element
    // Information gain per comparison: O(1/n)
 
// With total ordering (sorted array):
function binary_search(arr, target):
    low, high = 0, len(arr) - 1
    while low <= high:
        mid = (low + high) / 2
        if arr[mid] == target:
            return FOUND
        else if arr[mid] < target:
            low = mid + 1     // Eliminate HALF of remaining
        else:
            high = mid - 1    // Eliminate HALF of remaining
    return NOT_FOUND
    // Worst case: O(log n) comparisons
    // Each comparison eliminates: n/2 elements
    // Information gain per comparison: O(n/2)

The Information-Theoretic View

From information theory: locating an element among n requires log₂(n) bits of information. Each comparison in a sorted structure provides 1 bit (left or right). Thus, log₂(n) comparisons suffice. In an unsorted structure, each comparison provides only 1/n bits on average, requiring n comparisons. Ordering is information pre-computed and embedded in the structure.

The Cost of Ordering

Ordering isn't free. Maintaining a sorted structure:

Increases insertion cost from O(1) to O(log n) or O(n)
May require additional memory for balanced structures
Demands a comparison function that forms a consistent order

The engineering decision: Is the one-time sort cost amortized across enough searches to be worthwhile? For one search in n elements, sorting (O(n log n)) then searching (O(log n)) is slower than linear (O(n)). For k searches, sorting pays off when k × O(log n) < k × O(n) - O(n log n), approximately when k > log n.

Explicit vs Implicit Search Spaces

A crucial distinction often overlooked by novice programmers: search spaces can be explicit (physically stored in memory) or implicit (generated mathematically on demand).

Explicit Search Spaces

The elements are materialized—stored in a data structure:

An array of user records in a database
A hash table of cached values
A graph of connected entities

Advantages: Fast access; can precompute properties; supports indexing. Disadvantages: Memory bounded; must construct before search; static snapshots.

Implicit Search Spaces

Elements are defined by rules but not stored:

All possible chess board configurations (too many to store)
All paths through a maze (combinatorially explosive)
All valid solutions to a constraint satisfaction problem

Implicit Space: 8-Puzzle State SpaceThe classic sliding tile puzzle demonstrates implicit search spaces

Input

Output

implicit-space-generator.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
from typing import List, Tuple, Generator
 
class PuzzleState:
    """
    Represents a state in the 8-puzzle implicit search space.
    The search space is defined by the get_neighbors function,
    not by enumerating all possible states.
    """
    
    def __init__(self, grid: Tuple[Tuple[int, ...], ...]):
        self.grid = grid
        self._blank_pos = self._find_blank()
    
    def _find_blank(self) -> Tuple[int, int]:
        """Find position of blank (0) tile."""
        for i, row in enumerate(self.grid):
            for j, val in enumerate(row):
                if val == 0:
                    return (i, j)
        raise ValueError("No blank tile found")
    
    def get_neighbors(self) -> Generator['PuzzleState', None, None]:
        """
        Generate all valid neighboring states.
        
        This is the key to implicit search spaces:
        We don't store all states, we GENERATE neighbors on demand.
        """
        i, j = self._blank_pos
        moves = [(-1, 0), (1, 0), (0, -1), (0, 1)]  # up, down, left, right
        
        for di, dj in moves:
            ni, nj = i + di, j + dj
            if 0 <= ni < 3 and 0 <= nj < 3:
                # Create new grid with tiles swapped
                new_grid = [list(row) for row in self.grid]
                new_grid[i][j], new_grid[ni][nj] = new_grid[ni][nj], new_grid[i][j]
                yield PuzzleState(tuple(tuple(row) for row in new_grid))
    
    def is_goal(self) -> bool:
        """Check if this is the target state."""
        return self.grid == ((1, 2, 3), (4, 5, 6), (7, 8, 0))
    
    def __hash__(self):
        return hash(self.grid)
    
    def __eq__(self, other):
        return isinstance(other, PuzzleState) and self.grid == other.grid

Memory vs Computation Tradeoff

Implicit spaces trade memory for computation. We save storage but recompute neighbors each time. For very deep searches, this recomputation can dominate. The visited set (to avoid cycles) still requires memory proportional to explored states—implicit spaces don't eliminate memory constraints, they reshape them.

Defining the Target Precisely

While the search space defines where we look, the target defines what we're looking for. Imprecise targets lead to incorrect algorithms, edge-case bugs, and ambiguous results.

Three Forms of Target Specification

1. Exact Value Match

The simplest form: find element x where x = v for some specific value v.

Target: Find the integer 42 in array [17, 42, 8, 42, 23]

2. Predicate Satisfaction

Find any element x where predicate p(x) evaluates to true.

Target: Find any element x where x > 30 AND x is even
        → 42 satisfies this predicate

3. Optimization Objective

Find the element x that minimizes or maximizes some function f(x).

Target: Find the element with minimum absolute difference from 25
        → 23 minimizes |x - 25| with difference 2

Target Types and Their Characteristics
Target Type	Specification	Uniqueness	Algorithm Implications
Exact Value	x = v	May have 0, 1, or many matches	Hash lookup O(1) when applicable
Predicate	p(x) = true	Typically many matches	Must evaluate predicate on candidates
Minimum	argmin f(x)	Usually unique (tie-breaking needed)	May enable pruning with bounds
Maximum	argmax f(x)	Usually unique (tie-breaking needed)	Similar to minimum with negation
Range	a ≤ x ≤ b	Typically many matches	Sorted structure yields O(log n + k) for k results
Top-K	K largest/smallest	Exactly K elements	Heap enables O(n log k)

Target Precision in Practice

Real-world targets often combine multiple forms:

Example: Database Query

SELECT * FROM users 
WHERE age > 25 AND country = 'US' 
ORDER BY registration_date 
LIMIT 10

This query has:

Predicate target: age > 25 AND country = 'US'
Optimization component: ORDER BY (sorted output)
Cardinality constraint: LIMIT 10 (top-K)

The query optimizer must decide how to decompose this into primitive searches, what indices to use, and in what order to apply filters.

How Target and Space Interact

The choice of algorithm emerges from the interaction between search space and target. The same target on different spaces requires different algorithms; the same space with different targets requires different algorithms.

Case Analysis: Finding Element ≥ K

Same Target, Different Spaces
Search Space	Target: First x ≥ k	Best Algorithm	Complexity
Unsorted array	Linear scan until x ≥ k found	Linear search with early exit	O(n) worst case
Sorted array	Binary search for lower bound	Binary search variant	O(log n)
Balanced BST	Inorder successor search	Tree traversal	O(log n)
Min-heap	Must examine all elements	Linear scan of heap array	O(n) — heap order doesn't help
Hash set	Linear scan (no order)	Iterate through all	O(n) — hash doesn't preserve order

Observation 1: The target "find first x ≥ k" only benefits from sorting/ordering. Hash structures provide O(1) lookup for equality but can't help with inequality targets.

Observation 2: Heaps are ordered, but only for min/max access. The heap property (parent ≤ children) doesn't help locate arbitrary thresholds.

Now consider the reverse: same space, different targets:

Same Space (Sorted Array), Different Targets
Target Type	Query Example	Algorithm	Complexity
Exact match	Find x where x = 42	Binary search	O(log n)
Existence	Is 42 in the array?	Binary search (boolean result)	O(log n)
First occurrence	First x where x = 42	Binary search (leftmost)	O(log n)
Count	How many elements = 42?	Two binary searches (bounds)	O(log n)
Range query	All x where 30 ≤ x ≤ 50	Two binary searches + iterate	O(log n + k)
Nearest neighbor	Closest x to 42	Binary search + neighbors	O(log n)

The Design Principle

Choose your data structure (search space construction) based on the most frequent target type. If you mostly do equality lookups, hash tables excel. If you need range queries, sorted structures win. If you need priority access, heaps are optimal. Structure your space to match your targets.

Multi-Dimensional Search Spaces

Many real-world search problems involve multi-dimensional data where elements have multiple attributes. Finding the right element requires searching across all dimensions.

The Dimensionality Challenge

Consider searching for restaurants:

Dimension 1: Location (latitude, longitude)
Dimension 2: Cuisine type (categorical)
Dimension 3: Price range (ordinal)
Dimension 4: Rating (continuous)

A search like "Italian restaurants within 2 miles, under $30, rated above 4 stars" queries across all dimensions simultaneously.

The Curse of Dimensionality: As dimensions increase, the volume of the space grows exponentially. Methods that work for 1D or 2D often fail for higher dimensions because 'nearby' becomes increasingly rare.

Structures for Multi-Dimensional Search
Structure	Dimensions	Query Types	Build Time	Query Time
Sorted array + binary search	1D	Range, exact	O(n log n)	O(log n)
K-D Tree	Low-D (2-20)	Range, nearest neighbor	O(n log n)	O(n^(1-1/k) + m)
R-Tree	Low-D spatial	Range, containment	O(n log n)	O(log n + m)
Multi-dimensional index	Arbitrary	Composite queries	Variable	Variable
Linear scan with filters	Any	Any predicate	O(n)	O(n)

Practical Strategies for Multi-Dimensional Problems

Strategy 1: Reduce to 1D If one dimension dominates query patterns, index on that dimension and filter the rest.

Strategy 2: Composite Keys For exact matches across dimensions, combine dimensions into a single key for hash lookup.

Strategy 3: Dimensional Decomposition Query each dimension separately and intersect results. Works well when each dimension filter is highly selective.

Strategy 4: Space-Filling Curves Map multi-dimensional space to 1D using Z-order or Hilbert curves, preserving locality for range queries.

The Database Approach

Modern databases handle multi-dimensional search through query planners that analyze the selectivity of each dimension, choose indices strategically, and combine results optimally. Understanding search spaces helps you understand what the query planner is optimizing.

Summary: Mastering Search Spaces and Targets

We've explored the two fundamental components of every search problem: the space of candidates and the specification of what we seek. Let's consolidate the key insights:

Key Takeaways

•Search spaces have structure — Linear, hierarchical, graph-based, or specialized structures determine which algorithms apply.
•Ordering enables elimination — Sorted spaces transform O(n) scans into O(log n) binary searches. Ordering is pre-computed information.
•Implicit spaces save memory — Generate neighbors on-demand rather than storing all states. Essential for combinatorial problems.
•Targets come in three forms — Exact value, predicate satisfaction, or optimization objective. Each has different algorithmic implications.
•Space and target interact — The optimal algorithm emerges from their combination. Neither alone determines the solution.
•Multi-dimensional spaces are hard — Dimensionality compounds complexity. Practical solutions involve dimension reduction, decomposition, or specialized structures.
•Match your structure to your queries — Design data structures anticipating the targets you'll search for most often.

What's Next:

With search space and target well understood, we'll next explore the types of results a search can produce. Is the goal to confirm existence? Find a location? Return all matches? Each answer type changes the algorithm's requirements and termination conditions.

Page Complete

You now understand how to characterize search spaces by their size, structure, and accessibility, and how to precisely specify search targets. These skills enable you to analyze any search problem and identify which algorithms are applicable. Next, we'll examine the different types of search results.

2 / 4

Loading learning content...

Data Structures & AlgorithmsSearching Algorithms

The Search Problem — Formalization

LevelBeginner

Duration60 mins

TopicSearching Algorithms

2 / 4

Search Space and Target

The Two Pillars of Every Search

What You Will Learn

Understanding the Search Space

What Constitutes a Search Space?

Formally, a search space S is a set (or multiset, if duplicates exist) of elements from some domain D, along with:

A cardinality |S| = n, the number of elements
A structure defining relationships between elements
An access model specifying how elements can be examined
Properties like ordering, uniqueness, and distribution

Search Space Characteristics and Their Impact
Characteristic	Definition	Impact on Algorithm Selection
Size (n)	Number of elements in S	Determines asymptotic requirements; small n allows O(n²), large n demands O(log n) or better
Boundedness	Is n finite or infinite?	Infinite spaces require termination criteria; may need approximation algorithms
Static vs Dynamic	Does S change during search?	Dynamic spaces invalidate some optimizations; may need concurrent data structures
Explicit vs Implicit	Is S materialized or generated on-demand?	Implicit spaces save memory but may cost computation; crucial for state-space search
Uniform vs Structured	Do elements have meaningful relationships?	Structure enables divide-and-conquer; uniformity limits to scan

The Size Question

Structural Classifications of Search Spaces

Search spaces can be classified by their inherent structure. This classification is perhaps the most important factor in algorithm selection.

Linear Search Spaces

Elements exist in a one-dimensional sequence. This includes:

Unordered sequences: Arrays, lists, streams without any sorting
Ordered sequences: Sorted arrays, priority-ordered queues
Indexed sequences: Direct-access arrays versus linked structures

Hierarchical Search Spaces

Elements form parent-child relationships:

Binary trees: Each node has at most two children
N-ary trees: Each node may have many children
Binary search trees: Ordered hierarchy enabling elimination
Heaps: Partially ordered for priority access

Graph-Based Spaces

•Directed graphs: Edges have direction (one-way relationships)
•Undirected graphs: Edges are bidirectional
•Weighted graphs: Edges carry costs or distances
•DAGs: Directed acyclic graphs (dependency structures)
•Complete graphs: All vertices connected to all others
•Sparse graphs: Few edges relative to vertices

Special Spaces

•Hash spaces: Elements mapped by hash function
•Metric spaces: Elements with distance functions
•State spaces: Configurations in problem-solving
•Continuous spaces: Real-valued domains
•Lattice spaces: Partially ordered sets
•Probabilistic spaces: Elements with associated probabilities

Why Structure Matters:

The Elimination Principle: The more information each operation provides, the fewer operations needed. Structure is what creates information.

Ordering and Its Power

Among all properties a search space can possess, ordering is arguably the most powerful. A totally ordered search space transforms search from a scanning problem into an elimination problem.

Types of Ordering

Total Order: Every pair of elements is comparable. Examples:

Numbers with < relation
Strings with lexicographic order
Dates with chronological order

Partial Order: Some pairs are comparable, others are not. Examples:

Tasks with dependencies (some tasks are independent)
Version control branches (commits on different branches)
Subset inclusion (neither {1,2} < {2,3} nor {2,3} < {1,2})

ordering-impact.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// The profound impact of ordering on search
 
// Without ordering (unsorted array):
function linear_search(arr, target):
    for each element in arr:    // Must check ALL elements
        if element == target:
            return FOUND
    return NOT_FOUND
    // Worst case: O(n) comparisons
    // Each comparison eliminates: 1 element
    // Information gain per comparison: O(1/n)
 
// With total ordering (sorted array):
function binary_search(arr, target):
    low, high = 0, len(arr) - 1
    while low <= high:
        mid = (low + high) / 2
        if arr[mid] == target:
            return FOUND
        else if arr[mid] < target:
            low = mid + 1     // Eliminate HALF of remaining
        else:
            high = mid - 1    // Eliminate HALF of remaining
    return NOT_FOUND
    // Worst case: O(log n) comparisons
    // Each comparison eliminates: n/2 elements
    // Information gain per comparison: O(n/2)

The Information-Theoretic View

The Cost of Ordering

Ordering isn't free. Maintaining a sorted structure:

Increases insertion cost from O(1) to O(log n) or O(n)
May require additional memory for balanced structures
Demands a comparison function that forms a consistent order

Explicit vs Implicit Search Spaces

A crucial distinction often overlooked by novice programmers: search spaces can be explicit (physically stored in memory) or implicit (generated mathematically on demand).

Explicit Search Spaces

The elements are materialized—stored in a data structure:

An array of user records in a database
A hash table of cached values
A graph of connected entities

Advantages: Fast access; can precompute properties; supports indexing. Disadvantages: Memory bounded; must construct before search; static snapshots.

Implicit Search Spaces

Elements are defined by rules but not stored:

All possible chess board configurations (too many to store)
All paths through a maze (combinatorially explosive)
All valid solutions to a constraint satisfaction problem

Implicit Space: 8-Puzzle State SpaceThe classic sliding tile puzzle demonstrates implicit search spaces

Input

Output

implicit-space-generator.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
from typing import List, Tuple, Generator
 
class PuzzleState:
    """
    Represents a state in the 8-puzzle implicit search space.
    The search space is defined by the get_neighbors function,
    not by enumerating all possible states.
    """
    
    def __init__(self, grid: Tuple[Tuple[int, ...], ...]):
        self.grid = grid
        self._blank_pos = self._find_blank()
    
    def _find_blank(self) -> Tuple[int, int]:
        """Find position of blank (0) tile."""
        for i, row in enumerate(self.grid):
            for j, val in enumerate(row):
                if val == 0:
                    return (i, j)
        raise ValueError("No blank tile found")
    
    def get_neighbors(self) -> Generator['PuzzleState', None, None]:
        """
        Generate all valid neighboring states.
        
        This is the key to implicit search spaces:
        We don't store all states, we GENERATE neighbors on demand.
        """
        i, j = self._blank_pos
        moves = [(-1, 0), (1, 0), (0, -1), (0, 1)]  # up, down, left, right
        
        for di, dj in moves:
            ni, nj = i + di, j + dj
            if 0 <= ni < 3 and 0 <= nj < 3:
                # Create new grid with tiles swapped
                new_grid = [list(row) for row in self.grid]
                new_grid[i][j], new_grid[ni][nj] = new_grid[ni][nj], new_grid[i][j]
                yield PuzzleState(tuple(tuple(row) for row in new_grid))
    
    def is_goal(self) -> bool:
        """Check if this is the target state."""
        return self.grid == ((1, 2, 3), (4, 5, 6), (7, 8, 0))
    
    def __hash__(self):
        return hash(self.grid)
    
    def __eq__(self, other):
        return isinstance(other, PuzzleState) and self.grid == other.grid

Memory vs Computation Tradeoff

Defining the Target Precisely

While the search space defines where we look, the target defines what we're looking for. Imprecise targets lead to incorrect algorithms, edge-case bugs, and ambiguous results.

Three Forms of Target Specification

1. Exact Value Match

The simplest form: find element x where x = v for some specific value v.

Target: Find the integer 42 in array [17, 42, 8, 42, 23]

2. Predicate Satisfaction

Find any element x where predicate p(x) evaluates to true.

Target: Find any element x where x > 30 AND x is even
        → 42 satisfies this predicate

3. Optimization Objective

Find the element x that minimizes or maximizes some function f(x).

Target: Find the element with minimum absolute difference from 25
        → 23 minimizes |x - 25| with difference 2

Target Types and Their Characteristics
Target Type	Specification	Uniqueness	Algorithm Implications
Exact Value	x = v	May have 0, 1, or many matches	Hash lookup O(1) when applicable
Predicate	p(x) = true	Typically many matches	Must evaluate predicate on candidates
Minimum	argmin f(x)	Usually unique (tie-breaking needed)	May enable pruning with bounds
Maximum	argmax f(x)	Usually unique (tie-breaking needed)	Similar to minimum with negation
Range	a ≤ x ≤ b	Typically many matches	Sorted structure yields O(log n + k) for k results
Top-K	K largest/smallest	Exactly K elements	Heap enables O(n log k)

Target Precision in Practice

Real-world targets often combine multiple forms:

Example: Database Query

SELECT * FROM users 
WHERE age > 25 AND country = 'US' 
ORDER BY registration_date 
LIMIT 10

This query has:

Predicate target: age > 25 AND country = 'US'
Optimization component: ORDER BY (sorted output)
Cardinality constraint: LIMIT 10 (top-K)

The query optimizer must decide how to decompose this into primitive searches, what indices to use, and in what order to apply filters.

How Target and Space Interact

Case Analysis: Finding Element ≥ K

Same Target, Different Spaces
Search Space	Target: First x ≥ k	Best Algorithm	Complexity
Unsorted array	Linear scan until x ≥ k found	Linear search with early exit	O(n) worst case
Sorted array	Binary search for lower bound	Binary search variant	O(log n)
Balanced BST	Inorder successor search	Tree traversal	O(log n)
Min-heap	Must examine all elements	Linear scan of heap array	O(n) — heap order doesn't help
Hash set	Linear scan (no order)	Iterate through all	O(n) — hash doesn't preserve order

Observation 1: The target "find first x ≥ k" only benefits from sorting/ordering. Hash structures provide O(1) lookup for equality but can't help with inequality targets.

Observation 2: Heaps are ordered, but only for min/max access. The heap property (parent ≤ children) doesn't help locate arbitrary thresholds.

Now consider the reverse: same space, different targets:

Same Space (Sorted Array), Different Targets
Target Type	Query Example	Algorithm	Complexity
Exact match	Find x where x = 42	Binary search	O(log n)
Existence	Is 42 in the array?	Binary search (boolean result)	O(log n)
First occurrence	First x where x = 42	Binary search (leftmost)	O(log n)
Count	How many elements = 42?	Two binary searches (bounds)	O(log n)
Range query	All x where 30 ≤ x ≤ 50	Two binary searches + iterate	O(log n + k)
Nearest neighbor	Closest x to 42	Binary search + neighbors	O(log n)

The Design Principle

Multi-Dimensional Search Spaces

Many real-world search problems involve multi-dimensional data where elements have multiple attributes. Finding the right element requires searching across all dimensions.

The Dimensionality Challenge

Consider searching for restaurants:

Dimension 1: Location (latitude, longitude)
Dimension 2: Cuisine type (categorical)
Dimension 3: Price range (ordinal)
Dimension 4: Rating (continuous)

A search like "Italian restaurants within 2 miles, under $30, rated above 4 stars" queries across all dimensions simultaneously.

Structures for Multi-Dimensional Search
Structure	Dimensions	Query Types	Build Time	Query Time
Sorted array + binary search	1D	Range, exact	O(n log n)	O(log n)
K-D Tree	Low-D (2-20)	Range, nearest neighbor	O(n log n)	O(n^(1-1/k) + m)
R-Tree	Low-D spatial	Range, containment	O(n log n)	O(log n + m)
Multi-dimensional index	Arbitrary	Composite queries	Variable	Variable
Linear scan with filters	Any	Any predicate	O(n)	O(n)

Practical Strategies for Multi-Dimensional Problems

Strategy 1: Reduce to 1D If one dimension dominates query patterns, index on that dimension and filter the rest.

Strategy 2: Composite Keys For exact matches across dimensions, combine dimensions into a single key for hash lookup.

Strategy 3: Dimensional Decomposition Query each dimension separately and intersect results. Works well when each dimension filter is highly selective.

Strategy 4: Space-Filling Curves Map multi-dimensional space to 1D using Z-order or Hilbert curves, preserving locality for range queries.

The Database Approach

Summary: Mastering Search Spaces and Targets

We've explored the two fundamental components of every search problem: the space of candidates and the specification of what we seek. Let's consolidate the key insights:

Key Takeaways

•Search spaces have structure — Linear, hierarchical, graph-based, or specialized structures determine which algorithms apply.
•Ordering enables elimination — Sorted spaces transform O(n) scans into O(log n) binary searches. Ordering is pre-computed information.
•Implicit spaces save memory — Generate neighbors on-demand rather than storing all states. Essential for combinatorial problems.
•Targets come in three forms — Exact value, predicate satisfaction, or optimization objective. Each has different algorithmic implications.
•Space and target interact — The optimal algorithm emerges from their combination. Neither alone determines the solution.
•Multi-dimensional spaces are hard — Dimensionality compounds complexity. Practical solutions involve dimension reduction, decomposition, or specialized structures.
•Match your structure to your queries — Design data structures anticipating the targets you'll search for most often.

What's Next:

Page Complete

2 / 4