Data Structures & AlgorithmsCore Array Operations & Cost Model

Core Array Operations & Cost Model

LevelBeginner

Duration75 mins

TopicCore Array Operations & Cost Model

2 / 5

Search (Unsorted) — O(n) Linear Search

The Cost of Not Knowing Where to Look

In the previous page, we discovered that arrays offer instant access when you know the index. Want element 5? Done. Element 5 million? Also done—same O(1) time.

But here's a fundamental question that changes everything:

What if you don't know the index?

What if, instead of asking "What's at position 7?", you're asking "Where is the value 42?" or "Is 'Alice' in this list?"

This is the search problem, and it reveals the first significant cost in working with arrays.

What You Will Learn

By the end of this page, you will understand why searching for a value in an unsorted array requires examining every element in the worst case—O(n) time. You'll learn the linear search algorithm, understand its unavoidable nature, and recognize when this cost is acceptable versus when you need strategies to avoid it.

Access vs Search — The Fundamental Distinction

Before diving deeper, let's crystallize the distinction that makes all the difference:

Access (O(1)): Given an index i, retrieve the value at that position.

Question: "What's at position 7?"
Answer: Immediate—compute the address and fetch.

Search (O(n) for unsorted): Given a value v, find if it exists and where.

Question: "Where is the value 42?" or "Is 'Alice' in this list?"
Answer: You must look through elements until you find it (or exhaust the array).

Array Access

•Input: Index
•Output: Value at that position
•Time: O(1) — constant
•Mechanism: Address calculation
•Example: arr[5] → instantly get value

Array Search

•Input: Target value
•Output: Index where found (or 'not found')
•Time: O(n) — linear (worst case)
•Mechanism: Examine elements one by one
•Example: find(42) → check each element

Why the Difference Matters

This distinction is fundamental to data structure selection. If your use case is 'access by position frequently,' arrays excel. If your use case is 'search by value frequently,' arrays may not be the best choice—hash tables, trees, or sorted arrays with binary search might be better.

Linear Search — The Only Option for Unsorted Data

When your data is unsorted, there's only one way to find a value: check each element until you find it. This is called linear search (or sequential search), and it's the simplest searching algorithm that exists.

The algorithm:

Start at the first element (index 0)
Compare current element with the target value
If they match, return the current index (found!)
If not, move to the next element
If you've checked all elements without finding, return 'not found'

That's it. There are no clever tricks when data has no order—you simply must look everywhere.

linear_search
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def linear_search(arr, target):
    """
    Search for target in arr.
    Returns the index if found, -1 otherwise.
    Time Complexity: O(n) — must check each element
    Space Complexity: O(1) — only a few variables
    """
    for i in range(len(arr)):
        if arr[i] == target:
            return i  # Found! Return index
    return -1  # Not found after checking all elements
 
# Example usage
numbers = [64, 34, 25, 12, 22, 11, 90]
result = linear_search(numbers, 22)
print(f"Found at index: {result}")  # Output: Found at index: 4
 
# Worst case: element not in array or at the end
result = linear_search(numbers, 100)  # Returns -1, checked all 7 elements
result = linear_search(numbers, 90)   # Returns 6, checked all 7 elements

Why O(n) Is Unavoidable for Unsorted Data

You might wonder: "Surely there's a clever trick to search faster?" For unsorted data, the answer is no—and here's why this is mathematically provable.

The Information-Theoretic Argument:

Consider an array of n distinct elements in arbitrary order. When you check element at position i, you learn exactly one fact: whether arr[i] equals the target.

If it doesn't match, what have you learned about the target's location? Absolutely nothing. Position i+1 could hold it, or position n-1, or anywhere else. Without order, positions are independent—checking one tells you nothing about others.

The Adversary Argument

Imagine an adversary who answers your queries about 'is target at position i?' If you skip ANY position, the adversary can place the target there. To GUARANTEE finding the target (or confirming absence), you must check EVERY position. There's no way to be certain with fewer than n checks.

Why guessing doesn't help:

Suppose your target is 42, and you cleverly guess position 500 first. You find 73 there. What does this tell you?

Is 42 to the left of position 500? Unknown
Is 42 to the right of position 500? Unknown
Is 42 in the array at all? Unknown

The value 73 gives you no directional information because the data is unsorted. Position 500 holding 73 tells you nothing about what's at position 0, 501, or anywhere else.

This is the fundamental limitation of unsorted data.

Contrast with sorted data (preview):

If the array were sorted, finding 73 at position 500 tells you:

All elements before position 500 are ≤ 73
All elements after position 500 are ≥ 73
If target 42 < 73, you can ignore the right half!

This is the power of order—it creates relationships between elements. We'll explore this in the next page on binary search.

Analyzing Linear Search — Best, Average, and Worst Case

Let's rigorously analyze linear search across all scenarios.

Worst Case: O(n)

The worst case occurs when:

The target is at the very last position, OR
The target is not in the array at all

In both cases, we must examine all n elements before we can conclude.

Linear Search Complexity Analysis
Case	Comparisons	Big O	When It Happens
Best Case	1	O(1)	Target is the first element
Worst Case	n	O(n)	Target is last or absent
Average Case	n/2	O(n)	Target is somewhere in the middle (on average)

Understanding the Average Case:

If the target is equally likely to be at any position (1 through n), the expected number of comparisons is:

E[comparisons] = (1 + 2 + 3 + ... + n) / n = n(n+1)/(2n) = (n+1)/2 ≈ n/2

This is still O(n)—the constant factor of 1/2 doesn't change the growth rate. When we double n, the average search time doubles.

The Average Case Still Hurts:

For n = 1,000: ~500 comparisons on average
For n = 1,000,000: ~500,000 comparisons on average
For n = 1,000,000,000: ~500,000,000 comparisons on average

Even the 'average' case becomes problematic at scale.

Why We Focus on Worst Case

In algorithm analysis, worst case usually matters most for system design—it tells you the guaranteed upper bound on performance. Systems must handle their heaviest loads, not just typical ones. A search that's 'usually fast' but occasionally takes hours is a reliability problem.

The Cost Illustrated — Linear Search at Scale

Let's make the O(n) cost concrete with realistic numbers. Assume each comparison takes a modest 10 nanoseconds (10⁻⁸ seconds).

Linear Search Time at Different Scales
Array Size (n)	Worst Case Comparisons	Time (@ 10ns each)	Practical Impact
100	100	1 microsecond	Imperceptible
10,000	10,000	100 microseconds	Still fast
1,000,000	1,000,000	10 milliseconds	Noticeable lag
100,000,000	100,000,000	1 second	Unacceptable for UI
10,000,000,000	10,000,000,000	100 seconds	System hang

Now consider frequent searches:

If your application performs 1,000 searches per second on a 1-million-element array:

Each search: ~10ms worst case
1,000 searches: 10,000ms = 10 seconds
You need 10 seconds of CPU time... every second

You physically can't keep up. The searches queue up, latency grows, and eventually the system collapses.

This is why search performance matters. Linear search works fine for small datasets but becomes a bottleneck at scale.

Real-World Disaster

A major e-commerce site once had a product lookup that used linear search through a cached product list. As the catalog grew from 10K to 500K products, page load times increased 50x. Sales dropped 30% before engineers diagnosed the O(n) search buried in the code.

Common Linear Search Patterns in Practice

Linear search manifests in many forms—sometimes disguised. Learning to recognize these patterns helps you spot O(n) costs in your code.

linear_search_patterns
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Pattern 1: Finding first occurrence
def find_first(arr, target):
    for i, val in enumerate(arr):
        if val == target:
            return i
    return -1
 
# Pattern 2: Finding ALL occurrences — O(n) regardless
def find_all(arr, target):
    indices = []
    for i, val in enumerate(arr):
        if val == target:
            indices.append(i)
    return indices  # Must scan entire array
 
# Pattern 3: Finding element matching a condition
def find_first_even(arr):
    for val in arr:
        if val % 2 == 0:
            return val
    return None
 
# Pattern 4: Existence check (often hidden in 'in' operator!)
if target in arr:  # This is O(n) linear search!
    print("Found")
 
# Pattern 5: Count occurrences
count = arr.count(target)  # O(n) — scans entire array
 
# Pattern 6: Finding max/min (linear scan required)
maximum = max(arr)  # O(n) — no way around it for unsorted

Watch for Hidden Linear Searches

Many built-in methods perform linear search: in, includes(), indexOf(), find(), count(), any(), all(). When used in loops, these create O(n²) patterns that devastate performance. Always ask: 'Is this operation O(n)?'

When Linear Search Is Perfectly Fine

Despite its O(n) cost, linear search is often the right choice. Don't optimize prematurely—understand when O(n) is acceptable.

Linear Search Is Fine When:

•Small datasets — For n < 100, even O(n²) is often fast enough. Don't over-engineer.
•One-time or rare searches — If you search once, O(n) is negligible.
•Data is unsorted and must remain so — Sometimes order matters (e.g., insertion order).
•Linear scan is required anyway — Finding max, counting, transforming—you must see every element.
•Simplicity trumps speed — Readable code with O(n) may beat complex O(log n) if n stays small.
•Early termination is likely — If targets are usually near the start, average case is good.

The Engineering Judgment:

The question isn't 'Is O(n) bad?' but 'Is O(n) bad for my use case?'

Searching a 50-element configuration once at startup? Linear search is perfect.
Searching a 10-million-row table 1000 times per second? Linear search will destroy you.

Always profile with realistic data volumes before optimizing.

The Constant Factor Matters for Small n

For small n, simpler algorithms often beat 'faster' ones due to constant factors. Binary search on 10 elements isn't necessarily faster than linear search because the bookkeeping overhead (computing midpoints, etc.) adds up. The crossover point varies but is often around n=50-100.

Strategies to Avoid Linear Search

When linear search becomes a bottleneck, you have several strategies to escape O(n) search time. Each has trade-offs.

Strategy 1: Sort and Use Binary Search

•Sort the array (one-time O(n log n) cost)
•Binary search reduces each search to O(log n)
•Trade-off: Upfront sorting cost; must maintain sorted order on insertions
•Best for: Static or rarely-changing data with frequent searches

Strategy 2: Use a Hash Set/Table

•Build a hash set from the array (O(n) one-time)
•Lookup is O(1) average case
•Trade-off: Extra memory; doesn't give index, just existence
•Best for: Frequent existence checks, membership tests

Strategy 3: Indexing / Precomputation

•Build an index mapping values to positions
•Direct lookup by value becomes O(1)
•Trade-off: Memory for index; must update on changes
•Best for: Value→index mapping; databases use this constantly

The Pattern: Trade Space or Preprocessing for Speed

All these strategies share a theme: invest time/space upfront to make subsequent searches faster. This is a fundamental trade-off in algorithm design—there's no free lunch, but the trade is often worth it.

Summary: Understanding Linear Search

We've thoroughly explored the first major cost in array operations. Let's consolidate the key insights:

Key Takeaways

•Searching ≠ Accessing — Access by index is O(1); search by value is O(n) for unsorted data
•Linear search is unavoidable — With no order, you must potentially check every element
•O(n) scales linearly — Doubling the data doubles the search time
•At scale, O(n) hurts — Millions of elements with frequent searches = performance crisis
•Linear search is fine for small/rare use cases — Don't optimize prematurely
•Escape strategies exist — Sorting, hashing, or indexing trade upfront cost for faster searches

What's next:

We've seen that unsorted arrays force O(n) search. But what if the array IS sorted? Can we do better?

Yes—dramatically better. Next, we explore binary search, which reduces search time from O(n) to O(log n)—a transformation that makes searching a billion elements as fast as searching thirty.

Page Complete

You now understand why searching unsorted arrays requires O(n) time, when this cost is acceptable, and how to escape it when necessary. Next, we'll discover the magic of binary search on sorted arrays.

2 / 5

Loading learning content...

Data Structures & AlgorithmsCore Array Operations & Cost Model

Core Array Operations & Cost Model

LevelBeginner

Duration75 mins

TopicCore Array Operations & Cost Model

2 / 5

Search (Unsorted) — O(n) Linear Search

The Cost of Not Knowing Where to Look

In the previous page, we discovered that arrays offer instant access when you know the index. Want element 5? Done. Element 5 million? Also done—same O(1) time.

But here's a fundamental question that changes everything:

What if you don't know the index?

What if, instead of asking "What's at position 7?", you're asking "Where is the value 42?" or "Is 'Alice' in this list?"

This is the search problem, and it reveals the first significant cost in working with arrays.

What You Will Learn

Access vs Search — The Fundamental Distinction

Before diving deeper, let's crystallize the distinction that makes all the difference:

Access (O(1)): Given an index i, retrieve the value at that position.

Question: "What's at position 7?"
Answer: Immediate—compute the address and fetch.

Search (O(n) for unsorted): Given a value v, find if it exists and where.

Question: "Where is the value 42?" or "Is 'Alice' in this list?"
Answer: You must look through elements until you find it (or exhaust the array).

Array Access

•Input: Index
•Output: Value at that position
•Time: O(1) — constant
•Mechanism: Address calculation
•Example: arr[5] → instantly get value

Array Search

•Input: Target value
•Output: Index where found (or 'not found')
•Time: O(n) — linear (worst case)
•Mechanism: Examine elements one by one
•Example: find(42) → check each element

Why the Difference Matters

Linear Search — The Only Option for Unsorted Data

The algorithm:

Start at the first element (index 0)
Compare current element with the target value
If they match, return the current index (found!)
If not, move to the next element
If you've checked all elements without finding, return 'not found'

That's it. There are no clever tricks when data has no order—you simply must look everywhere.

linear_search
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def linear_search(arr, target):
    """
    Search for target in arr.
    Returns the index if found, -1 otherwise.
    Time Complexity: O(n) — must check each element
    Space Complexity: O(1) — only a few variables
    """
    for i in range(len(arr)):
        if arr[i] == target:
            return i  # Found! Return index
    return -1  # Not found after checking all elements
 
# Example usage
numbers = [64, 34, 25, 12, 22, 11, 90]
result = linear_search(numbers, 22)
print(f"Found at index: {result}")  # Output: Found at index: 4
 
# Worst case: element not in array or at the end
result = linear_search(numbers, 100)  # Returns -1, checked all 7 elements
result = linear_search(numbers, 90)   # Returns 6, checked all 7 elements

Why O(n) Is Unavoidable for Unsorted Data

You might wonder: "Surely there's a clever trick to search faster?" For unsorted data, the answer is no—and here's why this is mathematically provable.

The Information-Theoretic Argument:

Consider an array of n distinct elements in arbitrary order. When you check element at position i, you learn exactly one fact: whether arr[i] equals the target.

The Adversary Argument

Why guessing doesn't help:

Suppose your target is 42, and you cleverly guess position 500 first. You find 73 there. What does this tell you?

Is 42 to the left of position 500? Unknown
Is 42 to the right of position 500? Unknown
Is 42 in the array at all? Unknown

The value 73 gives you no directional information because the data is unsorted. Position 500 holding 73 tells you nothing about what's at position 0, 501, or anywhere else.

This is the fundamental limitation of unsorted data.

Contrast with sorted data (preview):

If the array were sorted, finding 73 at position 500 tells you:

All elements before position 500 are ≤ 73
All elements after position 500 are ≥ 73
If target 42 < 73, you can ignore the right half!

This is the power of order—it creates relationships between elements. We'll explore this in the next page on binary search.

Analyzing Linear Search — Best, Average, and Worst Case

Let's rigorously analyze linear search across all scenarios.

Worst Case: O(n)

The worst case occurs when:

The target is at the very last position, OR
The target is not in the array at all

In both cases, we must examine all n elements before we can conclude.

Linear Search Complexity Analysis
Case	Comparisons	Big O	When It Happens
Best Case	1	O(1)	Target is the first element
Worst Case	n	O(n)	Target is last or absent
Average Case	n/2	O(n)	Target is somewhere in the middle (on average)

Understanding the Average Case:

If the target is equally likely to be at any position (1 through n), the expected number of comparisons is:

E[comparisons] = (1 + 2 + 3 + ... + n) / n = n(n+1)/(2n) = (n+1)/2 ≈ n/2

This is still O(n)—the constant factor of 1/2 doesn't change the growth rate. When we double n, the average search time doubles.

The Average Case Still Hurts:

For n = 1,000: ~500 comparisons on average
For n = 1,000,000: ~500,000 comparisons on average
For n = 1,000,000,000: ~500,000,000 comparisons on average

Even the 'average' case becomes problematic at scale.

Why We Focus on Worst Case

The Cost Illustrated — Linear Search at Scale

Let's make the O(n) cost concrete with realistic numbers. Assume each comparison takes a modest 10 nanoseconds (10⁻⁸ seconds).

Linear Search Time at Different Scales
Array Size (n)	Worst Case Comparisons	Time (@ 10ns each)	Practical Impact
100	100	1 microsecond	Imperceptible
10,000	10,000	100 microseconds	Still fast
1,000,000	1,000,000	10 milliseconds	Noticeable lag
100,000,000	100,000,000	1 second	Unacceptable for UI
10,000,000,000	10,000,000,000	100 seconds	System hang

Now consider frequent searches:

If your application performs 1,000 searches per second on a 1-million-element array:

Each search: ~10ms worst case
1,000 searches: 10,000ms = 10 seconds
You need 10 seconds of CPU time... every second

You physically can't keep up. The searches queue up, latency grows, and eventually the system collapses.

This is why search performance matters. Linear search works fine for small datasets but becomes a bottleneck at scale.

Real-World Disaster

Common Linear Search Patterns in Practice

Linear search manifests in many forms—sometimes disguised. Learning to recognize these patterns helps you spot O(n) costs in your code.

linear_search_patterns
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Pattern 1: Finding first occurrence
def find_first(arr, target):
    for i, val in enumerate(arr):
        if val == target:
            return i
    return -1
 
# Pattern 2: Finding ALL occurrences — O(n) regardless
def find_all(arr, target):
    indices = []
    for i, val in enumerate(arr):
        if val == target:
            indices.append(i)
    return indices  # Must scan entire array
 
# Pattern 3: Finding element matching a condition
def find_first_even(arr):
    for val in arr:
        if val % 2 == 0:
            return val
    return None
 
# Pattern 4: Existence check (often hidden in 'in' operator!)
if target in arr:  # This is O(n) linear search!
    print("Found")
 
# Pattern 5: Count occurrences
count = arr.count(target)  # O(n) — scans entire array
 
# Pattern 6: Finding max/min (linear scan required)
maximum = max(arr)  # O(n) — no way around it for unsorted

Watch for Hidden Linear Searches

When Linear Search Is Perfectly Fine

Despite its O(n) cost, linear search is often the right choice. Don't optimize prematurely—understand when O(n) is acceptable.

Linear Search Is Fine When:

•Small datasets — For n < 100, even O(n²) is often fast enough. Don't over-engineer.
•One-time or rare searches — If you search once, O(n) is negligible.
•Data is unsorted and must remain so — Sometimes order matters (e.g., insertion order).
•Linear scan is required anyway — Finding max, counting, transforming—you must see every element.
•Simplicity trumps speed — Readable code with O(n) may beat complex O(log n) if n stays small.
•Early termination is likely — If targets are usually near the start, average case is good.

The Engineering Judgment:

The question isn't 'Is O(n) bad?' but 'Is O(n) bad for my use case?'

Searching a 50-element configuration once at startup? Linear search is perfect.
Searching a 10-million-row table 1000 times per second? Linear search will destroy you.

Always profile with realistic data volumes before optimizing.

The Constant Factor Matters for Small n

Strategies to Avoid Linear Search

When linear search becomes a bottleneck, you have several strategies to escape O(n) search time. Each has trade-offs.

Strategy 1: Sort and Use Binary Search

•Sort the array (one-time O(n log n) cost)
•Binary search reduces each search to O(log n)
•Trade-off: Upfront sorting cost; must maintain sorted order on insertions
•Best for: Static or rarely-changing data with frequent searches

Strategy 2: Use a Hash Set/Table

•Build a hash set from the array (O(n) one-time)
•Lookup is O(1) average case
•Trade-off: Extra memory; doesn't give index, just existence
•Best for: Frequent existence checks, membership tests

Strategy 3: Indexing / Precomputation

•Build an index mapping values to positions
•Direct lookup by value becomes O(1)
•Trade-off: Memory for index; must update on changes
•Best for: Value→index mapping; databases use this constantly

The Pattern: Trade Space or Preprocessing for Speed

Summary: Understanding Linear Search

We've thoroughly explored the first major cost in array operations. Let's consolidate the key insights:

Key Takeaways

•Searching ≠ Accessing — Access by index is O(1); search by value is O(n) for unsorted data
•Linear search is unavoidable — With no order, you must potentially check every element
•O(n) scales linearly — Doubling the data doubles the search time
•At scale, O(n) hurts — Millions of elements with frequent searches = performance crisis
•Linear search is fine for small/rare use cases — Don't optimize prematurely
•Escape strategies exist — Sorting, hashing, or indexing trade upfront cost for faster searches

What's next:

We've seen that unsorted arrays force O(n) search. But what if the array IS sorted? Can we do better?

Yes—dramatically better. Next, we explore binary search, which reduces search time from O(n) to O(log n)—a transformation that makes searching a billion elements as fast as searching thirty.

Page Complete

2 / 5