Loading learning content...
In the previous page, we discovered that arrays offer instant access when you know the index. Want element 5? Done. Element 5 million? Also done—same O(1) time.
But here's a fundamental question that changes everything:
What if you don't know the index?
What if, instead of asking "What's at position 7?", you're asking "Where is the value 42?" or "Is 'Alice' in this list?"
This is the search problem, and it reveals the first significant cost in working with arrays.
By the end of this page, you will understand why searching for a value in an unsorted array requires examining every element in the worst case—O(n) time. You'll learn the linear search algorithm, understand its unavoidable nature, and recognize when this cost is acceptable versus when you need strategies to avoid it.
Before diving deeper, let's crystallize the distinction that makes all the difference:
Access (O(1)): Given an index i, retrieve the value at that position.
Search (O(n) for unsorted): Given a value v, find if it exists and where.
This distinction is fundamental to data structure selection. If your use case is 'access by position frequently,' arrays excel. If your use case is 'search by value frequently,' arrays may not be the best choice—hash tables, trees, or sorted arrays with binary search might be better.
When your data is unsorted, there's only one way to find a value: check each element until you find it. This is called linear search (or sequential search), and it's the simplest searching algorithm that exists.
The algorithm:
That's it. There are no clever tricks when data has no order—you simply must look everywhere.
1234567891011121314151617181920
def linear_search(arr, target): """ Search for target in arr. Returns the index if found, -1 otherwise. Time Complexity: O(n) — must check each element Space Complexity: O(1) — only a few variables """ for i in range(len(arr)): if arr[i] == target: return i # Found! Return index return -1 # Not found after checking all elements # Example usagenumbers = [64, 34, 25, 12, 22, 11, 90]result = linear_search(numbers, 22)print(f"Found at index: {result}") # Output: Found at index: 4 # Worst case: element not in array or at the endresult = linear_search(numbers, 100) # Returns -1, checked all 7 elementsresult = linear_search(numbers, 90) # Returns 6, checked all 7 elementsYou might wonder: "Surely there's a clever trick to search faster?" For unsorted data, the answer is no—and here's why this is mathematically provable.
The Information-Theoretic Argument:
Consider an array of n distinct elements in arbitrary order. When you check element at position i, you learn exactly one fact: whether arr[i] equals the target.
If it doesn't match, what have you learned about the target's location? Absolutely nothing. Position i+1 could hold it, or position n-1, or anywhere else. Without order, positions are independent—checking one tells you nothing about others.
Imagine an adversary who answers your queries about 'is target at position i?' If you skip ANY position, the adversary can place the target there. To GUARANTEE finding the target (or confirming absence), you must check EVERY position. There's no way to be certain with fewer than n checks.
Why guessing doesn't help:
Suppose your target is 42, and you cleverly guess position 500 first. You find 73 there. What does this tell you?
The value 73 gives you no directional information because the data is unsorted. Position 500 holding 73 tells you nothing about what's at position 0, 501, or anywhere else.
This is the fundamental limitation of unsorted data.
Contrast with sorted data (preview):
If the array were sorted, finding 73 at position 500 tells you:
This is the power of order—it creates relationships between elements. We'll explore this in the next page on binary search.
Let's rigorously analyze linear search across all scenarios.
Worst Case: O(n)
The worst case occurs when:
In both cases, we must examine all n elements before we can conclude.
| Case | Comparisons | Big O | When It Happens |
|---|---|---|---|
| Best Case | 1 | O(1) | Target is the first element |
| Worst Case | n | O(n) | Target is last or absent |
| Average Case | n/2 | O(n) | Target is somewhere in the middle (on average) |
Understanding the Average Case:
If the target is equally likely to be at any position (1 through n), the expected number of comparisons is:
E[comparisons] = (1 + 2 + 3 + ... + n) / n = n(n+1)/(2n) = (n+1)/2 ≈ n/2
This is still O(n)—the constant factor of 1/2 doesn't change the growth rate. When we double n, the average search time doubles.
The Average Case Still Hurts:
Even the 'average' case becomes problematic at scale.
In algorithm analysis, worst case usually matters most for system design—it tells you the guaranteed upper bound on performance. Systems must handle their heaviest loads, not just typical ones. A search that's 'usually fast' but occasionally takes hours is a reliability problem.
Let's make the O(n) cost concrete with realistic numbers. Assume each comparison takes a modest 10 nanoseconds (10⁻⁸ seconds).
| Array Size (n) | Worst Case Comparisons | Time (@ 10ns each) | Practical Impact |
|---|---|---|---|
| 100 | 100 | 1 microsecond | Imperceptible |
| 10,000 | 10,000 | 100 microseconds | Still fast |
| 1,000,000 | 1,000,000 | 10 milliseconds | Noticeable lag |
| 100,000,000 | 100,000,000 | 1 second | Unacceptable for UI |
| 10,000,000,000 | 10,000,000,000 | 100 seconds | System hang |
Now consider frequent searches:
If your application performs 1,000 searches per second on a 1-million-element array:
You physically can't keep up. The searches queue up, latency grows, and eventually the system collapses.
This is why search performance matters. Linear search works fine for small datasets but becomes a bottleneck at scale.
A major e-commerce site once had a product lookup that used linear search through a cached product list. As the catalog grew from 10K to 500K products, page load times increased 50x. Sales dropped 30% before engineers diagnosed the O(n) search buried in the code.
Linear search manifests in many forms—sometimes disguised. Learning to recognize these patterns helps you spot O(n) costs in your code.
12345678910111213141516171819202122232425262728293031
# Pattern 1: Finding first occurrencedef find_first(arr, target): for i, val in enumerate(arr): if val == target: return i return -1 # Pattern 2: Finding ALL occurrences — O(n) regardlessdef find_all(arr, target): indices = [] for i, val in enumerate(arr): if val == target: indices.append(i) return indices # Must scan entire array # Pattern 3: Finding element matching a conditiondef find_first_even(arr): for val in arr: if val % 2 == 0: return val return None # Pattern 4: Existence check (often hidden in 'in' operator!)if target in arr: # This is O(n) linear search! print("Found") # Pattern 5: Count occurrencescount = arr.count(target) # O(n) — scans entire array # Pattern 6: Finding max/min (linear scan required)maximum = max(arr) # O(n) — no way around it for unsortedMany built-in methods perform linear search: in, includes(), indexOf(), find(), count(), any(), all(). When used in loops, these create O(n²) patterns that devastate performance. Always ask: 'Is this operation O(n)?'
Despite its O(n) cost, linear search is often the right choice. Don't optimize prematurely—understand when O(n) is acceptable.
The Engineering Judgment:
The question isn't 'Is O(n) bad?' but 'Is O(n) bad for my use case?'
Always profile with realistic data volumes before optimizing.
For small n, simpler algorithms often beat 'faster' ones due to constant factors. Binary search on 10 elements isn't necessarily faster than linear search because the bookkeeping overhead (computing midpoints, etc.) adds up. The crossover point varies but is often around n=50-100.
When linear search becomes a bottleneck, you have several strategies to escape O(n) search time. Each has trade-offs.
The Pattern: Trade Space or Preprocessing for Speed
All these strategies share a theme: invest time/space upfront to make subsequent searches faster. This is a fundamental trade-off in algorithm design—there's no free lunch, but the trade is often worth it.
We've thoroughly explored the first major cost in array operations. Let's consolidate the key insights:
What's next:
We've seen that unsorted arrays force O(n) search. But what if the array IS sorted? Can we do better?
Yes—dramatically better. Next, we explore binary search, which reduces search time from O(n) to O(log n)—a transformation that makes searching a billion elements as fast as searching thirty.
You now understand why searching unsorted arrays requires O(n) time, when this cost is acceptable, and how to escape it when necessary. Next, we'll discover the magic of binary search on sorted arrays.