Loading content...
If you were to analyze every algorithm problem ever solved using hash tables, one pattern would appear more frequently than any other: frequency counting. It is so fundamental, so universally applicable, that mastering it unlocks a vast portion of algorithmic problem-solving.
Frequency counting answers the simplest yet most powerful question: How many times does each thing appear?
This deceptively simple question underlies countless real-world applications:
In every case, the naive approach—iterating through the data repeatedly for each unique element—yields O(n²) complexity. The hash-based approach? O(n). This single pattern optimization has saved more computing hours than perhaps any other algorithmic technique.
By the end of this page, you will deeply understand frequency counting as a pattern: when to apply it, how to implement it efficiently in multiple languages, and how to recognize disguised frequency problems. You will see how this single pattern forms the foundation for dozens of classic interview problems and real-world applications.
At its heart, frequency counting is a mapping problem. We need to map each unique element to its count of occurrences. This is precisely what hash tables excel at: mapping keys to values in O(1) average time.
The Mental Model:
Imagine you're a librarian cataloging books. As each book arrives, you check if you've seen this title before:
After processing all books, you have a complete count of every title. The hash table is your catalog index—providing instant lookup by title.
The Invariant:
At any point during frequency counting, the hash table maintains a critical invariant:
For every key k in the map, map[k] equals the exact number of times k appeared in the processed portion of the input.
This invariant is what makes frequency counting so reliable. It's self-correcting—any element we encounter either creates or updates an entry, never leaving the count in an inconsistent state.
123456789101112131415161718192021222324252627282930
/** * Basic frequency counting: count occurrences of each element * Time: O(n) where n is the number of elements * Space: O(k) where k is the number of unique elements */function countFrequencies<T>(elements: T[]): Map<T, number> { const frequencyMap = new Map<T, number>(); for (const element of elements) { // Get current count (defaults to 0 if not present) const currentCount = frequencyMap.get(element) ?? 0; // Increment and store frequencyMap.set(element, currentCount + 1); } return frequencyMap;} // Example usageconst words = ["apple", "banana", "apple", "cherry", "banana", "apple"];const wordCounts = countFrequencies(words); // Result: Map { "apple" => 3, "banana" => 2, "cherry" => 1 }console.log(wordCounts); // Converting to array for analysisconst sorted = [...wordCounts.entries()] .sort((a, b) => b[1] - a[1]); // Sort by count descending console.log("Most frequent:", sorted[0]); // ["apple", 3]Understanding the complexity of frequency counting requires examining both the best and worst cases, as well as understanding why this pattern is so efficient.
Time Complexity:
Space Complexity:
Why O(n) Matters:
Consider the alternative without hash tables. To count frequencies, you might:
This approach is O(n²). On a million elements:
The speedup factor is 1,000,000x. This is why frequency counting with hash tables is considered one of the most important algorithmic patterns.
| Metric | Hash Table Approach | Naive Approach | Improvement Factor |
|---|---|---|---|
| Time (n=1,000) | O(1,000) | O(1,000,000) | 1,000x |
| Time (n=1,000,000) | O(1,000,000) | O(1,000,000,000,000) | 1,000,000x |
| Space (worst) | O(n) | O(n) | Same |
| Space (typical) | O(k) where k << n | O(n) | Better |
In real-world applications, the space complexity O(k) is often much better than O(n). Counting word frequencies in a novel has millions of words but typically only 10,000-30,000 unique words. Counting customer purchase categories for a retailer might involve billions of transactions but only hundreds of product categories.
The basic frequency counting pattern extends to many powerful variations. Each builds on the core concept while adding specific capabilities.
Variation 1: Bounded Frequency Counting
Sometimes we only care about counts up to some threshold. For example, when detecting duplicates, we only need to know if count > 1. This can optimize early termination:
1234567891011121314151617181920212223242526272829303132333435363738
/** * Find elements that appear more than k times * Optimization: Track only until we exceed threshold */function findFrequentElements<T>(elements: T[], k: number): T[] { const counts = new Map<T, number>(); const result: T[] = []; const alreadyAdded = new Set<T>(); for (const element of elements) { const newCount = (counts.get(element) ?? 0) + 1; counts.set(element, newCount); // Add to result when threshold exceeded (only once) if (newCount > k && !alreadyAdded.has(element)) { result.push(element); alreadyAdded.add(element); } } return result;} /** * Find the first duplicate element (early termination) */function findFirstDuplicate<T>(elements: T[]): T | undefined { const seen = new Set<T>(); for (const element of elements) { if (seen.has(element)) { return element; // Early exit! } seen.add(element); } return undefined;}Variation 2: Frequency Comparison
Comparing frequencies between two collections reveals relationships like anagrams, permutations, or subset containment:
123456789101112131415161718192021222324252627282930313233343536373839404142
/** * Check if two strings are anagrams * Key insight: Anagrams have identical character frequencies */function areAnagrams(s1: string, s2: string): boolean { if (s1.length !== s2.length) return false; const freq = new Map<string, number>(); // Build frequency map for first string for (const char of s1) { freq.set(char, (freq.get(char) ?? 0) + 1); } // Subtract frequencies for second string for (const char of s2) { const count = freq.get(char); if (count === undefined || count === 0) { return false; // Character not in s1 or depleted } freq.set(char, count - 1); } return true; // All counts are now 0} /** * Alternative: Compare two frequency maps */function mapsEqual<K, V>(map1: Map<K, V>, map2: Map<K, V>): boolean { if (map1.size !== map2.size) return false; for (const [key, value] of map1) { if (map2.get(key) !== value) return false; } return true;} // Examplesconsole.log(areAnagrams("listen", "silent")); // trueconsole.log(areAnagrams("hello", "world")); // falseVariation 3: Rolling/Sliding Window Frequency
Maintaining frequencies over a sliding window requires careful updates as elements enter and leave:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
/** * Find all anagram occurrences of pattern in text * Uses sliding window with frequency comparison */function findAnagramIndices(text: string, pattern: string): number[] { if (pattern.length > text.length) return []; const result: number[] = []; const patternFreq = new Map<string, number>(); const windowFreq = new Map<string, number>(); // Build pattern frequency map for (const char of pattern) { patternFreq.set(char, (patternFreq.get(char) ?? 0) + 1); } // Track how many characters have matching counts let matchCount = 0; const targetMatchCount = patternFreq.size; for (let i = 0; i < text.length; i++) { // Add character entering window const entering = text[i]; const newEnteringCount = (windowFreq.get(entering) ?? 0) + 1; windowFreq.set(entering, newEnteringCount); // Update match count for entering character const patternEnteringCount = patternFreq.get(entering) ?? 0; if (newEnteringCount === patternEnteringCount) { matchCount++; } else if (newEnteringCount === patternEnteringCount + 1) { matchCount--; } // Remove character leaving window (if window is full) if (i >= pattern.length) { const leaving = text[i - pattern.length]; const oldLeavingCount = windowFreq.get(leaving)!; const newLeavingCount = oldLeavingCount - 1; if (newLeavingCount === 0) { windowFreq.delete(leaving); } else { windowFreq.set(leaving, newLeavingCount); } // Update match count for leaving character const patternLeavingCount = patternFreq.get(leaving) ?? 0; if (oldLeavingCount === patternLeavingCount) { matchCount--; } else if (newLeavingCount === patternLeavingCount) { matchCount++; } } // Check if current window is an anagram if (i >= pattern.length - 1 && matchCount === targetMatchCount) { result.push(i - pattern.length + 1); } } return result;} // Exampleconsole.log(findAnagramIndices("cbaebabacd", "abc")); // [0, 6]Frequency counting isn't just an interview pattern—it's a cornerstone of real production systems across every industry. Let's examine how this pattern manifests in different domains:
1. Search Engines: TF-IDF and Document Indexing
When you search for "best pizza near me," search engines don't match your query against every document sequentially. Instead, they've pre-computed term frequencies for every document in their index.
Both require frequency counting at massive scale. Google processes over 8.5 billion searches per day, each matching against billions of indexed documents—all powered by frequency-based ranking.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
from collections import Counterfrom math import logfrom typing import List, Dict def compute_tf(document: str) -> Dict[str, float]: """ Compute term frequency for each word in a document. TF(word) = count(word) / total_words """ words = document.lower().split() word_counts = Counter(words) total_words = len(words) return {word: count / total_words for word, count in word_counts.items()} def compute_idf(documents: List[str]) -> Dict[str, float]: """ Compute inverse document frequency across corpus. IDF(word) = log(total_docs / docs_containing_word) """ total_docs = len(documents) doc_counts: Counter = Counter() for doc in documents: # Count each word once per document (use set) unique_words = set(doc.lower().split()) doc_counts.update(unique_words) return {word: log(total_docs / count) for word, count in doc_counts.items()} def compute_tfidf(document: str, idf: Dict[str, float]) -> Dict[str, float]: """Compute TF-IDF scores for a document.""" tf = compute_tf(document) return {word: tf_score * idf.get(word, 0) for word, tf_score in tf.items()} # Example: Simple search corpuscorpus = [ "The quick brown fox jumps over the lazy dog", "The lazy cat sleeps all day long", "A quick solution requires quick thinking"] idf = compute_idf(corpus)document_scores = compute_tfidf(corpus[0], idf) # Words unique to document 0 have higher TF-IDFprint(sorted(document_scores.items(), key=lambda x: x[1], reverse=True)[:5])2. Network Security: DDoS Detection and Rate Limiting
Frequency counting is essential for detecting abuse patterns in network traffic:
3. Bioinformatics: Genome Analysis
Counting nucleotide (A, C, G, T) frequencies helps identify:
4. Recommendation Systems
Frequency-based collaborative filtering:
5. Log Analysis and Observability
At extreme scale (billions of events), exact frequency counting becomes impractical due to memory constraints. Production systems often use probabilistic data structures like Count-Min Sketch or HyperLogLog for approximate counting with bounded error guarantees. These trade exact accuracy for O(1) space complexity.
Frequency counting appears in numerous interview problems, often in disguised forms. Learning to recognize the pattern is key to solving them efficiently.
Problem 1: Top K Frequent Elements
Given an array of integers, return the k most frequent elements. This is a direct application of frequency counting followed by selection.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
/** * Top K Frequent Elements * * Approach 1: Sort by frequency - O(n log n) * Approach 2: Min-heap of size k - O(n log k) * Approach 3: Bucket sort - O(n) (when k can be large) */ // Approach 1: Simple sorting (O(n log n))function topKFrequentSort(nums: number[], k: number): number[] { const freq = new Map<number, number>(); for (const num of nums) { freq.set(num, (freq.get(num) ?? 0) + 1); } return [...freq.entries()] .sort((a, b) => b[1] - a[1]) .slice(0, k) .map(entry => entry[0]);} // Approach 3: Bucket sort (O(n))// Key insight: frequency can be at most n, so we can use an array of bucketsfunction topKFrequentBucket(nums: number[], k: number): number[] { const freq = new Map<number, number>(); for (const num of nums) { freq.set(num, (freq.get(num) ?? 0) + 1); } // Create buckets: bucket[i] contains elements with frequency i // Index 0 unused since no element has frequency 0 const buckets: number[][] = Array.from( { length: nums.length + 1 }, () => [] ); for (const [num, count] of freq) { buckets[count].push(num); } // Collect from highest frequency buckets const result: number[] = []; for (let i = buckets.length - 1; i >= 0 && result.length < k; i--) { result.push(...buckets[i]); } return result.slice(0, k);} // Exampleconsole.log(topKFrequentBucket([1,1,1,2,2,3], 2)); // [1, 2]Problem 2: Valid Anagram
Given two strings s and t, return true if t is an anagram of s. This is pure frequency comparison.
Problem 3: First Unique Character in a String
Given a string, find the first non-repeating character and return its index. If it doesn't exist, return -1.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
/** * First Unique Character in a String * * Key insight: Count frequencies first, then scan for count === 1 * Time: O(n), Space: O(1) if alphabet is fixed (26 lowercase letters) */function firstUniqChar(s: string): number { const freq = new Map<string, number>(); // First pass: count frequencies for (const char of s) { freq.set(char, (freq.get(char) ?? 0) + 1); } // Second pass: find first with frequency 1 for (let i = 0; i < s.length; i++) { if (freq.get(s[i]) === 1) { return i; } } return -1;} /** * Variant: Track order of appearance for single-pass streaming * Useful when you can't re-scan the input */function firstUniqCharStream(s: string): number { const freq = new Map<string, number>(); const order: number[] = []; // Index of first occurrence const charToIndex = new Map<string, number>(); for (let i = 0; i < s.length; i++) { const char = s[i]; const count = (freq.get(char) ?? 0) + 1; freq.set(char, count); if (count === 1) { charToIndex.set(char, i); order.push(i); } } // Find first index in order with frequency 1 for (const idx of order) { if (freq.get(s[idx]) === 1) { return idx; } } return -1;} console.log(firstUniqChar("onenoughtone")); // 2 (e is unique)console.log(firstUniqChar("algorithm")); // 0 (a is first unique)Problem 4: Ransom Note
Given two strings ransomNote and magazine, return true if ransomNote can be constructed from the letters in magazine. Each letter in magazine can only be used once.
1234567891011121314151617181920212223242526272829303132333435363738394041
from collections import Counter def can_construct(ransom_note: str, magazine: str) -> bool: """ Can ransom_note be constructed from magazine letters? Key insight: magazine must have >= frequency of each char in note """ note_freq = Counter(ransom_note) magazine_freq = Counter(magazine) for char, count in note_freq.items(): if magazine_freq[char] < count: return False return True def can_construct_subtract(ransom_note: str, magazine: str) -> bool: """Alternative: subtract and check for negatives""" magazine_freq = Counter(magazine) for char in ransom_note: magazine_freq[char] -= 1 if magazine_freq[char] < 0: return False return True def can_construct_pythonic(ransom_note: str, magazine: str) -> bool: """ Most Pythonic: Counter subtraction leaves only shortage Counter with negative counts indicates what's missing """ return not (Counter(ransom_note) - Counter(magazine)) # Examplesprint(can_construct("aa", "aab")) # Trueprint(can_construct("aa", "ab")) # FalseEven experienced engineers make mistakes with frequency counting. Here are the most common pitfalls and how to avoid them:
map.get(key) returns undefined for missing keysdict[key] raises KeyError for missing keysmap.get(key) returns null for missing keysget(key, default) or equivalent: map.get(key) ?? 0str.toLowerCase() or str.lower()0.1 + 0.2 !== 0.3 due to floating-point representation[1,2,3] !== [1,2,3] - Arrays are compared by referenceJSON.stringify() or tuple()123456789101112131415161718192021222324
// BAD: Using arrays as Map keysconst badMap = new Map<number[], number>();const key1 = [1, 2, 3];const key2 = [1, 2, 3];badMap.set(key1, 1);console.log(badMap.get(key2)); // undefined! (different reference) // GOOD: Use string representationconst goodMap = new Map<string, number>();goodMap.set(JSON.stringify([1, 2, 3]), 1);console.log(goodMap.get(JSON.stringify([1, 2, 3]))); // 1 // GOOD: Group by creating composite key stringsfunction groupByPair(items: [string, string][]): Map<string, number> { const counts = new Map<string, number>(); for (const [a, b] of items) { // Create deterministic composite key const key = `${a}||${b}`; // Use delimiter unlikely in data counts.set(key, (counts.get(key) ?? 0) + 1); } return counts;}When frequency counting in production code:
We've explored frequency counting from first principles to advanced applications. Let's consolidate the key insights:
Counter, defaultdict, Map.merge() make code cleaner and less error-proneYou now deeply understand frequency counting—the most fundamental hashing pattern. This single concept underpins countless algorithms and appears in nearly every technical interview. Next, we'll explore the Two-Sum pattern and how hashing enables pair-finding problems.