Loading learning content...
Grouping is one of the most natural human cognitive operations. We instinctively categorize animals by species, books by genre, and emails by sender. In computing, this same operation appears everywhere:
Without hash tables, grouping n items into k groups requires O(n²) time in the worst case. With hash tables, it's O(n). This page explores the grouping pattern in depth—from the basic technique to sophisticated applications.
By the end of this page, you will understand grouping as a fundamental pattern: how to choose the right key function, how to group complex objects, and how this pattern solves problems that don't obviously look like grouping problems (like detecting anagrams or finding isomorphic strings).
The grouping pattern consists of three conceptual steps:
The Mental Model:
Imagine you're a postal worker sorting mail. Each piece of mail goes into a bin based on its destination zip code. The zip code is your "key function." After processing all mail, each bin contains all letters going to the same area.
123456789101112131415161718192021222324252627282930313233343536373839404142434445
/** * Generic grouping function * Groups elements by the result of a key function * * Time: O(n * k) where k is the complexity of keyFn * Space: O(n) */function groupBy<T, K>(items: T[], keyFn: (item: T) => K): Map<K, T[]> { const groups = new Map<K, T[]>(); for (const item of items) { const key = keyFn(item); if (!groups.has(key)) { groups.set(key, []); } groups.get(key)!.push(item); } return groups;} // Example 1: Group numbers by parity (even/odd)const numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];const byParity = groupBy(numbers, n => n % 2 === 0 ? 'even' : 'odd');console.log(byParity);// Map { 'odd' => [1, 3, 5, 7, 9], 'even' => [2, 4, 6, 8, 10] } // Example 2: Group strings by first letterconst words = ['apple', 'banana', 'apricot', 'blueberry', 'cherry'];const byFirstLetter = groupBy(words, w => w[0].toLowerCase());console.log(byFirstLetter);// Map { 'a' => ['apple', 'apricot'], 'b' => ['banana', 'blueberry'], 'c' => ['cherry'] } // Example 3: Group objects by propertyinterface Person { name: string; age: number; city: string; }const people: Person[] = [ { name: 'Alice', age: 30, city: 'NYC' }, { name: 'Bob', age: 25, city: 'LA' }, { name: 'Charlie', age: 30, city: 'NYC' }, { name: 'Diana', age: 25, city: 'NYC' },];const byCity = groupBy(people, p => p.city);console.log(byCity);// Map { 'NYC' => [Alice, Charlie, Diana], 'LA' => [Bob] }The key function is the heart of grouping. It must satisfy several properties:
1. Determinism
The same input must always produce the same key. This is essential for correctness:
// BAD: Non-deterministic key (uses random or time)
const badKeyFn = (item) => Math.random() > 0.5 ? 'A' : 'B';
// GOOD: Deterministic key
const goodKeyFn = (item) => item.category;
2. Hashability
The key must be usable in a hash map. In most languages:
123456789101112131415161718192021222324252627282930313233343536
// PROBLEM: Objects as keys don't work by defaultconst badGrouping = new Map<number[], string[]>();const key1 = [1, 2];const key2 = [1, 2]; // Same content, different referencebadGrouping.set(key1, ['item1']);badGrouping.get(key2); // undefined! Different object reference // SOLUTION 1: Serialize to stringfunction groupByArrayKey<T>(items: T[], keyFn: (item: T) => number[]): Map<string, T[]> { const groups = new Map<string, T[]>(); for (const item of items) { const keyArray = keyFn(item); const keyString = JSON.stringify(keyArray); // Serialize if (!groups.has(keyString)) { groups.set(keyString, []); } groups.get(keyString)!.push(item); } return groups;} // Example: Group points by their quadrant signatureinterface Point { x: number; y: number; }const points: Point[] = [ { x: 1, y: 2 }, { x: -1, y: 3 }, { x: 2, y: 1 }, { x: -2, y: -1 }]; const byQuadrant = groupByArrayKey(points, p => [ Math.sign(p.x), // -1, 0, or 1 Math.sign(p.y)]);console.log(byQuadrant);// Map { '[1,1]' => [Point(1,2), Point(2,1)], '[-1,1]' => [Point(-1,3)], '[-1,-1]' => [Point(-2,-1)] }3. Semantic Correctness
The key function should capture the "sameness" you care about:
4. Efficiency
The key function runs once per item. If it's O(k) per item, total time becomes O(n × k). Keep it efficient:
Group Anagrams (LeetCode 49)
Given an array of strings, group anagrams together. This is the canonical grouping problem.
Key Insight:
Two strings are anagrams if and only if they have the same character frequency. We can represent this either by:
Both approaches produce the same key for anagrams.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
/** * Group Anagrams - Sorting Approach * Time: O(n * k log k) where k is max string length * Space: O(n * k) */function groupAnagramsSorting(strs: string[]): string[][] { const groups = new Map<string, string[]>(); for (const str of strs) { // Key: sorted characters const key = str.split('').sort().join(''); if (!groups.has(key)) { groups.set(key, []); } groups.get(key)!.push(str); } return [...groups.values()];} /** * Group Anagrams - Character Count Approach * Time: O(n * k) where k is max string length * Space: O(n * k) * * Faster for long strings (avoids O(k log k) sorting) */function groupAnagramsCounting(strs: string[]): string[][] { const groups = new Map<string, string[]>(); for (const str of strs) { // Build character count key const counts = new Array(26).fill(0); for (const char of str) { counts[char.charCodeAt(0) - 'a'.charCodeAt(0)]++; } // Convert to string key: "1#0#0#..." for counts of a, b, c, ... const key = counts.join('#'); if (!groups.has(key)) { groups.set(key, []); } groups.get(key)!.push(str); } return [...groups.values()];} // Exampleconst words = ["eat", "tea", "tan", "ate", "nat", "bat"];console.log(groupAnagramsSorting(words));// [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]] // Trace the counting approach:// "eat" → counts = [1,0,0,0,1,0,...,1,...] (a:1, e:1, t:1)// Key = "1#0#0#0#1#0#0#0#0#0#0#0#0#0#0#0#0#0#0#1#0#0#0#0#0#0"// "tea" → same counts → same key → grouped with "eat"• Sorting Key: Simpler code, works well for short strings (k < 100) • Counting Key: Better for long strings, fixed alphabet size (e.g., lowercase letters) • Unicode Strings: Use Counter (Python) or Map (JS/TS) for variable character sets
Isomorphic Strings (LeetCode 205)
Two strings are isomorphic if characters can be replaced to get the other string (one-to-one mapping).
The Key Insight:
Encode each string by the pattern of first occurrences. If both strings have the same pattern, they're isomorphic:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
/** * Check if two strings are isomorphic * Time: O(n), Space: O(n) */function isIsomorphic(s: string, t: string): boolean { if (s.length !== t.length) return false; // Pattern: encode each position as the index of first occurrence const patternize = (str: string): string => { const firstOccurrence = new Map<string, number>(); const pattern: number[] = []; for (let i = 0; i < str.length; i++) { const char = str[i]; if (!firstOccurrence.has(char)) { firstOccurrence.set(char, i); } pattern.push(firstOccurrence.get(char)!); } return pattern.join(','); }; return patternize(s) === patternize(t);} // Examples:// "egg" → pattern [0, 1, 1] (e first seen at 0, g at 1, g at 1)// "add" → pattern [0, 1, 1] (a first seen at 0, d at 1, d at 1)// Same pattern → isomorphic! // "foo" → pattern [0, 1, 1]// "bar" → pattern [0, 1, 2] (all different)// Different patterns → not isomorphic console.log(isIsomorphic("egg", "add")); // trueconsole.log(isIsomorphic("foo", "bar")); // falseconsole.log(isIsomorphic("paper", "title")); // true /** * Group Isomorphic Strings * Given a list of strings, group all isomorphic strings together */function groupIsomorphic(strs: string[]): string[][] { const groups = new Map<string, string[]>(); const patternize = (str: string): string => { const firstOccurrence = new Map<string, number>(); const pattern: number[] = []; for (let i = 0; i < str.length; i++) { if (!firstOccurrence.has(str[i])) { firstOccurrence.set(str[i], i); } pattern.push(firstOccurrence.get(str[i])!); } return pattern.join(','); }; for (const str of strs) { const key = patternize(str); if (!groups.has(key)) { groups.set(key, []); } groups.get(key)!.push(str); } return [...groups.values()];} console.log(groupIsomorphic(["egg", "add", "foo", "bar", "paper", "title"]));// [["egg", "add"], ["foo"], ["bar"], ["paper", "title"]]Word Pattern (LeetCode 290)
Given a pattern and a string, check if the string follows the same pattern:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
def word_pattern(pattern: str, s: str) -> bool: """ Check if string follows the pattern. Time: O(n + m), Space: O(n) """ words = s.split() if len(pattern) != len(words): return False # Two mappings needed: pattern → word AND word → pattern # Both must be bijective (one-to-one and onto) pattern_to_word = {} word_to_pattern = {} for char, word in zip(pattern, words): # Check pattern → word mapping if char in pattern_to_word: if pattern_to_word[char] != word: return False else: pattern_to_word[char] = word # Check word → pattern mapping if word in word_to_pattern: if word_to_pattern[word] != char: return False else: word_to_pattern[word] = char return True def word_pattern_using_pattern(pattern: str, s: str) -> bool: """ Alternative: Compare normalized patterns. """ words = s.split() if len(pattern) != len(words): return False def normalize(items): mapping = {} result = [] for item in items: if item not in mapping: mapping[item] = len(mapping) result.append(mapping[item]) return tuple(result) return normalize(pattern) == normalize(words) # Examplesprint(word_pattern("abba", "dog cat cat dog")) # Trueprint(word_pattern("abba", "dog cat cat fish")) # Falseprint(word_pattern("aaaa", "dog cat cat dog")) # False (a→dog, a→cat conflict)Real-world data often requires grouping at multiple levels—like an organizational hierarchy or a date-time breakdown (year → month → day).
Approach 1: Nested Maps
Build a tree-like structure of nested maps:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
/** * Multi-level grouping with nested maps */interface SaleRecord { year: number; month: number; product: string; amount: number;} function groupSalesMultiLevel(sales: SaleRecord[]): Map<number, Map<number, Map<string, SaleRecord[]>>> { const result = new Map<number, Map<number, Map<string, SaleRecord[]>>>(); for (const sale of sales) { // Level 1: Year if (!result.has(sale.year)) { result.set(sale.year, new Map()); } const yearMap = result.get(sale.year)!; // Level 2: Month if (!yearMap.has(sale.month)) { yearMap.set(sale.month, new Map()); } const monthMap = yearMap.get(sale.month)!; // Level 3: Product if (!monthMap.has(sale.product)) { monthMap.set(sale.product, []); } monthMap.get(sale.product)!.push(sale); } return result;} /** * Generic multi-level grouping */type KeyFunction<T> = (item: T) => string | number; function groupByMultiple<T>( items: T[], keyFns: KeyFunction<T>[]): Map<any, any> { if (keyFns.length === 0) { throw new Error("At least one key function required"); } if (keyFns.length === 1) { // Base case: single-level grouping const groups = new Map<any, T[]>(); for (const item of items) { const key = keyFns[0](item); if (!groups.has(key)) { groups.set(key, []); } groups.get(key)!.push(item); } return groups; } // Recursive case: group by first key, then recursively group each bucket const firstLevel = new Map<any, T[]>(); for (const item of items) { const key = keyFns[0](item); if (!firstLevel.has(key)) { firstLevel.set(key, []); } firstLevel.get(key)!.push(item); } const result = new Map<any, any>(); for (const [key, group] of firstLevel) { result.set(key, groupByMultiple(group, keyFns.slice(1))); } return result;} // Example usageconst sales: SaleRecord[] = [ { year: 2024, month: 1, product: 'Widget', amount: 100 }, { year: 2024, month: 1, product: 'Gadget', amount: 200 }, { year: 2024, month: 2, product: 'Widget', amount: 150 }, { year: 2023, month: 12, product: 'Widget', amount: 90 },]; const grouped = groupByMultiple(sales, [ s => s.year, s => s.month, s => s.product]); // Navigate: grouped.get(2024).get(1).get('Widget') → [first sale]Approach 2: Composite Key
Often simpler: use a single level with a composite key:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
from collections import defaultdictfrom typing import List, Dict, Tuplefrom dataclasses import dataclass @dataclassclass SaleRecord: year: int month: int product: str amount: float def group_by_composite_key( sales: List[SaleRecord]) -> Dict[Tuple[int, int, str], List[SaleRecord]]: """ Single-level grouping with composite key. Often simpler than nested maps. """ groups = defaultdict(list) for sale in sales: key = (sale.year, sale.month, sale.product) groups[key].append(sale) return dict(groups) def aggregate_by_composite_key( sales: List[SaleRecord]) -> Dict[Tuple[int, int], float]: """ Group and aggregate in one pass. Sum amounts by (year, month). """ totals = defaultdict(float) for sale in sales: key = (sale.year, sale.month) totals[key] += sale.amount return dict(totals) # Examplesales = [ SaleRecord(2024, 1, 'Widget', 100), SaleRecord(2024, 1, 'Gadget', 200), SaleRecord(2024, 1, 'Widget', 50), SaleRecord(2024, 2, 'Widget', 150),] # Group by all three dimensionsby_all = group_by_composite_key(sales)print(by_all[(2024, 1, 'Widget')]) # [SaleRecord(2024,1,Widget,100), SaleRecord(2024,1,Widget,50)] # Aggregate by year and monthmonthly_totals = aggregate_by_composite_key(sales)print(monthly_totals) # {(2024, 1): 350.0, (2024, 2): 150.0}Grouping often precedes aggregation—computing summary statistics per group:
This is the algorithmic equivalent of SQL's GROUP BY with aggregate functions.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
/** * Generic group and aggregate */function groupAndAggregate<T, K, R>( items: T[], keyFn: (item: T) => K, aggregator: (group: T[]) => R): Map<K, R> { // First, group const groups = new Map<K, T[]>(); for (const item of items) { const key = keyFn(item); if (!groups.has(key)) { groups.set(key, []); } groups.get(key)!.push(item); } // Then, aggregate each group const result = new Map<K, R>(); for (const [key, group] of groups) { result.set(key, aggregator(group)); } return result;} // Example: Calculate statistics per categoryinterface Product { category: string; name: string; price: number;} const products: Product[] = [ { category: 'Electronics', name: 'Phone', price: 999 }, { category: 'Electronics', name: 'Tablet', price: 599 }, { category: 'Clothing', name: 'Shirt', price: 49 }, { category: 'Clothing', name: 'Pants', price: 79 }, { category: 'Electronics', name: 'Laptop', price: 1299 },]; // Sum prices per categoryconst sumByCategory = groupAndAggregate( products, p => p.category, group => group.reduce((sum, p) => sum + p.price, 0));console.log(sumByCategory);// Map { 'Electronics' => 2897, 'Clothing' => 128 } // Statistics per categoryinterface Stats { count: number; sum: number; avg: number; min: number; max: number;} const statsByCategory = groupAndAggregate<Product, string, Stats>( products, p => p.category, group => ({ count: group.length, sum: group.reduce((s, p) => s + p.price, 0), avg: group.reduce((s, p) => s + p.price, 0) / group.length, min: Math.min(...group.map(p => p.price)), max: Math.max(...group.map(p => p.price)), }));console.log(statsByCategory.get('Electronics'));// { count: 3, sum: 2897, avg: 965.67, min: 599, max: 1299 }Grouping is a fundamental pattern that appears across data processing, from simple categorization to complex analytics. Let's consolidate the key insights:
| Problem Type | Key Function | Example |
|---|---|---|
| Anagrams | Sorted characters or char counts | "eat" → "aet" |
| Isomorphic strings | First-occurrence pattern | "egg" → [0,1,1] |
| By property | Object property value | person.city |
| By range/bucket | value // bucket_size | age // 10 → decade |
| By composite | Tuple of properties | (year, month, category) |
| By hash | Hash of content | MD5(file_contents) |
You now understand grouping as a fundamental pattern—from simple categorization to complex multi-level hierarchy construction. This pattern is ubiquitous in data processing and forms the core of many analytics operations. Next, we'll explore caching and memoization patterns.