Data Structures & AlgorithmsHash Tables

Common Hashing Patterns & Problem Types

LevelIntermediate

Duration75 mins

TopicHash Tables

4 / 5

Grouping by Characteristic

The Power of Classification

Grouping is one of the most natural human cognitive operations. We instinctively categorize animals by species, books by genre, and emails by sender. In computing, this same operation appears everywhere:

Group customers by region
Group log entries by error type
Group words by their sorted letters (anagram detection)
Group transactions by date
Group identical images by perceptual hash

Without hash tables, grouping n items into k groups requires O(n²) time in the worst case. With hash tables, it's O(n). This page explores the grouping pattern in depth—from the basic technique to sophisticated applications.

What You Will Master

By the end of this page, you will understand grouping as a fundamental pattern: how to choose the right key function, how to group complex objects, and how this pattern solves problems that don't obviously look like grouping problems (like detecting anagrams or finding isomorphic strings).

The Core Grouping Pattern

The grouping pattern consists of three conceptual steps:

Define the key function: What characteristic determines which group an item belongs to?
Build the group map: Iterate through items, computing keys and adding items to their groups
Return/process the groups: Extract the grouped results

The Mental Model:

Imagine you're a postal worker sorting mail. Each piece of mail goes into a bin based on its destination zip code. The zip code is your "key function." After processing all mail, each bin contains all letters going to the same area.

grouping_basic
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
/**
 * Generic grouping function
 * Groups elements by the result of a key function
 * 
 * Time: O(n * k) where k is the complexity of keyFn
 * Space: O(n)
 */
function groupBy<T, K>(items: T[], keyFn: (item: T) => K): Map<K, T[]> {
    const groups = new Map<K, T[]>();
    
    for (const item of items) {
        const key = keyFn(item);
        
        if (!groups.has(key)) {
            groups.set(key, []);
        }
        groups.get(key)!.push(item);
    }
    
    return groups;
}
 
// Example 1: Group numbers by parity (even/odd)
const numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
const byParity = groupBy(numbers, n => n % 2 === 0 ? 'even' : 'odd');
console.log(byParity);
// Map { 'odd' => [1, 3, 5, 7, 9], 'even' => [2, 4, 6, 8, 10] }
 
// Example 2: Group strings by first letter
const words = ['apple', 'banana', 'apricot', 'blueberry', 'cherry'];
const byFirstLetter = groupBy(words, w => w[0].toLowerCase());
console.log(byFirstLetter);
// Map { 'a' => ['apple', 'apricot'], 'b' => ['banana', 'blueberry'], 'c' => ['cherry'] }
 
// Example 3: Group objects by property
interface Person { name: string; age: number; city: string; }
const people: Person[] = [
    { name: 'Alice', age: 30, city: 'NYC' },
    { name: 'Bob', age: 25, city: 'LA' },
    { name: 'Charlie', age: 30, city: 'NYC' },
    { name: 'Diana', age: 25, city: 'NYC' },
];
const byCity = groupBy(people, p => p.city);
console.log(byCity);
// Map { 'NYC' => [Alice, Charlie, Diana], 'LA' => [Bob] }

Choosing the Right Key Function

The key function is the heart of grouping. It must satisfy several properties:

1. Determinism

The same input must always produce the same key. This is essential for correctness:

// BAD: Non-deterministic key (uses random or time)
const badKeyFn = (item) => Math.random() > 0.5 ? 'A' : 'B';

// GOOD: Deterministic key
const goodKeyFn = (item) => item.category;

2. Hashability

The key must be usable in a hash map. In most languages:

Primitives (strings, numbers) are always hashable
Objects require special handling (serialize to string, use ID, etc.)

hashable_keys
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// PROBLEM: Objects as keys don't work by default
const badGrouping = new Map<number[], string[]>();
const key1 = [1, 2];
const key2 = [1, 2]; // Same content, different reference
badGrouping.set(key1, ['item1']);
badGrouping.get(key2); // undefined! Different object reference
 
// SOLUTION 1: Serialize to string
function groupByArrayKey<T>(items: T[], keyFn: (item: T) => number[]): Map<string, T[]> {
    const groups = new Map<string, T[]>();
    
    for (const item of items) {
        const keyArray = keyFn(item);
        const keyString = JSON.stringify(keyArray); // Serialize
        
        if (!groups.has(keyString)) {
            groups.set(keyString, []);
        }
        groups.get(keyString)!.push(item);
    }
    
    return groups;
}
 
// Example: Group points by their quadrant signature
interface Point { x: number; y: number; }
const points: Point[] = [
    { x: 1, y: 2 }, { x: -1, y: 3 }, { x: 2, y: 1 }, { x: -2, y: -1 }
];
 
const byQuadrant = groupByArrayKey(points, p => [
    Math.sign(p.x),  // -1, 0, or 1
    Math.sign(p.y)
]);
console.log(byQuadrant);
// Map { '[1,1]' => [Point(1,2), Point(2,1)], '[-1,1]' => [Point(-1,3)], '[-1,-1]' => [Point(-2,-1)] }

3. Semantic Correctness

The key function should capture the "sameness" you care about:

Group anagrams: Key is sorted characters → "listen" and "silent" both become "eilnst"
Group by first initial: Key is first character → "Alice" and "Amanda" both have key "A"
Group duplicates: Key is the element itself → identical items group together

4. Efficiency

The key function runs once per item. If it's O(k) per item, total time becomes O(n × k). Keep it efficient:

Sorting a string of length m is O(m log m)
Counting characters in a string of length m is O(m)
Both work for anagram detection, but counting is faster for long strings

Classic Application: Anagram Grouping

Group Anagrams (LeetCode 49)

Given an array of strings, group anagrams together. This is the canonical grouping problem.

Key Insight:

Two strings are anagrams if and only if they have the same character frequency. We can represent this either by:

Sorting the characters ("eat" → "aet")
Counting character frequencies ("eat" → "a1e1t1")

Both approaches produce the same key for anagrams.

group_anagrams
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
/**
 * Group Anagrams - Sorting Approach
 * Time: O(n * k log k) where k is max string length
 * Space: O(n * k)
 */
function groupAnagramsSorting(strs: string[]): string[][] {
    const groups = new Map<string, string[]>();
    
    for (const str of strs) {
        // Key: sorted characters
        const key = str.split('').sort().join('');
        
        if (!groups.has(key)) {
            groups.set(key, []);
        }
        groups.get(key)!.push(str);
    }
    
    return [...groups.values()];
}
 
/**
 * Group Anagrams - Character Count Approach
 * Time: O(n * k) where k is max string length
 * Space: O(n * k)
 * 
 * Faster for long strings (avoids O(k log k) sorting)
 */
function groupAnagramsCounting(strs: string[]): string[][] {
    const groups = new Map<string, string[]>();
    
    for (const str of strs) {
        // Build character count key
        const counts = new Array(26).fill(0);
        for (const char of str) {
            counts[char.charCodeAt(0) - 'a'.charCodeAt(0)]++;
        }
        // Convert to string key: "1#0#0#..." for counts of a, b, c, ...
        const key = counts.join('#');
        
        if (!groups.has(key)) {
            groups.set(key, []);
        }
        groups.get(key)!.push(str);
    }
    
    return [...groups.values()];
}
 
// Example
const words = ["eat", "tea", "tan", "ate", "nat", "bat"];
console.log(groupAnagramsSorting(words));
// [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
 
// Trace the counting approach:
// "eat" → counts = [1,0,0,0,1,0,...,1,...] (a:1, e:1, t:1)
// Key = "1#0#0#0#1#0#0#0#0#0#0#0#0#0#0#0#0#0#0#1#0#0#0#0#0#0"
// "tea" → same counts → same key → grouped with "eat"

When to Use Which Approach

• Sorting Key: Simpler code, works well for short strings (k < 100) • Counting Key: Better for long strings, fixed alphabet size (e.g., lowercase letters) • Unicode Strings: Use Counter (Python) or Map (JS/TS) for variable character sets

Isomorphic Strings and Pattern Matching

Isomorphic Strings (LeetCode 205)

Two strings are isomorphic if characters can be replaced to get the other string (one-to-one mapping).

"egg" and "add" are isomorphic: e→a, g→d
"foo" and "bar" are not: f→b, o→a, but then o→r? (o can't map to two things)

The Key Insight:

Encode each string by the pattern of first occurrences. If both strings have the same pattern, they're isomorphic:

isomorphic_strings
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
/**
 * Check if two strings are isomorphic
 * Time: O(n), Space: O(n)
 */
function isIsomorphic(s: string, t: string): boolean {
    if (s.length !== t.length) return false;
    
    // Pattern: encode each position as the index of first occurrence
    const patternize = (str: string): string => {
        const firstOccurrence = new Map<string, number>();
        const pattern: number[] = [];
        
        for (let i = 0; i < str.length; i++) {
            const char = str[i];
            if (!firstOccurrence.has(char)) {
                firstOccurrence.set(char, i);
            }
            pattern.push(firstOccurrence.get(char)!);
        }
        
        return pattern.join(',');
    };
    
    return patternize(s) === patternize(t);
}
 
// Examples:
// "egg" → pattern [0, 1, 1] (e first seen at 0, g at 1, g at 1)
// "add" → pattern [0, 1, 1] (a first seen at 0, d at 1, d at 1)
// Same pattern → isomorphic!
 
// "foo" → pattern [0, 1, 1]
// "bar" → pattern [0, 1, 2] (all different)
// Different patterns → not isomorphic
 
console.log(isIsomorphic("egg", "add"));   // true
console.log(isIsomorphic("foo", "bar"));   // false
console.log(isIsomorphic("paper", "title")); // true
 
/**
 * Group Isomorphic Strings
 * Given a list of strings, group all isomorphic strings together
 */
function groupIsomorphic(strs: string[]): string[][] {
    const groups = new Map<string, string[]>();
    
    const patternize = (str: string): string => {
        const firstOccurrence = new Map<string, number>();
        const pattern: number[] = [];
        
        for (let i = 0; i < str.length; i++) {
            if (!firstOccurrence.has(str[i])) {
                firstOccurrence.set(str[i], i);
            }
            pattern.push(firstOccurrence.get(str[i])!);
        }
        
        return pattern.join(',');
    };
    
    for (const str of strs) {
        const key = patternize(str);
        if (!groups.has(key)) {
            groups.set(key, []);
        }
        groups.get(key)!.push(str);
    }
    
    return [...groups.values()];
}
 
console.log(groupIsomorphic(["egg", "add", "foo", "bar", "paper", "title"]));
// [["egg", "add"], ["foo"], ["bar"], ["paper", "title"]]

Word Pattern (LeetCode 290)

Given a pattern and a string, check if the string follows the same pattern:

Pattern "abba", String "dog cat cat dog" → true
Pattern "abba", String "dog cat cat fish" → false

word_pattern
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
def word_pattern(pattern: str, s: str) -> bool:
    """
    Check if string follows the pattern.
    Time: O(n + m), Space: O(n)
    """
    words = s.split()
    
    if len(pattern) != len(words):
        return False
    
    # Two mappings needed: pattern → word AND word → pattern
    # Both must be bijective (one-to-one and onto)
    pattern_to_word = {}
    word_to_pattern = {}
    
    for char, word in zip(pattern, words):
        # Check pattern → word mapping
        if char in pattern_to_word:
            if pattern_to_word[char] != word:
                return False
        else:
            pattern_to_word[char] = word
        
        # Check word → pattern mapping
        if word in word_to_pattern:
            if word_to_pattern[word] != char:
                return False
        else:
            word_to_pattern[word] = char
    
    return True
 
 
def word_pattern_using_pattern(pattern: str, s: str) -> bool:
    """
    Alternative: Compare normalized patterns.
    """
    words = s.split()
    
    if len(pattern) != len(words):
        return False
    
    def normalize(items):
        mapping = {}
        result = []
        for item in items:
            if item not in mapping:
                mapping[item] = len(mapping)
            result.append(mapping[item])
        return tuple(result)
    
    return normalize(pattern) == normalize(words)
 
 
# Examples
print(word_pattern("abba", "dog cat cat dog"))  # True
print(word_pattern("abba", "dog cat cat fish")) # False
print(word_pattern("aaaa", "dog cat cat dog"))  # False (a→dog, a→cat conflict)

Multi-Level Grouping

Real-world data often requires grouping at multiple levels—like an organizational hierarchy or a date-time breakdown (year → month → day).

Approach 1: Nested Maps

Build a tree-like structure of nested maps:

multi_level_grouping
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
/**
 * Multi-level grouping with nested maps
 */
interface SaleRecord {
    year: number;
    month: number;
    product: string;
    amount: number;
}
 
function groupSalesMultiLevel(sales: SaleRecord[]): Map<number, Map<number, Map<string, SaleRecord[]>>> {
    const result = new Map<number, Map<number, Map<string, SaleRecord[]>>>();
    
    for (const sale of sales) {
        // Level 1: Year
        if (!result.has(sale.year)) {
            result.set(sale.year, new Map());
        }
        const yearMap = result.get(sale.year)!;
        
        // Level 2: Month
        if (!yearMap.has(sale.month)) {
            yearMap.set(sale.month, new Map());
        }
        const monthMap = yearMap.get(sale.month)!;
        
        // Level 3: Product
        if (!monthMap.has(sale.product)) {
            monthMap.set(sale.product, []);
        }
        monthMap.get(sale.product)!.push(sale);
    }
    
    return result;
}
 
/**
 * Generic multi-level grouping
 */
type KeyFunction<T> = (item: T) => string | number;
 
function groupByMultiple<T>(
    items: T[], 
    keyFns: KeyFunction<T>[]
): Map<any, any> {
    if (keyFns.length === 0) {
        throw new Error("At least one key function required");
    }
    
    if (keyFns.length === 1) {
        // Base case: single-level grouping
        const groups = new Map<any, T[]>();
        for (const item of items) {
            const key = keyFns[0](item);
            if (!groups.has(key)) {
                groups.set(key, []);
            }
            groups.get(key)!.push(item);
        }
        return groups;
    }
    
    // Recursive case: group by first key, then recursively group each bucket
    const firstLevel = new Map<any, T[]>();
    for (const item of items) {
        const key = keyFns[0](item);
        if (!firstLevel.has(key)) {
            firstLevel.set(key, []);
        }
        firstLevel.get(key)!.push(item);
    }
    
    const result = new Map<any, any>();
    for (const [key, group] of firstLevel) {
        result.set(key, groupByMultiple(group, keyFns.slice(1)));
    }
    
    return result;
}
 
// Example usage
const sales: SaleRecord[] = [
    { year: 2024, month: 1, product: 'Widget', amount: 100 },
    { year: 2024, month: 1, product: 'Gadget', amount: 200 },
    { year: 2024, month: 2, product: 'Widget', amount: 150 },
    { year: 2023, month: 12, product: 'Widget', amount: 90 },
];
 
const grouped = groupByMultiple(sales, [
    s => s.year,
    s => s.month,
    s => s.product
]);
 
// Navigate: grouped.get(2024).get(1).get('Widget') → [first sale]

Approach 2: Composite Key

Often simpler: use a single level with a composite key:

composite_key_grouping
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
from collections import defaultdict
from typing import List, Dict, Tuple
from dataclasses import dataclass
 
@dataclass
class SaleRecord:
    year: int
    month: int
    product: str
    amount: float
 
 
def group_by_composite_key(
    sales: List[SaleRecord]
) -> Dict[Tuple[int, int, str], List[SaleRecord]]:
    """
    Single-level grouping with composite key.
    Often simpler than nested maps.
    """
    groups = defaultdict(list)
    
    for sale in sales:
        key = (sale.year, sale.month, sale.product)
        groups[key].append(sale)
    
    return dict(groups)
 
 
def aggregate_by_composite_key(
    sales: List[SaleRecord]
) -> Dict[Tuple[int, int], float]:
    """
    Group and aggregate in one pass.
    Sum amounts by (year, month).
    """
    totals = defaultdict(float)
    
    for sale in sales:
        key = (sale.year, sale.month)
        totals[key] += sale.amount
    
    return dict(totals)
 
 
# Example
sales = [
    SaleRecord(2024, 1, 'Widget', 100),
    SaleRecord(2024, 1, 'Gadget', 200),
    SaleRecord(2024, 1, 'Widget', 50),
    SaleRecord(2024, 2, 'Widget', 150),
]
 
# Group by all three dimensions
by_all = group_by_composite_key(sales)
print(by_all[(2024, 1, 'Widget')])  # [SaleRecord(2024,1,Widget,100), SaleRecord(2024,1,Widget,50)]
 
# Aggregate by year and month
monthly_totals = aggregate_by_composite_key(sales)
print(monthly_totals)  # {(2024, 1): 350.0, (2024, 2): 150.0}

Aggregation with Grouping

Grouping often precedes aggregation—computing summary statistics per group:

Sum/Average revenue per category
Count of events per user
Min/Max values per region

This is the algorithmic equivalent of SQL's GROUP BY with aggregate functions.

group_and_aggregate
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
/**
 * Generic group and aggregate
 */
function groupAndAggregate<T, K, R>(
    items: T[],
    keyFn: (item: T) => K,
    aggregator: (group: T[]) => R
): Map<K, R> {
    // First, group
    const groups = new Map<K, T[]>();
    for (const item of items) {
        const key = keyFn(item);
        if (!groups.has(key)) {
            groups.set(key, []);
        }
        groups.get(key)!.push(item);
    }
    
    // Then, aggregate each group
    const result = new Map<K, R>();
    for (const [key, group] of groups) {
        result.set(key, aggregator(group));
    }
    
    return result;
}
 
// Example: Calculate statistics per category
interface Product {
    category: string;
    name: string;
    price: number;
}
 
const products: Product[] = [
    { category: 'Electronics', name: 'Phone', price: 999 },
    { category: 'Electronics', name: 'Tablet', price: 599 },
    { category: 'Clothing', name: 'Shirt', price: 49 },
    { category: 'Clothing', name: 'Pants', price: 79 },
    { category: 'Electronics', name: 'Laptop', price: 1299 },
];
 
// Sum prices per category
const sumByCategory = groupAndAggregate(
    products,
    p => p.category,
    group => group.reduce((sum, p) => sum + p.price, 0)
);
console.log(sumByCategory);
// Map { 'Electronics' => 2897, 'Clothing' => 128 }
 
// Statistics per category
interface Stats {
    count: number;
    sum: number;
    avg: number;
    min: number;
    max: number;
}
 
const statsByCategory = groupAndAggregate<Product, string, Stats>(
    products,
    p => p.category,
    group => ({
        count: group.length,
        sum: group.reduce((s, p) => s + p.price, 0),
        avg: group.reduce((s, p) => s + p.price, 0) / group.length,
        min: Math.min(...group.map(p => p.price)),
        max: Math.max(...group.map(p => p.price)),
    })
);
console.log(statsByCategory.get('Electronics'));
// { count: 3, sum: 2897, avg: 965.67, min: 599, max: 1299 }

Summary: Grouping Mastery

Grouping is a fundamental pattern that appears across data processing, from simple categorization to complex analytics. Let's consolidate the key insights:

Key Takeaways

•The key function defines the grouping — Choose it carefully; it determines what "sameness" means
•Keys must be hashable — Use strings, tuples, or serialized representations for complex keys
•Anagram grouping is the classic example — Sort or count characters to create the key
•Isomorphic patterns use position encoding — First-occurrence index reveals structural equivalence
•Multi-level grouping can use nested maps or composite keys — Composite keys are often simpler
•Grouping + aggregation = powerful analytics — The algorithmic equivalent of SQL GROUP BY

Common Grouping Patterns Quick Reference
Problem Type	Key Function	Example
Anagrams	Sorted characters or char counts	"eat" → "aet"
Isomorphic strings	First-occurrence pattern	"egg" → [0,1,1]
By property	Object property value	person.city
By range/bucket	value // bucket_size	age // 10 → decade
By composite	Tuple of properties	(year, month, category)
By hash	Hash of content	MD5(file_contents)

Page Complete

You now understand grouping as a fundamental pattern—from simple categorization to complex multi-level hierarchy construction. This pattern is ubiquitous in data processing and forms the core of many analytics operations. Next, we'll explore caching and memoization patterns.

4 / 5

Loading learning content...

Data Structures & AlgorithmsHash Tables

Common Hashing Patterns & Problem Types

LevelIntermediate

Duration75 mins

TopicHash Tables

4 / 5

Grouping by Characteristic

The Power of Classification

Group customers by region
Group log entries by error type
Group words by their sorted letters (anagram detection)
Group transactions by date
Group identical images by perceptual hash

What You Will Master

The Core Grouping Pattern

The grouping pattern consists of three conceptual steps:

Define the key function: What characteristic determines which group an item belongs to?
Build the group map: Iterate through items, computing keys and adding items to their groups
Return/process the groups: Extract the grouped results

The Mental Model:

grouping_basic
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
/**
 * Generic grouping function
 * Groups elements by the result of a key function
 * 
 * Time: O(n * k) where k is the complexity of keyFn
 * Space: O(n)
 */
function groupBy<T, K>(items: T[], keyFn: (item: T) => K): Map<K, T[]> {
    const groups = new Map<K, T[]>();
    
    for (const item of items) {
        const key = keyFn(item);
        
        if (!groups.has(key)) {
            groups.set(key, []);
        }
        groups.get(key)!.push(item);
    }
    
    return groups;
}
 
// Example 1: Group numbers by parity (even/odd)
const numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
const byParity = groupBy(numbers, n => n % 2 === 0 ? 'even' : 'odd');
console.log(byParity);
// Map { 'odd' => [1, 3, 5, 7, 9], 'even' => [2, 4, 6, 8, 10] }
 
// Example 2: Group strings by first letter
const words = ['apple', 'banana', 'apricot', 'blueberry', 'cherry'];
const byFirstLetter = groupBy(words, w => w[0].toLowerCase());
console.log(byFirstLetter);
// Map { 'a' => ['apple', 'apricot'], 'b' => ['banana', 'blueberry'], 'c' => ['cherry'] }
 
// Example 3: Group objects by property
interface Person { name: string; age: number; city: string; }
const people: Person[] = [
    { name: 'Alice', age: 30, city: 'NYC' },
    { name: 'Bob', age: 25, city: 'LA' },
    { name: 'Charlie', age: 30, city: 'NYC' },
    { name: 'Diana', age: 25, city: 'NYC' },
];
const byCity = groupBy(people, p => p.city);
console.log(byCity);
// Map { 'NYC' => [Alice, Charlie, Diana], 'LA' => [Bob] }

Choosing the Right Key Function

The key function is the heart of grouping. It must satisfy several properties:

1. Determinism

The same input must always produce the same key. This is essential for correctness:

// BAD: Non-deterministic key (uses random or time)
const badKeyFn = (item) => Math.random() > 0.5 ? 'A' : 'B';

// GOOD: Deterministic key
const goodKeyFn = (item) => item.category;

2. Hashability

The key must be usable in a hash map. In most languages:

Primitives (strings, numbers) are always hashable
Objects require special handling (serialize to string, use ID, etc.)

hashable_keys
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// PROBLEM: Objects as keys don't work by default
const badGrouping = new Map<number[], string[]>();
const key1 = [1, 2];
const key2 = [1, 2]; // Same content, different reference
badGrouping.set(key1, ['item1']);
badGrouping.get(key2); // undefined! Different object reference
 
// SOLUTION 1: Serialize to string
function groupByArrayKey<T>(items: T[], keyFn: (item: T) => number[]): Map<string, T[]> {
    const groups = new Map<string, T[]>();
    
    for (const item of items) {
        const keyArray = keyFn(item);
        const keyString = JSON.stringify(keyArray); // Serialize
        
        if (!groups.has(keyString)) {
            groups.set(keyString, []);
        }
        groups.get(keyString)!.push(item);
    }
    
    return groups;
}
 
// Example: Group points by their quadrant signature
interface Point { x: number; y: number; }
const points: Point[] = [
    { x: 1, y: 2 }, { x: -1, y: 3 }, { x: 2, y: 1 }, { x: -2, y: -1 }
];
 
const byQuadrant = groupByArrayKey(points, p => [
    Math.sign(p.x),  // -1, 0, or 1
    Math.sign(p.y)
]);
console.log(byQuadrant);
// Map { '[1,1]' => [Point(1,2), Point(2,1)], '[-1,1]' => [Point(-1,3)], '[-1,-1]' => [Point(-2,-1)] }

3. Semantic Correctness

The key function should capture the "sameness" you care about:

Group anagrams: Key is sorted characters → "listen" and "silent" both become "eilnst"
Group by first initial: Key is first character → "Alice" and "Amanda" both have key "A"
Group duplicates: Key is the element itself → identical items group together

4. Efficiency

The key function runs once per item. If it's O(k) per item, total time becomes O(n × k). Keep it efficient:

Sorting a string of length m is O(m log m)
Counting characters in a string of length m is O(m)
Both work for anagram detection, but counting is faster for long strings

Classic Application: Anagram Grouping

Group Anagrams (LeetCode 49)

Given an array of strings, group anagrams together. This is the canonical grouping problem.

Key Insight:

Two strings are anagrams if and only if they have the same character frequency. We can represent this either by:

Sorting the characters ("eat" → "aet")
Counting character frequencies ("eat" → "a1e1t1")

Both approaches produce the same key for anagrams.

group_anagrams
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
/**
 * Group Anagrams - Sorting Approach
 * Time: O(n * k log k) where k is max string length
 * Space: O(n * k)
 */
function groupAnagramsSorting(strs: string[]): string[][] {
    const groups = new Map<string, string[]>();
    
    for (const str of strs) {
        // Key: sorted characters
        const key = str.split('').sort().join('');
        
        if (!groups.has(key)) {
            groups.set(key, []);
        }
        groups.get(key)!.push(str);
    }
    
    return [...groups.values()];
}
 
/**
 * Group Anagrams - Character Count Approach
 * Time: O(n * k) where k is max string length
 * Space: O(n * k)
 * 
 * Faster for long strings (avoids O(k log k) sorting)
 */
function groupAnagramsCounting(strs: string[]): string[][] {
    const groups = new Map<string, string[]>();
    
    for (const str of strs) {
        // Build character count key
        const counts = new Array(26).fill(0);
        for (const char of str) {
            counts[char.charCodeAt(0) - 'a'.charCodeAt(0)]++;
        }
        // Convert to string key: "1#0#0#..." for counts of a, b, c, ...
        const key = counts.join('#');
        
        if (!groups.has(key)) {
            groups.set(key, []);
        }
        groups.get(key)!.push(str);
    }
    
    return [...groups.values()];
}
 
// Example
const words = ["eat", "tea", "tan", "ate", "nat", "bat"];
console.log(groupAnagramsSorting(words));
// [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
 
// Trace the counting approach:
// "eat" → counts = [1,0,0,0,1,0,...,1,...] (a:1, e:1, t:1)
// Key = "1#0#0#0#1#0#0#0#0#0#0#0#0#0#0#0#0#0#0#1#0#0#0#0#0#0"
// "tea" → same counts → same key → grouped with "eat"

When to Use Which Approach

Isomorphic Strings and Pattern Matching

Isomorphic Strings (LeetCode 205)

Two strings are isomorphic if characters can be replaced to get the other string (one-to-one mapping).

"egg" and "add" are isomorphic: e→a, g→d
"foo" and "bar" are not: f→b, o→a, but then o→r? (o can't map to two things)

The Key Insight:

Encode each string by the pattern of first occurrences. If both strings have the same pattern, they're isomorphic:

isomorphic_strings
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
/**
 * Check if two strings are isomorphic
 * Time: O(n), Space: O(n)
 */
function isIsomorphic(s: string, t: string): boolean {
    if (s.length !== t.length) return false;
    
    // Pattern: encode each position as the index of first occurrence
    const patternize = (str: string): string => {
        const firstOccurrence = new Map<string, number>();
        const pattern: number[] = [];
        
        for (let i = 0; i < str.length; i++) {
            const char = str[i];
            if (!firstOccurrence.has(char)) {
                firstOccurrence.set(char, i);
            }
            pattern.push(firstOccurrence.get(char)!);
        }
        
        return pattern.join(',');
    };
    
    return patternize(s) === patternize(t);
}
 
// Examples:
// "egg" → pattern [0, 1, 1] (e first seen at 0, g at 1, g at 1)
// "add" → pattern [0, 1, 1] (a first seen at 0, d at 1, d at 1)
// Same pattern → isomorphic!
 
// "foo" → pattern [0, 1, 1]
// "bar" → pattern [0, 1, 2] (all different)
// Different patterns → not isomorphic
 
console.log(isIsomorphic("egg", "add"));   // true
console.log(isIsomorphic("foo", "bar"));   // false
console.log(isIsomorphic("paper", "title")); // true
 
/**
 * Group Isomorphic Strings
 * Given a list of strings, group all isomorphic strings together
 */
function groupIsomorphic(strs: string[]): string[][] {
    const groups = new Map<string, string[]>();
    
    const patternize = (str: string): string => {
        const firstOccurrence = new Map<string, number>();
        const pattern: number[] = [];
        
        for (let i = 0; i < str.length; i++) {
            if (!firstOccurrence.has(str[i])) {
                firstOccurrence.set(str[i], i);
            }
            pattern.push(firstOccurrence.get(str[i])!);
        }
        
        return pattern.join(',');
    };
    
    for (const str of strs) {
        const key = patternize(str);
        if (!groups.has(key)) {
            groups.set(key, []);
        }
        groups.get(key)!.push(str);
    }
    
    return [...groups.values()];
}
 
console.log(groupIsomorphic(["egg", "add", "foo", "bar", "paper", "title"]));
// [["egg", "add"], ["foo"], ["bar"], ["paper", "title"]]

Word Pattern (LeetCode 290)

Given a pattern and a string, check if the string follows the same pattern:

Pattern "abba", String "dog cat cat dog" → true
Pattern "abba", String "dog cat cat fish" → false

word_pattern
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
def word_pattern(pattern: str, s: str) -> bool:
    """
    Check if string follows the pattern.
    Time: O(n + m), Space: O(n)
    """
    words = s.split()
    
    if len(pattern) != len(words):
        return False
    
    # Two mappings needed: pattern → word AND word → pattern
    # Both must be bijective (one-to-one and onto)
    pattern_to_word = {}
    word_to_pattern = {}
    
    for char, word in zip(pattern, words):
        # Check pattern → word mapping
        if char in pattern_to_word:
            if pattern_to_word[char] != word:
                return False
        else:
            pattern_to_word[char] = word
        
        # Check word → pattern mapping
        if word in word_to_pattern:
            if word_to_pattern[word] != char:
                return False
        else:
            word_to_pattern[word] = char
    
    return True
 
 
def word_pattern_using_pattern(pattern: str, s: str) -> bool:
    """
    Alternative: Compare normalized patterns.
    """
    words = s.split()
    
    if len(pattern) != len(words):
        return False
    
    def normalize(items):
        mapping = {}
        result = []
        for item in items:
            if item not in mapping:
                mapping[item] = len(mapping)
            result.append(mapping[item])
        return tuple(result)
    
    return normalize(pattern) == normalize(words)
 
 
# Examples
print(word_pattern("abba", "dog cat cat dog"))  # True
print(word_pattern("abba", "dog cat cat fish")) # False
print(word_pattern("aaaa", "dog cat cat dog"))  # False (a→dog, a→cat conflict)

Multi-Level Grouping

Real-world data often requires grouping at multiple levels—like an organizational hierarchy or a date-time breakdown (year → month → day).

Approach 1: Nested Maps

Build a tree-like structure of nested maps:

multi_level_grouping
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
/**
 * Multi-level grouping with nested maps
 */
interface SaleRecord {
    year: number;
    month: number;
    product: string;
    amount: number;
}
 
function groupSalesMultiLevel(sales: SaleRecord[]): Map<number, Map<number, Map<string, SaleRecord[]>>> {
    const result = new Map<number, Map<number, Map<string, SaleRecord[]>>>();
    
    for (const sale of sales) {
        // Level 1: Year
        if (!result.has(sale.year)) {
            result.set(sale.year, new Map());
        }
        const yearMap = result.get(sale.year)!;
        
        // Level 2: Month
        if (!yearMap.has(sale.month)) {
            yearMap.set(sale.month, new Map());
        }
        const monthMap = yearMap.get(sale.month)!;
        
        // Level 3: Product
        if (!monthMap.has(sale.product)) {
            monthMap.set(sale.product, []);
        }
        monthMap.get(sale.product)!.push(sale);
    }
    
    return result;
}
 
/**
 * Generic multi-level grouping
 */
type KeyFunction<T> = (item: T) => string | number;
 
function groupByMultiple<T>(
    items: T[], 
    keyFns: KeyFunction<T>[]
): Map<any, any> {
    if (keyFns.length === 0) {
        throw new Error("At least one key function required");
    }
    
    if (keyFns.length === 1) {
        // Base case: single-level grouping
        const groups = new Map<any, T[]>();
        for (const item of items) {
            const key = keyFns[0](item);
            if (!groups.has(key)) {
                groups.set(key, []);
            }
            groups.get(key)!.push(item);
        }
        return groups;
    }
    
    // Recursive case: group by first key, then recursively group each bucket
    const firstLevel = new Map<any, T[]>();
    for (const item of items) {
        const key = keyFns[0](item);
        if (!firstLevel.has(key)) {
            firstLevel.set(key, []);
        }
        firstLevel.get(key)!.push(item);
    }
    
    const result = new Map<any, any>();
    for (const [key, group] of firstLevel) {
        result.set(key, groupByMultiple(group, keyFns.slice(1)));
    }
    
    return result;
}
 
// Example usage
const sales: SaleRecord[] = [
    { year: 2024, month: 1, product: 'Widget', amount: 100 },
    { year: 2024, month: 1, product: 'Gadget', amount: 200 },
    { year: 2024, month: 2, product: 'Widget', amount: 150 },
    { year: 2023, month: 12, product: 'Widget', amount: 90 },
];
 
const grouped = groupByMultiple(sales, [
    s => s.year,
    s => s.month,
    s => s.product
]);
 
// Navigate: grouped.get(2024).get(1).get('Widget') → [first sale]

Approach 2: Composite Key

Often simpler: use a single level with a composite key:

composite_key_grouping
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
from collections import defaultdict
from typing import List, Dict, Tuple
from dataclasses import dataclass
 
@dataclass
class SaleRecord:
    year: int
    month: int
    product: str
    amount: float
 
 
def group_by_composite_key(
    sales: List[SaleRecord]
) -> Dict[Tuple[int, int, str], List[SaleRecord]]:
    """
    Single-level grouping with composite key.
    Often simpler than nested maps.
    """
    groups = defaultdict(list)
    
    for sale in sales:
        key = (sale.year, sale.month, sale.product)
        groups[key].append(sale)
    
    return dict(groups)
 
 
def aggregate_by_composite_key(
    sales: List[SaleRecord]
) -> Dict[Tuple[int, int], float]:
    """
    Group and aggregate in one pass.
    Sum amounts by (year, month).
    """
    totals = defaultdict(float)
    
    for sale in sales:
        key = (sale.year, sale.month)
        totals[key] += sale.amount
    
    return dict(totals)
 
 
# Example
sales = [
    SaleRecord(2024, 1, 'Widget', 100),
    SaleRecord(2024, 1, 'Gadget', 200),
    SaleRecord(2024, 1, 'Widget', 50),
    SaleRecord(2024, 2, 'Widget', 150),
]
 
# Group by all three dimensions
by_all = group_by_composite_key(sales)
print(by_all[(2024, 1, 'Widget')])  # [SaleRecord(2024,1,Widget,100), SaleRecord(2024,1,Widget,50)]
 
# Aggregate by year and month
monthly_totals = aggregate_by_composite_key(sales)
print(monthly_totals)  # {(2024, 1): 350.0, (2024, 2): 150.0}

Aggregation with Grouping

Grouping often precedes aggregation—computing summary statistics per group:

Sum/Average revenue per category
Count of events per user
Min/Max values per region

This is the algorithmic equivalent of SQL's GROUP BY with aggregate functions.

group_and_aggregate
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
/**
 * Generic group and aggregate
 */
function groupAndAggregate<T, K, R>(
    items: T[],
    keyFn: (item: T) => K,
    aggregator: (group: T[]) => R
): Map<K, R> {
    // First, group
    const groups = new Map<K, T[]>();
    for (const item of items) {
        const key = keyFn(item);
        if (!groups.has(key)) {
            groups.set(key, []);
        }
        groups.get(key)!.push(item);
    }
    
    // Then, aggregate each group
    const result = new Map<K, R>();
    for (const [key, group] of groups) {
        result.set(key, aggregator(group));
    }
    
    return result;
}
 
// Example: Calculate statistics per category
interface Product {
    category: string;
    name: string;
    price: number;
}
 
const products: Product[] = [
    { category: 'Electronics', name: 'Phone', price: 999 },
    { category: 'Electronics', name: 'Tablet', price: 599 },
    { category: 'Clothing', name: 'Shirt', price: 49 },
    { category: 'Clothing', name: 'Pants', price: 79 },
    { category: 'Electronics', name: 'Laptop', price: 1299 },
];
 
// Sum prices per category
const sumByCategory = groupAndAggregate(
    products,
    p => p.category,
    group => group.reduce((sum, p) => sum + p.price, 0)
);
console.log(sumByCategory);
// Map { 'Electronics' => 2897, 'Clothing' => 128 }
 
// Statistics per category
interface Stats {
    count: number;
    sum: number;
    avg: number;
    min: number;
    max: number;
}
 
const statsByCategory = groupAndAggregate<Product, string, Stats>(
    products,
    p => p.category,
    group => ({
        count: group.length,
        sum: group.reduce((s, p) => s + p.price, 0),
        avg: group.reduce((s, p) => s + p.price, 0) / group.length,
        min: Math.min(...group.map(p => p.price)),
        max: Math.max(...group.map(p => p.price)),
    })
);
console.log(statsByCategory.get('Electronics'));
// { count: 3, sum: 2897, avg: 965.67, min: 599, max: 1299 }

Summary: Grouping Mastery

Grouping is a fundamental pattern that appears across data processing, from simple categorization to complex analytics. Let's consolidate the key insights:

Key Takeaways

•The key function defines the grouping — Choose it carefully; it determines what "sameness" means
•Keys must be hashable — Use strings, tuples, or serialized representations for complex keys
•Anagram grouping is the classic example — Sort or count characters to create the key
•Isomorphic patterns use position encoding — First-occurrence index reveals structural equivalence
•Multi-level grouping can use nested maps or composite keys — Composite keys are often simpler
•Grouping + aggregation = powerful analytics — The algorithmic equivalent of SQL GROUP BY

Common Grouping Patterns Quick Reference
Problem Type	Key Function	Example
Anagrams	Sorted characters or char counts	"eat" → "aet"
Isomorphic strings	First-occurrence pattern	"egg" → [0,1,1]
By property	Object property value	person.city
By range/bucket	value // bucket_size	age // 10 → decade
By composite	Tuple of properties	(year, month, category)
By hash	Hash of content	MD5(file_contents)

Page Complete

4 / 5