Sharding Partitioning - Learning Module

Loading content...

0/273

Directory-Based Sharding

The Lookup Table Approach

What if you could map any key to any shard, with complete flexibility? What if migration were as simple as updating a database row? Directory-based sharding provides exactly this—an explicit lookup table that maps partition keys to shards.

Unlike range or hash sharding where the mapping is computed algorithmically, directory sharding stores the mapping explicitly. This provides maximum flexibility: any key can be moved to any shard at any time. But this power comes with costs—every query requires a lookup, and the directory itself becomes a critical system component.

What You Will Learn

By the end of this page, you will understand how directory-based sharding works, when it's the right choice, how to implement it efficiently, and the tradeoffs compared to algorithmic sharding. You'll learn patterns for directory caching, consistency, and high availability that make directory sharding practical at scale.

How Directory-Based Sharding Works

In directory-based sharding, a centralized directory (lookup table) maintains the mapping from partition keys to shards. Every query requires consulting this directory to determine the target shard.

The Basic Mechanism:

Maintain a Directory — A database table or key-value store that maps partition keys to shard identifiers

| tenant_id | shard_id |
|-----------|----------|
| acme-corp | shard-3  |
| globex    | shard-1  |
| initech   | shard-2  |

Lookup Before Query — Before any data operation, query the directory to find the target shard
- Application queries: "Where is acme-corp?"
- Directory responds: "shard-3"
Route to Shard — Direct the query to the resolved shard
Cache for Performance — Cache directory lookups aggressively to avoid repeated directory queries

Compared to Algorithmic Sharding:

With hash sharding: shard = hash(key) % N — computed instantly, no lookup With directory sharding: shard = directory.get(key) — requires a query

directory-shard-router.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
/**
 * Directory-Based Shard Router
 * 
 * Uses an explicit lookup table to map keys to shards
 * Includes caching for performance
 */
 
interface ShardMapping {
    partitionKey: string;
    shardId: string;
    lastUpdated: Date;
}
 
interface DirectoryStore {
    get(key: string): Promise<ShardMapping | null>;
    set(key: string, shardId: string): Promise<void>;
    getMany(keys: string[]): Promise<Map<string, ShardMapping>>;
}
 
class DirectoryShardRouter {
    private directory: DirectoryStore;
    private cache: Map<string, { shardId: string; expiry: Date }> = new Map();
    private cacheTtlMs: number;
    
    constructor(directory: DirectoryStore, cacheTtlMs: number = 60000) {
        this.directory = directory;
        this.cacheTtlMs = cacheTtlMs;
    }
    
    /**
     * Get the shard for a partition key
     * Checks cache first, then queries directory
     */
    async getShard(partitionKey: string): Promise<string | null> {
        // Check cache first
        const cached = this.cache.get(partitionKey);
        if (cached && cached.expiry > new Date()) {
            return cached.shardId;
        }
        
        // Cache miss - query directory
        const mapping = await this.directory.get(partitionKey);
        if (!mapping) {
            return null; // Key not registered
        }
        
        // Update cache
        this.cache.set(partitionKey, {
            shardId: mapping.shardId,
            expiry: new Date(Date.now() + this.cacheTtlMs),
        });
        
        return mapping.shardId;
    }
    
    /**
     * Batch lookup for multiple keys
     * More efficient than individual lookups
     */
    async getShards(partitionKeys: string[]): Promise<Map<string, string>> {
        const result = new Map<string, string>();
        const cacheMisses: string[] = [];
        
        // Check cache for all keys
        for (const key of partitionKeys) {
            const cached = this.cache.get(key);
            if (cached && cached.expiry > new Date()) {
                result.set(key, cached.shardId);
            } else {
                cacheMisses.push(key);
            }
        }
        
        // Batch query directory for cache misses
        if (cacheMisses.length > 0) {
            const mappings = await this.directory.getMany(cacheMisses);
            
            for (const [key, mapping] of mappings) {
                result.set(key, mapping.shardId);
                this.cache.set(key, {
                    shardId: mapping.shardId,
                    expiry: new Date(Date.now() + this.cacheTtlMs),
                });
            }
        }
        
        return result;
    }
    
    /**
     * Register a new partition key to a shard
     * Used when creating new tenants/entities
     */
    async registerKey(partitionKey: string, shardId: string): Promise<void> {
        await this.directory.set(partitionKey, shardId);
        
        // Update cache with new mapping
        this.cache.set(partitionKey, {
            shardId,
            expiry: new Date(Date.now() + this.cacheTtlMs),
        });
    }
    
    /**
     * Move a key to a different shard
     * This is where directory sharding shines!
     */
    async migrateKey(partitionKey: string, newShardId: string): Promise<void> {
        // Step 1: Update directory (with lock)
        await this.directory.set(partitionKey, newShardId);
        
        // Step 2: Invalidate cache across all application instances
        // (In production, use pub/sub or cache invalidation broadcast)
        this.cache.delete(partitionKey);
        
        // Step 3: Actual data migration happens separately
        // (Copy data from old shard to new, then update directory)
    }
    
    /**
     * Invalidate cache for a key
     * Called when directory update notifications arrive
     */
    invalidateCache(partitionKey: string): void {
        this.cache.delete(partitionKey);
    }
}

The Cache is Critical

Without caching, every data query requires two round-trips: one to the directory, one to the shard. This doubles latency. Production directory sharding implementations cache aggressively, with cache hit rates typically >99%. The tradeoff is cache invalidation complexity when mappings change.

When Directory-Based Sharding Excels

Directory-based sharding isn't the default choice—algorithmic approaches (hash, range) are simpler. But certain scenarios make directory sharding the best option:

Ideal Use Cases:

Directory Sharding Strengths

•Per-Tenant Placement Control — Multi-tenant SaaS where specific tenants need to be on specific shards (data residency, compliance, isolation).
•Uneven Entity Sizes — When entities vary dramatically in size and you need manual balancing. A huge tenant can get its own shard.
•Custom Migration Strategies — When you need to move individual entities without algorithm changes.
•Heterogeneous Shard Types — When different shards have different characteristics (SSD vs HDD, regions, compliance levels).
•Gradual Onboarding — New entities can be directed to new, less-loaded shards without rebalancing existing data.
•Fine-Grained Control — When business logic dictates shard placement that can't be expressed algorithmically.

Directory Sharding Scenarios
Scenario	Why Directory Works	Alternative Approach
GDPR data residency	Map EU customers to EU shards explicitly	Complex hash function per region
VIP tenant isolation	Place large tenants on dedicated shards	None - requires explicit control
Mixed storage tiers	Hot tenants on SSD, cold on HDD	Complex tiering logic
Gradual migration	Move tenants one by one during migration	Big-bang migration
Acquisition integration	Keep acquired company's data separate	Schema merge required

The Multi-Tenant SaaS Pattern:

Directory sharding is particularly powerful for multi-tenant SaaS. Consider a B2B application with 10,000 tenants:

9,900 small tenants are evenly distributed across 10 shards
50 medium tenants are balanced manually for load
50 large tenants each get their own dedicated shard
5 enterprise tenants get isolated shards with premium SLAs

This mixed approach—small tenants sharing, large tenants isolated—requires directory sharding. No algorithmic approach can express "tenant X gets shard Y because they pay $1M/year."

Schrodinger's Sharding

Directory sharding allows a hybrid approach: use hash sharding as the default, but override specific keys in the directory. New tenants get hash-assigned shards, but if a tenant grows huge, manually reassign them to a dedicated shard via directory entry.

The Directory as a Single Point of Failure

The fundamental challenge of directory sharding is that the directory itself becomes a critical dependency. If the directory is unavailable, no queries can be routed. This creates a single point of failure (SPOF) that must be addressed.

Failure Scenarios:

Directory Failure Impacts

•Directory Unavailable — All routing fails. No data can be accessed until directory recovers.
•Directory Latency Spike — All queries slow down proportionally. 100ms directory latency = 100ms added to every query.
•Directory Corruption — Queries route to wrong shards. Data appears missing or belongs to wrong tenant.
•Split Brain — Different app servers have different cached views of the directory, causing inconsistent routing.
•Cache Invalidation Failure — Stale cache routes queries to old shards after migration, missing new data.

Mitigation Strategies:

Making the Directory Highly Available

•Replicated Directory — Run directory on a replicated database (PostgreSQL with replication, CockroachDB, Spanner). Multiple replicas ensure availability.
•Distributed Cache — Use Redis Cluster or memcached cluster as the directory. Already designed for high availability.
•Aggressive Caching — Cache directory mappings in application memory with long TTL. Even if directory goes down, cached mappings continue to work.
•Stale Read Fallback — During directory outage, serve from stale cache. Accept that recently migrated data might be inconsistent.
•Local Directory Snapshot — Periodically snapshot directory to local disk. Use snapshot if directory is unreachable.
•Circuit Breaker — If directory latency exceeds threshold, use cached entries only and alert operations.

resilient-directory-router.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
/**
 * Resilient Directory Router with Fallback Strategies
 * 
 * Handles directory unavailability gracefully
 */
 
interface ResilientRouterConfig {
    directoryTimeoutMs: number;
    staleCacheFallback: boolean;
    circuitBreakerThreshold: number;
    localSnapshotPath?: string;
}
 
class ResilientDirectoryRouter {
    private config: ResilientRouterConfig;
    private directory: DirectoryStore;
    private staleCache: Map<string, string> = new Map();
    private freshCache: Map<string, { shardId: string; expiry: Date }> = new Map();
    
    // Circuit breaker state
    private consecutiveFailures: number = 0;
    private circuitOpen: boolean = false;
    private circuitOpenUntil: Date = new Date(0);
    
    constructor(directory: DirectoryStore, config: ResilientRouterConfig) {
        this.directory = directory;
        this.config = config;
    }
    
    async getShard(partitionKey: string): Promise<string | null> {
        // Check fresh cache first
        const fresh = this.freshCache.get(partitionKey);
        if (fresh && fresh.expiry > new Date()) {
            return fresh.shardId;
        }
        
        // Check if circuit breaker is open
        if (this.circuitOpen && new Date() < this.circuitOpenUntil) {
            console.warn(`Circuit open, using stale cache for ${partitionKey}`);
            return this.getFromStaleCache(partitionKey);
        }
        
        // Try directory with timeout
        try {
            const mapping = await this.queryDirectoryWithTimeout(partitionKey);
            
            if (mapping) {
                // Update both caches
                this.freshCache.set(partitionKey, {
                    shardId: mapping.shardId,
                    expiry: new Date(Date.now() + 60000),
                });
                this.staleCache.set(partitionKey, mapping.shardId);
                
                // Reset circuit breaker on success
                this.consecutiveFailures = 0;
                this.circuitOpen = false;
                
                return mapping.shardId;
            }
            
            return null;
            
        } catch (error) {
            // Directory query failed
            this.consecutiveFailures++;
            
            if (this.consecutiveFailures >= this.config.circuitBreakerThreshold) {
                console.error(`Circuit breaker triggered after ${this.consecutiveFailures} failures`);
                this.circuitOpen = true;
                this.circuitOpenUntil = new Date(Date.now() + 30000); // 30s cooldown
            }
            
            // Fall back to stale cache
            if (this.config.staleCacheFallback) {
                return this.getFromStaleCache(partitionKey);
            }
            
            throw error;
        }
    }
    
    private async queryDirectoryWithTimeout(key: string): Promise<ShardMapping | null> {
        const timeoutPromise = new Promise<never>((_, reject) => {
            setTimeout(() => reject(new Error('Directory timeout')), 
                       this.config.directoryTimeoutMs);
        });
        
        return Promise.race([
            this.directory.get(key),
            timeoutPromise,
        ]);
    }
    
    private getFromStaleCache(key: string): string | null {
        const stale = this.staleCache.get(key);
        if (stale) {
            console.warn(`Serving stale cache for ${key}`);
            return stale;
        }
        return null;
    }
    
    /**
     * Load directory snapshot for cold start or recovery
     */
    async loadSnapshot(snapshot: Map<string, string>): Promise<void> {
        this.staleCache = new Map(snapshot);
        console.log(`Loaded snapshot with ${snapshot.size} entries`);
    }
}

Stale Cache Risk

Using stale cache during directory outage can route queries to wrong shards if migrations occurred. Weigh this risk: is it better to serve potentially stale data, or to fail entirely? For most systems, serving from stale cache is preferable to complete outage.

Directory Updates and Cache Invalidation

When a partition key moves to a new shard, all cached mappings must be invalidated. This is the classic distributed cache invalidation problem—one of the hardest problems in computer science.

The Challenge:

Imagine 100 application servers, each caching directory mappings locally. Tenant "acme" moves from shard-1 to shard-2. Each server must:

Learn of the change
Invalidate their cached entry for "acme"
Start routing to shard-2

If any server misses the invalidation, it continues routing to shard-1, causing queries to fail or return stale data.

Invalidation Strategies:

Cache Invalidation Approaches

•Time-Based Expiry (TTL) — Cache entries expire after N seconds. Simple but slow—stale data persists until TTL expires. Best for low-change environments.
•Pub/Sub Broadcast — Directory publishes changes; all servers subscribe. Redis pub/sub, Kafka, or cloud pub/sub. Fast but requires reliable messaging.
•Version Stamping — Include version in cached entries. Query includes version; shard rejects if stale. Catches inconsistencies at query time.
•Distributed Cache — Use Redis Cluster as directory AND cache. Single source of truth. Eliminates local cache inconsistency.
•Lease-Based — Cache entries are 'leased' from directory. Must renew lease periodically. Directory can refuse renewal to force refresh.

pubsub-cache-invalidation.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
/**
 * Pub/Sub Cache Invalidation for Directory Sharding
 * 
 * Uses Redis Pub/Sub to broadcast invalidation events
 */
 
import { createClient, RedisClientType } from 'redis';
 
interface InvalidationEvent {
    type: 'invalidate' | 'update';
    partitionKey: string;
    newShardId?: string;
    timestamp: number;
}
 
class PubSubCacheInvalidator {
    private subscriber: RedisClientType;
    private publisher: RedisClientType;
    private cache: Map<string, { shardId: string; version: number }>;
    private channel: string = 'shard-directory-updates';
    
    constructor(redisUrl: string) {
        this.subscriber = createClient({ url: redisUrl });
        this.publisher = createClient({ url: redisUrl });
        this.cache = new Map();
    }
    
    async initialize(): Promise<void> {
        await this.subscriber.connect();
        await this.publisher.connect();
        
        // Subscribe to invalidation events
        await this.subscriber.subscribe(this.channel, (message) => {
            this.handleInvalidationEvent(JSON.parse(message));
        });
        
        console.log('Cache invalidation listener initialized');
    }
    
    /**
     * Called when directory is updated
     * Broadcasts invalidation to all application instances
     */
    async broadcastInvalidation(partitionKey: string, newShardId?: string): Promise<void> {
        const event: InvalidationEvent = {
            type: newShardId ? 'update' : 'invalidate',
            partitionKey,
            newShardId,
            timestamp: Date.now(),
        };
        
        await this.publisher.publish(this.channel, JSON.stringify(event));
        console.log(`Broadcast invalidation for ${partitionKey}`);
    }
    
    /**
     * Handle incoming invalidation event
     */
    private handleInvalidationEvent(event: InvalidationEvent): void {
        console.log(`Received invalidation event for ${event.partitionKey}`);
        
        if (event.type === 'invalidate') {
            // Just remove from cache
            this.cache.delete(event.partitionKey);
        } else if (event.type === 'update' && event.newShardId) {
            // Update cache with new value (optimistic update)
            this.cache.set(event.partitionKey, {
                shardId: event.newShardId,
                version: event.timestamp,
            });
        }
    }
    
    /**
     * Get shard from cache
     */
    getShard(partitionKey: string): string | null {
        const entry = this.cache.get(partitionKey);
        return entry?.shardId || null;
    }
    
    /**
     * Update cache (called after directory query)
     */
    setCache(partitionKey: string, shardId: string, version: number): void {
        const existing = this.cache.get(partitionKey);
        
        // Only update if we have newer version
        if (!existing || existing.version < version) {
            this.cache.set(partitionKey, { shardId, version });
        }
    }
}
 
// Usage during migration:
async function migrateTenant(
    tenantId: string,
    oldShardId: string,
    newShardId: string,
    invalidator: PubSubCacheInvalidator
): Promise<void> {
    console.log(`Migrating ${tenantId} from ${oldShardId} to ${newShardId}`);
    
    // Step 1: Copy data to new shard
    await copyTenantData(tenantId, oldShardId, newShardId);
    
    // Step 2: Update directory (source of truth)
    await updateDirectory(tenantId, newShardId);
    
    // Step 3: Broadcast invalidation to all instances
    await invalidator.broadcastInvalidation(tenantId, newShardId);
    
    // Step 4: Wait for propagation (optional, depends on requirements)
    await new Promise(resolve => setTimeout(resolve, 1000));
    
    // Step 5: Delete data from old shard
    await deleteTenantData(tenantId, oldShardId);
    
    console.log(`Migration complete for ${tenantId}`);
}
 
// Placeholder functions
async function copyTenantData(t: string, from: string, to: string) {}
async function updateDirectory(t: string, shard: string) {}
async function deleteTenantData(t: string, shard: string) {}

The Two-Phase Commit Alternative

For strict consistency, some systems use a two-phase approach: (1) mark the key as 'migrating' in the directory, causing queries to wait; (2) migrate data; (3) update directory to new shard; (4) remove migrating flag. This eliminates inconsistency but introduces migration latency.

Directory Schema Design

The directory's schema significantly impacts performance, flexibility, and operational management. A well-designed directory schema anticipates future needs and enables efficient querying.

Basic Schema:

CREATE TABLE shard_directory (
    partition_key   VARCHAR(255) PRIMARY KEY,
    shard_id        VARCHAR(50) NOT NULL,
    created_at      TIMESTAMP DEFAULT NOW(),
    updated_at      TIMESTAMP DEFAULT NOW()
);

Enhanced Schema with Metadata:

directory-schema.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
-- Enhanced directory schema for production use
 
CREATE TABLE shard_directory (
    -- Core mapping
    partition_key   VARCHAR(255) PRIMARY KEY,
    shard_id        VARCHAR(50) NOT NULL,
    
    -- Metadata for operational management
    entity_type     VARCHAR(50) NOT NULL,  -- 'tenant', 'user', 'workspace'
    created_at      TIMESTAMP DEFAULT NOW(),
    updated_at      TIMESTAMP DEFAULT NOW(),
    
    -- Migration state
    migration_state VARCHAR(20) DEFAULT 'stable',  -- 'stable', 'migrating', 'pending_delete'
    migration_target_shard VARCHAR(50),  -- Where it's migrating to
    
    -- Sizing hints for rebalancing
    estimated_size_bytes BIGINT DEFAULT 0,
    last_size_update TIMESTAMP,
    
    -- Operational metadata
    tier            VARCHAR(20) DEFAULT 'standard', -- 'standard', 'premium', 'dedicated'
    region          VARCHAR(20),  -- For data residency
    compliance_flags JSONB  -- GDPR, HIPAA, etc.
);
 
-- Index for finding all entities on a shard (for rebalancing)
CREATE INDEX idx_shard_directory_shard ON shard_directory(shard_id);
 
-- Index for finding migrating entities
CREATE INDEX idx_shard_directory_migration ON shard_directory(migration_state) 
    WHERE migration_state != 'stable';
 
-- Index for querying by entity type
CREATE INDEX idx_shard_directory_type ON shard_directory(entity_type);
 
-- Shard metadata table
CREATE TABLE shards (
    shard_id        VARCHAR(50) PRIMARY KEY,
    host            VARCHAR(255) NOT NULL,
    port            INTEGER NOT NULL,
    
    -- Capacity tracking
    max_entities    INTEGER,
    current_entities INTEGER DEFAULT 0,
    max_size_bytes  BIGINT,
    current_size_bytes BIGINT DEFAULT 0,
    
    -- Status
    status          VARCHAR(20) DEFAULT 'active', -- 'active', 'draining', 'readonly', 'offline'
    region          VARCHAR(20),
    tier            VARCHAR(20) DEFAULT 'standard',
    
    -- Connection info
    connection_string TEXT,
    created_at      TIMESTAMP DEFAULT NOW()
);
 
-- View for routing decisions
CREATE VIEW routing_info AS
SELECT 
    d.partition_key,
    d.shard_id,
    d.migration_state,
    d.migration_target_shard,
    s.host,
    s.port,
    s.status as shard_status
FROM shard_directory d
JOIN shards s ON d.shard_id = s.shard_id;
 
-- Function to get routing with migration awareness
CREATE OR REPLACE FUNCTION get_shard_routing(p_key VARCHAR(255))
RETURNS TABLE(shard_id VARCHAR(50), host VARCHAR(255), port INTEGER, is_migrating BOOLEAN) AS $$
BEGIN
    RETURN QUERY
    SELECT 
        COALESCE(d.migration_target_shard, d.shard_id) as shard_id,
        s.host,
        s.port,
        d.migration_state = 'migrating' as is_migrating
    FROM shard_directory d
    JOIN shards s ON s.shard_id = COALESCE(d.migration_target_shard, d.shard_id)
    WHERE d.partition_key = p_key
      AND s.status = 'active';
END;
$$ LANGUAGE plpgsql;

Directory Schema Elements
Element	Purpose	Impact
migration_state	Track in-flight migrations	Enables safe, observable migrations
estimated_size_bytes	Track entity size	Enables intelligent rebalancing
tier	Differentiate service levels	Route premium tenants to better hardware
region	Data residency tracking	Compliance with GDPR, data sovereignty
compliance_flags	Regulatory requirements	Ensure data stays on compliant shards

Size Tracking for Rebalancing

Tracking estimated_size_bytes per entity enables intelligent shard rebalancing. When a shard is overloaded, you can identify the largest entities to move. Without this, rebalancing is guesswork. Update size estimates periodically (e.g., nightly) by querying shard databases.

Hybrid Approaches: Directory + Algorithm

In practice, pure directory sharding is rarely used alone. Most production systems use a hybrid approach that combines algorithmic sharding (hash or range) with directory overrides.

The Hybrid Pattern:

Default: Algorithmic — New entities get a shard via hash or range calculation
Override: Directory — Specific entities can have explicit directory entries
Lookup Order: Check directory first; if no entry, compute algorithmically

This gives you the simplicity of algorithmic sharding for the 99% case, with directory flexibility for the 1% that need special handling.

hybrid-shard-router.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
/**
 * Hybrid Shard Router
 * 
 * Combines algorithmic (hash) sharding with directory overrides
 * Directory entries take precedence over computed shards
 */
 
class HybridShardRouter {
    private directoryCache: Map<string, string> = new Map();
    private directory: DirectoryStore;
    private shardCount: number;
    
    constructor(directory: DirectoryStore, shardCount: number) {
        this.directory = directory;
        this.shardCount = shardCount;
    }
    
    /**
     * Get shard for a partition key
     * 1. Check directory (for overrides)
     * 2. Fall back to hash computation
     */
    async getShard(partitionKey: string): Promise<string> {
        // Step 1: Check cache for directory override
        const cached = this.directoryCache.get(partitionKey);
        if (cached) {
            return cached;
        }
        
        // Step 2: Query directory for explicit mapping
        const directoryMapping = await this.directory.get(partitionKey);
        if (directoryMapping) {
            this.directoryCache.set(partitionKey, directoryMapping.shardId);
            return directoryMapping.shardId;
        }
        
        // Step 3: No directory entry - compute via hash
        return this.computeHashShard(partitionKey);
    }
    
    /**
     * Compute shard using consistent hash (default behavior)
     */
    private computeHashShard(partitionKey: string): string {
        const hash = this.murmurHash3(partitionKey);
        const shardIndex = hash % this.shardCount;
        return `shard-${shardIndex}`;
    }
    
    /**
     * Override a partition key to a specific shard
     * Used for:
     * - Moving large tenants to dedicated shards
     * - Data residency requirements
     * - Isolating problematic tenants
     */
    async overrideShard(partitionKey: string, shardId: string): Promise<void> {
        // Add to directory
        await this.directory.set(partitionKey, shardId);
        
        // Update cache
        this.directoryCache.set(partitionKey, shardId);
        
        console.log(`Override: ${partitionKey} -> ${shardId}`);
    }
    
    /**
     * Remove override, revert to algorithmic sharding
     */
    async removeOverride(partitionKey: string): Promise<void> {
        await this.directory.delete(partitionKey);
        this.directoryCache.delete(partitionKey);
        
        // Entity now routes via hash
        const computedShard = this.computeHashShard(partitionKey);
        console.log(`Override removed: ${partitionKey} reverts to ${computedShard}`);
    }
    
    /**
     * Get stats on overrides vs algorithmic routing
     */
    async getRoutingStats(): Promise<{ overrides: number; algorithmic: number }> {
        const overrideCount = await this.directory.count();
        // Assuming we know total entity count from somewhere
        return {
            overrides: overrideCount,
            algorithmic: -1, // Would need to query actual shard databases
        };
    }
    
    private murmurHash3(key: string): number {
        let hash = 0x12345678;
        for (let i = 0; i < key.length; i++) {
            hash ^= key.charCodeAt(i);
            hash = Math.imul(hash, 0x5bd1e995);
            hash ^= hash >>> 15;
        }
        return Math.abs(hash >>> 0);
    }
}
 
// Example usage
const router = new HybridShardRouter(directoryStore, 16);
 
// Most tenants use hash sharding
const smallTenantShard = await router.getShard("small-tenant-123");
// Returns computed shard, e.g., "shard-7"
 
// Large tenant gets explicit override
await router.overrideShard("enterprise-acme", "shard-dedicated-acme");
 
// Now always routes to dedicated shard
const acmeShard = await router.getShard("enterprise-acme");
// Returns "shard-dedicated-acme"

Hybrid Benefits

•Simplicity for normal cases
•Flexibility for exceptions
•Minimal directory size
•Fast default routing
•Incremental migration path

Hybrid Complexity

•Two routing paths to debug
•Cache must handle both sources
•Migration from override back to hash
•Directory must be available for overrides
•Testing both paths required

The Sweet Spot

Hybrid sharding is often the sweet spot for growing systems. Start with pure algorithmic sharding. When specific tenants need special handling, add directory overrides for just those cases. You maintain simplicity for the common case while gaining flexibility where needed.

Directory Sharding at Scale

As the number of partition keys grows into millions, the directory itself can become a scalability challenge. Let's examine strategies for scaling the directory.

Directory Scaling Challenges:

Millions of Entries — A SaaS with 10M users means 10M directory entries if sharding by user_id
High Query Rate — Every data query potentially queries the directory
Memory Pressure — Full directory cache might not fit in application memory
Write Amplification — Schema changes require updating millions of rows

Scaling Strategies:

Scaling the Directory

•Use the Hybrid Pattern — Only store overrides in directory. If 99.5% of entities use algorithmic sharding, directory has 0.5% of entries.
•Group-Level Sharding — Shard by tenant_id, not user_id. 100K tenants is more manageable than 100M users. Users inherit their tenant's shard.
•Hierarchical Directory — Region → Shard Group → Shard. Cache coarse levels (region), lookup fine-grained as needed.
•Shard the Directory Itself — If directory can't fit on one node, shard it using consistent hashing. Meta-sharding!
•In-Memory Specialized Stores — Use Redis, VoltDB, or MemSQL for the directory. Designed for millions of fast lookups.

Directory Scaling by Entity Count
Entity Count	Recommended Approach	Directory Size	Cache Strategy
< 10,000	Full directory in any DB	< 1MB	Cache everything
10K - 100K	PostgreSQL/MySQL with caching	1-10MB	LRU cache, 60s TTL
100K - 1M	Hybrid (overrides only) or Redis	10-100MB	Distributed cache
1M - 10M	Group-level sharding or sharded directory	100MB-1GB	Tiered caching
10M	Hierarchical + sharded directory	1GB+	Regional caching, pre-warming

Case Study: Slack's Enterprise Grid

Slack uses a form of directory sharding for Enterprise Grid, where large organizations can have dedicated infrastructure:

Most workspaces use algorithmic sharding (hash of workspace_id)
Enterprise Grid customers get explicit directory entries routing to isolated shards
Directory tracks workspace → shard mapping and workspace → tenant membership
The directory is cached heavily with pub/sub invalidation

This hybrid approach scales to millions of workspaces while providing enterprise-grade isolation for customers who need it.

Start Simple, Scale as Needed

Don't over-engineer the directory initially. Start with a simple PostgreSQL table and in-memory cache. Only add complexity (Redis, sharded directory, hierarchical lookup) when you actually hit scale limits. Many successful systems run for years with a simple directory in a single database table.

Summary: Directory-Based Sharding

We've covered directory-based sharding comprehensively. Let's consolidate the key insights:

Key Takeaways

•Directory sharding uses explicit lookup tables — Maximum flexibility at the cost of an additional lookup hop per query.
•Caching is essential — Without aggressive caching, every query requires two round-trips. Target 99%+ cache hit rate.
•The directory is a SPOF — Make it highly available with replication, stale cache fallback, and circuit breakers.
•Cache invalidation is the hard problem — Use pub/sub, TTL, or hybrid approaches to keep caches consistent after migrations.
•Hybrid approaches combine the best of both — Algorithmic for default, directory for overrides. Most production systems use this pattern.
•Scale the directory thoughtfully — Group-level sharding, hierarchical lookup, or sharding the directory itself for massive scale.

What's Next:

Now that you understand all three sharding strategies (range, hash, directory), we'll tackle the most critical decision in any sharding implementation: shard key selection. The shard key determines query efficiency, data distribution, and migration complexity. Choosing poorly can doom your sharding implementation; choosing well makes everything else easier.

Page Complete

You now understand directory-based sharding: how it works, when to use it, how to make it resilient, and how to combine it with algorithmic approaches. You have the complete picture of sharding strategies. Next, we'll learn how to choose the shard key that makes your entire sharding architecture work.