Loading content...
What if you could map any key to any shard, with complete flexibility? What if migration were as simple as updating a database row? Directory-based sharding provides exactly this—an explicit lookup table that maps partition keys to shards.
Unlike range or hash sharding where the mapping is computed algorithmically, directory sharding stores the mapping explicitly. This provides maximum flexibility: any key can be moved to any shard at any time. But this power comes with costs—every query requires a lookup, and the directory itself becomes a critical system component.
By the end of this page, you will understand how directory-based sharding works, when it's the right choice, how to implement it efficiently, and the tradeoffs compared to algorithmic sharding. You'll learn patterns for directory caching, consistency, and high availability that make directory sharding practical at scale.
In directory-based sharding, a centralized directory (lookup table) maintains the mapping from partition keys to shards. Every query requires consulting this directory to determine the target shard.
The Basic Mechanism:
Maintain a Directory — A database table or key-value store that maps partition keys to shard identifiers
| tenant_id | shard_id |
|-----------|----------|
| acme-corp | shard-3 |
| globex | shard-1 |
| initech | shard-2 |
Lookup Before Query — Before any data operation, query the directory to find the target shard
Route to Shard — Direct the query to the resolved shard
Cache for Performance — Cache directory lookups aggressively to avoid repeated directory queries
Compared to Algorithmic Sharding:
With hash sharding: shard = hash(key) % N — computed instantly, no lookup
With directory sharding: shard = directory.get(key) — requires a query
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127
/** * Directory-Based Shard Router * * Uses an explicit lookup table to map keys to shards * Includes caching for performance */ interface ShardMapping { partitionKey: string; shardId: string; lastUpdated: Date;} interface DirectoryStore { get(key: string): Promise<ShardMapping | null>; set(key: string, shardId: string): Promise<void>; getMany(keys: string[]): Promise<Map<string, ShardMapping>>;} class DirectoryShardRouter { private directory: DirectoryStore; private cache: Map<string, { shardId: string; expiry: Date }> = new Map(); private cacheTtlMs: number; constructor(directory: DirectoryStore, cacheTtlMs: number = 60000) { this.directory = directory; this.cacheTtlMs = cacheTtlMs; } /** * Get the shard for a partition key * Checks cache first, then queries directory */ async getShard(partitionKey: string): Promise<string | null> { // Check cache first const cached = this.cache.get(partitionKey); if (cached && cached.expiry > new Date()) { return cached.shardId; } // Cache miss - query directory const mapping = await this.directory.get(partitionKey); if (!mapping) { return null; // Key not registered } // Update cache this.cache.set(partitionKey, { shardId: mapping.shardId, expiry: new Date(Date.now() + this.cacheTtlMs), }); return mapping.shardId; } /** * Batch lookup for multiple keys * More efficient than individual lookups */ async getShards(partitionKeys: string[]): Promise<Map<string, string>> { const result = new Map<string, string>(); const cacheMisses: string[] = []; // Check cache for all keys for (const key of partitionKeys) { const cached = this.cache.get(key); if (cached && cached.expiry > new Date()) { result.set(key, cached.shardId); } else { cacheMisses.push(key); } } // Batch query directory for cache misses if (cacheMisses.length > 0) { const mappings = await this.directory.getMany(cacheMisses); for (const [key, mapping] of mappings) { result.set(key, mapping.shardId); this.cache.set(key, { shardId: mapping.shardId, expiry: new Date(Date.now() + this.cacheTtlMs), }); } } return result; } /** * Register a new partition key to a shard * Used when creating new tenants/entities */ async registerKey(partitionKey: string, shardId: string): Promise<void> { await this.directory.set(partitionKey, shardId); // Update cache with new mapping this.cache.set(partitionKey, { shardId, expiry: new Date(Date.now() + this.cacheTtlMs), }); } /** * Move a key to a different shard * This is where directory sharding shines! */ async migrateKey(partitionKey: string, newShardId: string): Promise<void> { // Step 1: Update directory (with lock) await this.directory.set(partitionKey, newShardId); // Step 2: Invalidate cache across all application instances // (In production, use pub/sub or cache invalidation broadcast) this.cache.delete(partitionKey); // Step 3: Actual data migration happens separately // (Copy data from old shard to new, then update directory) } /** * Invalidate cache for a key * Called when directory update notifications arrive */ invalidateCache(partitionKey: string): void { this.cache.delete(partitionKey); }}Without caching, every data query requires two round-trips: one to the directory, one to the shard. This doubles latency. Production directory sharding implementations cache aggressively, with cache hit rates typically >99%. The tradeoff is cache invalidation complexity when mappings change.
Directory-based sharding isn't the default choice—algorithmic approaches (hash, range) are simpler. But certain scenarios make directory sharding the best option:
Ideal Use Cases:
| Scenario | Why Directory Works | Alternative Approach |
|---|---|---|
| GDPR data residency | Map EU customers to EU shards explicitly | Complex hash function per region |
| VIP tenant isolation | Place large tenants on dedicated shards | None - requires explicit control |
| Mixed storage tiers | Hot tenants on SSD, cold on HDD | Complex tiering logic |
| Gradual migration | Move tenants one by one during migration | Big-bang migration |
| Acquisition integration | Keep acquired company's data separate | Schema merge required |
The Multi-Tenant SaaS Pattern:
Directory sharding is particularly powerful for multi-tenant SaaS. Consider a B2B application with 10,000 tenants:
This mixed approach—small tenants sharing, large tenants isolated—requires directory sharding. No algorithmic approach can express "tenant X gets shard Y because they pay $1M/year."
Directory sharding allows a hybrid approach: use hash sharding as the default, but override specific keys in the directory. New tenants get hash-assigned shards, but if a tenant grows huge, manually reassign them to a dedicated shard via directory entry.
The fundamental challenge of directory sharding is that the directory itself becomes a critical dependency. If the directory is unavailable, no queries can be routed. This creates a single point of failure (SPOF) that must be addressed.
Failure Scenarios:
Mitigation Strategies:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111
/** * Resilient Directory Router with Fallback Strategies * * Handles directory unavailability gracefully */ interface ResilientRouterConfig { directoryTimeoutMs: number; staleCacheFallback: boolean; circuitBreakerThreshold: number; localSnapshotPath?: string;} class ResilientDirectoryRouter { private config: ResilientRouterConfig; private directory: DirectoryStore; private staleCache: Map<string, string> = new Map(); private freshCache: Map<string, { shardId: string; expiry: Date }> = new Map(); // Circuit breaker state private consecutiveFailures: number = 0; private circuitOpen: boolean = false; private circuitOpenUntil: Date = new Date(0); constructor(directory: DirectoryStore, config: ResilientRouterConfig) { this.directory = directory; this.config = config; } async getShard(partitionKey: string): Promise<string | null> { // Check fresh cache first const fresh = this.freshCache.get(partitionKey); if (fresh && fresh.expiry > new Date()) { return fresh.shardId; } // Check if circuit breaker is open if (this.circuitOpen && new Date() < this.circuitOpenUntil) { console.warn(`Circuit open, using stale cache for ${partitionKey}`); return this.getFromStaleCache(partitionKey); } // Try directory with timeout try { const mapping = await this.queryDirectoryWithTimeout(partitionKey); if (mapping) { // Update both caches this.freshCache.set(partitionKey, { shardId: mapping.shardId, expiry: new Date(Date.now() + 60000), }); this.staleCache.set(partitionKey, mapping.shardId); // Reset circuit breaker on success this.consecutiveFailures = 0; this.circuitOpen = false; return mapping.shardId; } return null; } catch (error) { // Directory query failed this.consecutiveFailures++; if (this.consecutiveFailures >= this.config.circuitBreakerThreshold) { console.error(`Circuit breaker triggered after ${this.consecutiveFailures} failures`); this.circuitOpen = true; this.circuitOpenUntil = new Date(Date.now() + 30000); // 30s cooldown } // Fall back to stale cache if (this.config.staleCacheFallback) { return this.getFromStaleCache(partitionKey); } throw error; } } private async queryDirectoryWithTimeout(key: string): Promise<ShardMapping | null> { const timeoutPromise = new Promise<never>((_, reject) => { setTimeout(() => reject(new Error('Directory timeout')), this.config.directoryTimeoutMs); }); return Promise.race([ this.directory.get(key), timeoutPromise, ]); } private getFromStaleCache(key: string): string | null { const stale = this.staleCache.get(key); if (stale) { console.warn(`Serving stale cache for ${key}`); return stale; } return null; } /** * Load directory snapshot for cold start or recovery */ async loadSnapshot(snapshot: Map<string, string>): Promise<void> { this.staleCache = new Map(snapshot); console.log(`Loaded snapshot with ${snapshot.size} entries`); }}Using stale cache during directory outage can route queries to wrong shards if migrations occurred. Weigh this risk: is it better to serve potentially stale data, or to fail entirely? For most systems, serving from stale cache is preferable to complete outage.
When a partition key moves to a new shard, all cached mappings must be invalidated. This is the classic distributed cache invalidation problem—one of the hardest problems in computer science.
The Challenge:
Imagine 100 application servers, each caching directory mappings locally. Tenant "acme" moves from shard-1 to shard-2. Each server must:
If any server misses the invalidation, it continues routing to shard-1, causing queries to fail or return stale data.
Invalidation Strategies:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125
/** * Pub/Sub Cache Invalidation for Directory Sharding * * Uses Redis Pub/Sub to broadcast invalidation events */ import { createClient, RedisClientType } from 'redis'; interface InvalidationEvent { type: 'invalidate' | 'update'; partitionKey: string; newShardId?: string; timestamp: number;} class PubSubCacheInvalidator { private subscriber: RedisClientType; private publisher: RedisClientType; private cache: Map<string, { shardId: string; version: number }>; private channel: string = 'shard-directory-updates'; constructor(redisUrl: string) { this.subscriber = createClient({ url: redisUrl }); this.publisher = createClient({ url: redisUrl }); this.cache = new Map(); } async initialize(): Promise<void> { await this.subscriber.connect(); await this.publisher.connect(); // Subscribe to invalidation events await this.subscriber.subscribe(this.channel, (message) => { this.handleInvalidationEvent(JSON.parse(message)); }); console.log('Cache invalidation listener initialized'); } /** * Called when directory is updated * Broadcasts invalidation to all application instances */ async broadcastInvalidation(partitionKey: string, newShardId?: string): Promise<void> { const event: InvalidationEvent = { type: newShardId ? 'update' : 'invalidate', partitionKey, newShardId, timestamp: Date.now(), }; await this.publisher.publish(this.channel, JSON.stringify(event)); console.log(`Broadcast invalidation for ${partitionKey}`); } /** * Handle incoming invalidation event */ private handleInvalidationEvent(event: InvalidationEvent): void { console.log(`Received invalidation event for ${event.partitionKey}`); if (event.type === 'invalidate') { // Just remove from cache this.cache.delete(event.partitionKey); } else if (event.type === 'update' && event.newShardId) { // Update cache with new value (optimistic update) this.cache.set(event.partitionKey, { shardId: event.newShardId, version: event.timestamp, }); } } /** * Get shard from cache */ getShard(partitionKey: string): string | null { const entry = this.cache.get(partitionKey); return entry?.shardId || null; } /** * Update cache (called after directory query) */ setCache(partitionKey: string, shardId: string, version: number): void { const existing = this.cache.get(partitionKey); // Only update if we have newer version if (!existing || existing.version < version) { this.cache.set(partitionKey, { shardId, version }); } }} // Usage during migration:async function migrateTenant( tenantId: string, oldShardId: string, newShardId: string, invalidator: PubSubCacheInvalidator): Promise<void> { console.log(`Migrating ${tenantId} from ${oldShardId} to ${newShardId}`); // Step 1: Copy data to new shard await copyTenantData(tenantId, oldShardId, newShardId); // Step 2: Update directory (source of truth) await updateDirectory(tenantId, newShardId); // Step 3: Broadcast invalidation to all instances await invalidator.broadcastInvalidation(tenantId, newShardId); // Step 4: Wait for propagation (optional, depends on requirements) await new Promise(resolve => setTimeout(resolve, 1000)); // Step 5: Delete data from old shard await deleteTenantData(tenantId, oldShardId); console.log(`Migration complete for ${tenantId}`);} // Placeholder functionsasync function copyTenantData(t: string, from: string, to: string) {}async function updateDirectory(t: string, shard: string) {}async function deleteTenantData(t: string, shard: string) {}For strict consistency, some systems use a two-phase approach: (1) mark the key as 'migrating' in the directory, causing queries to wait; (2) migrate data; (3) update directory to new shard; (4) remove migrating flag. This eliminates inconsistency but introduces migration latency.
The directory's schema significantly impacts performance, flexibility, and operational management. A well-designed directory schema anticipates future needs and enables efficient querying.
Basic Schema:
CREATE TABLE shard_directory (
partition_key VARCHAR(255) PRIMARY KEY,
shard_id VARCHAR(50) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
Enhanced Schema with Metadata:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687
-- Enhanced directory schema for production use CREATE TABLE shard_directory ( -- Core mapping partition_key VARCHAR(255) PRIMARY KEY, shard_id VARCHAR(50) NOT NULL, -- Metadata for operational management entity_type VARCHAR(50) NOT NULL, -- 'tenant', 'user', 'workspace' created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW(), -- Migration state migration_state VARCHAR(20) DEFAULT 'stable', -- 'stable', 'migrating', 'pending_delete' migration_target_shard VARCHAR(50), -- Where it's migrating to -- Sizing hints for rebalancing estimated_size_bytes BIGINT DEFAULT 0, last_size_update TIMESTAMP, -- Operational metadata tier VARCHAR(20) DEFAULT 'standard', -- 'standard', 'premium', 'dedicated' region VARCHAR(20), -- For data residency compliance_flags JSONB -- GDPR, HIPAA, etc.); -- Index for finding all entities on a shard (for rebalancing)CREATE INDEX idx_shard_directory_shard ON shard_directory(shard_id); -- Index for finding migrating entitiesCREATE INDEX idx_shard_directory_migration ON shard_directory(migration_state) WHERE migration_state != 'stable'; -- Index for querying by entity typeCREATE INDEX idx_shard_directory_type ON shard_directory(entity_type); -- Shard metadata tableCREATE TABLE shards ( shard_id VARCHAR(50) PRIMARY KEY, host VARCHAR(255) NOT NULL, port INTEGER NOT NULL, -- Capacity tracking max_entities INTEGER, current_entities INTEGER DEFAULT 0, max_size_bytes BIGINT, current_size_bytes BIGINT DEFAULT 0, -- Status status VARCHAR(20) DEFAULT 'active', -- 'active', 'draining', 'readonly', 'offline' region VARCHAR(20), tier VARCHAR(20) DEFAULT 'standard', -- Connection info connection_string TEXT, created_at TIMESTAMP DEFAULT NOW()); -- View for routing decisionsCREATE VIEW routing_info ASSELECT d.partition_key, d.shard_id, d.migration_state, d.migration_target_shard, s.host, s.port, s.status as shard_statusFROM shard_directory dJOIN shards s ON d.shard_id = s.shard_id; -- Function to get routing with migration awarenessCREATE OR REPLACE FUNCTION get_shard_routing(p_key VARCHAR(255))RETURNS TABLE(shard_id VARCHAR(50), host VARCHAR(255), port INTEGER, is_migrating BOOLEAN) AS $$BEGIN RETURN QUERY SELECT COALESCE(d.migration_target_shard, d.shard_id) as shard_id, s.host, s.port, d.migration_state = 'migrating' as is_migrating FROM shard_directory d JOIN shards s ON s.shard_id = COALESCE(d.migration_target_shard, d.shard_id) WHERE d.partition_key = p_key AND s.status = 'active';END;$$ LANGUAGE plpgsql;| Element | Purpose | Impact |
|---|---|---|
| migration_state | Track in-flight migrations | Enables safe, observable migrations |
| estimated_size_bytes | Track entity size | Enables intelligent rebalancing |
| tier | Differentiate service levels | Route premium tenants to better hardware |
| region | Data residency tracking | Compliance with GDPR, data sovereignty |
| compliance_flags | Regulatory requirements | Ensure data stays on compliant shards |
Tracking estimated_size_bytes per entity enables intelligent shard rebalancing. When a shard is overloaded, you can identify the largest entities to move. Without this, rebalancing is guesswork. Update size estimates periodically (e.g., nightly) by querying shard databases.
In practice, pure directory sharding is rarely used alone. Most production systems use a hybrid approach that combines algorithmic sharding (hash or range) with directory overrides.
The Hybrid Pattern:
This gives you the simplicity of algorithmic sharding for the 99% case, with directory flexibility for the 1% that need special handling.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
/** * Hybrid Shard Router * * Combines algorithmic (hash) sharding with directory overrides * Directory entries take precedence over computed shards */ class HybridShardRouter { private directoryCache: Map<string, string> = new Map(); private directory: DirectoryStore; private shardCount: number; constructor(directory: DirectoryStore, shardCount: number) { this.directory = directory; this.shardCount = shardCount; } /** * Get shard for a partition key * 1. Check directory (for overrides) * 2. Fall back to hash computation */ async getShard(partitionKey: string): Promise<string> { // Step 1: Check cache for directory override const cached = this.directoryCache.get(partitionKey); if (cached) { return cached; } // Step 2: Query directory for explicit mapping const directoryMapping = await this.directory.get(partitionKey); if (directoryMapping) { this.directoryCache.set(partitionKey, directoryMapping.shardId); return directoryMapping.shardId; } // Step 3: No directory entry - compute via hash return this.computeHashShard(partitionKey); } /** * Compute shard using consistent hash (default behavior) */ private computeHashShard(partitionKey: string): string { const hash = this.murmurHash3(partitionKey); const shardIndex = hash % this.shardCount; return `shard-${shardIndex}`; } /** * Override a partition key to a specific shard * Used for: * - Moving large tenants to dedicated shards * - Data residency requirements * - Isolating problematic tenants */ async overrideShard(partitionKey: string, shardId: string): Promise<void> { // Add to directory await this.directory.set(partitionKey, shardId); // Update cache this.directoryCache.set(partitionKey, shardId); console.log(`Override: ${partitionKey} -> ${shardId}`); } /** * Remove override, revert to algorithmic sharding */ async removeOverride(partitionKey: string): Promise<void> { await this.directory.delete(partitionKey); this.directoryCache.delete(partitionKey); // Entity now routes via hash const computedShard = this.computeHashShard(partitionKey); console.log(`Override removed: ${partitionKey} reverts to ${computedShard}`); } /** * Get stats on overrides vs algorithmic routing */ async getRoutingStats(): Promise<{ overrides: number; algorithmic: number }> { const overrideCount = await this.directory.count(); // Assuming we know total entity count from somewhere return { overrides: overrideCount, algorithmic: -1, // Would need to query actual shard databases }; } private murmurHash3(key: string): number { let hash = 0x12345678; for (let i = 0; i < key.length; i++) { hash ^= key.charCodeAt(i); hash = Math.imul(hash, 0x5bd1e995); hash ^= hash >>> 15; } return Math.abs(hash >>> 0); }} // Example usageconst router = new HybridShardRouter(directoryStore, 16); // Most tenants use hash shardingconst smallTenantShard = await router.getShard("small-tenant-123");// Returns computed shard, e.g., "shard-7" // Large tenant gets explicit overrideawait router.overrideShard("enterprise-acme", "shard-dedicated-acme"); // Now always routes to dedicated shardconst acmeShard = await router.getShard("enterprise-acme");// Returns "shard-dedicated-acme"Hybrid sharding is often the sweet spot for growing systems. Start with pure algorithmic sharding. When specific tenants need special handling, add directory overrides for just those cases. You maintain simplicity for the common case while gaining flexibility where needed.
As the number of partition keys grows into millions, the directory itself can become a scalability challenge. Let's examine strategies for scaling the directory.
Directory Scaling Challenges:
Scaling Strategies:
| Entity Count | Recommended Approach | Directory Size | Cache Strategy |
|---|---|---|---|
| < 10,000 | Full directory in any DB | < 1MB | Cache everything |
| 10K - 100K | PostgreSQL/MySQL with caching | 1-10MB | LRU cache, 60s TTL |
| 100K - 1M | Hybrid (overrides only) or Redis | 10-100MB | Distributed cache |
| 1M - 10M | Group-level sharding or sharded directory | 100MB-1GB | Tiered caching |
10M | Hierarchical + sharded directory | 1GB+ | Regional caching, pre-warming |
Case Study: Slack's Enterprise Grid
Slack uses a form of directory sharding for Enterprise Grid, where large organizations can have dedicated infrastructure:
This hybrid approach scales to millions of workspaces while providing enterprise-grade isolation for customers who need it.
Don't over-engineer the directory initially. Start with a simple PostgreSQL table and in-memory cache. Only add complexity (Redis, sharded directory, hierarchical lookup) when you actually hit scale limits. Many successful systems run for years with a simple directory in a single database table.
We've covered directory-based sharding comprehensively. Let's consolidate the key insights:
What's Next:
Now that you understand all three sharding strategies (range, hash, directory), we'll tackle the most critical decision in any sharding implementation: shard key selection. The shard key determines query efficiency, data distribution, and migration complexity. Choosing poorly can doom your sharding implementation; choosing well makes everything else easier.
You now understand directory-based sharding: how it works, when to use it, how to make it resilient, and how to combine it with algorithmic approaches. You have the complete picture of sharding strategies. Next, we'll learn how to choose the shard key that makes your entire sharding architecture work.