Loading content...
In 1967, psychologist Stanley Milgram conducted his famous "small world experiment," concluding that any two people in the United States are connected by approximately six intermediate acquaintances—the origin of "six degrees of separation."
LinkedIn brings this concept to the digital professional world. When you view someone's profile, LinkedIn shows "2nd" or "3rd" next to their name, with a hover revealing the exact path: "You → Sarah → Michael → them." This seemingly simple feature requires solving computationally expensive problems on a graph with billions of edges in under one second.
This page explores the algorithms, data structures, and optimizations that make degree-of-separation queries possible at LinkedIn scale. We'll cover bidirectional search, landmark-based distance estimation, path ranking, and the unique challenges of professional network topology.
By the end of this page, you will understand how to compute shortest paths in massive graphs, optimize queries using landmarks and precomputation, rank multiple paths by quality, handle edge cases like disconnected users and supernodes, and design systems that return degree labels in real-time.
Let's formally define the degree-of-separation problem and understand why it's computationally challenging at scale.
Formal Definition:
Given an undirected graph G = (V, E) representing the connection network:
The Degree Query: For members u and v, find the shortest path length d(u, v), or determine that no path exists.
The Path Query: For members u and v, find one or more shortest paths P = [u, n₁, n₂, ..., v] where each consecutive pair is connected.
Complexity Analysis:
| Algorithm | Time Complexity | Space Complexity | Practical at LinkedIn Scale? |
|---|---|---|---|
| BFS (naive) | O(|V| + |E|) | O(|V|) | No - explores too many nodes |
| Bidirectional BFS | O(b^(d/2)) | O(b^(d/2)) | Yes - meets in middle |
| A* with heuristic | O(b^d) worst case | O(b^d) | If good heuristic available |
| Landmark-based | O(L) lookup + correction | O(L × |V|) precompute | Yes - for approximate |
| All-pairs shortest path | O(|V|³) | O(|V|²) | Absolutely not - infeasible |
Where:
Why This Is Hard:
Branching Factor Explosion: With 500 average connections, 1-hop = 500 nodes, 2-hop = 250,000 nodes, 3-hop = 125 million nodes. Naive BFS quickly exceeds memory.
Latency Requirement: Users expect instant feedback (<1 second) when viewing profiles.
Query Volume: LinkedIn serves billions of profile views per day, each potentially triggering a path query.
Dynamic Graph: Connections change constantly—10+ million new connections per day invalidate precomputed results.
Supernodes: Celebrities and influencers distort typical algorithms. A supernode with 100K connections explodes search space.
LinkedIn's data shows the average degrees of separation between any two random users is approximately 4.5. Most practical queries are between users within 3 degrees (connected through a single mutual connection). The 4th and 5th degree cases are less common and can tolerate slightly higher latency.
Bidirectional BFS is the core algorithm for degree-of-separation queries. By searching from both the source and target simultaneously, we reduce the explored graph exponentially.
Mathematical Intuition:
For a graph with branching factor b and a shortest path of length d:
For b=500 and d=4:
This is a reduction by a factor of 125,000!
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230
interface PathResult { degrees: number; paths: string[][]; // Multiple shortest paths searchStats: { nodesExplored: number; timeMs: number; cacheHits: number; };} class DegreeOfSeparationService { private graphStore: GraphStore; private cache: PathCache; private metrics: MetricsRecorder; // Configuration private MAX_DEGREES = 4; // Don't search beyond 4th degree private MAX_PATHS = 5; // Return up to 5 shortest paths private MAX_NEIGHBORS = 1000; // Limit neighbors for supernodes private TIMEOUT_MS = 800; // Hard timeout for user experience async findPath( viewerId: string, targetId: string ): Promise<PathResult> { const startTime = Date.now(); // Quick checks if (viewerId === targetId) { return this.selfPath(viewerId); } // Check cache for previously computed path const cached = await this.cache.getPath(viewerId, targetId); if (cached && !this.isStale(cached)) { return cached; } // Check direct connection first (1st degree) if (await this.graphStore.areConnected(viewerId, targetId)) { return this.directPath(viewerId, targetId); } // Bidirectional BFS const result = await this.bidirectionalSearch(viewerId, targetId); // Cache the result if (result.degrees > 0) { await this.cache.setPath(viewerId, targetId, result); } result.searchStats.timeMs = Date.now() - startTime; return result; } private async bidirectionalSearch( start: string, end: string ): Promise<PathResult> { // Forward search state (from viewer) const forwardVisited = new Map<string, string[]>(); // node -> parents const forwardQueue: string[] = [start]; forwardVisited.set(start, []); // Backward search state (from target) const backwardVisited = new Map<string, string[]>(); // node -> children const backwardQueue: string[] = [end]; backwardVisited.set(end, []); let depth = 0; let nodesExplored = 0; let meetingNodes: string[] = []; const startTime = Date.now(); while ( forwardQueue.length > 0 && backwardQueue.length > 0 && depth < this.MAX_DEGREES && (Date.now() - startTime) < this.TIMEOUT_MS ) { depth++; // Expand smaller frontier (optimization) if (forwardQueue.length <= backwardQueue.length) { nodesExplored += forwardQueue.length; meetingNodes = await this.expandFrontier( forwardQueue, forwardVisited, backwardVisited, true ); } else { nodesExplored += backwardQueue.length; meetingNodes = await this.expandFrontier( backwardQueue, backwardVisited, forwardVisited, false ); } if (meetingNodes.length > 0) { // Found meeting points - reconstruct paths const paths = this.reconstructPaths( start, end, meetingNodes, forwardVisited, backwardVisited ); return { degrees: depth, paths: paths.slice(0, this.MAX_PATHS), searchStats: { nodesExplored, timeMs: 0, cacheHits: 0 }, }; } } // Not connected within MAX_DEGREES return { degrees: -1, // Indicates out of network paths: [], searchStats: { nodesExplored, timeMs: 0, cacheHits: 0 }, }; } private async expandFrontier( queue: string[], myVisited: Map<string, string[]>, otherVisited: Map<string, string[]>, isForward: boolean ): Promise<string[]> { const frontierSize = queue.length; const meetingNodes: string[] = []; // Process all nodes at current depth for (let i = 0; i < frontierSize; i++) { const node = queue.shift()!; // Get neighbors with supernode protection const neighbors = await this.getNeighborsSafe(node); for (const neighbor of neighbors) { if (otherVisited.has(neighbor)) { // Found meeting point! meetingNodes.push(neighbor); // Track parent for path reconstruction const parents = myVisited.get(neighbor) || []; parents.push(node); myVisited.set(neighbor, parents); } else if (!myVisited.has(neighbor)) { myVisited.set(neighbor, [node]); queue.push(neighbor); } else { // Already visited - add additional parent for multiple paths myVisited.get(neighbor)!.push(node); } } } return meetingNodes; } // Supernode protection: limit neighbors explored private async getNeighborsSafe(node: string): Promise<string[]> { const connectionCount = await this.graphStore.getConnectionCount(node); if (connectionCount > this.MAX_NEIGHBORS) { // Supernode: sample high-quality connections instead of all return this.graphStore.getTopConnections(node, this.MAX_NEIGHBORS, { sortBy: 'interaction_recency' }); } return this.graphStore.getConnections(node); } // Reconstruct all shortest paths through meeting nodes private reconstructPaths( start: string, end: string, meetingNodes: string[], forwardVisited: Map<string, string[]>, backwardVisited: Map<string, string[]> ): string[][] { const allPaths: string[][] = []; for (const meeting of meetingNodes) { // BFS backward through forwardVisited to get all paths to start const pathsToMeeting = this.backtrackPaths(meeting, start, forwardVisited); // BFS forward through backwardVisited to get all paths to end const pathsFromMeeting = this.backtrackPaths(meeting, end, backwardVisited); // Combine: path to meeting + meeting + path from meeting for (const toPath of pathsToMeeting) { for (const fromPath of pathsFromMeeting) { // toPath is reversed (meeting -> start), needs reversal // fromPath goes meeting -> end, skip first element (meeting) const fullPath = [ ...toPath.reverse(), ...fromPath.slice(1) ]; allPaths.push(fullPath); } } } return allPaths; } private backtrackPaths( from: string, to: string, parentMap: Map<string, string[]> ): string[][] { if (from === to) return [[to]]; const paths: string[][] = []; const parents = parentMap.get(from) || []; for (const parent of parents) { const subPaths = this.backtrackPaths(parent, to, parentMap); for (const subPath of subPaths) { paths.push([from, ...subPath]); } } return paths; }}Finding all shortest paths is exponentially more expensive than finding one. In dense networks, there may be thousands of equally short paths. The algorithm must limit path enumeration to avoid combinatorial explosion. LinkedIn shows only a few paths—typically the ones through most relevant mutual connections.
Bidirectional BFS is accurate but expensive. For displaying degree labels on search results (potentially hundreds of results), we need faster approximate methods. Landmark-based estimation provides O(1) distance lookups after O(L × |V|) precomputation.
The Landmark Algorithm:
Precomputation Phase:
Query Phase: For query (u, v):
Key Insight: The triangle inequality guarantees this is an upper bound on the true distance. With well-chosen landmarks, it's often exact or very close.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166
class LandmarkDistanceEstimator { private landmarks: string[]; private distanceFromLandmark: Map<string, Map<string, number>>; private distanceToLandmark: Map<string, Map<string, number>>; // Configuration private LANDMARK_COUNT = 500; private INF = 999; // Distance if unreachable constructor(private graphStore: GraphStore) { this.landmarks = []; this.distanceFromLandmark = new Map(); this.distanceToLandmark = new Map(); } // Precomputation: run offline periodically (e.g., daily) async precompute(): Promise<void> { console.log('Starting landmark precomputation...'); // Step 1: Select landmarks strategically this.landmarks = await this.selectLandmarks(); // Step 2: BFS from each landmark to compute distances for (const landmark of this.landmarks) { const distFrom = await this.bfsFromNode(landmark); this.distanceFromLandmark.set(landmark, distFrom); // For directed graphs, also compute distances TO landmark // For undirected connection graph, this is symmetric this.distanceToLandmark.set(landmark, distFrom); } console.log(`Precomputation complete: ${this.landmarks.length} landmarks`); } // Landmark selection: choose high-degree, diverse nodes private async selectLandmarks(): Promise<string[]> { const candidates: Array<{ id: string; score: number }> = []; // Get high-degree nodes as candidates const topNodes = await this.graphStore.getTopNodesByDegree( this.LANDMARK_COUNT * 5 ); // Score candidates by degree and diversity for (const node of topNodes) { const degree = await this.graphStore.getConnectionCount(node.id); const centralityEstimate = degree; // Simple proxy candidates.push({ id: node.id, score: centralityEstimate, }); } // Greedy selection with diversity constraint const selected: string[] = []; const selectedSet = new Set<string>(); candidates.sort((a, b) => b.score - a.score); for (const candidate of candidates) { if (selected.length >= this.LANDMARK_COUNT) break; // Check that this landmark isn't too close to existing landmarks const tooClose = await this.isTooCloseToSelected( candidate.id, selectedSet ); if (!tooClose) { selected.push(candidate.id); selectedSet.add(candidate.id); } } return selected; } // O(1) distance estimation after precomputation estimateDistance(source: string, target: string): number { if (source === target) return 0; let minDistance = this.INF; for (const landmark of this.landmarks) { const distSourceToLandmark = this.distanceToLandmark.get(landmark)?.get(source); const distLandmarkToTarget = this.distanceFromLandmark.get(landmark)?.get(target); if (distSourceToLandmark !== undefined && distLandmarkToTarget !== undefined) { const viaLandmark = distSourceToLandmark + distLandmarkToTarget; minDistance = Math.min(minDistance, viaLandmark); } } return minDistance; } // Lower bound using triangle inequality estimateDistanceLowerBound(source: string, target: string): number { let maxLowerBound = 0; for (const landmark of this.landmarks) { const distSourceToLandmark = this.distanceToLandmark.get(landmark)?.get(source); const distTargetToLandmark = this.distanceToLandmark.get(landmark)?.get(target); if (distSourceToLandmark !== undefined && distTargetToLandmark !== undefined) { // |d(s,t) - d(l,t)| <= d(s,l) via triangle inequality const lowerBound = Math.abs(distSourceToLandmark - distTargetToLandmark); maxLowerBound = Math.max(maxLowerBound, lowerBound); } } return maxLowerBound; } // Combined estimate with bounds getDistanceEstimate(source: string, target: string): DistanceEstimate { const upperBound = this.estimateDistance(source, target); const lowerBound = this.estimateDistanceLowerBound(source, target); return { estimate: Math.ceil((upperBound + lowerBound) / 2), lowerBound, upperBound, confidence: upperBound === lowerBound ? 1.0 : 1.0 - (upperBound - lowerBound) / upperBound, }; } // BFS for precomputation private async bfsFromNode(start: string): Promise<Map<string, number>> { const distances = new Map<string, number>(); distances.set(start, 0); const queue: string[] = [start]; const maxDepth = 6; // Limit search depth let depth = 0; while (queue.length > 0 && depth < maxDepth) { const levelSize = queue.length; depth++; for (let i = 0; i < levelSize; i++) { const node = queue.shift()!; const neighbors = await this.graphStore.getConnections(node); for (const neighbor of neighbors) { if (!distances.has(neighbor)) { distances.set(neighbor, depth); queue.push(neighbor); } } } } return distances; }} interface DistanceEstimate { estimate: number; lowerBound: number; upperBound: number; confidence: number; // 1.0 = exact, lower = less confident}| Aspect | More Landmarks | Fewer Landmarks |
|---|---|---|
| Accuracy | Higher (tighter bounds) | Lower (looser bounds) |
| Precompute Time | O(L × E) - Higher | Lower |
| Storage | O(L × V) - Higher | Lower |
| Query Time | O(L) - Slightly Higher | Faster |
| Update Cost | Higher (more landmarks affected) | Lower |
For search results showing '2nd' vs '3rd' degree labels, off-by-one errors are acceptable. Users care about 'Is this person in my network?' more than the exact path. Landmark estimation handles 99% of cases correctly with O(1) lookups, reserving expensive exact computation for profile views where the actual path is shown.
When multiple shortest paths exist, which one should LinkedIn display? The path "You → CEO → Target" is more valuable than "You → Acquaintance from 10 years ago → Target." Path ranking considers relationship quality, not just topological distance.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192
interface RankedPath { path: string[]; degrees: number; qualityScore: number; explanation: PathExplanation;} interface PathExplanation { intermediates: IntermediateNode[]; context: string; // "Through 2 colleagues at Google"} interface IntermediateNode { memberId: string; name: string; headline: string; relationshipToViewer: RelationshipContext; relationshipToTarget: RelationshipContext;} interface RelationshipContext { type: 'colleague' | 'classmate' | 'connection' | 'unknown'; sharedEntity?: string; // Company or school name connectionStrength: number;} class PathRanker { private memberService: MemberService; private interactionStore: InteractionStore; async rankPaths( viewerId: string, targetId: string, paths: string[][] ): Promise<RankedPath[]> { const rankedPaths: RankedPath[] = []; for (const path of paths) { const score = await this.scorePath(viewerId, targetId, path); const explanation = await this.explainPath(viewerId, targetId, path); rankedPaths.push({ path, degrees: path.length - 1, qualityScore: score, explanation, }); } // Sort by quality (higher is better) rankedPaths.sort((a, b) => b.qualityScore - a.qualityScore); return rankedPaths; } private async scorePath( viewerId: string, targetId: string, path: string[] ): Promise<number> { let totalScore = 0; const edgeCount = path.length - 1; for (let i = 0; i < edgeCount; i++) { const from = path[i]; const to = path[i + 1]; const edgeScore = await this.scoreEdge(from, to); totalScore += edgeScore; } // Normalize by path length // Bonus for shorter paths const lengthBonus = Math.pow(0.9, edgeCount - 1); return (totalScore / edgeCount) * lengthBonus; } private async scoreEdge(from: string, to: string): Promise<number> { const scores = await Promise.all([ this.scoreRecency(from, to), this.scoreInteraction(from, to), this.scoreContext(from, to), this.scoreActivity(to), // Intermediate node should be active ]); // Weighted combination const weights = [0.2, 0.3, 0.3, 0.2]; return scores.reduce((sum, score, i) => sum + score * weights[i], 0); } private async scoreRecency(from: string, to: string): Promise<number> { const connection = await this.getConnectionMetadata(from, to); if (!connection) return 0; const ageInDays = (Date.now() - connection.connectedAt.getTime()) / (1000 * 60 * 60 * 24); // Decay: connections older than 5 years score lower if (ageInDays < 365) return 1.0; if (ageInDays < 365 * 2) return 0.9; if (ageInDays < 365 * 5) return 0.7; return 0.5; } private async scoreInteraction(from: string, to: string): Promise<number> { const interactions = await this.interactionStore.getInteractions(from, to); if (!interactions) return 0.3; // No data, assume weak // Recent messages are strong signal if (interactions.lastMessage && this.isRecent(interactions.lastMessage, 30)) { return 1.0; } // Recent profile view if (interactions.lastProfileView && this.isRecent(interactions.lastProfileView, 90)) { return 0.8; } // Any interaction in past year if (interactions.lastAnyInteraction && this.isRecent(interactions.lastAnyInteraction, 365)) { return 0.6; } return 0.3; } private async scoreContext(from: string, to: string): Promise<number> { const sharedContext = await this.getSharedContext(from, to); if (sharedContext.currentColleagues) return 1.0; // Work together now if (sharedContext.pastColleagues) return 0.8; // Worked together before if (sharedContext.classmates) return 0.7; // Same school if (sharedContext.sameIndustry) return 0.5; // Same industry return 0.3; // No shared context } private async scoreActivity(memberId: string): Promise<number> { const member = await this.memberService.getMember(memberId); if (!member.lastActive) return 0.3; const daysSinceActive = (Date.now() - member.lastActive.getTime()) / (1000 * 60 * 60 * 24); if (daysSinceActive < 1) return 1.0; // Active today if (daysSinceActive < 7) return 0.9; // Active this week if (daysSinceActive < 30) return 0.7; // Active this month if (daysSinceActive < 365) return 0.4; // Active this year return 0.1; // Dormant account } private async explainPath( viewerId: string, targetId: string, path: string[] ): Promise<PathExplanation> { const intermediates: IntermediateNode[] = []; for (let i = 1; i < path.length - 1; i++) { const memberId = path[i]; const member = await this.memberService.getMember(memberId); intermediates.push({ memberId, name: `${member.firstName} ${member.lastName}`, headline: member.headline, relationshipToViewer: await this.getRelationship(viewerId, memberId), relationshipToTarget: await this.getRelationship(memberId, targetId), }); } const context = this.generateContextString(intermediates); return { intermediates, context }; } private generateContextString(intermediates: IntermediateNode[]): string { if (intermediates.length === 1) { const int = intermediates[0]; if (int.relationshipToViewer.type === 'colleague') { return `Through ${int.name}, your colleague at ${int.relationshipToViewer.sharedEntity}`; } return `Through ${int.name}`; } // Multiple intermediates const names = intermediates.map(i => i.name); return `Through ${names.join(' and ')}`; }}Production degree-of-separation systems must handle numerous edge cases gracefully. Let's examine the common challenges and solutions.
| Edge Case | Challenge | Solution |
|---|---|---|
| Disconnected Users | No path exists between users | Return 'Out of network' after bounded search |
| Supernodes | 100K+ connections explode search | Limit neighbor expansion, prioritize by quality |
| Private Profiles | Can't traverse through private users | Exclude from intermediate path nodes |
| Blocked Users | Must hide paths through blocked | Filter blocked from results, respect blocking |
| Deleted Accounts | Stale data shows dead paths | Validate path before display, cache invalidation |
| Self-Loops | User viewing own profile | Short-circuit return 'You' |
| Very Distant Users | 4+ degrees take too long | Show 'Out of your immediate network' |
| Celebrity Accounts | Millions want path to celebrity | Precompute/cache paths to popular targets |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144
class RobustPathFinder { private pathService: DegreeOfSeparationService; private memberService: MemberService; private privacyService: PrivacyService; private blockService: BlockService; async findPathSafe( viewerId: string, targetId: string ): Promise<SafePathResult> { // Edge case 1: Self-viewing if (viewerId === targetId) { return { type: 'self', degrees: 0, path: null }; } // Edge case 2: Check if target is blocked or viewer is blocked by target const blockStatus = await this.blockService.checkBlockStatus( viewerId, targetId ); if (blockStatus.isBlocked) { // Don't reveal that they're blocked - show as "out of network" return { type: 'out_of_network', degrees: -1, path: null }; } // Edge case 3: Check if target profile is visible const canView = await this.privacyService.canViewProfile(viewerId, targetId); if (!canView) { return { type: 'private', degrees: -1, path: null }; } // Edge case 4: Check if target is a deleted/suspended account const targetMember = await this.memberService.getMember(targetId); if (!targetMember || targetMember.status !== 'active') { return { type: 'unavailable', degrees: -1, path: null }; } // Perform the actual path finding try { const result = await this.pathService.findPath(viewerId, targetId); if (result.degrees === -1) { return { type: 'out_of_network', degrees: -1, path: null }; } // Edge case 5: Validate path doesn't contain blocked/private users const validatedPath = await this.validatePath(viewerId, result.paths[0]); if (!validatedPath) { // Path contains invalid node, find alternative return this.findAlternativePath(viewerId, targetId, result.paths); } return { type: 'connected', degrees: result.degrees, path: validatedPath, }; } catch (error) { if (error instanceof TimeoutError) { // Edge case 6: Query took too long return { type: 'timeout', degrees: -1, path: null }; } throw error; } } // Validate each intermediate node in path private async validatePath( viewerId: string, path: string[] ): Promise<string[] | null> { if (!path || path.length < 2) return null; for (let i = 1; i < path.length - 1; i++) { const intermediateId = path[i]; // Check intermediate is active const member = await this.memberService.getMember(intermediateId); if (!member || member.status !== 'active') { return null; // Dead node in path } // Check viewer can "see through" this intermediate // (their connections are visible) const connectionsVisible = await this.privacyService.areConnectionsVisible( intermediateId, viewerId ); if (!connectionsVisible) { return null; // Privacy prevents this path } // Check no blocks const blocked = await this.blockService.isBlocked(intermediateId, viewerId); if (blocked) { return null; // Intermediate blocked viewer } } return path; } // Try alternative paths when primary is invalid private async findAlternativePath( viewerId: string, targetId: string, paths: string[][] ): Promise<SafePathResult> { for (const path of paths.slice(1)) { // Skip first (already validated as bad) const validated = await this.validatePath(viewerId, path); if (validated) { return { type: 'connected', degrees: validated.length - 1, path: validated, }; } } // All cached paths invalid, recompute excluding known bad nodes const badNodes = new Set( paths.flat().filter(async (id) => !await this.validatePath(viewerId, [viewerId, id, targetId])) ); const freshResult = await this.pathService.findPath(viewerId, targetId, { excludeNodes: badNodes, }); if (freshResult.degrees > 0) { return { type: 'connected', degrees: freshResult.degrees, path: freshResult.paths[0], }; } return { type: 'out_of_network', degrees: -1, path: null }; }} type SafePathResult = { type: 'self' | 'connected' | 'out_of_network' | 'private' | 'unavailable' | 'timeout'; degrees: number; path: string[] | null;};Given the computational cost of path queries and the high volume of requests, intelligent caching and precomputation are essential.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139
class PathCacheService { private redis: RedisCluster; private popularTargets: Set<string>; // TTL configuration based on data stability private TTL = { pathWithDegree: 3600, // 1 hour - includes full path degreeOnly: 7200, // 2 hours - just the number outOfNetwork: 86400, // 24 hours - unlikely to change popularTarget: 1800, // 30 min - high traffic, fresher }; // Get cached path result async getCachedPath( viewerId: string, targetId: string ): Promise<CachedPathResult | null> { // Try full path cache first const pathKey = this.pathCacheKey(viewerId, targetId); const cached = await this.redis.get(pathKey); if (cached) { return JSON.parse(cached); } // Fall back to degree-only cache const degreeKey = this.degreeCacheKey(viewerId, targetId); const degree = await this.redis.get(degreeKey); if (degree !== null) { return { degrees: parseInt(degree), path: null, // Path not cached, just degree fromCache: 'degree_only', }; } return null; } // Cache a computed path result async cachePath( viewerId: string, targetId: string, result: PathResult ): Promise<void> { const isPopular = this.popularTargets.has(targetId); if (result.degrees === -1) { // Out of network - cache longer await this.cacheOutOfNetwork(viewerId, targetId); return; } // Cache full path const pathKey = this.pathCacheKey(viewerId, targetId); const ttl = isPopular ? this.TTL.popularTarget : this.TTL.pathWithDegree; await this.redis.setex(pathKey, ttl, JSON.stringify({ degrees: result.degrees, path: result.paths[0], rankedPaths: result.paths.slice(1, 5), // Store alternatives cachedAt: Date.now(), })); // Also cache degree-only (lasts longer, used for fallback) const degreeKey = this.degreeCacheKey(viewerId, targetId); await this.redis.setex(degreeKey, this.TTL.degreeOnly, result.degrees.toString()); } private async cacheOutOfNetwork( viewerId: string, targetId: string ): Promise<void> { const key = this.degreeCacheKey(viewerId, targetId); await this.redis.setex(key, this.TTL.outOfNetwork, '-1'); } // Cache invalidation on connection changes async invalidateOnConnection( member1: string, member2: string ): Promise<void> { // When two people connect, paths involving them change // We can't enumerate all affected pairs, so we: // 1. Invalidate direct path between them await this.invalidatePair(member1, member2); // 2. Invalidate cached "out of network" for their connections // (They might now have paths to previously unreachable users) // This is expensive - we do it lazily via TTL // 3. If either is popular, trigger background recomputation if (this.popularTargets.has(member1)) { this.queuePrecomputation(member1); } if (this.popularTargets.has(member2)) { this.queuePrecomputation(member2); } } private pathCacheKey(viewer: string, target: string): string { return `path:${viewer}:${target}`; } private degreeCacheKey(viewer: string, target: string): string { // Symmetric for undirected graph const sorted = [viewer, target].sort(); return `degree:${sorted[0]}:${sorted[1]}`; } // Precompute paths to popular targets (celebrities, executives) async precomputePopularPaths(): Promise<void> { // Get sample of active users const activeUsers = await this.memberService.getActiveUserSample(10000); for (const target of this.popularTargets) { for (const viewer of activeUsers) { if (viewer !== target) { // Use low-priority queue to not impact live traffic this.precomputeQueue.addLowPriority({ viewer, target, priority: 'low', }); } } } }} interface CachedPathResult { degrees: number; path: string[] | null; rankedPaths?: string[][]; fromCache: 'full' | 'degree_only'; cachedAt?: number;}When a new connection is created, paths through those users may change. However, invalidating all affected cached paths is infeasible—one connection could affect millions of cached pairs. The solution is TTL-based expiration plus targeted invalidation for high-value cases (popular targets, direct paths).
We've explored the complete lifecycle of degree-of-separation queries at LinkedIn scale. Let's consolidate the key learnings:
You now understand how to compute and display degrees of separation efficiently. In the next page, we'll explore recommendation algorithms—how 'People You May Know' identifies valuable potential connections from billions of possibilities.