Loading content...
Playlists are the soul of music personalization. They transform a massive catalog into personal collections that reflect individual tastes, moods, and memories. With over 4 billion user-created playlists on Spotify—more than the number of tracks in the catalog—playlist management is as important as streaming itself.
Designing a playlist system at this scale involves solving complex distributed systems problems: How do you enable millions of concurrent edits without data loss? How do you sync playlist changes across devices within seconds? How do you handle playlists with 10,000 tracks without performance degradation? How do collaborative playlists work when multiple users edit simultaneously?
This page explores the architecture that makes seamless playlist management possible.
You will understand the data models for playlists and libraries, database selection and sharding strategies, techniques for handling concurrency and conflicts, cross-device synchronization, and performance optimizations for large playlists.
A well-designed data model is foundational. It must support all playlist operations efficiently while enabling horizontal scaling across billions of playlists.
Core Entities:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
-- Core playlist entity-- Sharded by playlist_id for horizontal scalingCREATE TABLE playlists ( playlist_id UUID PRIMARY KEY, -- Globally unique ID owner_id UUID NOT NULL, -- User who created playlist name VARCHAR(256) NOT NULL, -- Playlist name description TEXT, -- Optional description image_url VARCHAR(512), -- Custom or generated cover -- Visibility and sharing is_public BOOLEAN DEFAULT false, is_collaborative BOOLEAN DEFAULT false, -- Metadata track_count INT DEFAULT 0, -- Denormalized for quick access total_duration_ms BIGINT DEFAULT 0, -- Denormalized total duration follower_count INT DEFAULT 0, -- Denormalized followers -- Versioning for sync version BIGINT DEFAULT 1, -- Incremented on any change snapshot_id VARCHAR(64), -- Unique snapshot for API -- Timestamps created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW(), -- Indexes INDEX idx_owner (owner_id), INDEX idx_public (is_public, follower_count DESC)); -- Playlist tracks - the core relationship-- Sharded by playlist_id (co-located with playlist)CREATE TABLE playlist_tracks ( playlist_id UUID NOT NULL, position INT NOT NULL, -- 0-indexed position in playlist track_uri VARCHAR(64) NOT NULL, -- Spotify track URI -- Snapshot of track info at add time (for deleted tracks) added_by UUID, -- Who added this track added_at TIMESTAMP DEFAULT NOW(), -- Composite primary key PRIMARY KEY (playlist_id, position), -- For finding all positions of a track INDEX idx_track (playlist_id, track_uri)); -- User library - saved itemsCREATE TABLE user_library ( user_id UUID NOT NULL, item_type ENUM('track', 'album', 'artist', 'playlist', 'episode'), item_uri VARCHAR(64) NOT NULL, saved_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (user_id, item_type, item_uri), INDEX idx_saved_at (user_id, item_type, saved_at DESC)); -- Playlist followersCREATE TABLE playlist_followers ( playlist_id UUID NOT NULL, user_id UUID NOT NULL, followed_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (playlist_id, user_id), INDEX idx_user_follows (user_id, followed_at DESC)); -- Collaborative playlist permissionsCREATE TABLE playlist_collaborators ( playlist_id UUID NOT NULL, user_id UUID NOT NULL, role ENUM('editor', 'viewer') DEFAULT 'editor', added_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (playlist_id, user_id), INDEX idx_user_collaborations (user_id));Some databases support array columns which might seem perfect for track lists. However, arrays have limitations: no efficient insertion in the middle, difficult to query by position, and atomic operations become challenging. The positions table approach is more flexible and scalable.
With 4+ billion playlists and billions of track entries, no single database can handle this load. We need a horizontally scalable database with a well-designed sharding strategy.
Database Requirements:
| Database | Scalability | Consistency | Use Case Fit | Trade-offs |
|---|---|---|---|---|
| PostgreSQL + Citus | Excellent | Strong | Great | Operational complexity |
| CockroachDB | Excellent | Strong | Great | Latency for global distribution |
| Vitess (MySQL) | Excellent | Strong | Great | Proven at YouTube scale |
| Cassandra | Excellent | Tunable | Good | No transactions, eventual consistency |
| DynamoDB | Excellent | Strong (per-item) | Good | Limited query flexibility |
| Spanner | Excellent | Strong | Excellent | Cost, vendor lock-in |
Sharding Strategy:
For playlist data, we shard by playlist_id. This ensures that all data for a single playlist (the playlist record and all its tracks) is co-located on the same shard, enabling single-shard transactions.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
class PlaylistShardRouter: """ Routes playlist operations to appropriate database shards. Uses consistent hashing on playlist_id. """ def __init__(self, shard_count=256): self.shard_count = shard_count self.shard_connections = self._initialize_connections() def get_shard(self, playlist_id: str) -> int: """ Determine shard for a playlist using consistent hashing. We use consistent hashing to: 1. Distribute playlists evenly across shards 2. Minimize resharding when adding/removing nodes """ # Hash the playlist_id to get a consistent shard hash_value = hashlib.md5(playlist_id.encode()).hexdigest() numeric_hash = int(hash_value[:8], 16) return numeric_hash % self.shard_count def get_connection(self, playlist_id: str): """Get database connection for the appropriate shard.""" shard_id = self.get_shard(playlist_id) return self.shard_connections[shard_id] def execute_playlist_query(self, playlist_id: str, query: str, params: tuple): """Execute query on the correct shard.""" conn = self.get_connection(playlist_id) with conn.cursor() as cursor: cursor.execute(query, params) return cursor.fetchall() # Sharding boundaries for user library (different shard key)class UserLibraryShardRouter: """ User library is sharded by user_id to co-locate all of a user's saved items on one shard. """ def get_shard(self, user_id: str) -> int: hash_value = hashlib.md5(user_id.encode()).hexdigest() return int(hash_value[:8], 16) % self.shard_count # Cross-shard query for "playlists where I'm a collaborator" def get_user_collaborations(self, user_id: str): """ This requires a scatter-gather query since collaborations are stored on playlist shards, not user shards. Options: 1. Secondary index table sharded by user_id 2. Scatter-gather across all shards (expensive) 3. Cache in separate system like Redis We use option 1 - a separate user_collaborations table sharded by user_id that mirrors playlist_collaborators. """ passQueries like 'find all playlists a user follows' span multiple shards since playlists are sharded by playlist_id, not user_id. These require either scatter-gather queries (expensive) or maintaining a secondary index table sharded by user_id. Design carefully to minimize cross-shard operations.
Every playlist operation must be implemented efficiently, atomically, and with proper versioning for sync. Let's examine the critical operations:
Add Tracks Operation:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178
class PlaylistService: """ Core playlist operations with transactional safety. """ async def add_tracks( self, playlist_id: str, track_uris: List[str], position: Optional[int] = None, # None = append user_id: str = None ) -> PlaylistSnapshot: """ Add tracks to playlist at specified position. Complexity considerations: - Appending is O(1) - just insert at end position - Inserting in middle requires shifting subsequent positions - Spotify limits playlists to 10,000 tracks """ async with self.db.transaction() as tx: # Get current playlist state with lock playlist = await tx.fetch_one( "SELECT * FROM playlists WHERE playlist_id = %s FOR UPDATE", (playlist_id,) ) # Verify permissions if not self._can_edit(playlist, user_id): raise PermissionDeniedError() # Check track limit new_count = playlist['track_count'] + len(track_uris) if new_count > 10000: raise PlaylistLimitExceeded(f"Max 10,000 tracks allowed") # Determine insert position if position is None: insert_position = playlist['track_count'] else: insert_position = min(position, playlist['track_count']) # Shift existing tracks to make room await tx.execute( """UPDATE playlist_tracks SET position = position + %s WHERE playlist_id = %s AND position >= %s""", (len(track_uris), playlist_id, insert_position) ) # Insert new tracks for i, track_uri in enumerate(track_uris): await tx.execute( """INSERT INTO playlist_tracks (playlist_id, position, track_uri, added_by, added_at) VALUES (%s, %s, %s, %s, NOW())""", (playlist_id, insert_position + i, track_uri, user_id) ) # Get track durations for total_duration update durations = await self._get_track_durations(track_uris) total_new_duration = sum(durations) # Update playlist metadata new_version = playlist['version'] + 1 new_snapshot = self._generate_snapshot_id() await tx.execute( """UPDATE playlists SET track_count = track_count + %s, total_duration_ms = total_duration_ms + %s, version = %s, snapshot_id = %s, updated_at = NOW() WHERE playlist_id = %s""", (len(track_uris), total_new_duration, new_version, new_snapshot, playlist_id) ) # Emit change event for sync await self._emit_playlist_changed(playlist_id, new_version) return PlaylistSnapshot( playlist_id=playlist_id, version=new_version, snapshot_id=new_snapshot ) async def reorder_tracks( self, playlist_id: str, range_start: int, range_length: int, insert_before: int, user_id: str ) -> PlaylistSnapshot: """ Move a range of tracks to a new position. This is the most complex operation - requires careful position management to avoid gaps or overlaps. """ async with self.db.transaction() as tx: playlist = await tx.fetch_one( "SELECT * FROM playlists WHERE playlist_id = %s FOR UPDATE", (playlist_id,) ) if not self._can_edit(playlist, user_id): raise PermissionDeniedError() # Validate ranges if range_start < 0 or range_start + range_length > playlist['track_count']: raise InvalidRangeError() # Calculate effective insert position after removing the range if insert_before > range_start: effective_insert = insert_before - range_length else: effective_insert = insert_before # Step 1: Move target tracks to negative positions (temporary) await tx.execute( """UPDATE playlist_tracks SET position = -(position - %s + 1) WHERE playlist_id = %s AND position >= %s AND position < %s""", (range_start, playlist_id, range_start, range_start + range_length) ) # Step 2: Close the gap left by removed tracks if insert_before > range_start: # Shift tracks between old position and insert down await tx.execute( """UPDATE playlist_tracks SET position = position - %s WHERE playlist_id = %s AND position >= %s AND position < %s""", (range_length, playlist_id, range_start + range_length, insert_before) ) else: # Shift tracks between insert and old position up await tx.execute( """UPDATE playlist_tracks SET position = position + %s WHERE playlist_id = %s AND position >= %s AND position < %s""", (range_length, playlist_id, insert_before, range_start) ) # Step 3: Move tracks from negative to final position await tx.execute( """UPDATE playlist_tracks SET position = %s - position - 1 WHERE playlist_id = %s AND position < 0""", (effective_insert + range_length, playlist_id) ) # Update version new_version = playlist['version'] + 1 new_snapshot = self._generate_snapshot_id() await tx.execute( """UPDATE playlists SET version = %s, snapshot_id = %s, updated_at = NOW() WHERE playlist_id = %s""", (new_version, new_snapshot, playlist_id) ) await self._emit_playlist_changed(playlist_id, new_version) return PlaylistSnapshot( playlist_id=playlist_id, version=new_version, snapshot_id=new_snapshot )Users often perform bulk operations (add album, copy playlist). These should be implemented as single transactions with batch inserts for efficiency, not as multiple individual add_track calls.
Collaborative playlists present the ultimate concurrency challenge: multiple users editing the same playlist simultaneously from different devices. Without careful handling, edits can be lost or corrupt ordering.
Concurrency Scenarios:
| Scenario | Risk | Solution |
|---|---|---|
| Two users add same track | Duplicate entries | Allow duplicates or last-write-wins |
| Delete while reordering | Invalid positions | Optimistic locking with retry |
| Simultaneous reorders | Corrupted order | Transaction serialization |
| Add during position shift | Wrong final position | Position recalculation |
| Offline edits sync | Conflicting changes | CRDT-based merge or user resolution |
Optimistic Concurrency Control:
We use optimistic concurrency with version checks. Clients submit changes with expected version; server rejects if version mismatches.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118
class PlaylistServiceWithConcurrency: """ Playlist operations with optimistic concurrency control. """ async def add_tracks_with_version( self, playlist_id: str, track_uris: List[str], position: Optional[int], expected_version: int, # Client's known version user_id: str ) -> Union[PlaylistSnapshot, ConcurrencyConflict]: """ Add tracks only if playlist hasn't changed since client last fetched. If version mismatch, return current state for client to reconcile. """ async with self.db.transaction() as tx: playlist = await tx.fetch_one( "SELECT * FROM playlists WHERE playlist_id = %s FOR UPDATE", (playlist_id,) ) # Version check if playlist['version'] != expected_version: current_tracks = await self._get_playlist_tracks(tx, playlist_id) return ConcurrencyConflict( expected_version=expected_version, actual_version=playlist['version'], current_tracks=current_tracks, message="Playlist was modified. Please review changes." ) # Proceed with normal add operation return await self._perform_add( tx, playlist_id, track_uris, position, user_id, playlist ) async def resolve_offline_sync( self, playlist_id: str, client_version: int, client_operations: List[Operation], user_id: str ) -> SyncResult: """ Reconcile offline changes with current server state. Uses Operational Transformation (OT) to merge operations. """ async with self.db.transaction() as tx: playlist = await tx.fetch_one( "SELECT * FROM playlists WHERE playlist_id = %s FOR UPDATE", (playlist_id,) ) if playlist['version'] == client_version: # No server changes, apply client ops directly for op in client_operations: await self._apply_operation(tx, playlist_id, op, user_id) return SyncResult(status='applied', conflicts=[]) # Get server operations since client's version server_ops = await self._get_operations_since( tx, playlist_id, client_version ) # Transform client operations against server operations transformed_ops = self._operational_transform( client_operations, server_ops ) # Apply transformed operations conflicts = [] for op in transformed_ops: try: await self._apply_operation(tx, playlist_id, op, user_id) except OperationConflict as e: conflicts.append(e) return SyncResult( status='merged' if not conflicts else 'partial', conflicts=conflicts ) def _operational_transform( self, client_ops: List[Operation], server_ops: List[Operation] ) -> List[Operation]: """ Transform client operations to apply cleanly after server operations. Example: - Server added track at position 5 - Client wants to add track at position 7 - Transform: Client should now add at position 8 """ transformed = [] for client_op in client_ops: adjusted_op = client_op.copy() for server_op in server_ops: if server_op.type == 'add' and client_op.type == 'add': if server_op.position <= client_op.position: adjusted_op.position += len(server_op.tracks) elif server_op.type == 'remove' and client_op.type == 'add': if server_op.position < client_op.position: adjusted_op.position -= 1 # More complex transformations for reorder, etc. transformed.append(adjusted_op) return transformedFor highly collaborative scenarios, CRDTs (Conflict-free Replicated Data Types) can automatically merge concurrent changes without conflicts. However, CRDTs add complexity and may not preserve user intent in all cases. Most music services use simpler optimistic locking with conflict reporting.
Users expect changes made on one device to appear on all their other devices almost instantly. This requires a real-time synchronization system that pushes updates without requiring polling.
Sync Architecture:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
┌─────────────────────────────────────────────────────────────────────┐│ CROSS-DEVICE SYNC ARCHITECTURE │├─────────────────────────────────────────────────────────────────────┤│ ││ ┌──────────────────────────────────────────────────────────────┐ ││ │ CLIENT DEVICES │ ││ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ ││ │ │ Phone │ │ Desktop │ │ Web │ │ Smart │ │ ││ │ │ App │ │ App │ │ Player │ │ Speaker │ │ ││ │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │ ││ │ │ │ │ │ │ ││ │ └────────────┼────────────┼────────────┘ │ ││ │ │ │ │ ││ │ ▼ ▼ │ ││ └──────────────────────────────────────────────────────────────┘ ││ │ │ ││ │ WebSocket │ WebSocket ││ ▼ ▼ ││ ┌──────────────────────────────────────────────────────────────┐ ││ │ REAL-TIME NOTIFICATION SERVICE │ ││ │ │ ││ │ • Maintains WebSocket connections to all active clients │ ││ │ • Routes change notifications to affected users │ ││ │ • Groups connections by user_id for efficient broadcast │ ││ │ │ ││ │ Tech: Redis Pub/Sub + WebSocket clusters │ ││ └──────────────────────────────────────────────────────────────┘ ││ │ ││ │ Subscribe to changes ││ ▼ ││ ┌──────────────────────────────────────────────────────────────┐ ││ │ CHANGE EVENT BUS │ ││ │ │ ││ │ • Playlist Service emits events on every change │ ││ │ • Events include: playlist_id, version, change_type │ ││ │ • Stored for replay (handle client reconnection) │ ││ │ │ ││ │ Tech: Kafka/Pulsar/Redis Streams │ ││ └──────────────────────────────────────────────────────────────┘ ││ │ ││ │ Events ││ ▼ ││ ┌──────────────────────────────────────────────────────────────┐ ││ │ PLAYLIST SERVICE │ ││ │ │ ││ │ All playlist mutations emit change events │ ││ └──────────────────────────────────────────────────────────────┘ │└─────────────────────────────────────────────────────────────────────┘Sync Protocol:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126
class PlaylistSyncService: """ Manages real-time playlist synchronization across devices. """ async def handle_client_connect(self, user_id: str, connection): """ When client connects, subscribe to user's relevant channels. """ # Subscribe to user's personal library changes await self.pubsub.subscribe(f"user:{user_id}:library") # Subscribe to playlists user owns or follows playlist_ids = await self._get_user_playlist_ids(user_id) for playlist_id in playlist_ids: await self.pubsub.subscribe(f"playlist:{playlist_id}:changes") # Store connection for push notifications self.user_connections[user_id].add(connection) async def emit_playlist_changed( self, playlist_id: str, version: int, change_type: str, details: dict ): """ Broadcast playlist change to all interested clients. """ # Get all users who need this notification owner_id = await self._get_playlist_owner(playlist_id) follower_ids = await self._get_playlist_followers(playlist_id) collaborator_ids = await self._get_playlist_collaborators(playlist_id) interested_users = {owner_id} | set(follower_ids) | set(collaborator_ids) # Publish to channel message = { 'type': 'playlist_changed', 'playlist_id': playlist_id, 'version': version, 'change_type': change_type, # 'tracks_added', 'tracks_removed', 'reordered' 'timestamp': datetime.utcnow().isoformat(), **details } await self.pubsub.publish( f"playlist:{playlist_id}:changes", json.dumps(message) ) # Store for replay (clients that reconnect) await self._store_change_event(playlist_id, version, message) async def handle_client_sync_request( self, user_id: str, playlist_id: str, client_version: int ) -> SyncResponse: """ Client requests sync - provide delta if possible, full refresh if needed. """ current_version = await self._get_playlist_version(playlist_id) if client_version == current_version: return SyncResponse(status='up_to_date') # Try to provide delta (changes since client's version) version_gap = current_version - client_version if version_gap <= 100: # Delta sync for small gaps changes = await self._get_changes_since(playlist_id, client_version) return SyncResponse( status='delta', changes=changes, new_version=current_version ) else: # Full refresh for large gaps full_playlist = await self._get_full_playlist(playlist_id) return SyncResponse( status='full_refresh', playlist=full_playlist, new_version=current_version ) # Client-side sync handlingclass ClientSyncManager: """ Client-side sync manager that integrates with server sync service. """ async def handle_sync_notification(self, notification: dict): """ Handle incoming sync notification from server. """ if notification['type'] == 'playlist_changed': playlist_id = notification['playlist_id'] server_version = notification['version'] local_version = self.local_cache.get_version(playlist_id) if server_version == local_version + 1: # Simple case: sequential change, apply delta await self._apply_delta(playlist_id, notification) elif server_version > local_version + 1: # Missed changes: request full sync await self._request_sync(playlist_id, local_version) # else: we're ahead (our change), ignore async def _apply_delta(self, playlist_id: str, change: dict): """Apply incremental change to local cache.""" if change['change_type'] == 'tracks_added': self.local_cache.insert_tracks( playlist_id, change['position'], change['tracks'] ) elif change['change_type'] == 'tracks_removed': self.local_cache.remove_tracks( playlist_id, change['positions'] ) # Update local version self.local_cache.set_version(playlist_id, change['version'])Spotify Connect extends this sync further—not just playlist state but playback state (current track, position, volume) syncs in real-time. Users can start playing on phone and continue on desktop seamlessly. This uses the same WebSocket infrastructure.
Spotify allows playlists with up to 10,000 tracks. At this scale, naive implementations become unusable—imagine fetching or rendering 10,000 track cards. We need specialized optimizations.
Loading Strategies:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125
class PlaylistAPI: """ Playlist API with efficient pagination for large playlists. """ async def get_playlist_tracks( self, playlist_id: str, offset: int = 0, limit: int = 100, # Max 100 per request fields: Optional[List[str]] = None ) -> PlaylistTracksResponse: """ Get paginated tracks from playlist. Fields parameter allows clients to request only needed data: - 'items.track.id,items.track.name' for minimal - Full response for detail view """ limit = min(limit, 100) # Cap at 100 # Get total count for pagination info playlist = await self.db.fetch_one( "SELECT track_count FROM playlists WHERE playlist_id = %s", (playlist_id,) ) # Fetch requested page tracks = await self.db.fetch_all( """SELECT pt.position, pt.track_uri, pt.added_at, pt.added_by, t.name, t.artist_ids, t.album_id, t.duration_ms FROM playlist_tracks pt JOIN tracks t ON pt.track_uri = t.uri WHERE pt.playlist_id = %s ORDER BY pt.position LIMIT %s OFFSET %s""", (playlist_id, limit, offset) ) return PlaylistTracksResponse( items=self._format_tracks(tracks, fields), total=playlist['track_count'], offset=offset, limit=limit, next=self._build_next_url(playlist_id, offset, limit, playlist['track_count']), previous=self._build_prev_url(playlist_id, offset, limit) ) async def search_playlist_tracks( self, playlist_id: str, query: str, limit: int = 20 ) -> List[TrackSearchResult]: """ Search within a playlist - server-side for large playlists. Uses full-text search on track name, artist name, album name. """ results = await self.db.fetch_all( """SELECT pt.position, t.name, t.artist_names, t.album_name FROM playlist_tracks pt JOIN tracks t ON pt.track_uri = t.uri WHERE pt.playlist_id = %s AND ( t.name ILIKE %s OR t.artist_names ILIKE %s OR t.album_name ILIKE %s ) ORDER BY pt.position LIMIT %s""", (playlist_id, f'%{query}%', f'%{query}%', f'%{query}%', limit) ) return results # Virtual scroll integrationclass VirtualScrollPlaylistView: """ Client-side virtual scroll implementation for large playlists. """ def __init__(self, playlist_id: str, total_tracks: int): self.playlist_id = playlist_id self.total_tracks = total_tracks self.row_height = 64 # pixels per track row self.viewport_height = 800 # visible area self.buffer_rows = 10 # extra rows to pre-fetch self.loaded_ranges = IntervalTree() # Track which ranges are loaded self.tracks_cache = {} # position -> track data def get_visible_range(self, scroll_position: int) -> Tuple[int, int]: """Calculate which track positions are currently visible.""" first_visible = scroll_position // self.row_height visible_count = self.viewport_height // self.row_height # Add buffer start = max(0, first_visible - self.buffer_rows) end = min(self.total_tracks, first_visible + visible_count + self.buffer_rows) return (start, end) async def on_scroll(self, scroll_position: int): """Handle scroll event - fetch needed data.""" start, end = self.get_visible_range(scroll_position) # Find gaps in loaded data needed_ranges = self._find_missing_ranges(start, end) for range_start, range_end in needed_ranges: tracks = await self.api.get_playlist_tracks( self.playlist_id, offset=range_start, limit=range_end - range_start ) # Cache loaded tracks for i, track in enumerate(tracks.items): self.tracks_cache[range_start + i] = track self.loaded_ranges.add(Interval(range_start, range_end)) # Render visible tracks self._render_visible(start, end)Even with virtual scrolling, caching 10,000 tracks consumes memory. Implement cache eviction for tracks far from current scroll position. Also consider storing only essential fields in memory, fetching full details on demand.
Beyond playlists, users maintain a personal library of saved content: liked songs, saved albums, followed artists, and podcasts. The library is essentially a collection of special-purpose saved items.
Library Data Model:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121
class UserLibraryService: """ Manages user's personal library of saved content. """ # Special playlist for Liked Songs LIKED_SONGS_PSEUDO_PLAYLIST = "liked-songs" async def save_track(self, user_id: str, track_uri: str): """ Save a track to user's library (heart/like action). This is a high-frequency operation - users like many tracks daily. Must be optimized for speed. """ await self.db.execute( """INSERT INTO user_library (user_id, item_type, item_uri, saved_at) VALUES (%s, 'track', %s, NOW()) ON CONFLICT (user_id, item_type, item_uri) DO NOTHING""", (user_id, track_uri) ) # Update denormalized count await self._increment_saved_count(user_id, 'track') # Emit for sync await self._emit_library_changed(user_id, 'track_saved', track_uri) async def get_saved_tracks( self, user_id: str, offset: int = 0, limit: int = 50 ) -> SavedTracksResponse: """ Get user's saved tracks (Liked Songs). Ordered by save time, most recent first. """ tracks = await self.db.fetch_all( """SELECT ul.item_uri, ul.saved_at, t.* FROM user_library ul JOIN tracks t ON ul.item_uri = t.uri WHERE ul.user_id = %s AND ul.item_type = 'track' ORDER BY ul.saved_at DESC LIMIT %s OFFSET %s""", (user_id, limit, offset) ) total = await self._get_saved_count(user_id, 'track') return SavedTracksResponse(items=tracks, total=total, offset=offset) async def is_track_saved(self, user_id: str, track_uris: List[str]) -> List[bool]: """ Check if tracks are in user's library. Batched for efficiency (UI shows heart state for many tracks). """ if not track_uris: return [] # Batch query saved = await self.db.fetch_all( """SELECT item_uri FROM user_library WHERE user_id = %s AND item_type = 'track' AND item_uri = ANY(%s)""", (user_id, track_uris) ) saved_set = {row['item_uri'] for row in saved} return [uri in saved_set for uri in track_uris] async def save_album(self, user_id: str, album_uri: str): """Save an album to library.""" await self.db.execute( """INSERT INTO user_library (user_id, item_type, item_uri, saved_at) VALUES (%s, 'album', %s, NOW()) ON CONFLICT DO NOTHING""", (user_id, album_uri) ) await self._emit_library_changed(user_id, 'album_saved', album_uri) async def follow_artist(self, user_id: str, artist_uri: str): """ Follow an artist. Following affects: 1. Artist appears in library 2. User gets new release notifications 3. Artist's music ranks higher in recommendations """ await self.db.execute( """INSERT INTO user_library (user_id, item_type, item_uri, saved_at) VALUES (%s, 'artist', %s, NOW()) ON CONFLICT DO NOTHING""", (user_id, artist_uri) ) # Add to notification targets for new releases await self._add_artist_notification_target(user_id, artist_uri) # Update recommendation signals await self._emit_follow_signal(user_id, artist_uri) async def get_library_summary(self, user_id: str) -> LibrarySummary: """ Get counts for library tab. Uses denormalized counts for speed. """ counts = await self.db.fetch_one( """SELECT saved_tracks_count, saved_albums_count, followed_artists_count, followed_playlists_count, saved_podcasts_count FROM user_library_counts WHERE user_id = %s""", (user_id,) ) return LibrarySummary(**counts)The library is conceptually different from playlists. Library is about 'I want to remember this exists' while playlists are about 'I want to play these together in this order'. Some services blur this distinction, but keeping them separate allows cleaner mental models and data structures.
We've covered the complete architecture for playlist and library management. Let's consolidate the key design decisions:
| Component | Decision | Rationale |
|---|---|---|
| Data Model | Position-based with versions | Efficient reordering, sync support |
| Sharding | By playlist_id | Co-locate playlist with its tracks |
| Concurrency | Optimistic locking + OT | Handle collaborative edits safely |
| Sync | WebSocket + version-based delta | Real-time cross-device updates |
| Large Playlists | Pagination + virtual scroll | Handle 10,000 track playlists |
| Library | Separate table, type-indexed | Fast save/unsave, batched lookups |
What's next:
With streaming and playlist architecture covered, we'll move to Recommendation System—how machine learning pipelines power personalized discovery at scale.
You now understand how to architect playlist and library management at scale: from data modeling and sharding strategies, through concurrency control and real-time synchronization, to performance optimizations for large playlists. This forms the organizational backbone of any music streaming platform.