Spotify Music Streaming - Learning Module

Loading content...

0/273

Playlist and Library Management

Organizing the World's Music

Playlists are the soul of music personalization. They transform a massive catalog into personal collections that reflect individual tastes, moods, and memories. With over 4 billion user-created playlists on Spotify—more than the number of tracks in the catalog—playlist management is as important as streaming itself.

Designing a playlist system at this scale involves solving complex distributed systems problems: How do you enable millions of concurrent edits without data loss? How do you sync playlist changes across devices within seconds? How do you handle playlists with 10,000 tracks without performance degradation? How do collaborative playlists work when multiple users edit simultaneously?

This page explores the architecture that makes seamless playlist management possible.

What You Will Learn

You will understand the data models for playlists and libraries, database selection and sharding strategies, techniques for handling concurrency and conflicts, cross-device synchronization, and performance optimizations for large playlists.

Data Model Design

A well-designed data model is foundational. It must support all playlist operations efficiently while enabling horizontal scaling across billions of playlists.

Core Entities:

data-model.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
-- Core playlist entity
-- Sharded by playlist_id for horizontal scaling
CREATE TABLE playlists (
    playlist_id         UUID PRIMARY KEY,        -- Globally unique ID
    owner_id            UUID NOT NULL,           -- User who created playlist
    name                VARCHAR(256) NOT NULL,   -- Playlist name
    description         TEXT,                    -- Optional description
    image_url           VARCHAR(512),            -- Custom or generated cover
    
    -- Visibility and sharing
    is_public           BOOLEAN DEFAULT false,
    is_collaborative    BOOLEAN DEFAULT false,
    
    -- Metadata
    track_count         INT DEFAULT 0,           -- Denormalized for quick access
    total_duration_ms   BIGINT DEFAULT 0,        -- Denormalized total duration
    follower_count      INT DEFAULT 0,           -- Denormalized followers
    
    -- Versioning for sync
    version             BIGINT DEFAULT 1,        -- Incremented on any change
    snapshot_id         VARCHAR(64),             -- Unique snapshot for API
    
    -- Timestamps
    created_at          TIMESTAMP DEFAULT NOW(),
    updated_at          TIMESTAMP DEFAULT NOW(),
    
    -- Indexes
    INDEX idx_owner (owner_id),
    INDEX idx_public (is_public, follower_count DESC)
);
 
-- Playlist tracks - the core relationship
-- Sharded by playlist_id (co-located with playlist)
CREATE TABLE playlist_tracks (
    playlist_id         UUID NOT NULL,
    position            INT NOT NULL,            -- 0-indexed position in playlist
    track_uri           VARCHAR(64) NOT NULL,    -- Spotify track URI
    
    -- Snapshot of track info at add time (for deleted tracks)
    added_by            UUID,                    -- Who added this track
    added_at            TIMESTAMP DEFAULT NOW(),
    
    -- Composite primary key
    PRIMARY KEY (playlist_id, position),
    
    -- For finding all positions of a track
    INDEX idx_track (playlist_id, track_uri)
);
 
-- User library - saved items
CREATE TABLE user_library (
    user_id             UUID NOT NULL,
    item_type           ENUM('track', 'album', 'artist', 'playlist', 'episode'),
    item_uri            VARCHAR(64) NOT NULL,
    saved_at            TIMESTAMP DEFAULT NOW(),
    
    PRIMARY KEY (user_id, item_type, item_uri),
    INDEX idx_saved_at (user_id, item_type, saved_at DESC)
);
 
-- Playlist followers
CREATE TABLE playlist_followers (
    playlist_id         UUID NOT NULL,
    user_id             UUID NOT NULL,
    followed_at         TIMESTAMP DEFAULT NOW(),
    
    PRIMARY KEY (playlist_id, user_id),
    INDEX idx_user_follows (user_id, followed_at DESC)
);
 
-- Collaborative playlist permissions
CREATE TABLE playlist_collaborators (
    playlist_id         UUID NOT NULL,
    user_id             UUID NOT NULL,
    role                ENUM('editor', 'viewer') DEFAULT 'editor',
    added_at            TIMESTAMP DEFAULT NOW(),
    
    PRIMARY KEY (playlist_id, user_id),
    INDEX idx_user_collaborations (user_id)
);

Key Design Decisions

•Position-based ordering — Tracks are stored with explicit positions, enabling efficient reordering and insertion at any point.
•Denormalized counts — track_count, duration, follower_count are maintained on the playlist record to avoid counting queries.
•Version tracking — Each playlist has a version that increments on any change, enabling efficient sync checks.
•Snapshot IDs — Unique identifiers for playlist state, enabling clients to detect if their cached version is stale.
•added_by tracking — For collaborative playlists, we track who added each track for attribution.

Why Not Use Array Columns?

Some databases support array columns which might seem perfect for track lists. However, arrays have limitations: no efficient insertion in the middle, difficult to query by position, and atomic operations become challenging. The positions table approach is more flexible and scalable.

Database Selection and Sharding

With 4+ billion playlists and billions of track entries, no single database can handle this load. We need a horizontally scalable database with a well-designed sharding strategy.

Database Requirements:

Database Selection Criteria

•Horizontal scalability — Must shard across hundreds of database nodes.
•Strong consistency for writes — Playlist modifications must not be lost.
•Low latency reads — Playlist loading must be fast (<100ms).
•Efficient range queries — Fetch tracks by position range efficiently.
•Transaction support — Atomic operations for multi-track edits.

Database Options Comparison
Database	Scalability	Consistency	Use Case Fit	Trade-offs
PostgreSQL + Citus	Excellent	Strong	Great	Operational complexity
CockroachDB	Excellent	Strong	Great	Latency for global distribution
Vitess (MySQL)	Excellent	Strong	Great	Proven at YouTube scale
Cassandra	Excellent	Tunable	Good	No transactions, eventual consistency
DynamoDB	Excellent	Strong (per-item)	Good	Limited query flexibility
Spanner	Excellent	Strong	Excellent	Cost, vendor lock-in

Sharding Strategy:

For playlist data, we shard by playlist_id. This ensures that all data for a single playlist (the playlist record and all its tracks) is co-located on the same shard, enabling single-shard transactions.

sharding-strategy.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
class PlaylistShardRouter:
    """
    Routes playlist operations to appropriate database shards.
    Uses consistent hashing on playlist_id.
    """
    
    def __init__(self, shard_count=256):
        self.shard_count = shard_count
        self.shard_connections = self._initialize_connections()
    
    def get_shard(self, playlist_id: str) -> int:
        """
        Determine shard for a playlist using consistent hashing.
        
        We use consistent hashing to:
        1. Distribute playlists evenly across shards
        2. Minimize resharding when adding/removing nodes
        """
        # Hash the playlist_id to get a consistent shard
        hash_value = hashlib.md5(playlist_id.encode()).hexdigest()
        numeric_hash = int(hash_value[:8], 16)
        
        return numeric_hash % self.shard_count
    
    def get_connection(self, playlist_id: str):
        """Get database connection for the appropriate shard."""
        shard_id = self.get_shard(playlist_id)
        return self.shard_connections[shard_id]
    
    def execute_playlist_query(self, playlist_id: str, query: str, params: tuple):
        """Execute query on the correct shard."""
        conn = self.get_connection(playlist_id)
        with conn.cursor() as cursor:
            cursor.execute(query, params)
            return cursor.fetchall()
 
# Sharding boundaries for user library (different shard key)
class UserLibraryShardRouter:
    """
    User library is sharded by user_id to co-locate
    all of a user's saved items on one shard.
    """
    
    def get_shard(self, user_id: str) -> int:
        hash_value = hashlib.md5(user_id.encode()).hexdigest()
        return int(hash_value[:8], 16) % self.shard_count
    
    # Cross-shard query for "playlists where I'm a collaborator"
    def get_user_collaborations(self, user_id: str):
        """
        This requires a scatter-gather query since collaborations
        are stored on playlist shards, not user shards.
        
        Options:
        1. Secondary index table sharded by user_id
        2. Scatter-gather across all shards (expensive)
        3. Cache in separate system like Redis
        
        We use option 1 - a separate user_collaborations table
        sharded by user_id that mirrors playlist_collaborators.
        """
        pass

Cross-Shard Queries

Queries like 'find all playlists a user follows' span multiple shards since playlists are sharded by playlist_id, not user_id. These require either scatter-gather queries (expensive) or maintaining a secondary index table sharded by user_id. Design carefully to minimize cross-shard operations.

Playlist Operations Implementation

Every playlist operation must be implemented efficiently, atomically, and with proper versioning for sync. Let's examine the critical operations:

Add Tracks Operation:

playlist-operations.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
class PlaylistService:
    """
    Core playlist operations with transactional safety.
    """
    
    async def add_tracks(
        self,
        playlist_id: str,
        track_uris: List[str],
        position: Optional[int] = None,  # None = append
        user_id: str = None
    ) -> PlaylistSnapshot:
        """
        Add tracks to playlist at specified position.
        
        Complexity considerations:
        - Appending is O(1) - just insert at end position
        - Inserting in middle requires shifting subsequent positions
        - Spotify limits playlists to 10,000 tracks
        """
        async with self.db.transaction() as tx:
            # Get current playlist state with lock
            playlist = await tx.fetch_one(
                "SELECT * FROM playlists WHERE playlist_id = %s FOR UPDATE",
                (playlist_id,)
            )
            
            # Verify permissions
            if not self._can_edit(playlist, user_id):
                raise PermissionDeniedError()
            
            # Check track limit
            new_count = playlist['track_count'] + len(track_uris)
            if new_count > 10000:
                raise PlaylistLimitExceeded(f"Max 10,000 tracks allowed")
            
            # Determine insert position
            if position is None:
                insert_position = playlist['track_count']
            else:
                insert_position = min(position, playlist['track_count'])
                
                # Shift existing tracks to make room
                await tx.execute(
                    """UPDATE playlist_tracks 
                       SET position = position + %s 
                       WHERE playlist_id = %s AND position >= %s""",
                    (len(track_uris), playlist_id, insert_position)
                )
            
            # Insert new tracks
            for i, track_uri in enumerate(track_uris):
                await tx.execute(
                    """INSERT INTO playlist_tracks 
                       (playlist_id, position, track_uri, added_by, added_at)
                       VALUES (%s, %s, %s, %s, NOW())""",
                    (playlist_id, insert_position + i, track_uri, user_id)
                )
            
            # Get track durations for total_duration update
            durations = await self._get_track_durations(track_uris)
            total_new_duration = sum(durations)
            
            # Update playlist metadata
            new_version = playlist['version'] + 1
            new_snapshot = self._generate_snapshot_id()
            
            await tx.execute(
                """UPDATE playlists SET 
                   track_count = track_count + %s,
                   total_duration_ms = total_duration_ms + %s,
                   version = %s,
                   snapshot_id = %s,
                   updated_at = NOW()
                   WHERE playlist_id = %s""",
                (len(track_uris), total_new_duration, new_version, 
                 new_snapshot, playlist_id)
            )
            
            # Emit change event for sync
            await self._emit_playlist_changed(playlist_id, new_version)
            
            return PlaylistSnapshot(
                playlist_id=playlist_id,
                version=new_version,
                snapshot_id=new_snapshot
            )
    
    async def reorder_tracks(
        self,
        playlist_id: str,
        range_start: int,
        range_length: int,
        insert_before: int,
        user_id: str
    ) -> PlaylistSnapshot:
        """
        Move a range of tracks to a new position.
        
        This is the most complex operation - requires careful
        position management to avoid gaps or overlaps.
        """
        async with self.db.transaction() as tx:
            playlist = await tx.fetch_one(
                "SELECT * FROM playlists WHERE playlist_id = %s FOR UPDATE",
                (playlist_id,)
            )
            
            if not self._can_edit(playlist, user_id):
                raise PermissionDeniedError()
            
            # Validate ranges
            if range_start < 0 or range_start + range_length > playlist['track_count']:
                raise InvalidRangeError()
            
            # Calculate effective insert position after removing the range
            if insert_before > range_start:
                effective_insert = insert_before - range_length
            else:
                effective_insert = insert_before
            
            # Step 1: Move target tracks to negative positions (temporary)
            await tx.execute(
                """UPDATE playlist_tracks 
                   SET position = -(position - %s + 1)
                   WHERE playlist_id = %s 
                   AND position >= %s AND position < %s""",
                (range_start, playlist_id, range_start, range_start + range_length)
            )
            
            # Step 2: Close the gap left by removed tracks
            if insert_before > range_start:
                # Shift tracks between old position and insert down
                await tx.execute(
                    """UPDATE playlist_tracks 
                       SET position = position - %s
                       WHERE playlist_id = %s 
                       AND position >= %s AND position < %s""",
                    (range_length, playlist_id, 
                     range_start + range_length, insert_before)
                )
            else:
                # Shift tracks between insert and old position up
                await tx.execute(
                    """UPDATE playlist_tracks 
                       SET position = position + %s
                       WHERE playlist_id = %s 
                       AND position >= %s AND position < %s""",
                    (range_length, playlist_id, 
                     insert_before, range_start)
                )
            
            # Step 3: Move tracks from negative to final position
            await tx.execute(
                """UPDATE playlist_tracks 
                   SET position = %s - position - 1
                   WHERE playlist_id = %s AND position < 0""",
                (effective_insert + range_length, playlist_id)
            )
            
            # Update version
            new_version = playlist['version'] + 1
            new_snapshot = self._generate_snapshot_id()
            
            await tx.execute(
                """UPDATE playlists SET 
                   version = %s, snapshot_id = %s, updated_at = NOW()
                   WHERE playlist_id = %s""",
                (new_version, new_snapshot, playlist_id)
            )
            
            await self._emit_playlist_changed(playlist_id, new_version)
            
            return PlaylistSnapshot(
                playlist_id=playlist_id,
                version=new_version,
                snapshot_id=new_snapshot
            )

Batch Operations

Users often perform bulk operations (add album, copy playlist). These should be implemented as single transactions with batch inserts for efficiency, not as multiple individual add_track calls.

Concurrency and Conflict Resolution

Collaborative playlists present the ultimate concurrency challenge: multiple users editing the same playlist simultaneously from different devices. Without careful handling, edits can be lost or corrupt ordering.

Concurrency Scenarios:

Concurrency Scenarios
Scenario	Risk	Solution
Two users add same track	Duplicate entries	Allow duplicates or last-write-wins
Delete while reordering	Invalid positions	Optimistic locking with retry
Simultaneous reorders	Corrupted order	Transaction serialization
Add during position shift	Wrong final position	Position recalculation
Offline edits sync	Conflicting changes	CRDT-based merge or user resolution

Optimistic Concurrency Control:

We use optimistic concurrency with version checks. Clients submit changes with expected version; server rejects if version mismatches.

optimistic-concurrency.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
class PlaylistServiceWithConcurrency:
    """
    Playlist operations with optimistic concurrency control.
    """
    
    async def add_tracks_with_version(
        self,
        playlist_id: str,
        track_uris: List[str],
        position: Optional[int],
        expected_version: int,  # Client's known version
        user_id: str
    ) -> Union[PlaylistSnapshot, ConcurrencyConflict]:
        """
        Add tracks only if playlist hasn't changed since client last fetched.
        
        If version mismatch, return current state for client to reconcile.
        """
        async with self.db.transaction() as tx:
            playlist = await tx.fetch_one(
                "SELECT * FROM playlists WHERE playlist_id = %s FOR UPDATE",
                (playlist_id,)
            )
            
            # Version check
            if playlist['version'] != expected_version:
                current_tracks = await self._get_playlist_tracks(tx, playlist_id)
                return ConcurrencyConflict(
                    expected_version=expected_version,
                    actual_version=playlist['version'],
                    current_tracks=current_tracks,
                    message="Playlist was modified. Please review changes."
                )
            
            # Proceed with normal add operation
            return await self._perform_add(
                tx, playlist_id, track_uris, position, user_id, playlist
            )
    
    async def resolve_offline_sync(
        self,
        playlist_id: str,
        client_version: int,
        client_operations: List[Operation],
        user_id: str
    ) -> SyncResult:
        """
        Reconcile offline changes with current server state.
        
        Uses Operational Transformation (OT) to merge operations.
        """
        async with self.db.transaction() as tx:
            playlist = await tx.fetch_one(
                "SELECT * FROM playlists WHERE playlist_id = %s FOR UPDATE",
                (playlist_id,)
            )
            
            if playlist['version'] == client_version:
                # No server changes, apply client ops directly
                for op in client_operations:
                    await self._apply_operation(tx, playlist_id, op, user_id)
                return SyncResult(status='applied', conflicts=[])
            
            # Get server operations since client's version
            server_ops = await self._get_operations_since(
                tx, playlist_id, client_version
            )
            
            # Transform client operations against server operations
            transformed_ops = self._operational_transform(
                client_operations, 
                server_ops
            )
            
            # Apply transformed operations
            conflicts = []
            for op in transformed_ops:
                try:
                    await self._apply_operation(tx, playlist_id, op, user_id)
                except OperationConflict as e:
                    conflicts.append(e)
            
            return SyncResult(
                status='merged' if not conflicts else 'partial',
                conflicts=conflicts
            )
    
    def _operational_transform(
        self,
        client_ops: List[Operation],
        server_ops: List[Operation]
    ) -> List[Operation]:
        """
        Transform client operations to apply cleanly after server operations.
        
        Example:
        - Server added track at position 5
        - Client wants to add track at position 7
        - Transform: Client should now add at position 8
        """
        transformed = []
        for client_op in client_ops:
            adjusted_op = client_op.copy()
            
            for server_op in server_ops:
                if server_op.type == 'add' and client_op.type == 'add':
                    if server_op.position <= client_op.position:
                        adjusted_op.position += len(server_op.tracks)
                        
                elif server_op.type == 'remove' and client_op.type == 'add':
                    if server_op.position < client_op.position:
                        adjusted_op.position -= 1
                        
                # More complex transformations for reorder, etc.
            
            transformed.append(adjusted_op)
        
        return transformed

CRDT Alternative

For highly collaborative scenarios, CRDTs (Conflict-free Replicated Data Types) can automatically merge concurrent changes without conflicts. However, CRDTs add complexity and may not preserve user intent in all cases. Most music services use simpler optimistic locking with conflict reporting.

Cross-Device Synchronization

Users expect changes made on one device to appear on all their other devices almost instantly. This requires a real-time synchronization system that pushes updates without requiring polling.

Sync Architecture:

sync-architecture.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
┌─────────────────────────────────────────────────────────────────────┐
│                    CROSS-DEVICE SYNC ARCHITECTURE                    │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │                    CLIENT DEVICES                             │   │
│  │  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐         │   │
│  │  │ Phone   │  │ Desktop │  │ Web     │  │ Smart   │         │   │
│  │  │ App     │  │ App     │  │ Player  │  │ Speaker │         │   │
│  │  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘         │   │
│  │       │            │            │            │               │   │
│  │       └────────────┼────────────┼────────────┘               │   │
│  │                    │            │                             │   │
│  │                    ▼            ▼                             │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                       │            │                                 │
│                       │ WebSocket  │ WebSocket                       │
│                       ▼            ▼                                 │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │              REAL-TIME NOTIFICATION SERVICE                   │   │
│  │                                                               │   │
│  │  • Maintains WebSocket connections to all active clients      │   │
│  │  • Routes change notifications to affected users              │   │
│  │  • Groups connections by user_id for efficient broadcast      │   │
│  │                                                               │   │
│  │  Tech: Redis Pub/Sub + WebSocket clusters                     │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                │                                     │
│                                │ Subscribe to changes                │
│                                ▼                                     │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │                    CHANGE EVENT BUS                           │   │
│  │                                                               │   │
│  │  • Playlist Service emits events on every change              │   │
│  │  • Events include: playlist_id, version, change_type          │   │
│  │  • Stored for replay (handle client reconnection)             │   │
│  │                                                               │   │
│  │  Tech: Kafka/Pulsar/Redis Streams                             │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                │                                     │
│                                │ Events                              │
│                                ▼                                     │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │                    PLAYLIST SERVICE                           │   │
│  │                                                               │   │
│  │  All playlist mutations emit change events                    │   │
│  └──────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

Sync Protocol:

sync-protocol.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
class PlaylistSyncService:
    """
    Manages real-time playlist synchronization across devices.
    """
    
    async def handle_client_connect(self, user_id: str, connection):
        """
        When client connects, subscribe to user's relevant channels.
        """
        # Subscribe to user's personal library changes
        await self.pubsub.subscribe(f"user:{user_id}:library")
        
        # Subscribe to playlists user owns or follows
        playlist_ids = await self._get_user_playlist_ids(user_id)
        for playlist_id in playlist_ids:
            await self.pubsub.subscribe(f"playlist:{playlist_id}:changes")
        
        # Store connection for push notifications
        self.user_connections[user_id].add(connection)
    
    async def emit_playlist_changed(
        self,
        playlist_id: str,
        version: int,
        change_type: str,
        details: dict
    ):
        """
        Broadcast playlist change to all interested clients.
        """
        # Get all users who need this notification
        owner_id = await self._get_playlist_owner(playlist_id)
        follower_ids = await self._get_playlist_followers(playlist_id)
        collaborator_ids = await self._get_playlist_collaborators(playlist_id)
        
        interested_users = {owner_id} | set(follower_ids) | set(collaborator_ids)
        
        # Publish to channel
        message = {
            'type': 'playlist_changed',
            'playlist_id': playlist_id,
            'version': version,
            'change_type': change_type,  # 'tracks_added', 'tracks_removed', 'reordered'
            'timestamp': datetime.utcnow().isoformat(),
            **details
        }
        
        await self.pubsub.publish(
            f"playlist:{playlist_id}:changes",
            json.dumps(message)
        )
        
        # Store for replay (clients that reconnect)
        await self._store_change_event(playlist_id, version, message)
    
    async def handle_client_sync_request(
        self,
        user_id: str,
        playlist_id: str,
        client_version: int
    ) -> SyncResponse:
        """
        Client requests sync - provide delta if possible, full refresh if needed.
        """
        current_version = await self._get_playlist_version(playlist_id)
        
        if client_version == current_version:
            return SyncResponse(status='up_to_date')
        
        # Try to provide delta (changes since client's version)
        version_gap = current_version - client_version
        
        if version_gap <= 100:  # Delta sync for small gaps
            changes = await self._get_changes_since(playlist_id, client_version)
            return SyncResponse(
                status='delta',
                changes=changes,
                new_version=current_version
            )
        else:  # Full refresh for large gaps
            full_playlist = await self._get_full_playlist(playlist_id)
            return SyncResponse(
                status='full_refresh',
                playlist=full_playlist,
                new_version=current_version
            )
 
# Client-side sync handling
class ClientSyncManager:
    """
    Client-side sync manager that integrates with server sync service.
    """
    
    async def handle_sync_notification(self, notification: dict):
        """
        Handle incoming sync notification from server.
        """
        if notification['type'] == 'playlist_changed':
            playlist_id = notification['playlist_id']
            server_version = notification['version']
            
            local_version = self.local_cache.get_version(playlist_id)
            
            if server_version == local_version + 1:
                # Simple case: sequential change, apply delta
                await self._apply_delta(playlist_id, notification)
            elif server_version > local_version + 1:
                # Missed changes: request full sync
                await self._request_sync(playlist_id, local_version)
            # else: we're ahead (our change), ignore
    
    async def _apply_delta(self, playlist_id: str, change: dict):
        """Apply incremental change to local cache."""
        if change['change_type'] == 'tracks_added':
            self.local_cache.insert_tracks(
                playlist_id,
                change['position'],
                change['tracks']
            )
        elif change['change_type'] == 'tracks_removed':
            self.local_cache.remove_tracks(
                playlist_id,
                change['positions']
            )
        # Update local version
        self.local_cache.set_version(playlist_id, change['version'])

Spotify Connect

Spotify Connect extends this sync further—not just playlist state but playback state (current track, position, volume) syncs in real-time. Users can start playing on phone and continue on desktop seamlessly. This uses the same WebSocket infrastructure.

Large Playlist Optimization

Spotify allows playlists with up to 10,000 tracks. At this scale, naive implementations become unusable—imagine fetching or rendering 10,000 track cards. We need specialized optimizations.

Loading Strategies:

Performance Optimizations

•Paginated Loading — Load tracks in pages of 50-100. API returns track ranges, not full playlist.
•Virtual Scrolling — UI renders only visible tracks. Scroll position maps to track range, fetching as needed.
•Skeleton Loading — Show placeholder rows immediately, populate with data as it loads.
•Progressive Enhancement — Load essential info (title, artist) first; images and metadata async.
•Server-Side Search — For playlists >500 tracks, search API rather than client-side filter.

pagination-api.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
class PlaylistAPI:
    """
    Playlist API with efficient pagination for large playlists.
    """
    
    async def get_playlist_tracks(
        self,
        playlist_id: str,
        offset: int = 0,
        limit: int = 100,  # Max 100 per request
        fields: Optional[List[str]] = None
    ) -> PlaylistTracksResponse:
        """
        Get paginated tracks from playlist.
        
        Fields parameter allows clients to request only needed data:
        - 'items.track.id,items.track.name' for minimal
        - Full response for detail view
        """
        limit = min(limit, 100)  # Cap at 100
        
        # Get total count for pagination info
        playlist = await self.db.fetch_one(
            "SELECT track_count FROM playlists WHERE playlist_id = %s",
            (playlist_id,)
        )
        
        # Fetch requested page
        tracks = await self.db.fetch_all(
            """SELECT pt.position, pt.track_uri, pt.added_at, pt.added_by,
                      t.name, t.artist_ids, t.album_id, t.duration_ms
               FROM playlist_tracks pt
               JOIN tracks t ON pt.track_uri = t.uri
               WHERE pt.playlist_id = %s
               ORDER BY pt.position
               LIMIT %s OFFSET %s""",
            (playlist_id, limit, offset)
        )
        
        return PlaylistTracksResponse(
            items=self._format_tracks(tracks, fields),
            total=playlist['track_count'],
            offset=offset,
            limit=limit,
            next=self._build_next_url(playlist_id, offset, limit, playlist['track_count']),
            previous=self._build_prev_url(playlist_id, offset, limit)
        )
    
    async def search_playlist_tracks(
        self,
        playlist_id: str,
        query: str,
        limit: int = 20
    ) -> List[TrackSearchResult]:
        """
        Search within a playlist - server-side for large playlists.
        
        Uses full-text search on track name, artist name, album name.
        """
        results = await self.db.fetch_all(
            """SELECT pt.position, t.name, t.artist_names, t.album_name
               FROM playlist_tracks pt
               JOIN tracks t ON pt.track_uri = t.uri
               WHERE pt.playlist_id = %s
               AND (
                   t.name ILIKE %s OR
                   t.artist_names ILIKE %s OR
                   t.album_name ILIKE %s
               )
               ORDER BY pt.position
               LIMIT %s""",
            (playlist_id, f'%{query}%', f'%{query}%', f'%{query}%', limit)
        )
        
        return results
 
# Virtual scroll integration
class VirtualScrollPlaylistView:
    """
    Client-side virtual scroll implementation for large playlists.
    """
    
    def __init__(self, playlist_id: str, total_tracks: int):
        self.playlist_id = playlist_id
        self.total_tracks = total_tracks
        self.row_height = 64  # pixels per track row
        self.viewport_height = 800  # visible area
        self.buffer_rows = 10  # extra rows to pre-fetch
        
        self.loaded_ranges = IntervalTree()  # Track which ranges are loaded
        self.tracks_cache = {}  # position -> track data
    
    def get_visible_range(self, scroll_position: int) -> Tuple[int, int]:
        """Calculate which track positions are currently visible."""
        first_visible = scroll_position // self.row_height
        visible_count = self.viewport_height // self.row_height
        
        # Add buffer
        start = max(0, first_visible - self.buffer_rows)
        end = min(self.total_tracks, first_visible + visible_count + self.buffer_rows)
        
        return (start, end)
    
    async def on_scroll(self, scroll_position: int):
        """Handle scroll event - fetch needed data."""
        start, end = self.get_visible_range(scroll_position)
        
        # Find gaps in loaded data
        needed_ranges = self._find_missing_ranges(start, end)
        
        for range_start, range_end in needed_ranges:
            tracks = await self.api.get_playlist_tracks(
                self.playlist_id,
                offset=range_start,
                limit=range_end - range_start
            )
            
            # Cache loaded tracks
            for i, track in enumerate(tracks.items):
                self.tracks_cache[range_start + i] = track
            
            self.loaded_ranges.add(Interval(range_start, range_end))
        
        # Render visible tracks
        self._render_visible(start, end)

Memory Management

Even with virtual scrolling, caching 10,000 tracks consumes memory. Implement cache eviction for tracks far from current scroll position. Also consider storing only essential fields in memory, fetching full details on demand.

User Library Architecture

Beyond playlists, users maintain a personal library of saved content: liked songs, saved albums, followed artists, and podcasts. The library is essentially a collection of special-purpose saved items.

Library Data Model:

library-operations.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
class UserLibraryService:
    """
    Manages user's personal library of saved content.
    """
    
    # Special playlist for Liked Songs
    LIKED_SONGS_PSEUDO_PLAYLIST = "liked-songs"
    
    async def save_track(self, user_id: str, track_uri: str):
        """
        Save a track to user's library (heart/like action).
        
        This is a high-frequency operation - users like many tracks daily.
        Must be optimized for speed.
        """
        await self.db.execute(
            """INSERT INTO user_library (user_id, item_type, item_uri, saved_at)
               VALUES (%s, 'track', %s, NOW())
               ON CONFLICT (user_id, item_type, item_uri) DO NOTHING""",
            (user_id, track_uri)
        )
        
        # Update denormalized count
        await self._increment_saved_count(user_id, 'track')
        
        # Emit for sync
        await self._emit_library_changed(user_id, 'track_saved', track_uri)
    
    async def get_saved_tracks(
        self,
        user_id: str,
        offset: int = 0,
        limit: int = 50
    ) -> SavedTracksResponse:
        """
        Get user's saved tracks (Liked Songs).
        Ordered by save time, most recent first.
        """
        tracks = await self.db.fetch_all(
            """SELECT ul.item_uri, ul.saved_at, t.*
               FROM user_library ul
               JOIN tracks t ON ul.item_uri = t.uri
               WHERE ul.user_id = %s AND ul.item_type = 'track'
               ORDER BY ul.saved_at DESC
               LIMIT %s OFFSET %s""",
            (user_id, limit, offset)
        )
        
        total = await self._get_saved_count(user_id, 'track')
        
        return SavedTracksResponse(items=tracks, total=total, offset=offset)
    
    async def is_track_saved(self, user_id: str, track_uris: List[str]) -> List[bool]:
        """
        Check if tracks are in user's library.
        Batched for efficiency (UI shows heart state for many tracks).
        """
        if not track_uris:
            return []
        
        # Batch query
        saved = await self.db.fetch_all(
            """SELECT item_uri FROM user_library
               WHERE user_id = %s AND item_type = 'track'
               AND item_uri = ANY(%s)""",
            (user_id, track_uris)
        )
        
        saved_set = {row['item_uri'] for row in saved}
        return [uri in saved_set for uri in track_uris]
    
    async def save_album(self, user_id: str, album_uri: str):
        """Save an album to library."""
        await self.db.execute(
            """INSERT INTO user_library (user_id, item_type, item_uri, saved_at)
               VALUES (%s, 'album', %s, NOW())
               ON CONFLICT DO NOTHING""",
            (user_id, album_uri)
        )
        await self._emit_library_changed(user_id, 'album_saved', album_uri)
    
    async def follow_artist(self, user_id: str, artist_uri: str):
        """
        Follow an artist.
        
        Following affects:
        1. Artist appears in library
        2. User gets new release notifications
        3. Artist's music ranks higher in recommendations
        """
        await self.db.execute(
            """INSERT INTO user_library (user_id, item_type, item_uri, saved_at)
               VALUES (%s, 'artist', %s, NOW())
               ON CONFLICT DO NOTHING""",
            (user_id, artist_uri)
        )
        
        # Add to notification targets for new releases
        await self._add_artist_notification_target(user_id, artist_uri)
        
        # Update recommendation signals
        await self._emit_follow_signal(user_id, artist_uri)
    
    async def get_library_summary(self, user_id: str) -> LibrarySummary:
        """
        Get counts for library tab.
        Uses denormalized counts for speed.
        """
        counts = await self.db.fetch_one(
            """SELECT 
                   saved_tracks_count,
                   saved_albums_count,
                   followed_artists_count,
                   followed_playlists_count,
                   saved_podcasts_count
               FROM user_library_counts
               WHERE user_id = %s""",
            (user_id,)
        )
        
        return LibrarySummary(**counts)

Library vs. Playlists

The library is conceptually different from playlists. Library is about 'I want to remember this exists' while playlists are about 'I want to play these together in this order'. Some services blur this distinction, but keeping them separate allows cleaner mental models and data structures.

Playlist Architecture Summary

We've covered the complete architecture for playlist and library management. Let's consolidate the key design decisions:

Playlist Architecture Summary
Component	Decision	Rationale
Data Model	Position-based with versions	Efficient reordering, sync support
Sharding	By playlist_id	Co-locate playlist with its tracks
Concurrency	Optimistic locking + OT	Handle collaborative edits safely
Sync	WebSocket + version-based delta	Real-time cross-device updates
Large Playlists	Pagination + virtual scroll	Handle 10,000 track playlists
Library	Separate table, type-indexed	Fast save/unsave, batched lookups

Key Takeaways

•Position-based ordering enables efficient insertion, deletion, and reordering of tracks.
•Version tracking on playlists enables efficient sync and conflict detection.
•Sharding by playlist_id keeps playlist data co-located for transactional operations.
•Optimistic concurrency with operational transformation handles collaborative editing.
•Real-time sync via WebSocket ensures cross-device consistency within seconds.
•Pagination and virtual scrolling make large playlists performant.
•Denormalized counts avoid expensive aggregation queries.

What's next:

With streaming and playlist architecture covered, we'll move to Recommendation System—how machine learning pipelines power personalized discovery at scale.

Page Complete

You now understand how to architect playlist and library management at scale: from data modeling and sharding strategies, through concurrency control and real-time synchronization, to performance optimizations for large playlists. This forms the organizational backbone of any music streaming platform.