Content Based Methods - Learning Module

Loading content...

0/245

User Profiles

Understanding User Preferences

If item representations answer "What is this item?", then user profiles answer the equally critical question: "What does this user want?"

A user profile is a mathematical summary of a user's preferences, constructed from their interaction history. In content-based recommendation, user profiles are expressed in the same feature space as items—enabling direct comparison between what users like and what items offer.

The challenge is profound: users don't explicitly describe their preferences. We must infer them from noisy, sparse signals—a handful of clicks, purchases, or ratings scattered across a catalog of millions. From this incomplete picture, we must construct a profile that accurately predicts preferences for items the user has never seen.

What You Will Learn

By the end of this page, you will understand how to construct user profiles from interaction history, aggregate preferences across multiple items, handle temporal dynamics and evolving tastes, and balance exploitation of known preferences with exploration of new interests.

User Profile Fundamentals

A user profile in content-based filtering is a vector representation that captures the user's preferences in the item feature space.

Formal Definition:

Given:

User $u$ with interaction history $H_u = {(i_1, r_1, t_1), (i_2, r_2, t_2), ...}$
Item representations $\phi(i) \in \mathbb{R}^d$

The user profile is a function: $$\psi: U \rightarrow \mathbb{R}^d$$

Where $\psi(u)$ is a $d$-dimensional vector representing user $u$'s preferences.

The Recommendation Score:

With both user profiles and item representations in the same space, recommendation becomes a similarity computation:

$$\text{score}(u, i) = \text{sim}(\psi(u), \phi(i))$$

Common similarity functions include:

Cosine similarity: $\frac{\psi(u) \cdot \phi(i)}{|\psi(u)| |\phi(i)|}$
Dot product: $\psi(u) \cdot \phi(i)$
Learned similarity: Neural network combining both vectors

User Profile Design Considerations
Aspect	Question	Impact
Aggregation	How to combine multiple item interactions?	Affects profile accuracy and bias
Weighting	Are all interactions equally important?	Recency, engagement depth matter
Temporal Dynamics	How to handle evolving preferences?	Old vs new tastes balance
Multi-Interest	Does user have multiple taste clusters?	Single vs multi-vector profiles
Sparsity	How to profile users with few interactions?	Cold-start handling

Profile Construction Methods

Simple Averaging:

The most straightforward approach—average the representations of items the user has interacted with:

$$\psi(u) = \frac{1}{|H_u|} \sum_{i \in H_u} \phi(i)$$

Pros: Simple, interpretable, computationally cheap Cons: Ignores interaction strength, recency, negative signals

Weighted Averaging:

Weight items by interaction strength (ratings, time spent, purchase amount):

$$\psi(u) = \frac{\sum_{i \in H_u} w_{ui} \cdot \phi(i)}{\sum_{i \in H_u} w_{ui}}$$

Weighting strategies:

Rating-based: $w_{ui} = r_{ui} - \bar{r}_u$ (deviation from user mean)
Engagement-based: $w_{ui} = \log(1 + \text{time_spent}_{ui})$
Recency-based: $w_{ui} = e^{-\lambda(t_{now} - t_{ui})}$

TF-IDF Inspired:

Weight by item popularity (rare items are more informative):

$$w_{ui} = r_{ui} \cdot \log\frac{|U|}{|{u': i \in H_{u'}}|}$$

Items consumed by everyone reveal little about individual preference.

user_profile_construction.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
import numpy as np
from typing import List, Tuple, Dict, Optional
from dataclasses import dataclass
from enum import Enum
 
class WeightingStrategy(Enum):
    UNIFORM = "uniform"
    RATING = "rating"
    RECENCY = "recency"
    ENGAGEMENT = "engagement"
    TFIDF = "tfidf"
 
@dataclass
class Interaction:
    item_id: int
    rating: float
    timestamp: float
    engagement_time: float = 0.0
 
class UserProfileBuilder:
    """
    Builds user profiles from interaction history and item embeddings.
    """
    
    def __init__(
        self,
        item_embeddings: np.ndarray,
        weighting: WeightingStrategy = WeightingStrategy.RATING,
        recency_decay: float = 0.01,
        normalize: bool = True
    ):
        self.item_embeddings = item_embeddings
        self.weighting = weighting
        self.recency_decay = recency_decay
        self.normalize = normalize
        self.item_popularity = None
    
    def set_item_popularity(self, interaction_counts: np.ndarray):
        """Set item popularity for TF-IDF weighting."""
        total_users = interaction_counts.sum()
        self.item_popularity = np.log(total_users / (interaction_counts + 1))
    
    def _compute_weights(
        self,
        interactions: List[Interaction],
        current_time: Optional[float] = None
    ) -> np.ndarray:
        """Compute interaction weights based on strategy."""
        n = len(interactions)
        weights = np.ones(n)
        
        if self.weighting == WeightingStrategy.RATING:
            ratings = np.array([i.rating for i in interactions])
            mean_rating = ratings.mean()
            weights = ratings - mean_rating + 1  # Shift to positive
            
        elif self.weighting == WeightingStrategy.RECENCY:
            if current_time is None:
                current_time = max(i.timestamp for i in interactions)
            for idx, interaction in enumerate(interactions):
                age = current_time - interaction.timestamp
                weights[idx] = np.exp(-self.recency_decay * age)
                
        elif self.weighting == WeightingStrategy.ENGAGEMENT:
            weights = np.array([
                np.log1p(i.engagement_time) for i in interactions
            ])
            
        elif self.weighting == WeightingStrategy.TFIDF:
            if self.item_popularity is None:
                raise ValueError("Item popularity required for TF-IDF")
            ratings = np.array([i.rating for i in interactions])
            idf = np.array([
                self.item_popularity[i.item_id] for i in interactions
            ])
            weights = ratings * idf
        
        # Ensure positive weights
        weights = np.maximum(weights, 0.01)
        return weights
    
    def build_profile(
        self,
        interactions: List[Interaction],
        current_time: Optional[float] = None
    ) -> np.ndarray:
        """
        Build user profile from interaction history.
        
        Returns:
            User profile vector (same dimensionality as item embeddings)
        """
        if not interactions:
            return np.zeros(self.item_embeddings.shape[1])
        
        weights = self._compute_weights(interactions, current_time)
        
        # Gather item embeddings
        item_ids = [i.item_id for i in interactions]
        embeddings = self.item_embeddings[item_ids]
        
        # Weighted average
        profile = np.average(embeddings, axis=0, weights=weights)
        
        if self.normalize:
            norm = np.linalg.norm(profile)
            if norm > 0:
                profile = profile / norm
        
        return profile
    
    def compute_scores(
        self,
        user_profile: np.ndarray,
        candidate_items: Optional[List[int]] = None
    ) -> np.ndarray:
        """Compute recommendation scores for items."""
        if candidate_items is None:
            candidates = self.item_embeddings
        else:
            candidates = self.item_embeddings[candidate_items]
        
        # Cosine similarity (profiles are normalized)
        return candidates @ user_profile

Temporal Dynamics and Preference Evolution

User preferences are not static—they evolve over time. A user's music taste at 20 differs from their taste at 40. Seasonal patterns emerge: holiday movies in December, fitness content in January.

Types of Temporal Effects:

1. Long-term Drift: Gradual evolution of preferences (e.g., maturing taste in wine)

Strategy: Exponential decay weighting on older interactions

2. Short-term Context: Recent items influence immediate preferences

Watching a horror movie increases short-term horror preference
Strategy: Session-aware profiles, recency weighting

3. Periodic Patterns: Recurring preferences based on time cycles

Weekend vs weekday content
Seasonal preferences
Strategy: Time-aware models, separate profiles per context

4. Life Events: Discrete changes (new baby → parenting content)

Strategy: Change point detection, profile reset mechanisms

temporal_profile.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
import numpy as np
from collections import deque
from typing import List
 
class TemporalUserProfile:
    """
    User profile with temporal decay and session awareness.
    
    Maintains both long-term and short-term preference signals.
    """
    
    def __init__(
        self,
        embedding_dim: int,
        long_term_decay: float = 0.001,
        short_term_window: int = 10,
        long_short_ratio: float = 0.7
    ):
        self.embedding_dim = embedding_dim
        self.long_term_decay = long_term_decay
        self.short_term_window = short_term_window
        self.long_short_ratio = long_short_ratio
        
        # Long-term profile (exponential moving average)
        self.long_term_profile = np.zeros(embedding_dim)
        self.long_term_weight = 0.0
        
        # Short-term profile (recent window)
        self.recent_embeddings = deque(maxlen=short_term_window)
        
        self.last_update_time = 0.0
    
    def update(
        self,
        item_embedding: np.ndarray,
        timestamp: float,
        interaction_weight: float = 1.0
    ):
        """Update profile with new interaction."""
        # Apply decay to long-term profile
        if self.long_term_weight > 0:
            time_delta = timestamp - self.last_update_time
            decay = np.exp(-self.long_term_decay * time_delta)
            self.long_term_profile *= decay
            self.long_term_weight *= decay
        
        # Add new item to long-term
        self.long_term_profile += interaction_weight * item_embedding
        self.long_term_weight += interaction_weight
        
        # Add to short-term window
        self.recent_embeddings.append(
            (item_embedding, interaction_weight)
        )
        
        self.last_update_time = timestamp
    
    def get_profile(self) -> np.ndarray:
        """Get combined long-term and short-term profile."""
        # Long-term component
        if self.long_term_weight > 0:
            lt_profile = self.long_term_profile / self.long_term_weight
        else:
            lt_profile = np.zeros(self.embedding_dim)
        
        # Short-term component
        if self.recent_embeddings:
            embeddings = [e for e, w in self.recent_embeddings]
            weights = [w for e, w in self.recent_embeddings]
            st_profile = np.average(embeddings, axis=0, weights=weights)
        else:
            st_profile = np.zeros(self.embedding_dim)
        
        # Combine with ratio
        alpha = self.long_short_ratio
        combined = alpha * lt_profile + (1 - alpha) * st_profile
        
        # Normalize
        norm = np.linalg.norm(combined)
        return combined / norm if norm > 0 else combined

Session vs Persistent Profiles

Many systems maintain both: a persistent long-term profile reflecting overall taste, and a session profile capturing current context. The final profile blends both—using session signals for immediate relevance while anchoring to long-term preferences.

Multi-Interest User Modeling

A single vector often fails to capture the diversity of user interests. Someone who enjoys both classical music and heavy metal has preferences that a single averaged vector poorly represents.

The Averaging Problem:

If a user loves items at opposite ends of a dimension:

Classical music: $[1, 0, 0]$
Heavy metal: $[-1, 0, 0]$
Average: $[0, 0, 0]$ — represents neither!

Multi-Interest Solutions:

1. Multiple Profile Vectors: Maintain $K$ separate interest vectors per user: $$\Psi(u) = {\psi_1(u), \psi_2(u), ..., \psi_K(u)}$$

Score via maximum: $\text{score}(u, i) = \max_k \text{sim}(\psi_k(u), \phi(i))$

2. Clustering-Based:

Cluster user's interacted items
Create one profile per cluster
Use K-means, DBSCAN, or hierarchical clustering

3. Attention-Based:

Learn to attend to different aspects of history
Different attention heads capture different interests
Score depends on which interest matches the candidate item

multi_interest_profile.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
import numpy as np
from sklearn.cluster import KMeans
from typing import List, Tuple
 
class MultiInterestProfile:
    """
    Represents user with multiple interest clusters.
    """
    
    def __init__(
        self,
        max_interests: int = 5,
        min_items_per_interest: int = 3
    ):
        self.max_interests = max_interests
        self.min_items_per_interest = min_items_per_interest
        self.interest_vectors: List[np.ndarray] = []
        self.interest_weights: List[float] = []
    
    def build_from_history(
        self,
        item_embeddings: np.ndarray,
        interaction_weights: np.ndarray
    ):
        """Build multi-interest profile from item history."""
        n_items = len(item_embeddings)
        
        if n_items < self.min_items_per_interest:
            # Too few items - use single profile
            profile = np.average(
                item_embeddings, axis=0, weights=interaction_weights
            )
            self.interest_vectors = [profile / np.linalg.norm(profile)]
            self.interest_weights = [1.0]
            return
        
        # Determine number of clusters
        n_clusters = min(
            self.max_interests,
            n_items // self.min_items_per_interest
        )
        
        # Cluster items
        kmeans = KMeans(n_clusters=n_clusters, random_state=42)
        clusters = kmeans.fit_predict(item_embeddings)
        
        # Build profile per cluster
        self.interest_vectors = []
        self.interest_weights = []
        
        for c in range(n_clusters):
            mask = clusters == c
            if mask.sum() < self.min_items_per_interest:
                continue
            
            cluster_embeddings = item_embeddings[mask]
            cluster_weights = interaction_weights[mask]
            
            profile = np.average(
                cluster_embeddings, axis=0, weights=cluster_weights
            )
            profile = profile / np.linalg.norm(profile)
            
            self.interest_vectors.append(profile)
            self.interest_weights.append(cluster_weights.sum())
        
        # Normalize weights
        total = sum(self.interest_weights)
        self.interest_weights = [w/total for w in self.interest_weights]
    
    def score_item(self, item_embedding: np.ndarray) -> float:
        """Score item using max-similarity across interests."""
        if not self.interest_vectors:
            return 0.0
        
        item_norm = item_embedding / np.linalg.norm(item_embedding)
        similarities = [
            np.dot(interest, item_norm)
            for interest in self.interest_vectors
        ]
        return max(similarities)
    
    def score_item_weighted(self, item_embedding: np.ndarray) -> float:
        """Score using weighted combination of interest matches."""
        if not self.interest_vectors:
            return 0.0
        
        item_norm = item_embedding / np.linalg.norm(item_embedding)
        score = sum(
            weight * np.dot(interest, item_norm)
            for interest, weight in zip(
                self.interest_vectors, self.interest_weights
            )
        )
        return score

Handling Negative Signals

Users express preferences through both positive and negative signals. Knowing what users dislike is as valuable as knowing what they like.

Types of Negative Signals:

Signal	Interpretation	Strength
Low rating (1-2 stars)	Explicit dislike	Strong
Skip/scroll past	Likely not interested	Weak
Short engagement	Content didn't resonate	Moderate
Explicit 'not interested'	Direct feedback	Strong
Return/refund	Post-purchase regret	Strong

Incorporating Negative Signals:

Approach 1: Subtraction $$\psi(u) = \frac{\sum_{i \in H^+u} \phi(i) - \beta \sum{i \in H^-_u} \phi(i)}{|H^+_u| + \beta|H^-_u|}$$

Approach 2: Separate Profiles Maintain positive profile $\psi^+(u)$ and negative profile $\psi^-(u)$: $$\text{score}(u, i) = \text{sim}(\psi^+(u), \phi(i)) - \gamma \cdot \text{sim}(\psi^-(u), \phi(i))$$

Approach 3: Contrastive Learning Learn to push positive items closer and negative items farther in embedding space.

The Missing-Not-At-Random Problem

Absence of interaction is not neutral. Users don't interact with items they're unaware of or items they've already consumed. Treating non-interactions as negative signals biases against long-tail items. Advanced approaches model the selection process explicitly.

Cold-Start User Profiles

New users have no interaction history—the cold-start problem. Content-based systems can still function using alternative signals.

Cold-Start Strategies:

1. Demographic Defaults: Initialize profiles based on demographic features: $$\psi_0(u) = f(\text{age}, \text{location}, \text{device}, ...)$$

2. Onboarding Preferences: Ask users to indicate interests during signup:

Select favorite genres/categories
Rate a few seed items
Follow topics or creators

3. Contextual Signals: Use available context for initial recommendations:

Referral source implies interests
Search queries reveal intent
Time of day suggests content type

4. Popularity Fallback: Recommend popular items until profile develops:

Global popularity
Popularity within demographic segment
Trending content

5. Exploration: Strategically show diverse items to learn preferences quickly:

Bandit algorithms (UCB, Thompson Sampling)
Active learning strategies

Rapid Profile Building

•Maximize information gain — Early recommendations should differentiate preferences, not confirm obvious patterns.
•Diversify initial exposure — Show items from different categories to discover interests.
•Weight early signals heavily — First interactions are most informative.
•Transfer from similar users — Bootstrap from demographically similar users.

Profile Storage and Real-Time Updates

Production systems must efficiently store and update millions of user profiles.

Storage Considerations:

Profile Size: Users × Dimensions × 4 bytes
100M users × 256 dims × 4 bytes = 100 GB
Options: Redis, DynamoDB, Cassandra, custom stores

Update Strategies:

Batch Updates:

Recompute profiles periodically (hourly/daily)
Efficient for large-scale processing
Latency: profiles lag behind real behavior

Real-Time Updates:

Update profile on each interaction
Incremental formulas (running averages)
Challenge: consistency across distributed systems

Hybrid:

Batch for complete profile rebuilds
Real-time for session-level adjustments
Best of both worlds

Incremental Profile Updates

For weighted averages, maintain running sums: profile = Σ(w·e) and total_weight = Σw. On new interaction, add new term to both. Profile = running_sum / total_weight. This enables O(d) updates without reprocessing history.

Summary: User Profiles

Key Takeaways

•User profiles represent preferences in item space — Enabling direct user-item similarity computation.
•Aggregation matters — Weighted averaging with engagement, recency, and TF-IDF improve over simple means.
•Preferences evolve — Temporal decay and session awareness capture changing tastes.
•Multiple interests exist — Multi-vector profiles handle diverse preferences better than single vectors.
•Negative signals inform — What users dislike shapes profiles as much as what they like.
•Cold-start requires strategy — Onboarding, demographics, and exploration bootstrap new users.
•Production demands efficiency — Incremental updates and smart storage handle scale.

What's Next:

With item representations and user profiles established, we'll dive deep into TF-IDF and embeddings—the core techniques for representing textual content that powers many content-based recommendation systems.

Page Complete

You now understand how to construct, maintain, and evolve user profiles for content-based recommendation. You can handle diverse preferences, temporal dynamics, and cold-start scenarios while building systems that scale to millions of users.