Content Based Methods - Learning Module

Loading content...

0/245

Knowledge-Based Methods

When Domain Expertise Matters

Not all recommendations can be learned from data alone. When recommending a mortgage product, you need to understand income requirements, credit scoring, and regulatory constraints. When suggesting a medical treatment, clinical guidelines and contraindications must be respected. When configuring a complex industrial system, engineering specifications constrain what's possible.

Knowledge-based recommendation systems encode explicit domain expertise—rules, constraints, ontologies, and reasoning mechanisms—to provide recommendations that are not just statistically likely, but technically correct and domain-appropriate.

What You Will Learn

By the end of this page, you will understand constraint-based and case-based recommendation paradigms, knowledge representation techniques (ontologies, rules), when to use knowledge-based approaches, and how to integrate domain knowledge with data-driven methods.

Why Knowledge-Based Recommendation?

Content-based and collaborative filtering learn patterns from historical data. But some domains pose unique challenges where pure learning falls short:

High-Stakes Domains:

Medical recommendations: Wrong suggestion could harm patients
Financial products: Regulatory compliance required
Legal advice: Must follow legal frameworks

Infrequent Purchase Domains:

Real estate: Users buy houses rarely; no history to learn from
Automobiles: Purchase every 5-10 years
Enterprise software: Long procurement cycles

Complex Configuration:

Custom products: Configure laptop, car, insurance policy
B2B solutions: Complex compatibility requirements
Technical systems: Must respect engineering constraints

Explainability Requirements:

Must justify recommendations to users or regulators
"Users like you also bought this" insufficient
Need reasoning: "This product meets your stated requirements because..."

When to Use Knowledge-Based Approaches
Characteristic	Data-Driven Suitable	Knowledge-Based Suitable
Purchase frequency	High (daily/weekly)	Low (yearly or less)
User history depth	Extensive	Sparse or none
Domain constraints	Few/soft	Many/hard
Explainability need	Nice-to-have	Essential
Stakes of recommendation	Low	High
Product complexity	Simple attributes	Complex configuration

Constraint-Based Recommendation

Constraint-based recommenders use explicit rules to filter and rank items based on user requirements and domain constraints.

Components:

User Requirements (REQ): Explicitly stated preferences
- "Budget under $50,000"
- "Must have 4+ bedrooms"
- "Compatible with existing systems"
Product/Filter Constraints (FC): Domain rules on what's valid
- "If income < $80K, cannot recommend jumbo mortgage"
- "If selecting diesel engine, cannot select sport package"
Product Catalog (CAT): Items with their attributes
Recommendation: Items satisfying REQ ∩ FC

Formal Model:

$$\text{Recommendations} = {i \in \text{CAT} : \text{SAT}(\text{REQ}_u \cup \text{FC})}$$

Where SAT tests satisfiability of combined constraints.

constraint_based_recommender.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
from typing import List, Dict, Any, Callable, Optional
from dataclasses import dataclass
from enum import Enum
 
class ConstraintType(Enum):
    HARD = "hard"      # Must be satisfied
    SOFT = "soft"      # Prefer if possible
    
@dataclass
class Constraint:
    name: str
    condition: Callable[[Dict[str, Any], Dict[str, Any]], bool]
    type: ConstraintType = ConstraintType.HARD
    weight: float = 1.0
    explanation: str = ""
 
class ConstraintBasedRecommender:
    """
    Constraint-based recommendation system.
    
    Filters items based on user requirements and domain constraints,
    returning only items that satisfy all hard constraints and
    ranking by soft constraint satisfaction.
    """
    
    def __init__(self):
        self.filter_constraints: List[Constraint] = []
        self.product_constraints: List[Constraint] = []
    
    def add_filter_constraint(self, constraint: Constraint):
        """Add a constraint based on user requirements."""
        self.filter_constraints.append(constraint)
    
    def add_product_constraint(self, constraint: Constraint):
        """Add a domain/product constraint."""
        self.product_constraints.append(constraint)
    
    def check_constraint(
        self,
        constraint: Constraint,
        item: Dict[str, Any],
        requirements: Dict[str, Any]
    ) -> bool:
        """Check if constraint is satisfied."""
        try:
            return constraint.condition(item, requirements)
        except Exception:
            return False  # Constraint cannot be evaluated
    
    def recommend(
        self,
        items: List[Dict[str, Any]],
        requirements: Dict[str, Any],
        max_results: int = 10
    ) -> List[Dict[str, Any]]:
        """Generate recommendations satisfying constraints."""
        
        all_constraints = self.filter_constraints + self.product_constraints
        hard_constraints = [c for c in all_constraints if c.type == ConstraintType.HARD]
        soft_constraints = [c for c in all_constraints if c.type == ConstraintType.SOFT]
        
        valid_items = []
        
        for item in items:
            # Check all hard constraints
            hard_satisfied = all(
                self.check_constraint(c, item, requirements)
                for c in hard_constraints
            )
            
            if not hard_satisfied:
                continue
            
            # Score by soft constraint satisfaction
            soft_score = sum(
                c.weight for c in soft_constraints
                if self.check_constraint(c, item, requirements)
            )
            
            valid_items.append({
                'item': item,
                'score': soft_score,
                'explanations': self._generate_explanations(
                    item, requirements, all_constraints
                )
            })
        
        # Sort by soft constraint score
        valid_items.sort(key=lambda x: x['score'], reverse=True)
        
        return valid_items[:max_results]
    
    def _generate_explanations(
        self,
        item: Dict[str, Any],
        requirements: Dict[str, Any],
        constraints: List[Constraint]
    ) -> List[str]:
        """Generate explanations for satisfied constraints."""
        explanations = []
        for c in constraints:
            if c.explanation and self.check_constraint(c, item, requirements):
                explanations.append(c.explanation.format(**item, **requirements))
        return explanations
 
 
# Example: Real Estate Recommendation
def create_real_estate_recommender():
    rec = ConstraintBasedRecommender()
    
    # User-based filter constraints
    rec.add_filter_constraint(Constraint(
        name="budget",
        condition=lambda item, req: item['price'] <= req.get('max_price', float('inf')),
        type=ConstraintType.HARD,
        explanation="Within your budget of ${max_price:,
                        }"
    ))
 
    rec.add_filter_constraint(Constraint(
        name = "bedrooms",
        condition = lambda item, req: item['bedrooms'] >= req.get('min_bedrooms', 0),
        type = ConstraintType.HARD,
        explanation = "Has {bedrooms} bedrooms (you need {min_bedrooms}+)"
    ))
 
    rec.add_filter_constraint(Constraint(
        name = "location",
        condition = lambda item, req: item['city'] in req.get('preferred_cities', [item['city']]),
        type = ConstraintType.HARD
    ))
    
    # Soft preference constraints
    rec.add_filter_constraint(Constraint(
        name = "garage",
        condition = lambda item, req: item.get('has_garage', False) if req.get('wants_garage') else True,
            type = ConstraintType.SOFT,
            weight = 2.0,
            explanation = "Has a garage as preferred"
    ))
 
    rec.add_filter_constraint(Constraint(
        name = "pool",
        condition = lambda item, req: item.get('has_pool', False) if req.get('wants_pool') else True,
            type = ConstraintType.SOFT,
            weight = 1.5,
            explanation = "Has a pool as preferred"
    ))
 
    return rec

Handling Unsatisfiable Constraints

When no items satisfy all constraints, smart systems can: (1) Identify minimal constraint relaxation to find solutions, (2) Explain why requirements are unsatisfiable, (3) Suggest alternative requirements. This requires constraint solving techniques beyond simple filtering.

Case-Based Recommendation

Case-based reasoning (CBR) recommends items similar to cases that previously satisfied similar requirements. It's based on the principle: similar problems have similar solutions.

CBR Cycle:

Retrieve: Find past cases similar to current requirements
Reuse: Adapt past solutions to current context
Revise: Evaluate and adjust the solution
Retain: Store successful cases for future use

In Recommendation:

Cases = (requirements, successful recommendation) pairs
New user states requirements
Find similar past requirement profiles
Recommend items that satisfied those similar users

Similarity in CBR:

Domain-specific similarity functions: $$\text{sim}(\text{req}_1, \text{req}_2) = \sum_i w_i \cdot \text{sim}_i(\text{req}_1[i], \text{req}_2[i])$$

Where attribute similarities may use:

Numeric: $1 - |v_1 - v_2| / \text{range}$
Categorical: Ontology-based similarity
Set: Jaccard similarity

case_based_recommender.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
import numpy as np
from typing import List, Dict, Any, Tuple
from dataclasses import dataclass
 
@dataclass
    class Case:
    requirements: Dict[str, Any]
    recommended_item_id: str
    outcome: float  # Success score(e.g., purchase, satisfaction)
 
    class CaseBasedRecommender:
    """
    Case - based recommendation using past requirement-solution pairs.
    """
    
    def __init__(self):
    self.case_base: List[Case] = []
    self.attribute_weights: Dict[str, float] = {}
    self.items: Dict[str, Dict[str, Any]] = {}
    
    def add_case(self, case: Case):
    """Add a case to the case base."""
    self.case_base.append(case)
    
    def set_attribute_weights(self, weights: Dict[str, float]):
    """Set importance weights for requirement attributes."""
    self.attribute_weights = weights
    
    def _attribute_similarity(
        self,
        attr: str,
        val1: Any,
        val2: Any
    ) -> float:
    """Compute similarity for a single attribute."""
    if val1 is None or val2 is None:
    return 0.5  # Neutral for missing values
        
        if isinstance(val1, (int, float)) and isinstance(val2, (int, float)):
            # Numeric: normalized distance
    max_val = max(abs(val1), abs(val2), 1)
    return 1.0 - abs(val1 - val2) / max_val
        
        elif isinstance(val1, str) and isinstance(val2, str):
            # Categorical: exact match or use ontology
    return 1.0 if val1 == val2 else 0.0
        
        elif isinstance(val1, (list, set)) and isinstance(val2, (list, set)):
            # Set: Jaccard similarity
    set1, set2 = set(val1), set(val2)
    if not set1 and not set2:
    return 1.0
    return len(set1 & set2) / len(set1 | set2)
 
    return 0.0
    
    def case_similarity(
        self,
        req1: Dict[str, Any],
        req2: Dict[str, Any]
    ) -> float:
    """Compute weighted similarity between requirement profiles."""
    all_attrs = set(req1.keys()) | set(req2.keys())
 
    total_weight = 0.0
    weighted_sim = 0.0
 
    for attr in all_attrs:
        weight = self.attribute_weights.get(attr, 1.0)
    sim = self._attribute_similarity(
        attr, req1.get(attr), req2.get(attr)
    )
    weighted_sim += weight * sim
    total_weight += weight
 
    return weighted_sim / total_weight if total_weight > 0 else 0.0
    
    def retrieve(
        self,
        requirements: Dict[str, Any],
        k: int = 10
    ) -> List[Tuple[Case, float]]:
    """Retrieve k most similar cases."""
    similarities = [
        (case, self.case_similarity(requirements, case.requirements))
            for case in self.case_base
        ]
        
        # Sort by similarity descending
    similarities.sort(key = lambda x: x[1], reverse = True)
 
    return similarities[:k]
    
    def recommend(
        self,
        requirements: Dict[str, Any],
        n_items: int = 5,
        min_similarity: float = 0.5
    ) -> List[Dict[str, Any]]:
    """Recommend items based on similar cases."""
 
    similar_cases = self.retrieve(requirements, k = 50)
        
        # Aggregate recommendations from similar cases
    item_scores = {}
 
    for case, similarity in similar_cases:
    if similarity < min_similarity:
        continue
 
    item_id = case.recommended_item_id
    weight = similarity * case.outcome
 
    if item_id not in item_scores:
    item_scores[item_id] = { 'score': 0, 'count': 0 }
 
    item_scores[item_id]['score'] += weight
    item_scores[item_id]['count'] += 1
        
        # Rank items
    recommendations = []
    for item_id, data in item_scores.items():
        avg_score = data['score'] / data['count']
    recommendations.append({
        'item_id': item_id,
        'item': self.items.get(item_id, {}),
        'score': avg_score,
        'support': data['count']
    })
 
    recommendations.sort(key = lambda x: x['score'], reverse = True)
    return recommendations[:n_items]

Knowledge Representation Techniques

Ontologies:

Formal representations of domain concepts and their relationships:

Classes: Camera, DSLR Camera, Mirrorless Camera
Properties: hasResolution, hasSensorSize
Relationships: DSLR subClassOf Camera
Constraints: hasISO min 100 max 102400

Ontologies enable:

Semantic similarity (DSLR more similar to Mirrorless than to Tripod)
Type-safe constraints (lens compatible with camera mount type)
Hierarchical browsing (navigate by category)

Rule Systems:

Explicit IF-THEN rules encoding domain knowledge:

IF user.experience = 'beginner' AND user.budget < 1000
THEN recommend entry_level_cameras WITH explanation
     'These are great starter cameras within your budget'

IF camera.sensor_size = 'full_frame' AND user.use_case = 'travel'
THEN add_concern 'Full-frame cameras are typically heavier'

Knowledge Graphs:

Graph structures linking entities with typed relationships:

Nodes: Products, attributes, categories, users
Edges: has_attribute, belongs_to, compatible_with, purchased_by
Enable graph-based reasoning and path-based explanations

Ontology Languages

Standard ontology languages include OWL (Web Ontology Language) and RDF (Resource Description Framework). Tools like Protégé help build ontologies. For recommendation, lighter-weight taxonomies or graph databases often suffice.

Conversational Knowledge-Based Systems

Knowledge-based systems often operate through dialogue, eliciting requirements through conversation:

System: What type of photography do you primarily do? User: Mostly landscape and travel. System: Do you prioritize image quality or portability? User: Image quality, but I can't carry more than 1kg. System: Based on your needs, I recommend mirrorless cameras with APS-C sensors. Here are three options that excel in landscape photography under 1kg...

Critique-Based Refinement:

Users refine recommendations through critiques:

"Show me cheaper options"
"I need something with better low-light performance"
"Actually, weather sealing is important to me"

The system incorporates critiques as additional constraints.

Modern LLM Integration:

Large language models enable natural language understanding of requirements and natural language generation of explanations, while structured knowledge ensures constraint satisfaction and factual accuracy.

Conversational Patterns

•Requirement Elicitation — Ask clarifying questions to understand needs
•Presentation — Show recommendations with explanations
•Critique Handling — Refine based on user feedback
•Comparison — Help users compare options
•Trade-off Navigation — Explain necessary compromises

Combining Knowledge with Data-Driven Methods

The most powerful systems combine domain knowledge with learned patterns:

Knowledge-Guided Learning:

Constrained Neural Networks: Enforce domain constraints in model output
Knowledge Graph Embeddings: Learn embeddings respecting ontology structure
Physics-Informed ML: Domain equations guide model architecture

Learning Knowledge:

Rule Mining: Discover rules from interaction data
Ontology Learning: Build taxonomies from product descriptions
Constraint Learning: Infer constraints from successful recommendations

Practical Patterns:

Knowledge as Features: Ontology relationships as input features
Knowledge as Regularization: Penalize predictions violating constraints
Knowledge as Post-Processing: Filter/re-rank ML output with rules
Knowledge for Cold-Start: Domain rules when data is sparse

The Neuro-Symbolic Trend

The field is moving toward neuro-symbolic systems that combine the pattern recognition power of neural networks with the reasoning capabilities of symbolic AI. For recommendations, this means learning from data while respecting encoded domain expertise.

Implementation Considerations

Knowledge Acquisition Bottleneck:

The biggest challenge is capturing domain knowledge:

Expert interviews and knowledge elicitation sessions
Mining constraints from historical data
Collaborative ontology building with domain experts
Iterative refinement based on edge cases

Maintenance:

Knowledge bases require ongoing maintenance:

New products with new attributes
Changing regulations/constraints
Discovered errors in rules
Evolving domain understanding

Scalability:

Constraint checking: O(|items| × |constraints|) per request
Pre-compute constraint satisfaction for static constraints
Index items by satisfiable constraint combinations
Use constraint propagation to prune search space

Testing:

Unit test individual constraints
Integration test constraint combinations
Validate against known correct recommendations
Test edge cases and boundary conditions

Knowledge-Based System Challenges
Challenge	Mitigation Strategy
Knowledge acquisition	Expert workshops, data mining, iterative refinement
Knowledge maintenance	Version control, change tracking, automated testing
Scalability	Pre-computation, indexing, constraint propagation
Brittleness	Soft constraints, graceful degradation, fallbacks
User requirement elicitation	Smart defaults, critique-based, conversational UI

Summary: Knowledge-Based Methods

Key Takeaways

•Knowledge-based systems encode domain expertise — Essential for high-stakes, complex, or infrequent-purchase domains.
•Constraint-based recommendation filters by explicit rules — Hard constraints for validity, soft constraints for preference.
•Case-based reasoning uses past experience — Similar requirements imply similar solutions.
•Knowledge representation matters — Ontologies, rules, and knowledge graphs capture domain structure.
•Conversational interaction elicits requirements — Dialogue and critique-based refinement improve matches.
•Hybrid knowledge-learning systems combine strengths — Data-driven patterns with domain-enforced correctness.

Module Complete:

You've now mastered content-based recommendation methods—from item representations and user profiles through TF-IDF and embeddings, hybrid approaches, and knowledge-based systems. These techniques form the foundation for building recommendation systems that truly understand what items offer and what users want.

Module Complete

Congratulations! You've completed Module 4: Content-Based Methods. You now have comprehensive knowledge of item representation, user profiling, text encoding (TF-IDF to embeddings), hybrid systems, and knowledge-based approaches. Next, explore Deep Learning for Recommendations to see how neural networks are transforming the field.