Loading content...
In many real-world scenarios, we need to rank items—such as players, products, or models—based on pairwise comparison outcomes rather than absolute scores. This is the foundation of competitive ranking systems, preference learning, and modern LLM evaluation frameworks.
The Pairwise Comparison Model provides an elegant probabilistic framework for this task. It assumes that each item i possesses a hidden (latent) strength parameter βᵢ that represents its underlying quality or skill level. When two items compete, the probability that item i defeats item j is modeled using the logistic (sigmoid) function:
$$P(i \text{ beats } j) = \sigma(\beta_i - \beta_j) = \frac{1}{1 + e^{-(\beta_i - \beta_j)}}$$
This formulation captures the intuition that items with higher strength parameters are more likely to win, with the probability depending on the difference in strength values. When strengths are equal, each item has exactly a 50% chance of winning.
Maximum Likelihood Estimation:
Given a collection of observed pairwise comparisons, our goal is to find the strength parameters β that maximize the likelihood of the observed outcomes. For a dataset of comparisons, the log-likelihood function is:
$$\mathcal{L}(\beta) = \sum_{(w,l) \in \text{comparisons}} \log \sigma(\beta_w - \beta_l)$$
where (w, l) denotes a comparison where item w (winner) defeated item l (loser).
Gradient Ascent Optimization:
To maximize the log-likelihood, we use gradient ascent. The gradient with respect to each strength parameter is:
$$\frac{\partial \mathcal{L}}{\partial \beta_i} = \sum_{(w,l): w=i} (1 - \sigma(\beta_i - \beta_l)) - \sum_{(w,l): l=i} \sigma(\beta_w - \beta_i)$$
Intuitively, each win increases the gradient for the winner (pushing its strength up), while each loss decreases it (pushing the loser's strength down).
Identifiability Constraint:
Since only the differences between strength parameters matter (not their absolute values), we impose a centering constraint: after each gradient update, subtract the mean from all parameters to ensure they sum to zero. This makes the parameters uniquely identifiable.
Your Task:
Implement a function that estimates the strength parameters for all items given a set of pairwise comparison outcomes. Your implementation should:
comparisons = [(0, 1), (0, 1), (1, 0)]
n_items = 2
learning_rate = 0.5
n_iterations = 100[0.3466, -0.3466]Item 0 defeated item 1 in 2 out of 3 comparisons (win rate ≈ 66.7%). The maximum likelihood estimate satisfies σ(β₀ - β₁) = 2/3, which gives β₀ - β₁ = log(2) ≈ 0.693. After mean-centering, we get β₀ = 0.3466 and β₁ = -0.3466. The positive parameter for item 0 correctly indicates it as the stronger competitor.
comparisons = [(0, 1), (1, 2), (0, 2)]
n_items = 3
learning_rate = 0.5
n_iterations = 100[3.9579, 0.0, -3.9579]This forms a transitive chain: Item 0 beat item 1, item 1 beat item 2, and item 0 beat item 2. With only single comparisons and no losses for item 0 or wins for item 2, the optimization drives the strength parameters toward extreme values (bounded by numerical precision). The result correctly ranks items as 0 > 1 > 2, with item 1 at the midpoint (0.0 after centering).
comparisons = [(0, 1), (1, 0), (1, 2), (2, 1), (0, 2), (2, 0)]
n_items = 3
learning_rate = 0.3
n_iterations = 50[0.0, 0.0, 0.0]Each pair of items has exactly one win and one loss against each other—perfectly symmetric outcomes. The maximum likelihood solution is that all items have equal strength (50% win probability against any opponent). After centering, all parameters are exactly 0.0, indicating equal competitiveness.
Constraints