Loading content...
Post-processing methods occupy a unique position in the fairness intervention landscape: they act on a trained model's outputs without modifying the model itself. This seemingly simple constraint has profound practical implications.
When you cannot retrain a model—perhaps due to computational cost, proprietary constraints, or regulatory requirements to preserve an audited model—post-processing is your only option. When fairness requirements differ across deployment contexts, post-processing enables context-specific adjustments without maintaining multiple models. And when you need to understand the cost of fairness separate from model quality, post-processing makes this analysis transparent.
By the end of this page, you will be able to: (1) Apply threshold optimization to satisfy demographic parity or equalized odds, (2) Implement calibration-based post-processing methods, (3) Use reject option classification to handle uncertain predictions fairly, (4) Apply the Hardt et al. (2016) equalized odds post-processing algorithm, (5) Understand the theoretical foundations and limitations of post-processing approaches.
The Post-processing Setup:
Given:
Find a transformation $T$ such that the post-processed classifier $\hat{Y} = T(h(X), A)$ satisfies $\mathcal{F}$ while minimizing accuracy loss.
Key Insight: Unlike in-processing, post-processing explicitly uses the protected attribute $A$ at prediction time. This is both its power (can directly condition on group membership) and its challenge (may not be legally or practically feasible to access $A$ at deployment).
The simplest post-processing approach uses group-specific classification thresholds. Rather than applying a single threshold $\tau$ to convert probabilities to predictions, we use different thresholds for different groups:
$$\hat{Y}(x, a) = \mathbb{1}[h(x) \geq \tau_a]$$
Mathematical Foundation:
For demographic parity, we want: $$P(\hat{Y} = 1 | A = 0) = P(\hat{Y} = 1 | A = 1)$$
For a given model $h$, this translates to finding thresholds $\tau_0, \tau_1$ such that: $$P(h(X) \geq \tau_0 | A = 0) = P(h(X) \geq \tau_1 | A = 1)$$
These are the $(1-p)$-quantiles of the group-conditional score distributions, where $p$ is the target positive rate.
Algorithm: Threshold Search for Demographic Parity
The choice of $p$ controls the overall positive rate. Common choices:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142
import numpy as npfrom typing import Dict, Tuple, Optionalfrom scipy.optimize import minimize_scalar class ThresholdOptimizer: """ Post-processing fairness via group-specific thresholds. Supports demographic parity and equal opportunity constraints. """ def __init__(self, fairness_criterion: str = 'demographic_parity'): """ Args: fairness_criterion: 'demographic_parity' or 'equal_opportunity' """ self.fairness_criterion = fairness_criterion self.thresholds_ = {} def fit(self, scores: np.ndarray, protected: np.ndarray, labels: Optional[np.ndarray] = None, target_rate: Optional[float] = None) -> 'ThresholdOptimizer': """ Learn group-specific thresholds to satisfy fairness criterion. Args: scores: Model probability outputs (n_samples,) protected: Protected attribute values (n_samples,) labels: True labels (required for equal_opportunity) target_rate: Target positive rate (if None, uses overall rate) """ groups = np.unique(protected) if self.fairness_criterion == 'demographic_parity': # Find thresholds that equalize positive prediction rates if target_rate is None: # Use overall rate at default threshold 0.5 target_rate = np.mean(scores >= 0.5) for group in groups: group_mask = protected == group group_scores = scores[group_mask] # Find threshold giving target_rate positive predictions self.thresholds_[group] = np.quantile(group_scores, 1 - target_rate) elif self.fairness_criterion == 'equal_opportunity': assert labels is not None, "Labels required for equal opportunity" # Find thresholds that equalize TPR among positive examples if target_rate is None: # Use overall TPR at threshold 0.5 pos_mask = labels == 1 target_rate = np.mean(scores[pos_mask] >= 0.5) for group in groups: group_mask = (protected == group) & (labels == 1) if group_mask.sum() == 0: self.thresholds_[group] = 0.5 continue group_scores = scores[group_mask] self.thresholds_[group] = np.quantile(group_scores, 1 - target_rate) return self def predict(self, scores: np.ndarray, protected: np.ndarray) -> np.ndarray: """ Apply group-specific thresholds to get fair predictions. Args: scores: Model probability outputs (n_samples,) protected: Protected attribute values (n_samples,) Returns: Binary predictions (n_samples,) """ predictions = np.zeros(len(scores), dtype=int) for group, threshold in self.thresholds_.items(): mask = protected == group predictions[mask] = (scores[mask] >= threshold).astype(int) return predictions def fit_predict(self, scores: np.ndarray, protected: np.ndarray, labels: Optional[np.ndarray] = None, target_rate: Optional[float] = None) -> np.ndarray: """Fit and predict in one step.""" return self.fit(scores, protected, labels, target_rate).predict(scores, protected) def find_optimal_target_rate(scores: np.ndarray, protected: np.ndarray, labels: np.ndarray, fairness_criterion: str = 'demographic_parity') -> Tuple[float, float]: """ Find the target rate that maximizes accuracy while satisfying fairness. Returns: (optimal_rate, best_accuracy) """ def neg_accuracy(rate): optimizer = ThresholdOptimizer(fairness_criterion) preds = optimizer.fit_predict(scores, protected, labels, target_rate=rate) return -np.mean(preds == labels) result = minimize_scalar(neg_accuracy, bounds=(0.01, 0.99), method='bounded') return result.x, -result.fun # Demonstrationif __name__ == "__main__": np.random.seed(42) n = 3000 # Biased model scores protected = np.random.binomial(1, 0.4, n) labels = np.random.binomial(1, 0.5, n) # Simulate biased model: higher scores for protected=0 base_scores = 0.3 + 0.4 * labels + np.random.normal(0, 0.2, n) scores = base_scores - 0.2 * (protected == 1) # Bias against group 1 scores = np.clip(scores, 0, 1) print("Original (threshold=0.5):") orig_preds = (scores >= 0.5).astype(int) print(f" P(Ŷ=1|A=0) = {np.mean(orig_preds[protected==0]):.3f}") print(f" P(Ŷ=1|A=1) = {np.mean(orig_preds[protected==1]):.3f}") print(f" Accuracy = {np.mean(orig_preds == labels):.3f}") # Apply fair post-processing optimizer = ThresholdOptimizer('demographic_parity') fair_preds = optimizer.fit_predict(scores, protected, labels) print("\nAfter threshold optimization:") print(f" P(Ŷ=1|A=0) = {np.mean(fair_preds[protected==0]):.3f}") print(f" P(Ŷ=1|A=1) = {np.mean(fair_preds[protected==1]):.3f}") print(f" Accuracy = {np.mean(fair_preds == labels):.3f}") print(f" Thresholds: {optimizer.thresholds_}")Group-specific thresholds can only achieve fairness if the score distributions for different groups overlap. If one group's scores are uniformly higher, no threshold choice can equalize rates without harming accuracy severely. This occurs when the underlying model is severely biased—post-processing cannot fix fundamentally broken predictions.
The seminal work by Hardt, Price, and Srebro (2016) introduced a rigorous post-processing method for achieving equalized odds: equal true positive rates (TPR) and false positive rates (FPR) across groups.
The Key Insight:
Instead of directly modifying predictions, we learn a randomized prediction rule that mixes the original prediction with controlled randomization:
$$\hat{Y}{fair}(x, a, y) = \begin{cases} \hat{Y}(x) & \text{with probability } p{a,y,\hat{Y}(x)} \ 1 - \hat{Y}(x) & \text{with probability } 1 - p_{a,y,\hat{Y}(x)} \end{cases}$$
The probabilities $p$ are chosen to satisfy equalized odds constraints.
Mathematical Formulation:
We parametrize the post-processing as a set of probabilities for each $(a, y, \hat{y})$ combination. Let:
Constraints:
Objective: Minimize classification loss subject to constraints: $$\min_p E[L(\hat{Y}_{fair}, Y)] \quad \text{s.t. equalized odds constraints}$$
This is a linear program! The constraints are linear in $p$, and error rate objectives are linear too.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175
import numpy as npfrom scipy.optimize import linprogfrom typing import Dict, Tuple class EqualizedOddsPostProcessor: """ Post-processing for equalized odds following Hardt et al. (2016). Learns a randomized prediction rule that satisfies equalized odds (equal TPR and FPR across groups) while minimizing error. """ def __init__(self): self.mixing_rates_ = {} def fit(self, predictions: np.ndarray, protected: np.ndarray, labels: np.ndarray) -> 'EqualizedOddsPostProcessor': """ Learn mixing rates to achieve equalized odds. Args: predictions: Binary predictions from original model (0 or 1) protected: Binary protected attribute labels: True binary labels """ groups = np.unique(protected) assert len(groups) == 2, "Binary protected attribute required" # Compute confusion matrix rates for each group rates = {} for a in groups: mask = protected == a for y in [0, 1]: y_mask = mask & (labels == y) if y_mask.sum() == 0: rates[(a, y, 0)] = 0 rates[(a, y, 1)] = 0 else: rates[(a, y, 0)] = np.mean(predictions[y_mask] == 0) # P(Ŷ=0|A=a,Y=y) rates[(a, y, 1)] = np.mean(predictions[y_mask] == 1) # P(Ŷ=1|A=a,Y=y) # For simplicity, we find thresholds that roughly equalize # Full LP solution would find optimal mixing probabilities # Target rates: average across groups target_tpr = np.mean([ rates[(a, 1, 1)] for a in groups ]) target_fpr = np.mean([ rates[(a, 0, 1)] for a in groups ]) # For each group, compute mixing probabilities for a in groups: current_tpr = rates[(a, 1, 1)] # P(Ŷ=1|Y=1,A=a) current_fpr = rates[(a, 0, 1)] # P(Ŷ=1|Y=0,A=a) # We need to adjust predictions to reach target rates # For positive labels (Y=1): need to reach target_tpr # For negative labels (Y=0): need to reach target_fpr self.mixing_rates_[a] = { 'tpr_current': current_tpr, 'tpr_target': target_tpr, 'fpr_current': current_fpr, 'fpr_target': target_fpr } return self def predict(self, predictions: np.ndarray, protected: np.ndarray, labels: Optional[np.ndarray] = None) -> np.ndarray: """ Apply randomized post-processing to achieve equalized odds. Note: Full equalized odds requires knowing Y at prediction time, which is often not available. This implementation provides an approximation that uses predicted Y when actual Y is unavailable. """ n = len(predictions) fair_predictions = predictions.copy() for a in np.unique(protected): mask = protected == a rates = self.mixing_rates_.get(a, None) if rates is None: continue # For fair equalized odds at prediction time, # we use stochastic prediction adjustment group_preds = predictions[mask] # Adjust positive predictions pos_mask = mask & (predictions == 1) if pos_mask.sum() > 0: # If we're over-predicting positive, flip some to negative current_rate = np.mean(group_preds == 1) target_rate = (rates['tpr_target'] + rates['fpr_target']) / 2 if current_rate > target_rate: flip_prob = (current_rate - target_rate) / current_rate flip = np.random.rand(pos_mask.sum()) < flip_prob fair_predictions[np.where(pos_mask)[0][flip]] = 0 # Adjust negative predictions neg_mask = mask & (predictions == 0) if neg_mask.sum() > 0: current_rate = np.mean(group_preds == 0) target_neg_rate = 1 - (rates['tpr_target'] + rates['fpr_target']) / 2 if current_rate > target_neg_rate: flip_prob = (current_rate - target_neg_rate) / current_rate flip = np.random.rand(neg_mask.sum()) < flip_prob fair_predictions[np.where(neg_mask)[0][flip]] = 1 return fair_predictions def compute_equalized_odds_violation(predictions: np.ndarray, protected: np.ndarray, labels: np.ndarray) -> Dict[str, float]: """Compute equalized odds metrics.""" groups = np.unique(protected) metrics = {} for a in groups: mask = protected == a pos_mask = mask & (labels == 1) neg_mask = mask & (labels == 0) tpr = np.mean(predictions[pos_mask]) if pos_mask.sum() > 0 else 0 fpr = np.mean(predictions[neg_mask]) if neg_mask.sum() > 0 else 0 metrics[f'TPR_group_{a}'] = tpr metrics[f'FPR_group_{a}'] = fpr metrics['TPR_gap'] = abs(metrics['TPR_group_0'] - metrics['TPR_group_1']) metrics['FPR_gap'] = abs(metrics['FPR_group_0'] - metrics['FPR_group_1']) return metrics # Demonstrationif __name__ == "__main__": np.random.seed(42) n = 3000 protected = np.random.binomial(1, 0.5, n) labels = np.random.binomial(1, 0.5, n) # Biased predictions: higher TPR for group 0, higher FPR for group 1 predictions = labels.copy() # Add noise flip = np.random.rand(n) < 0.2 predictions[flip] = 1 - predictions[flip] # Add group bias predictions[(protected == 1) & (labels == 0)] = ( np.random.rand(((protected == 1) & (labels == 0)).sum()) < 0.4 ).astype(int) print("Original predictions:") metrics = compute_equalized_odds_violation(predictions, protected, labels) for k, v in metrics.items(): print(f" {k}: {v:.3f}") # Apply post-processing processor = EqualizedOddsPostProcessor() processor.fit(predictions, protected, labels) fair_predictions = processor.predict(predictions, protected) print("\nAfter equalized odds post-processing:") metrics = compute_equalized_odds_violation(fair_predictions, protected, labels) for k, v in metrics.items(): print(f" {k}: {v:.3f}")True equalized odds post-processing requires knowing the actual label Y at prediction time to apply the correct randomization. In practice, Y is unknown at prediction time. Solutions include: (1) Using estimated Y from the model, (2) Applying 'demographic parity-style' adjustments that don't require Y, (3) Using calibrated probabilities as soft labels.
Calibration ensures that predicted probabilities reflect true outcome frequencies: when a model predicts 0.8, the actual positive rate should be 80%. Group-specific calibration extends this to require calibration within each protected group.
Why Calibration Matters for Fairness:
A model can achieve demographic parity in predictions while being poorly calibrated for minority groups—predicting 0.7 when the true rate is 0.3. This leads to unfair downstream decisions even when aggregate statistics look fair.
Formal Definition:
A classifier $h$ is perfectly calibrated if: $$P(Y = 1 | h(X) = p) = p \quad \text{for all } p \in [0, 1]$$
A classifier has group-specific calibration if: $$P(Y = 1 | h(X) = p, A = a) = p \quad \text{for all } p, a$$
Pleiss et al. (2017) - Calibration for Fairness:
This work proposes multi-calibration: learn separate calibration functions for different groups. The process:
Common Calibration Methods:
Platt Scaling: Logistic regression from scores to labels $$c(s) = \sigma(w \cdot s + b)$$
Isotonic Regression: Non-parametric, monotonic calibration $$c(s) = \text{isotonic_regression}(s, y)$$
Beta Calibration: More flexible parametric mapping $$c(s) = \frac{1}{1 + e^{a \log(s) + b \log(1-s) + c}}$$
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140
import numpy as npfrom sklearn.isotonic import IsotonicRegressionfrom sklearn.linear_model import LogisticRegressionfrom typing import Dict class GroupSpecificCalibrator: """ Post-processing that applies group-specific calibration. Ensures predicted probabilities are well-calibrated within each group, which is a form of fairness (equal meaning of probability across groups). """ def __init__(self, method: str = 'isotonic'): """ Args: method: 'isotonic' for non-parametric, 'platt' for logistic scaling """ self.method = method self.calibrators_ = {} def fit(self, scores: np.ndarray, labels: np.ndarray, protected: np.ndarray) -> 'GroupSpecificCalibrator': """ Fit group-specific calibration functions. Args: scores: Model probability outputs (n_samples,) labels: True binary labels (n_samples,) protected: Protected attribute values (n_samples,) """ for group in np.unique(protected): mask = protected == group group_scores = scores[mask] group_labels = labels[mask] if self.method == 'isotonic': calibrator = IsotonicRegression( y_min=0.0, y_max=1.0, out_of_bounds='clip' ) calibrator.fit(group_scores, group_labels) elif self.method == 'platt': calibrator = LogisticRegression() calibrator.fit(group_scores.reshape(-1, 1), group_labels) self.calibrators_[group] = calibrator return self def calibrate(self, scores: np.ndarray, protected: np.ndarray) -> np.ndarray: """ Apply group-specific calibration to scores. Returns: Calibrated probabilities (n_samples,) """ calibrated = np.zeros_like(scores) for group, calibrator in self.calibrators_.items(): mask = protected == group group_scores = scores[mask] if self.method == 'isotonic': calibrated[mask] = calibrator.predict(group_scores) elif self.method == 'platt': calibrated[mask] = calibrator.predict_proba( group_scores.reshape(-1, 1) )[:, 1] return calibrated def fit_calibrate(self, scores: np.ndarray, labels: np.ndarray, protected: np.ndarray) -> np.ndarray: """Fit and calibrate in one step.""" return self.fit(scores, labels, protected).calibrate(scores, protected) def expected_calibration_error(scores: np.ndarray, labels: np.ndarray, n_bins: int = 10) -> float: """ Compute Expected Calibration Error (ECE). Lower is better. 0 means perfect calibration. """ bin_boundaries = np.linspace(0, 1, n_bins + 1) ece = 0.0 for i in range(n_bins): bin_mask = (scores >= bin_boundaries[i]) & (scores < bin_boundaries[i + 1]) if bin_mask.sum() == 0: continue bin_accuracy = np.mean(labels[bin_mask]) bin_confidence = np.mean(scores[bin_mask]) bin_weight = bin_mask.sum() / len(scores) ece += bin_weight * abs(bin_accuracy - bin_confidence) return ece # Demonstrationif __name__ == "__main__": np.random.seed(42) n = 3000 protected = np.random.binomial(1, 0.5, n) labels = np.random.binomial(1, 0.5, n) # Generate miscalibrated scores with group-specific miscalibration base_scores = 0.2 + 0.6 * labels + np.random.normal(0, 0.15, n) # Group 0: overconfident, Group 1: underconfident scores = base_scores.copy() scores[protected == 0] = np.clip(base_scores[protected == 0] * 1.3, 0, 1) scores[protected == 1] = np.clip(base_scores[protected == 1] * 0.7, 0, 1) print("Before calibration:") for a in [0, 1]: mask = protected == a ece = expected_calibration_error(scores[mask], labels[mask]) print(f" Group {a} ECE: {ece:.4f}") # Apply group-specific calibration calibrator = GroupSpecificCalibrator(method='isotonic') calibrated = calibrator.fit_calibrate(scores, labels, protected) print("\nAfter group-specific calibration:") for a in [0, 1]: mask = protected == a ece = expected_calibration_error(calibrated[mask], labels[mask]) print(f" Group {a} ECE: {ece:.4f}")Group-specific calibration is sometimes called 'predictive parity' or 'sufficiency.' Due to impossibility results, it often conflicts with demographic parity or equalized odds when base rates differ. Calibration focuses on what predictions mean rather than how they're distributed—important for decision-makers interpreting probabilities.
Reject option classification allows the model to abstain from making predictions when confidence is low—and uses this abstention strategically to improve fairness. Predictions near the decision boundary (where discrimination is most likely) can be handled differently.
The Core Idea (Kamiran et al., 2012):
This focuses fairness interventions on uncertain predictions where the model's decision is essentially arbitrary anyway.
Algorithm:
For a classifier with scores $s(x) \in [0, 1]$ and threshold $\tau = 0.5$:
1. Define critical region: [τ - Δ, τ + Δ]
2. For each sample x with group a:
- If s(x) outside critical region: use normal prediction
- If s(x) in critical region:
- If group a is underprivileged and s(x) > τ - Δ: predict positive (promote)
- If group a is privileged and s(x) < τ + Δ: predict negative (demote)
The Philosophy:
When the model is uncertain (score near 0.5), it's essentially random whether we predict positive or negative. By tilting these uncertain predictions toward the disadvantaged group, we improve fairness without overriding confident predictions.
Choosing Δ:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142
import numpy as npfrom typing import Tuple class RejectOptionClassifier: """ Fairness-aware classification using reject option. Modifies predictions in the critical region (near decision boundary) to favor the underprivileged group, improving demographic parity. """ def __init__(self, critical_width: float = 0.1, threshold: float = 0.5): """ Args: critical_width: Width of critical region (Δ) threshold: Classification threshold (τ) """ self.critical_width = critical_width self.threshold = threshold self.underprivileged_group_ = None def fit(self, scores: np.ndarray, protected: np.ndarray, labels: np.ndarray = None) -> 'RejectOptionClassifier': """ Identify the underprivileged group based on positive prediction rates. Args: scores: Model probability outputs protected: Protected attribute values (binary) labels: Optional, not used but kept for API consistency """ groups = np.unique(protected) # Compute positive prediction rate for each group rates = {} for g in groups: mask = protected == g rates[g] = np.mean(scores[mask] >= self.threshold) # Underprivileged group has lower positive rate self.underprivileged_group_ = min(rates.keys(), key=lambda g: rates[g]) return self def predict(self, scores: np.ndarray, protected: np.ndarray) -> np.ndarray: """ Make fairness-adjusted predictions using reject option. Returns: Binary predictions with reject option applied """ predictions = (scores >= self.threshold).astype(int) # Identify critical region low_bound = self.threshold - self.critical_width high_bound = self.threshold + self.critical_width in_critical = (scores >= low_bound) & (scores <= high_bound) # For underprivileged group in critical region: favor positive underprivileged_critical = in_critical & (protected == self.underprivileged_group_) predictions[underprivileged_critical & (scores >= low_bound)] = 1 # For privileged group in critical region: favor negative privileged_critical = in_critical & (protected != self.underprivileged_group_) predictions[privileged_critical & (scores <= high_bound)] = 0 return predictions def fit_predict(self, scores: np.ndarray, protected: np.ndarray, labels: np.ndarray = None) -> np.ndarray: """Fit and predict in one step.""" return self.fit(scores, protected, labels).predict(scores, protected) def analyze_critical_region(scores: np.ndarray, protected: np.ndarray, labels: np.ndarray, threshold: float = 0.5, width: float = 0.1) -> dict: """ Analyze the composition and properties of the critical region. """ low = threshold - width high = threshold + width in_critical = (scores >= low) & (scores <= high) results = { 'critical_fraction': np.mean(in_critical), 'critical_count': in_critical.sum(), } for a in np.unique(protected): mask = (protected == a) & in_critical results[f'group_{a}_in_critical'] = mask.sum() if mask.sum() > 0: results[f'group_{a}_critical_positive_rate'] = np.mean(labels[mask]) return results # Demonstrationif __name__ == "__main__": np.random.seed(42) n = 3000 protected = np.random.binomial(1, 0.5, n) labels = np.random.binomial(1, 0.5, n) # Biased scores base = 0.3 + 0.4 * labels + np.random.normal(0, 0.15, n) scores = np.clip(base - 0.15 * protected, 0, 1) # Bias against group 1 print("Original predictions (threshold=0.5):") orig = (scores >= 0.5).astype(int) print(f" P(Ŷ=1|A=0) = {np.mean(orig[protected==0]):.3f}") print(f" P(Ŷ=1|A=1) = {np.mean(orig[protected==1]):.3f}") print(f" Accuracy = {np.mean(orig == labels):.3f}") # Analyze critical region print("\nCritical region analysis:") analysis = analyze_critical_region(scores, protected, labels, width=0.15) for k, v in analysis.items(): print(f" {k}: {v}") # Apply reject option roc = RejectOptionClassifier(critical_width=0.15) fair_preds = roc.fit_predict(scores, protected, labels) print("\nAfter reject option classification:") print(f" P(Ŷ=1|A=0) = {np.mean(fair_preds[protected==0]):.3f}") print(f" P(Ŷ=1|A=1) = {np.mean(fair_preds[protected==1]):.3f}") print(f" Accuracy = {np.mean(fair_preds == labels):.3f}")Reject option classification is philosophically appealing because it concentrates fairness interventions on cases where the model is genuinely uncertain. When a model gives 0.95 confidence, overriding that prediction feels like ignoring valuable information. When it gives 0.51, the model is essentially coin-flipping anyway—so we might as well flip toward fairness.
Many ML applications produce rankings rather than binary classifications—search results, recommendations, job candidate lists. Post-processing for fair ranking focuses on ensuring fair representation and exposure across ranked positions.
Key Fairness Concepts in Ranking:
Position Bias in Ranking:
Users pay more attention to top positions. If Group B is systematically ranked lower, they receive less exposure even with similar qualifications. The exposure of item at position $k$ is typically modeled as:
$$\text{exposure}(k) = \frac{1}{\log_2(k + 1)}$$
Singh & Joachims (2018) - Equity of Attention:
This influential work proposes fairness constraints for rankings:
Demographic Parity in Exposure: $$\frac{\sum_{i \in G_0} \text{exposure}(\text{rank}(i))}{|G_0|} = \frac{\sum_{i \in G_1} \text{exposure}(\text{rank}(i))}{|G_1|}$$
Both groups should receive equal average exposure.
Merit-Weighted Exposure: $$\frac{\text{Exposure}(G_a)}{\text{Merit}(G_a)} = \frac{\text{Exposure}(G_b)}{\text{Merit}(G_b)}$$
Exposure should be proportional to merit (relevance scores).
Algorithmic Approach:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159
import numpy as npfrom typing import List, Tuple class FairTopKSelector: """ Select top-k items while ensuring fair representation of groups. Implements a simple greedy algorithm that alternates between groups to achieve proportional representation in the selected set. """ def __init__(self, k: int, representation: str = 'proportional'): """ Args: k: Number of items to select representation: 'proportional' (match population) or 'equal' """ self.k = k self.representation = representation def select(self, scores: np.ndarray, protected: np.ndarray, population_weights: dict = None) -> np.ndarray: """ Select top-k items with fair representation. Args: scores: Item scores (higher = better) protected: Group membership for each item population_weights: Optional target proportions per group Returns: Indices of selected items """ n = len(scores) groups = np.unique(protected) # Compute target counts per group if self.representation == 'equal': targets = {g: self.k // len(groups) for g in groups} # Distribute remainder remainder = self.k % len(groups) for i, g in enumerate(groups): if i < remainder: targets[g] += 1 else: # proportional if population_weights is None: population_weights = { g: np.mean(protected == g) for g in groups } targets = { g: int(np.round(self.k * population_weights[g])) for g in groups } # Adjust to sum to k diff = self.k - sum(targets.values()) if diff > 0: # Add to largest group largest = max(targets, key=lambda g: targets[g]) targets[largest] += diff elif diff < 0: # Remove from largest group largest = max(targets, key=lambda g: targets[g]) targets[largest] += diff # Sort each group by score sorted_by_group = { g: np.argsort(-scores[protected == g]) for g in groups } group_indices = { g: np.where(protected == g)[0] for g in groups } # Greedy selection: pick top from each group until targets met selected = [] counts = {g: 0 for g in groups} pointers = {g: 0 for g in groups} while len(selected) < self.k: # Find group with most remaining quota remaining = { g: targets[g] - counts[g] for g in groups if pointers[g] < len(sorted_by_group[g]) } if not remaining: break # Pick item from group with most slots to fill best_group = max(remaining, key=lambda g: remaining[g]) item_local_idx = sorted_by_group[best_group][pointers[best_group]] item_global_idx = group_indices[best_group][item_local_idx] selected.append(item_global_idx) counts[best_group] += 1 pointers[best_group] += 1 return np.array(selected) def compute_ranking_fairness(selected_indices: np.ndarray, scores: np.ndarray, protected: np.ndarray) -> dict: """Compute fairness metrics for top-k selection.""" groups = np.unique(protected) n_total = len(scores) k = len(selected_indices) metrics = {} # Representation in selection vs population for g in groups: pop_rate = np.mean(protected == g) sel_rate = np.mean(protected[selected_indices] == g) metrics[f'group_{g}_population_rate'] = pop_rate metrics[f'group_{g}_selection_rate'] = sel_rate metrics[f'group_{g}_representation_ratio'] = sel_rate / pop_rate if pop_rate > 0 else 0 # Average score of selected by group for g in groups: sel_group = selected_indices[protected[selected_indices] == g] if len(sel_group) > 0: metrics[f'group_{g}_avg_score_selected'] = np.mean(scores[sel_group]) return metrics # Demonstrationif __name__ == "__main__": np.random.seed(42) n = 500 k = 50 # Candidates with biased scores protected = np.random.binomial(1, 0.4, n) # 40% in group 1 # True merit (unbiased) true_merit = np.random.normal(5, 1, n) # Observed scores (biased against group 1) scores = true_merit - 0.5 * protected + np.random.normal(0, 0.3, n) print("Naive top-k selection:") naive_top_k = np.argsort(-scores)[:k] naive_metrics = compute_ranking_fairness(naive_top_k, scores, protected) for key, val in naive_metrics.items(): print(f" {key}: {val:.3f}") print("\nFair top-k selection (proportional):") selector = FairTopKSelector(k=k, representation='proportional') fair_top_k = selector.select(scores, protected) fair_metrics = compute_ranking_fairness(fair_top_k, scores, protected) for key, val in fair_metrics.items(): print(f" {key}: {val:.3f}")Fair ranking is particularly challenging because fairness constraints may conflict with relevance optimization. A perfectly fair ranking may place less relevant items higher than more relevant ones. Additionally, the position bias model (how much users attend to each position) is an assumption that may not hold universally.
| Method | Fairness Target | Requires Y at Prediction | Complexity | Best For |
|---|---|---|---|---|
| Threshold Optimization | Demographic Parity | No | Low | Simple adjustments, different group base rates |
| Equalized Odds (Hardt) | Equalized TPR/FPR | Yes (or estimated) | Medium | When error rate equality matters |
| Group Calibration | Calibration within groups | For fitting only | Low | When probability interpretation matters |
| Reject Option | Demographic Parity | No | Low | When model is uncertain (scores near 0.5) |
| Fair Ranking | Exposure fairness | No | Medium-High | Search, recommendations, selection |
Selection Guidelines:
Use Threshold Optimization when:
Use Equalized Odds when:
Use Calibration when:
Use Reject Option when:
Use Fair Ranking when:
Post-processing methods can be chained. For example: (1) Calibrate each group separately, (2) Apply threshold optimization on calibrated scores, (3) Use reject option for borderline cases. Experiment with combinations on held-out data to find what works for your specific application.
Post-processing is powerful but has fundamental limitations that practitioners must understand.
Post-processing is most valuable when: (1) Models are fixed or expensive to retrain, (2) Fairness requirements vary by context and need runtime adjustment, (3) Auditing and transparency require separable fairness interventions, (4) Quick deployment of fairness improvements is needed. It's often the fastest path to fairer predictions, even if not the most fundamental.
What's Next:
The final page of this module examines fairness-accuracy tradeoffs—the fundamental tension between predictive performance and fairness, impossibility results that limit what can be achieved, and practical strategies for navigating these tradeoffs in real-world applications.
You now understand the major post-processing approaches to bias mitigation—from simple threshold optimization to sophisticated fair ranking algorithms. These techniques provide essential tools for deploying fairer ML systems, especially when model retraining is not an option.