Loading learning content...
In 2018, Amazon scrapped an AI recruiting tool after discovering it systematically penalized resumes containing the word "women's" (as in "women's chess club"). The lesson wasn't just about bias—it was about measurement. Without proper fairness metrics, this bias persisted undetected through development and testing.
If you can't measure fairness, you can't manage it.
Fairness metrics translate abstract fairness concepts into concrete, actionable numbers. They enable auditing, comparison, benchmarking, and improvement tracking. This page provides a comprehensive toolkit of fairness metrics spanning classification, ranking, regression, and multi-class settings.
By the end of this page, you will master a comprehensive library of fairness metrics, understand which metrics correspond to which fairness definitions, implement practical measurement code for real systems, and know how to select appropriate metrics for different applications.
Binary classification is the most studied setting for fairness metrics. The core metrics derive from the confusion matrix, compared across protected groups.
| Metric | Formula | Fairness Criterion | Range |
|---|---|---|---|
| Statistical Parity Diff | |P(Ŷ=1|A=0) - P(Ŷ=1|A=1)| | Demographic Parity | [0, 1], 0 = fair |
| Disparate Impact Ratio | min(P(Ŷ=1|A=0), P(Ŷ=1|A=1)) / max(...) | Demographic Parity | [0, 1], 1 = fair |
| Equal Opportunity Diff | |TPR_A=0 - TPR_A=1| | Equality of Opportunity | [0, 1], 0 = fair |
| Equalized Odds Diff | max(|TPR diff|, |FPR diff|) | Equalized Odds | [0, 1], 0 = fair |
| Calibration Diff | |PPV_A=0 - PPV_A=1| | Calibration | [0, 1], 0 = fair |
| Average Odds Diff | (|TPR diff| + |FPR diff|) / 2 | Equalized Odds (avg) | [0, 1], 0 = fair |
| Theil Index | Entropy-based inequality measure | Individual fairness | [0, ∞), 0 = fair |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142
import numpy as npfrom typing import Dictfrom dataclasses import dataclass @dataclassclass FairnessReport: """Comprehensive fairness metrics report.""" demographic_parity_diff: float disparate_impact_ratio: float equal_opportunity_diff: float equalized_odds_diff: float calibration_diff: float average_odds_diff: float group_metrics: Dict def compute_classification_fairness_metrics( y_true: np.ndarray, y_pred: np.ndarray, protected_attr: np.ndarray, group_names: Dict[int, str] = None) -> FairnessReport: """ Compute comprehensive fairness metrics for binary classification. Args: y_true: Ground truth binary labels y_pred: Predicted binary labels protected_attr: Binary protected attribute group_names: Optional mapping of values to names Returns: FairnessReport with all metrics """ y_true = np.array(y_true) y_pred = np.array(y_pred) protected_attr = np.array(protected_attr) if group_names is None: group_names = {0: "Group_0", 1: "Group_1"} groups = np.unique(protected_attr) group_metrics = {} for g in groups: mask = protected_attr == g g_true, g_pred = y_true[mask], y_pred[mask] positives = g_true == 1 negatives = g_true == 0 tpr = g_pred[positives].mean() if positives.sum() > 0 else 0 fpr = g_pred[negatives].mean() if negatives.sum() > 0 else 0 pred_positives = g_pred == 1 ppv = g_true[pred_positives].mean() if pred_positives.sum() > 0 else 0 selection_rate = g_pred.mean() accuracy = (g_pred == g_true).mean() group_metrics[group_names[g]] = { 'tpr': tpr, 'fpr': fpr, 'ppv': ppv, 'selection_rate': selection_rate, 'accuracy': accuracy, 'n': mask.sum() } # Extract metrics for comparison g0, g1 = group_names[groups[0]], group_names[groups[1]] sr_diff = abs(group_metrics[g0]['selection_rate'] - group_metrics[g1]['selection_rate']) sr_min = min(group_metrics[g0]['selection_rate'], group_metrics[g1]['selection_rate']) sr_max = max(group_metrics[g0]['selection_rate'], group_metrics[g1]['selection_rate']) di_ratio = sr_min / sr_max if sr_max > 0 else 1.0 tpr_diff = abs(group_metrics[g0]['tpr'] - group_metrics[g1]['tpr']) fpr_diff = abs(group_metrics[g0]['fpr'] - group_metrics[g1]['fpr']) ppv_diff = abs(group_metrics[g0]['ppv'] - group_metrics[g1]['ppv']) return FairnessReport( demographic_parity_diff=sr_diff, disparate_impact_ratio=di_ratio, equal_opportunity_diff=tpr_diff, equalized_odds_diff=max(tpr_diff, fpr_diff), calibration_diff=ppv_diff, average_odds_diff=(tpr_diff + fpr_diff) / 2, group_metrics=group_metrics ) def theil_index(y_true, y_pred, protected_attr): """ Compute Theil Index for individual fairness. Measures inequality in prediction errors across individuals. Lower values indicate more equal treatment. """ y_true = np.array(y_true) y_pred = np.array(y_pred) # Compute individual benefits (1 if correctly predicted positive) benefits = ((y_pred == 1) & (y_true == 1)).astype(float) # Handle edge cases if benefits.sum() == 0 or benefits.mean() == 0: return 0.0 # Theil T index mu = benefits.mean() n = len(benefits) theil = 0 for b in benefits: if b > 0: theil += (b / mu) * np.log(b / mu) return theil / n # Example usageif __name__ == "__main__": np.random.seed(42) n = 1000 protected = np.random.binomial(1, 0.4, n) y_true = np.random.binomial(1, 0.3, n) # Biased predictions bias = protected * 0.2 y_pred = ((np.random.rand(n) + bias) > 0.7).astype(int) report = compute_classification_fairness_metrics( y_true, y_pred, protected, group_names={0: 'Majority', 1: 'Minority'} ) print(f"Demographic Parity Diff: {report.demographic_parity_diff:.4f}") print(f"Disparate Impact Ratio: {report.disparate_impact_ratio:.4f}") print(f"Equal Opportunity Diff: {report.equal_opportunity_diff:.4f}") print(f"Equalized Odds Diff: {report.equalized_odds_diff:.4f}")When models output probabilities rather than hard predictions, additional fairness metrics become relevant. These focus on whether predicted probabilities are meaningful across groups.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
import numpy as npfrom scipy import statsfrom sklearn.metrics import brier_score_loss, roc_auc_score def probabilistic_fairness_metrics( y_true: np.ndarray, y_proba: np.ndarray, protected_attr: np.ndarray, n_bins: int = 10) -> Dict: """ Compute probabilistic fairness metrics. """ y_true = np.array(y_true) y_proba = np.array(y_proba) protected_attr = np.array(protected_attr) groups = np.unique(protected_attr) results = {'groups': {}} def expected_calibration_error(y_true, y_proba, n_bins): bins = np.linspace(0, 1, n_bins + 1) ece = 0.0 for i in range(n_bins): mask = (y_proba >= bins[i]) & (y_proba < bins[i+1]) if mask.sum() == 0: continue acc = y_true[mask].mean() conf = y_proba[mask].mean() ece += (mask.sum() / len(y_true)) * abs(acc - conf) return ece for g in groups: mask = protected_attr == g g_true, g_proba = y_true[mask], y_proba[mask] results['groups'][g] = { 'mean_score': g_proba.mean(), 'std_score': g_proba.std(), 'brier_score': brier_score_loss(g_true, g_proba), 'ece': expected_calibration_error(g_true, g_proba, n_bins), 'auc': roc_auc_score(g_true, g_proba) if len(np.unique(g_true)) > 1 else None, 'balance_positive': g_proba[g_true == 1].mean() if (g_true == 1).sum() > 0 else None, 'balance_negative': g_proba[g_true == 0].mean() if (g_true == 0).sum() > 0 else None, } # Parity metrics g0, g1 = groups[0], groups[1] results['score_parity'] = { 'mean_diff': abs(results['groups'][g0]['mean_score'] - results['groups'][g1]['mean_score']), 'ks_statistic': stats.ks_2samp( y_proba[protected_attr == g0], y_proba[protected_attr == g1] ).statistic } results['calibration_parity'] = abs( results['groups'][g0]['ece'] - results['groups'][g1]['ece'] ) results['brier_parity'] = abs( results['groups'][g0]['brier_score'] - results['groups'][g1]['brier_score'] ) if results['groups'][g0]['auc'] and results['groups'][g1]['auc']: results['auc_parity'] = abs( results['groups'][g0]['auc'] - results['groups'][g1]['auc'] ) return resultsRanking systems (search, recommendations, hiring pipelines) require metrics that account for position and exposure, not just binary outcomes.
| Metric | Description | Formula Concept |
|---|---|---|
| Exposure Ratio | Ratio of exposure received by groups | Σposition_weights / group_size per group |
| NDKL | Normalized Discounted KL-divergence | Position-weighted divergence from target distribution |
| rKL | Ranking KL divergence | KL divergence at each prefix of ranking |
| Attention Parity | Equal attention across groups in top-k | P(in top-k | A=0) = P(in top-k | A=1) |
| Skew@k | Ratio of group representations in top-k | log(p_topk / p_overall) for each group |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
import numpy as np def ranking_fairness_metrics( ranking: np.ndarray, protected_attr: np.ndarray, k_values: list = [5, 10, 20]) -> Dict: """ Compute fairness metrics for ranked lists. Args: ranking: Indices of items in ranked order protected_attr: Protected attribute for each item k_values: Top-k positions to evaluate """ protected_attr = np.array(protected_attr) n = len(ranking) # Population proportions prop_1 = protected_attr.mean() prop_0 = 1 - prop_1 results = {} # Exposure with logarithmic discount position_weights = 1 / np.log2(np.arange(2, n + 2)) total_weight = position_weights.sum() exposure_0 = sum( position_weights[i] for i, idx in enumerate(ranking) if protected_attr[idx] == 0 ) exposure_1 = sum( position_weights[i] for i, idx in enumerate(ranking) if protected_attr[idx] == 1 ) # Normalize by group size to get exposure per capita n_0 = (protected_attr == 0).sum() n_1 = (protected_attr == 1).sum() exposure_per_capita_0 = exposure_0 / n_0 if n_0 > 0 else 0 exposure_per_capita_1 = exposure_1 / n_1 if n_1 > 0 else 0 results['exposure_ratio'] = ( min(exposure_per_capita_0, exposure_per_capita_1) / max(exposure_per_capita_0, exposure_per_capita_1) if max(exposure_per_capita_0, exposure_per_capita_1) > 0 else 1.0 ) # Attention parity at various k for k in k_values: if k > len(ranking): continue top_k_attrs = protected_attr[ranking[:k]] prop_1_topk = top_k_attrs.mean() results[f'representation@{k}'] = { 'group_1_prop_in_topk': prop_1_topk, 'group_1_prop_overall': prop_1, 'skew': np.log(prop_1_topk / prop_1) if prop_1_topk > 0 and prop_1 > 0 else 0, 'attention_parity_diff': abs(prop_1_topk - prop_1) } # NDKL (simplified) target_prop = prop_1 # Target: population proportion ndkl = 0 for i, idx in enumerate(ranking): weight = position_weights[i] / total_weight actual = protected_attr[idx] # Binary KL contribution if actual == 1: ndkl += weight * np.log(1 / target_prop) if target_prop > 0 else 0 else: ndkl += weight * np.log(1 / (1 - target_prop)) if target_prop < 1 else 0 results['ndkl'] = ndkl return resultsFor continuous predictions (salary, loan amount, risk scores), fairness metrics must capture differences in prediction magnitude and errors.
12345678910111213141516171819202122232425262728293031323334353637383940414243
def regression_fairness_metrics( y_true: np.ndarray, y_pred: np.ndarray, protected_attr: np.ndarray) -> Dict: """Compute fairness metrics for regression predictions.""" y_true = np.array(y_true) y_pred = np.array(y_pred) protected_attr = np.array(protected_attr) groups = np.unique(protected_attr) results = {'groups': {}} for g in groups: mask = protected_attr == g g_true, g_pred = y_true[mask], y_pred[mask] residuals = g_true - g_pred results['groups'][g] = { 'mean_prediction': g_pred.mean(), 'mean_true': g_true.mean(), 'mean_residual': residuals.mean(), 'mae': np.abs(residuals).mean(), 'rmse': np.sqrt((residuals ** 2).mean()), 'std_prediction': g_pred.std(), 'n': mask.sum() } g0, g1 = groups[0], groups[1] results['mean_prediction_parity'] = abs( results['groups'][g0]['mean_prediction'] - results['groups'][g1]['mean_prediction'] ) results['residual_parity'] = abs( results['groups'][g0]['mean_residual'] - results['groups'][g1]['mean_residual'] ) results['mae_parity'] = abs( results['groups'][g0]['mae'] - results['groups'][g1]['mae'] ) return resultsWith dozens of fairness metrics available, selecting the right ones for your application is critical. Consider these guidelines:
| Context | Primary Concern | Recommended Metrics |
|---|---|---|
| Hiring/Admissions | Equal opportunity for qualified | Equal Opportunity Diff, TPR Parity |
| Lending | Equal access + equal errors | Equalized Odds Diff, Calibration |
| Criminal Justice | Protect innocents equally | FPR Parity, Predictive Equality |
| Healthcare | Equal diagnostic accuracy | TPR Parity, PPV Parity, Calibration |
| Search/Recommendations | Fair exposure | Exposure Ratio, NDKL, Attention@k |
| Salary Prediction | No systematic bias | Mean Prediction Parity, Residual Parity |
| Advertising | Equal opportunity to see ads | Demographic Parity, Exposure Ratio |
You have completed the Fairness in ML module! You now understand fairness definitions, protected attributes, disparate impact, equality of opportunity, and comprehensive fairness metrics. This foundation prepares you for the next module on Bias Detection and Mitigation techniques.