Machine LearningML Interpretability & Fairness

Fairness in Machine Learning

LevelAdvanced

Duration90 mins

TopicML Interpretability & Fairness

5 / 5

Fairness Metrics

Measuring What Matters

In 2018, Amazon scrapped an AI recruiting tool after discovering it systematically penalized resumes containing the word "women's" (as in "women's chess club"). The lesson wasn't just about bias—it was about measurement. Without proper fairness metrics, this bias persisted undetected through development and testing.

If you can't measure fairness, you can't manage it.

Fairness metrics translate abstract fairness concepts into concrete, actionable numbers. They enable auditing, comparison, benchmarking, and improvement tracking. This page provides a comprehensive toolkit of fairness metrics spanning classification, ranking, regression, and multi-class settings.

What You Will Learn

By the end of this page, you will master a comprehensive library of fairness metrics, understand which metrics correspond to which fairness definitions, implement practical measurement code for real systems, and know how to select appropriate metrics for different applications.

Binary Classification Fairness Metrics

Binary classification is the most studied setting for fairness metrics. The core metrics derive from the confusion matrix, compared across protected groups.

Core Binary Classification Fairness Metrics
Metric	Formula	Fairness Criterion	Range
Statistical Parity Diff	\|P(Ŷ=1\|A=0) - P(Ŷ=1\|A=1)\|	Demographic Parity	[0, 1], 0 = fair
Disparate Impact Ratio	min(P(Ŷ=1\|A=0), P(Ŷ=1\|A=1)) / max(...)	Demographic Parity	[0, 1], 1 = fair
Equal Opportunity Diff	\|TPR_A=0 - TPR_A=1\|	Equality of Opportunity	[0, 1], 0 = fair
Equalized Odds Diff	max(\|TPR diff\|, \|FPR diff\|)	Equalized Odds	[0, 1], 0 = fair
Calibration Diff	\|PPV_A=0 - PPV_A=1\|	Calibration	[0, 1], 0 = fair
Average Odds Diff	(\|TPR diff\| + \|FPR diff\|) / 2	Equalized Odds (avg)	[0, 1], 0 = fair
Theil Index	Entropy-based inequality measure	Individual fairness	[0, ∞), 0 = fair

classification_fairness_metrics.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
import numpy as np
from typing import Dict
from dataclasses import dataclass
 
@dataclass
class FairnessReport:
    """Comprehensive fairness metrics report."""
    demographic_parity_diff: float
    disparate_impact_ratio: float
    equal_opportunity_diff: float
    equalized_odds_diff: float
    calibration_diff: float
    average_odds_diff: float
    group_metrics: Dict
 
def compute_classification_fairness_metrics(
    y_true: np.ndarray,
    y_pred: np.ndarray,
    protected_attr: np.ndarray,
    group_names: Dict[int, str] = None
) -> FairnessReport:
    """
    Compute comprehensive fairness metrics for binary classification.
    
    Args:
        y_true: Ground truth binary labels
        y_pred: Predicted binary labels
        protected_attr: Binary protected attribute
        group_names: Optional mapping of values to names
    
    Returns:
        FairnessReport with all metrics
    """
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    protected_attr = np.array(protected_attr)
    
    if group_names is None:
        group_names = {0: "Group_0", 1: "Group_1"}
    
    groups = np.unique(protected_attr)
    group_metrics = {}
    
    for g in groups:
        mask = protected_attr == g
        g_true, g_pred = y_true[mask], y_pred[mask]
        
        positives = g_true == 1
        negatives = g_true == 0
        
        tpr = g_pred[positives].mean() if positives.sum() > 0 else 0
        fpr = g_pred[negatives].mean() if negatives.sum() > 0 else 0
        
        pred_positives = g_pred == 1
        ppv = g_true[pred_positives].mean() if pred_positives.sum() > 0 else 0
        
        selection_rate = g_pred.mean()
        accuracy = (g_pred == g_true).mean()
        
        group_metrics[group_names[g]] = {
            'tpr': tpr, 'fpr': fpr, 'ppv': ppv,
            'selection_rate': selection_rate, 'accuracy': accuracy,
            'n': mask.sum()
        }
    
    # Extract metrics for comparison
    g0, g1 = group_names[groups[0]], group_names[groups[1]]
    
    sr_diff = abs(group_metrics[g0]['selection_rate'] - 
                  group_metrics[g1]['selection_rate'])
    
    sr_min = min(group_metrics[g0]['selection_rate'], 
                 group_metrics[g1]['selection_rate'])
    sr_max = max(group_metrics[g0]['selection_rate'], 
                 group_metrics[g1]['selection_rate'])
    di_ratio = sr_min / sr_max if sr_max > 0 else 1.0
    
    tpr_diff = abs(group_metrics[g0]['tpr'] - group_metrics[g1]['tpr'])
    fpr_diff = abs(group_metrics[g0]['fpr'] - group_metrics[g1]['fpr'])
    ppv_diff = abs(group_metrics[g0]['ppv'] - group_metrics[g1]['ppv'])
    
    return FairnessReport(
        demographic_parity_diff=sr_diff,
        disparate_impact_ratio=di_ratio,
        equal_opportunity_diff=tpr_diff,
        equalized_odds_diff=max(tpr_diff, fpr_diff),
        calibration_diff=ppv_diff,
        average_odds_diff=(tpr_diff + fpr_diff) / 2,
        group_metrics=group_metrics
    )
 
 
def theil_index(y_true, y_pred, protected_attr):
    """
    Compute Theil Index for individual fairness.
    
    Measures inequality in prediction errors across individuals.
    Lower values indicate more equal treatment.
    """
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    
    # Compute individual benefits (1 if correctly predicted positive)
    benefits = ((y_pred == 1) & (y_true == 1)).astype(float)
    
    # Handle edge cases
    if benefits.sum() == 0 or benefits.mean() == 0:
        return 0.0
    
    # Theil T index
    mu = benefits.mean()
    n = len(benefits)
    
    theil = 0
    for b in benefits:
        if b > 0:
            theil += (b / mu) * np.log(b / mu)
    
    return theil / n
 
 
# Example usage
if __name__ == "__main__":
    np.random.seed(42)
    n = 1000
    
    protected = np.random.binomial(1, 0.4, n)
    y_true = np.random.binomial(1, 0.3, n)
    
    # Biased predictions
    bias = protected * 0.2
    y_pred = ((np.random.rand(n) + bias) > 0.7).astype(int)
    
    report = compute_classification_fairness_metrics(
        y_true, y_pred, protected,
        group_names={0: 'Majority', 1: 'Minority'}
    )
    
    print(f"Demographic Parity Diff: {report.demographic_parity_diff:.4f}")
    print(f"Disparate Impact Ratio: {report.disparate_impact_ratio:.4f}")
    print(f"Equal Opportunity Diff: {report.equal_opportunity_diff:.4f}")
    print(f"Equalized Odds Diff: {report.equalized_odds_diff:.4f}")

Probabilistic and Calibration Metrics

When models output probabilities rather than hard predictions, additional fairness metrics become relevant. These focus on whether predicted probabilities are meaningful across groups.

Probabilistic Fairness Metrics

•Score Parity: Compare distribution of predicted probabilities across groups using statistical tests (KS test, t-test on means)
•Calibration Parity: Compare Expected Calibration Error (ECE) across groups; equal ECE means scores are equally meaningful
•Brier Score Parity: Compare Brier scores across groups; equal Brier scores mean equal probabilistic accuracy
•AUC Parity: Compare ROC-AUC across groups; AUC reflects ranking quality independent of threshold
•Balance for Positive/Negative Class: Compare mean predicted probability among actual positives/negatives across groups

probabilistic_fairness.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
import numpy as np
from scipy import stats
from sklearn.metrics import brier_score_loss, roc_auc_score
 
def probabilistic_fairness_metrics(
    y_true: np.ndarray,
    y_proba: np.ndarray,
    protected_attr: np.ndarray,
    n_bins: int = 10
) -> Dict:
    """
    Compute probabilistic fairness metrics.
    """
    y_true = np.array(y_true)
    y_proba = np.array(y_proba)
    protected_attr = np.array(protected_attr)
    
    groups = np.unique(protected_attr)
    results = {'groups': {}}
    
    def expected_calibration_error(y_true, y_proba, n_bins):
        bins = np.linspace(0, 1, n_bins + 1)
        ece = 0.0
        for i in range(n_bins):
            mask = (y_proba >= bins[i]) & (y_proba < bins[i+1])
            if mask.sum() == 0:
                continue
            acc = y_true[mask].mean()
            conf = y_proba[mask].mean()
            ece += (mask.sum() / len(y_true)) * abs(acc - conf)
        return ece
    
    for g in groups:
        mask = protected_attr == g
        g_true, g_proba = y_true[mask], y_proba[mask]
        
        results['groups'][g] = {
            'mean_score': g_proba.mean(),
            'std_score': g_proba.std(),
            'brier_score': brier_score_loss(g_true, g_proba),
            'ece': expected_calibration_error(g_true, g_proba, n_bins),
            'auc': roc_auc_score(g_true, g_proba) if len(np.unique(g_true)) > 1 else None,
            'balance_positive': g_proba[g_true == 1].mean() if (g_true == 1).sum() > 0 else None,
            'balance_negative': g_proba[g_true == 0].mean() if (g_true == 0).sum() > 0 else None,
        }
    
    # Parity metrics
    g0, g1 = groups[0], groups[1]
    
    results['score_parity'] = {
        'mean_diff': abs(results['groups'][g0]['mean_score'] - 
                        results['groups'][g1]['mean_score']),
        'ks_statistic': stats.ks_2samp(
            y_proba[protected_attr == g0],
            y_proba[protected_attr == g1]
        ).statistic
    }
    
    results['calibration_parity'] = abs(
        results['groups'][g0]['ece'] - results['groups'][g1]['ece']
    )
    
    results['brier_parity'] = abs(
        results['groups'][g0]['brier_score'] - 
        results['groups'][g1]['brier_score']
    )
    
    if results['groups'][g0]['auc'] and results['groups'][g1]['auc']:
        results['auc_parity'] = abs(
            results['groups'][g0]['auc'] - results['groups'][g1]['auc']
        )
    
    return results

Ranking Fairness Metrics

Ranking systems (search, recommendations, hiring pipelines) require metrics that account for position and exposure, not just binary outcomes.

Ranking Fairness Metrics
Metric	Description	Formula Concept
Exposure Ratio	Ratio of exposure received by groups	Σposition_weights / group_size per group
NDKL	Normalized Discounted KL-divergence	Position-weighted divergence from target distribution
rKL	Ranking KL divergence	KL divergence at each prefix of ranking
Attention Parity	Equal attention across groups in top-k	P(in top-k \| A=0) = P(in top-k \| A=1)
Skew@k	Ratio of group representations in top-k	log(p_topk / p_overall) for each group

ranking_fairness_metrics.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
import numpy as np
 
def ranking_fairness_metrics(
    ranking: np.ndarray,
    protected_attr: np.ndarray,
    k_values: list = [5, 10, 20]
) -> Dict:
    """
    Compute fairness metrics for ranked lists.
    
    Args:
        ranking: Indices of items in ranked order
        protected_attr: Protected attribute for each item
        k_values: Top-k positions to evaluate
    """
    protected_attr = np.array(protected_attr)
    n = len(ranking)
    
    # Population proportions
    prop_1 = protected_attr.mean()
    prop_0 = 1 - prop_1
    
    results = {}
    
    # Exposure with logarithmic discount
    position_weights = 1 / np.log2(np.arange(2, n + 2))
    total_weight = position_weights.sum()
    
    exposure_0 = sum(
        position_weights[i] for i, idx in enumerate(ranking)
        if protected_attr[idx] == 0
    )
    exposure_1 = sum(
        position_weights[i] for i, idx in enumerate(ranking)
        if protected_attr[idx] == 1
    )
    
    # Normalize by group size to get exposure per capita
    n_0 = (protected_attr == 0).sum()
    n_1 = (protected_attr == 1).sum()
    
    exposure_per_capita_0 = exposure_0 / n_0 if n_0 > 0 else 0
    exposure_per_capita_1 = exposure_1 / n_1 if n_1 > 0 else 0
    
    results['exposure_ratio'] = (
        min(exposure_per_capita_0, exposure_per_capita_1) /
        max(exposure_per_capita_0, exposure_per_capita_1)
        if max(exposure_per_capita_0, exposure_per_capita_1) > 0 else 1.0
    )
    
    # Attention parity at various k
    for k in k_values:
        if k > len(ranking):
            continue
        top_k_attrs = protected_attr[ranking[:k]]
        prop_1_topk = top_k_attrs.mean()
        
        results[f'representation@{k}'] = {
            'group_1_prop_in_topk': prop_1_topk,
            'group_1_prop_overall': prop_1,
            'skew': np.log(prop_1_topk / prop_1) if prop_1_topk > 0 and prop_1 > 0 else 0,
            'attention_parity_diff': abs(prop_1_topk - prop_1)
        }
    
    # NDKL (simplified)
    target_prop = prop_1  # Target: population proportion
    ndkl = 0
    for i, idx in enumerate(ranking):
        weight = position_weights[i] / total_weight
        actual = protected_attr[idx]
        # Binary KL contribution
        if actual == 1:
            ndkl += weight * np.log(1 / target_prop) if target_prop > 0 else 0
        else:
            ndkl += weight * np.log(1 / (1 - target_prop)) if target_prop < 1 else 0
    
    results['ndkl'] = ndkl
    
    return results

Regression Fairness Metrics

For continuous predictions (salary, loan amount, risk scores), fairness metrics must capture differences in prediction magnitude and errors.

Regression Fairness Metrics

•Mean Prediction Parity: Compare E[Ŷ|A=0] vs E[Ŷ|A=1]; difference indicates systematic bias
•Mean Absolute Error Parity: Compare MAE across groups; ensures equal prediction quality
•Residual Parity: Compare E[Y-Ŷ|A=0] vs E[Y-Ŷ|A=1]; non-zero difference indicates systematic over/under-prediction
•Variance Parity: Compare prediction variance across groups
•Conditional Parity: Compare predictions controlling for legitimate factors

regression_fairness.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def regression_fairness_metrics(
    y_true: np.ndarray,
    y_pred: np.ndarray,
    protected_attr: np.ndarray
) -> Dict:
    """Compute fairness metrics for regression predictions."""
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    protected_attr = np.array(protected_attr)
    
    groups = np.unique(protected_attr)
    results = {'groups': {}}
    
    for g in groups:
        mask = protected_attr == g
        g_true, g_pred = y_true[mask], y_pred[mask]
        residuals = g_true - g_pred
        
        results['groups'][g] = {
            'mean_prediction': g_pred.mean(),
            'mean_true': g_true.mean(),
            'mean_residual': residuals.mean(),
            'mae': np.abs(residuals).mean(),
            'rmse': np.sqrt((residuals ** 2).mean()),
            'std_prediction': g_pred.std(),
            'n': mask.sum()
        }
    
    g0, g1 = groups[0], groups[1]
    
    results['mean_prediction_parity'] = abs(
        results['groups'][g0]['mean_prediction'] - 
        results['groups'][g1]['mean_prediction']
    )
    results['residual_parity'] = abs(
        results['groups'][g0]['mean_residual'] - 
        results['groups'][g1]['mean_residual']
    )
    results['mae_parity'] = abs(
        results['groups'][g0]['mae'] - results['groups'][g1]['mae']
    )
    
    return results

Selecting Appropriate Metrics

With dozens of fairness metrics available, selecting the right ones for your application is critical. Consider these guidelines:

Metric Selection Guide
Context	Primary Concern	Recommended Metrics
Hiring/Admissions	Equal opportunity for qualified	Equal Opportunity Diff, TPR Parity
Lending	Equal access + equal errors	Equalized Odds Diff, Calibration
Criminal Justice	Protect innocents equally	FPR Parity, Predictive Equality
Healthcare	Equal diagnostic accuracy	TPR Parity, PPV Parity, Calibration
Search/Recommendations	Fair exposure	Exposure Ratio, NDKL, Attention@k
Salary Prediction	No systematic bias	Mean Prediction Parity, Residual Parity
Advertising	Equal opportunity to see ads	Demographic Parity, Exposure Ratio

Best Practices for Metric Selection

Start with stakeholder discussions about harms. 2) Map harms to specific error types (FP vs FN). 3) Select metrics that capture those errors. 4) Report multiple metrics for transparency. 5) Set thresholds before looking at results to avoid cherry-picking.

Summary: The Fairness Metrics Toolkit

Key Takeaways

•Classification metrics — Cover demographic parity, equalized odds, calibration, and their variants
•Probabilistic metrics — Assess calibration parity, score distributions, and ranking quality across groups
•Ranking metrics — Capture exposure, attention, and representation in ordered lists
•Regression metrics — Measure prediction bias and error equity for continuous outcomes
•Context determines metrics — Application domain and harm model guide selection
•Report multiple metrics — No single metric captures all fairness dimensions; transparency requires breadth

Module Complete

You have completed the Fairness in ML module! You now understand fairness definitions, protected attributes, disparate impact, equality of opportunity, and comprehensive fairness metrics. This foundation prepares you for the next module on Bias Detection and Mitigation techniques.

5 / 5

Loading learning content...

Machine LearningML Interpretability & Fairness

Fairness in Machine Learning

LevelAdvanced

Duration90 mins

TopicML Interpretability & Fairness

5 / 5

Fairness Metrics

Measuring What Matters

If you can't measure fairness, you can't manage it.

What You Will Learn

Binary Classification Fairness Metrics

Binary classification is the most studied setting for fairness metrics. The core metrics derive from the confusion matrix, compared across protected groups.

Core Binary Classification Fairness Metrics
Metric	Formula	Fairness Criterion	Range
Statistical Parity Diff	\|P(Ŷ=1\|A=0) - P(Ŷ=1\|A=1)\|	Demographic Parity	[0, 1], 0 = fair
Disparate Impact Ratio	min(P(Ŷ=1\|A=0), P(Ŷ=1\|A=1)) / max(...)	Demographic Parity	[0, 1], 1 = fair
Equal Opportunity Diff	\|TPR_A=0 - TPR_A=1\|	Equality of Opportunity	[0, 1], 0 = fair
Equalized Odds Diff	max(\|TPR diff\|, \|FPR diff\|)	Equalized Odds	[0, 1], 0 = fair
Calibration Diff	\|PPV_A=0 - PPV_A=1\|	Calibration	[0, 1], 0 = fair
Average Odds Diff	(\|TPR diff\| + \|FPR diff\|) / 2	Equalized Odds (avg)	[0, 1], 0 = fair
Theil Index	Entropy-based inequality measure	Individual fairness	[0, ∞), 0 = fair

classification_fairness_metrics.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
import numpy as np
from typing import Dict
from dataclasses import dataclass
 
@dataclass
class FairnessReport:
    """Comprehensive fairness metrics report."""
    demographic_parity_diff: float
    disparate_impact_ratio: float
    equal_opportunity_diff: float
    equalized_odds_diff: float
    calibration_diff: float
    average_odds_diff: float
    group_metrics: Dict
 
def compute_classification_fairness_metrics(
    y_true: np.ndarray,
    y_pred: np.ndarray,
    protected_attr: np.ndarray,
    group_names: Dict[int, str] = None
) -> FairnessReport:
    """
    Compute comprehensive fairness metrics for binary classification.
    
    Args:
        y_true: Ground truth binary labels
        y_pred: Predicted binary labels
        protected_attr: Binary protected attribute
        group_names: Optional mapping of values to names
    
    Returns:
        FairnessReport with all metrics
    """
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    protected_attr = np.array(protected_attr)
    
    if group_names is None:
        group_names = {0: "Group_0", 1: "Group_1"}
    
    groups = np.unique(protected_attr)
    group_metrics = {}
    
    for g in groups:
        mask = protected_attr == g
        g_true, g_pred = y_true[mask], y_pred[mask]
        
        positives = g_true == 1
        negatives = g_true == 0
        
        tpr = g_pred[positives].mean() if positives.sum() > 0 else 0
        fpr = g_pred[negatives].mean() if negatives.sum() > 0 else 0
        
        pred_positives = g_pred == 1
        ppv = g_true[pred_positives].mean() if pred_positives.sum() > 0 else 0
        
        selection_rate = g_pred.mean()
        accuracy = (g_pred == g_true).mean()
        
        group_metrics[group_names[g]] = {
            'tpr': tpr, 'fpr': fpr, 'ppv': ppv,
            'selection_rate': selection_rate, 'accuracy': accuracy,
            'n': mask.sum()
        }
    
    # Extract metrics for comparison
    g0, g1 = group_names[groups[0]], group_names[groups[1]]
    
    sr_diff = abs(group_metrics[g0]['selection_rate'] - 
                  group_metrics[g1]['selection_rate'])
    
    sr_min = min(group_metrics[g0]['selection_rate'], 
                 group_metrics[g1]['selection_rate'])
    sr_max = max(group_metrics[g0]['selection_rate'], 
                 group_metrics[g1]['selection_rate'])
    di_ratio = sr_min / sr_max if sr_max > 0 else 1.0
    
    tpr_diff = abs(group_metrics[g0]['tpr'] - group_metrics[g1]['tpr'])
    fpr_diff = abs(group_metrics[g0]['fpr'] - group_metrics[g1]['fpr'])
    ppv_diff = abs(group_metrics[g0]['ppv'] - group_metrics[g1]['ppv'])
    
    return FairnessReport(
        demographic_parity_diff=sr_diff,
        disparate_impact_ratio=di_ratio,
        equal_opportunity_diff=tpr_diff,
        equalized_odds_diff=max(tpr_diff, fpr_diff),
        calibration_diff=ppv_diff,
        average_odds_diff=(tpr_diff + fpr_diff) / 2,
        group_metrics=group_metrics
    )
 
 
def theil_index(y_true, y_pred, protected_attr):
    """
    Compute Theil Index for individual fairness.
    
    Measures inequality in prediction errors across individuals.
    Lower values indicate more equal treatment.
    """
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    
    # Compute individual benefits (1 if correctly predicted positive)
    benefits = ((y_pred == 1) & (y_true == 1)).astype(float)
    
    # Handle edge cases
    if benefits.sum() == 0 or benefits.mean() == 0:
        return 0.0
    
    # Theil T index
    mu = benefits.mean()
    n = len(benefits)
    
    theil = 0
    for b in benefits:
        if b > 0:
            theil += (b / mu) * np.log(b / mu)
    
    return theil / n
 
 
# Example usage
if __name__ == "__main__":
    np.random.seed(42)
    n = 1000
    
    protected = np.random.binomial(1, 0.4, n)
    y_true = np.random.binomial(1, 0.3, n)
    
    # Biased predictions
    bias = protected * 0.2
    y_pred = ((np.random.rand(n) + bias) > 0.7).astype(int)
    
    report = compute_classification_fairness_metrics(
        y_true, y_pred, protected,
        group_names={0: 'Majority', 1: 'Minority'}
    )
    
    print(f"Demographic Parity Diff: {report.demographic_parity_diff:.4f}")
    print(f"Disparate Impact Ratio: {report.disparate_impact_ratio:.4f}")
    print(f"Equal Opportunity Diff: {report.equal_opportunity_diff:.4f}")
    print(f"Equalized Odds Diff: {report.equalized_odds_diff:.4f}")

Probabilistic and Calibration Metrics

When models output probabilities rather than hard predictions, additional fairness metrics become relevant. These focus on whether predicted probabilities are meaningful across groups.

Probabilistic Fairness Metrics

•Score Parity: Compare distribution of predicted probabilities across groups using statistical tests (KS test, t-test on means)
•Calibration Parity: Compare Expected Calibration Error (ECE) across groups; equal ECE means scores are equally meaningful
•Brier Score Parity: Compare Brier scores across groups; equal Brier scores mean equal probabilistic accuracy
•AUC Parity: Compare ROC-AUC across groups; AUC reflects ranking quality independent of threshold
•Balance for Positive/Negative Class: Compare mean predicted probability among actual positives/negatives across groups

probabilistic_fairness.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
import numpy as np
from scipy import stats
from sklearn.metrics import brier_score_loss, roc_auc_score
 
def probabilistic_fairness_metrics(
    y_true: np.ndarray,
    y_proba: np.ndarray,
    protected_attr: np.ndarray,
    n_bins: int = 10
) -> Dict:
    """
    Compute probabilistic fairness metrics.
    """
    y_true = np.array(y_true)
    y_proba = np.array(y_proba)
    protected_attr = np.array(protected_attr)
    
    groups = np.unique(protected_attr)
    results = {'groups': {}}
    
    def expected_calibration_error(y_true, y_proba, n_bins):
        bins = np.linspace(0, 1, n_bins + 1)
        ece = 0.0
        for i in range(n_bins):
            mask = (y_proba >= bins[i]) & (y_proba < bins[i+1])
            if mask.sum() == 0:
                continue
            acc = y_true[mask].mean()
            conf = y_proba[mask].mean()
            ece += (mask.sum() / len(y_true)) * abs(acc - conf)
        return ece
    
    for g in groups:
        mask = protected_attr == g
        g_true, g_proba = y_true[mask], y_proba[mask]
        
        results['groups'][g] = {
            'mean_score': g_proba.mean(),
            'std_score': g_proba.std(),
            'brier_score': brier_score_loss(g_true, g_proba),
            'ece': expected_calibration_error(g_true, g_proba, n_bins),
            'auc': roc_auc_score(g_true, g_proba) if len(np.unique(g_true)) > 1 else None,
            'balance_positive': g_proba[g_true == 1].mean() if (g_true == 1).sum() > 0 else None,
            'balance_negative': g_proba[g_true == 0].mean() if (g_true == 0).sum() > 0 else None,
        }
    
    # Parity metrics
    g0, g1 = groups[0], groups[1]
    
    results['score_parity'] = {
        'mean_diff': abs(results['groups'][g0]['mean_score'] - 
                        results['groups'][g1]['mean_score']),
        'ks_statistic': stats.ks_2samp(
            y_proba[protected_attr == g0],
            y_proba[protected_attr == g1]
        ).statistic
    }
    
    results['calibration_parity'] = abs(
        results['groups'][g0]['ece'] - results['groups'][g1]['ece']
    )
    
    results['brier_parity'] = abs(
        results['groups'][g0]['brier_score'] - 
        results['groups'][g1]['brier_score']
    )
    
    if results['groups'][g0]['auc'] and results['groups'][g1]['auc']:
        results['auc_parity'] = abs(
            results['groups'][g0]['auc'] - results['groups'][g1]['auc']
        )
    
    return results

Ranking Fairness Metrics

Ranking systems (search, recommendations, hiring pipelines) require metrics that account for position and exposure, not just binary outcomes.

Ranking Fairness Metrics
Metric	Description	Formula Concept
Exposure Ratio	Ratio of exposure received by groups	Σposition_weights / group_size per group
NDKL	Normalized Discounted KL-divergence	Position-weighted divergence from target distribution
rKL	Ranking KL divergence	KL divergence at each prefix of ranking
Attention Parity	Equal attention across groups in top-k	P(in top-k \| A=0) = P(in top-k \| A=1)
Skew@k	Ratio of group representations in top-k	log(p_topk / p_overall) for each group

ranking_fairness_metrics.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
import numpy as np
 
def ranking_fairness_metrics(
    ranking: np.ndarray,
    protected_attr: np.ndarray,
    k_values: list = [5, 10, 20]
) -> Dict:
    """
    Compute fairness metrics for ranked lists.
    
    Args:
        ranking: Indices of items in ranked order
        protected_attr: Protected attribute for each item
        k_values: Top-k positions to evaluate
    """
    protected_attr = np.array(protected_attr)
    n = len(ranking)
    
    # Population proportions
    prop_1 = protected_attr.mean()
    prop_0 = 1 - prop_1
    
    results = {}
    
    # Exposure with logarithmic discount
    position_weights = 1 / np.log2(np.arange(2, n + 2))
    total_weight = position_weights.sum()
    
    exposure_0 = sum(
        position_weights[i] for i, idx in enumerate(ranking)
        if protected_attr[idx] == 0
    )
    exposure_1 = sum(
        position_weights[i] for i, idx in enumerate(ranking)
        if protected_attr[idx] == 1
    )
    
    # Normalize by group size to get exposure per capita
    n_0 = (protected_attr == 0).sum()
    n_1 = (protected_attr == 1).sum()
    
    exposure_per_capita_0 = exposure_0 / n_0 if n_0 > 0 else 0
    exposure_per_capita_1 = exposure_1 / n_1 if n_1 > 0 else 0
    
    results['exposure_ratio'] = (
        min(exposure_per_capita_0, exposure_per_capita_1) /
        max(exposure_per_capita_0, exposure_per_capita_1)
        if max(exposure_per_capita_0, exposure_per_capita_1) > 0 else 1.0
    )
    
    # Attention parity at various k
    for k in k_values:
        if k > len(ranking):
            continue
        top_k_attrs = protected_attr[ranking[:k]]
        prop_1_topk = top_k_attrs.mean()
        
        results[f'representation@{k}'] = {
            'group_1_prop_in_topk': prop_1_topk,
            'group_1_prop_overall': prop_1,
            'skew': np.log(prop_1_topk / prop_1) if prop_1_topk > 0 and prop_1 > 0 else 0,
            'attention_parity_diff': abs(prop_1_topk - prop_1)
        }
    
    # NDKL (simplified)
    target_prop = prop_1  # Target: population proportion
    ndkl = 0
    for i, idx in enumerate(ranking):
        weight = position_weights[i] / total_weight
        actual = protected_attr[idx]
        # Binary KL contribution
        if actual == 1:
            ndkl += weight * np.log(1 / target_prop) if target_prop > 0 else 0
        else:
            ndkl += weight * np.log(1 / (1 - target_prop)) if target_prop < 1 else 0
    
    results['ndkl'] = ndkl
    
    return results

Regression Fairness Metrics

For continuous predictions (salary, loan amount, risk scores), fairness metrics must capture differences in prediction magnitude and errors.

Regression Fairness Metrics

•Mean Prediction Parity: Compare E[Ŷ|A=0] vs E[Ŷ|A=1]; difference indicates systematic bias
•Mean Absolute Error Parity: Compare MAE across groups; ensures equal prediction quality
•Residual Parity: Compare E[Y-Ŷ|A=0] vs E[Y-Ŷ|A=1]; non-zero difference indicates systematic over/under-prediction
•Variance Parity: Compare prediction variance across groups
•Conditional Parity: Compare predictions controlling for legitimate factors

regression_fairness.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def regression_fairness_metrics(
    y_true: np.ndarray,
    y_pred: np.ndarray,
    protected_attr: np.ndarray
) -> Dict:
    """Compute fairness metrics for regression predictions."""
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    protected_attr = np.array(protected_attr)
    
    groups = np.unique(protected_attr)
    results = {'groups': {}}
    
    for g in groups:
        mask = protected_attr == g
        g_true, g_pred = y_true[mask], y_pred[mask]
        residuals = g_true - g_pred
        
        results['groups'][g] = {
            'mean_prediction': g_pred.mean(),
            'mean_true': g_true.mean(),
            'mean_residual': residuals.mean(),
            'mae': np.abs(residuals).mean(),
            'rmse': np.sqrt((residuals ** 2).mean()),
            'std_prediction': g_pred.std(),
            'n': mask.sum()
        }
    
    g0, g1 = groups[0], groups[1]
    
    results['mean_prediction_parity'] = abs(
        results['groups'][g0]['mean_prediction'] - 
        results['groups'][g1]['mean_prediction']
    )
    results['residual_parity'] = abs(
        results['groups'][g0]['mean_residual'] - 
        results['groups'][g1]['mean_residual']
    )
    results['mae_parity'] = abs(
        results['groups'][g0]['mae'] - results['groups'][g1]['mae']
    )
    
    return results

Selecting Appropriate Metrics

With dozens of fairness metrics available, selecting the right ones for your application is critical. Consider these guidelines:

Metric Selection Guide
Context	Primary Concern	Recommended Metrics
Hiring/Admissions	Equal opportunity for qualified	Equal Opportunity Diff, TPR Parity
Lending	Equal access + equal errors	Equalized Odds Diff, Calibration
Criminal Justice	Protect innocents equally	FPR Parity, Predictive Equality
Healthcare	Equal diagnostic accuracy	TPR Parity, PPV Parity, Calibration
Search/Recommendations	Fair exposure	Exposure Ratio, NDKL, Attention@k
Salary Prediction	No systematic bias	Mean Prediction Parity, Residual Parity
Advertising	Equal opportunity to see ads	Demographic Parity, Exposure Ratio

Best Practices for Metric Selection

Start with stakeholder discussions about harms. 2) Map harms to specific error types (FP vs FN). 3) Select metrics that capture those errors. 4) Report multiple metrics for transparency. 5) Set thresholds before looking at results to avoid cherry-picking.

Summary: The Fairness Metrics Toolkit

Key Takeaways

•Classification metrics — Cover demographic parity, equalized odds, calibration, and their variants
•Probabilistic metrics — Assess calibration parity, score distributions, and ranking quality across groups
•Ranking metrics — Capture exposure, attention, and representation in ordered lists
•Regression metrics — Measure prediction bias and error equity for continuous outcomes
•Context determines metrics — Application domain and harm model guide selection
•Report multiple metrics — No single metric captures all fairness dimensions; transparency requires breadth

Module Complete

5 / 5