Machine LearningCustom and Business Metrics

Custom and Business Metrics

LevelIntermediate

Duration90 mins

TopicCustom and Business Metrics

2 / 5

Threshold Optimization

The Hidden Lever: Decision Thresholds

Every probabilistic classifier produces continuous outputs—probabilities, scores, or logits—that must be converted to discrete decisions. The decision threshold is the boundary: predictions above it become positive, those below become negative.

Most practitioners use the default threshold of 0.5 without question. This is almost always suboptimal.

The Threshold Matters Enormously:

A fraud detector at threshold 0.5 might catch 60% of fraud
The same model at threshold 0.1 might catch 95% of fraud—but with more false alarms
The optimal point depends on your specific cost structure

Threshold optimization is the systematic process of finding the decision boundary that maximizes business value given your unique constraints.

What You Will Learn

By the end of this page, you will understand how threshold changes affect precision-recall trade-offs, derive cost-optimal thresholds mathematically, implement threshold optimization using ROC and PR curves, and handle operational constraints in threshold selection.

The Threshold-Performance Relationship

Changing the decision threshold creates a fundamental trade-off between different error types. Understanding this relationship is essential for informed threshold selection.

As threshold increases (more conservative):

Fewer positive predictions overall
Higher precision (fewer false positives among positives)
Lower recall (more false negatives, missing true positives)
More false negatives, fewer false positives

As threshold decreases (more aggressive):

More positive predictions overall
Lower precision (more false positives among positives)
Higher recall (fewer false negatives, catching more true positives)
Fewer false negatives, more false positives

Impact of Threshold Changes on Confusion Matrix
Threshold	TP	FP	FN	TN	Precision	Recall
0.1 (aggressive)	95	400	5	500	19.2%	95.0%
0.3	85	150	15	750	36.2%	85.0%
0.5 (default)	70	50	30	850	58.3%	70.0%
0.7	50	15	50	885	76.9%	50.0%
0.9 (conservative)	20	2	80	898	90.9%	20.0%

The Precision-Recall Trade-off

You cannot simultaneously maximize both precision and recall. Higher thresholds sacrifice recall for precision; lower thresholds sacrifice precision for recall. The optimal balance depends entirely on your application's cost structure.

Deriving the Cost-Optimal Threshold

Given a cost matrix, we can derive the mathematically optimal threshold that minimizes expected cost.

Decision Theory Foundation:

For a given instance x with predicted probability P(y=1|x) = p, we should predict positive if the expected cost of predicting positive is less than predicting negative:

$$E[\text{Cost}|\text{predict positive}] < E[\text{Cost}|\text{predict negative}]$$

Expanding: $$p \cdot C_{TP} + (1-p) \cdot C_{FP} < p \cdot C_{FN} + (1-p) \cdot C_{TN}$$

Solving for p (assuming $C_{TP} = C_{TN} = 0$):

$$p > \frac{C_{FP}}{C_{FP} + C_{FN}}$$

The Cost-Optimal Threshold:

$$t^* = \frac{C_{FP}}{C_{FP} + C_{FN}} = \frac{1}{1 + \frac{C_{FN}}{C_{FP}}} = \frac{1}{1 + CR}$$

where CR = C_FN / C_FP is the cost ratio.

Cost Ratios and Optimal Thresholds
Scenario	C_FP	C_FN	Cost Ratio	Optimal Threshold
Equal costs	$1	$1	1:1	0.500
Fraud detection	$10	$150	15:1	0.063
Cancer screening	$100	$10000	100:1	0.010
Spam filter	$50	$5	0.1:1	0.909
Ad click prediction	$0.01	$0.10	10:1	0.091

cost_optimal_threshold.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import numpy as np
from sklearn.metrics import confusion_matrix
 
def cost_optimal_threshold(cost_fp, cost_fn):
    """Compute the theoretically optimal threshold given costs."""
    return cost_fp / (cost_fp + cost_fn)
 
def find_best_threshold_empirically(y_true, y_proba, cost_fp, cost_fn, 
                                     thresholds=None):
    """
    Find threshold that minimizes total cost empirically.
    
    Searches over candidate thresholds and returns the one
    with minimum expected cost on the provided data.
    """
    if thresholds is None:
        thresholds = np.linspace(0.01, 0.99, 99)
    
    best_threshold = 0.5
    best_cost = float('inf')
    results = []
    
    for t in thresholds:
        y_pred = (y_proba >= t).astype(int)
        tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
        
        total_cost = fp * cost_fp + fn * cost_fn
        results.append({'threshold': t, 'cost': total_cost, 
                       'fp': fp, 'fn': fn})
        
        if total_cost < best_cost:
            best_cost = total_cost
            best_threshold = t
    
    return best_threshold, best_cost, results
 
# Example
np.random.seed(42)
n = 10000
y_true = np.random.binomial(1, 0.05, n)  # 5% positive
# Simulated well-calibrated probabilities
y_proba = np.clip(y_true * 0.7 + np.random.beta(2, 10, n) * 0.5, 0, 1)
 
cost_fp, cost_fn = 10, 150
theoretical = cost_optimal_threshold(cost_fp, cost_fn)
empirical, min_cost, _ = find_best_threshold_empirically(
    y_true, y_proba, cost_fp, cost_fn
)
 
print(f"Theoretical optimal: {theoretical:.4f}")
print(f"Empirical optimal: {empirical:.4f}")
print(f"Minimum cost: ${min_cost:, .2f
                        }")

ROC-Based Threshold Selection

The ROC curve provides a powerful framework for threshold selection by visualizing all possible operating points of a classifier.

Key Insight:

Each point on the ROC curve corresponds to a specific threshold. Moving along the curve from bottom-left (threshold=1) to top-right (threshold=0) trades false positive rate for true positive rate.

Common ROC-Based Threshold Selection Methods:

Youden's J Statistic: Maximize J = TPR - FPR (point furthest from diagonal)
Cost-Weighted Point: Find point minimizing cost on iso-cost lines
Operational Constraint: Fix FPR or TPR and find corresponding threshold
Distance to Perfect: Minimize distance to (0, 1) corner

roc_threshold_selection.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
import numpy as np
from sklearn.metrics import roc_curve, roc_auc_score
 
def find_threshold_youden(y_true, y_proba):
    """Find threshold maximizing Youden's J statistic."""
    fpr, tpr, thresholds = roc_curve(y_true, y_proba)
    j_scores = tpr - fpr
    best_idx = np.argmax(j_scores)
    return thresholds[best_idx], j_scores[best_idx]
 
def find_threshold_cost_weighted(y_true, y_proba, cost_fp, cost_fn,
                                  prevalence=None):
    """
    Find threshold minimizing cost using ROC curve.
    
    The optimal point satisfies: slope = (C_FP/C_FN) * ((1-π)/π)
    where π is the prevalence of positives.
    """
    if prevalence is None:
        prevalence = np.mean(y_true)
    
    fpr, tpr, thresholds = roc_curve(y_true, y_proba)
    
    # Compute cost at each ROC point
    n_pos = prevalence
    n_neg = 1 - prevalence
    
    costs = []
    for i in range(len(thresholds)):
        # Expected cost per sample
        cost = fpr[i] * n_neg * cost_fp + (1 - tpr[i]) * n_pos * cost_fn
        costs.append(cost)
    
    best_idx = np.argmin(costs)
    return thresholds[best_idx], costs[best_idx]
 
def find_threshold_at_fpr(y_true, y_proba, target_fpr):
    """Find threshold achieving target false positive rate."""
    fpr, tpr, thresholds = roc_curve(y_true, y_proba)
    idx = np.argmin(np.abs(fpr - target_fpr))
    return thresholds[idx], fpr[idx], tpr[idx]
 
# Demonstration
np.random.seed(42)
y_true = np.random.binomial(1, 0.1, 5000)
y_proba = np.clip(y_true * 0.6 + np.random.beta(2, 8, 5000), 0, 1)
 
print("ROC-Based Threshold Selection Methods")
print("=" * 50)
 
t_youden, j = find_threshold_youden(y_true, y_proba)
print(f"Youden's J: threshold={t_youden:.4f}, J={j:.4f}")
 
t_cost, cost = find_threshold_cost_weighted(y_true, y_proba, 10, 150)
print(f"Cost-weighted: threshold={t_cost:.4f}, cost={cost:.4f}")
 
t_fpr, actual_fpr, tpr = find_threshold_at_fpr(y_true, y_proba, 0.05)
print(f"FPR=5%: threshold={t_fpr:.4f}, actual_fpr={actual_fpr:.4f}, tpr={tpr:.4f}")

Precision-Recall Based Threshold Selection

For imbalanced datasets, Precision-Recall curves often provide more insight than ROC curves. Different threshold selection strategies apply.

When to Use PR-Based Selection:

Heavily imbalanced classes (positive rate < 5%)
When false positives among predictions matter more than overall FPR
When you need to report precision at specific recall targets

PR-Based Selection Methods:

F1-Optimal: Find threshold maximizing F1 score (harmonic mean of precision and recall)
F-beta Optimal: Maximize F-beta for custom precision-recall weighting
Precision at Recall: Fix recall target, find corresponding precision
Break-Even Point: Where precision equals recall

F-beta Score

F-beta = (1 + β²) × (precision × recall) / (β² × precision + recall). When β=1 (F1), precision and recall are equally weighted. β>1 weights recall higher; β<1 weights precision higher. Choose β based on your cost ratio.

pr_threshold_selection.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
import numpy as np
from sklearn.metrics import precision_recall_curve, f1_score
 
def find_threshold_f1_optimal(y_true, y_proba):
    """Find threshold maximizing F1 score."""
    precision, recall, thresholds = precision_recall_curve(y_true, y_proba)
    
    # Compute F1 at each threshold
    f1_scores = 2 * precision * recall / (precision + recall + 1e-10)
    
    best_idx = np.argmax(f1_scores[:-1])  # Last value is undefined
    return thresholds[best_idx], f1_scores[best_idx]
 
def find_threshold_fbeta_optimal(y_true, y_proba, beta=1.0):
    """Find threshold maximizing F-beta score."""
    precision, recall, thresholds = precision_recall_curve(y_true, y_proba)
    
    beta_sq = beta ** 2
    fbeta = (1 + beta_sq) * precision * recall / (beta_sq * precision + recall + 1e-10)
    
    best_idx = np.argmax(fbeta[:-1])
    return thresholds[best_idx], fbeta[best_idx]
 
def find_threshold_at_recall(y_true, y_proba, target_recall):
    """Find threshold achieving target recall."""
    precision, recall, thresholds = precision_recall_curve(y_true, y_proba)
    
    # Find highest threshold (most conservative) achieving target recall
    valid_idx = np.where(recall[:-1] >= target_recall)[0]
    if len(valid_idx) == 0:
        return thresholds[0], precision[0], recall[0]
    
    idx = valid_idx[-1]  # Highest threshold meeting requirement
    return thresholds[idx], precision[idx], recall[idx]
 
# Example with imbalanced data
np.random.seed(42)
y_true = np.random.binomial(1, 0.02, 10000)  # 2% positive
y_proba = np.clip(y_true * 0.7 + np.random.beta(1, 20, 10000), 0, 1)
 
print("PR-Based Threshold Selection")
print("=" * 50)
 
t_f1, f1 = find_threshold_f1_optimal(y_true, y_proba)
print(f"F1-optimal: threshold={t_f1:.4f}, F1={f1:.4f}")
 
# High recall (catch 90% of positives)
t_f2, f2 = find_threshold_fbeta_optimal(y_true, y_proba, beta=2)
print(f"F2-optimal: threshold={t_f2:.4f}, F2={f2:.4f}")
 
# High precision
t_f05, f05 = find_threshold_fbeta_optimal(y_true, y_proba, beta=0.5)
print(f"F0.5-optimal: threshold={t_f05:.4f}, F0.5={f05:.4f}")
 
t_rec, prec, rec = find_threshold_at_recall(y_true, y_proba, 0.90)
print(f"Recall≥90%: threshold={t_rec:.4f}, precision={prec:.4f}, recall={rec:.4f}")

Handling Operational Constraints

Real deployments often face constraints beyond pure cost optimization:

Common Operational Constraints:

Capacity Limits: "We can only investigate 100 cases per day"
FPR Ceiling: "False positive rate must be below 5%"
Recall Floor: "We must catch at least 80% of fraud"
Precision Floor: "At least 20% of flagged cases must be true positives"
Fairness Requirements: "TPR must be equal across demographic groups"

Constraint Handling Strategies

•Threshold Search with Constraints: Search for cost-optimal threshold among those satisfying constraints
•Lagrangian Relaxation: Convert constraints to penalty terms in objective function
•Multi-Threshold Systems: Different thresholds for different risk tiers
•Human-in-the-Loop: Route uncertain predictions (near threshold) to human review

constrained_threshold.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
import numpy as np
from sklearn.metrics import confusion_matrix
 
def find_threshold_with_constraints(y_true, y_proba, cost_fp, cost_fn,
                                     min_recall=None, max_fpr=None,
                                     min_precision=None, max_predictions=None):
    """
    Find cost-optimal threshold subject to operational constraints.
    """
    n = len(y_true)
    thresholds = np.linspace(0.01, 0.99, 199)
    
    best_threshold = None
    best_cost = float('inf')
    
    for t in thresholds:
        y_pred = (y_proba >= t).astype(int)
        tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
        
        # Check constraints
        recall = tp / (tp + fn) if (tp + fn) > 0 else 0
        fpr = fp / (fp + tn) if (fp + tn) > 0 else 0
        precision = tp / (tp + fp) if (tp + fp) > 0 else 1
        n_predictions = tp + fp
        
        # Constraint violations
        if min_recall is not None and recall < min_recall:
            continue
        if max_fpr is not None and fpr > max_fpr:
            continue
        if min_precision is not None and precision < min_precision:
            continue
        if max_predictions is not None and n_predictions > max_predictions:
            continue
        
        # Compute cost
        cost = fp * cost_fp + fn * cost_fn
        
        if cost < best_cost:
            best_cost = cost
            best_threshold = t
    
    return best_threshold, best_cost
 
# Example
np.random.seed(42)
y_true = np.random.binomial(1, 0.05, 5000)
y_proba = np.clip(y_true * 0.7 + np.random.beta(2, 10, 5000), 0, 1)
 
print("Constrained Threshold Optimization")
print("=" * 50)
 
# Unconstrained
t_unc, cost_unc = find_threshold_with_constraints(
    y_true, y_proba, 10, 150
)
print(f"Unconstrained: threshold={t_unc:.4f}, cost=${cost_unc:, .0f
                }")
 
# With recall constraint
t_rec, cost_rec = find_threshold_with_constraints(
                    y_true, y_proba, 10, 150, min_recall = 0.85
                )
print(f"Recall≥85%: threshold={t_rec:.4f}, cost=${cost_rec:,.0f}")
 
# With capacity constraint
t_cap, cost_cap = find_threshold_with_constraints(
                    y_true, y_proba, 10, 150, max_predictions = 200
                )
print(f"Max 200 preds: threshold={t_cap:.4f}, cost=${cost_cap:,.0f}")

Dynamic and Adaptive Thresholds

Static thresholds assume stationary conditions. In practice, optimal thresholds may need to change over time or vary by context.

When to Use Dynamic Thresholds:

Concept drift: When the relationship between features and outcomes changes
Seasonality: Fraud patterns differ during holidays vs. normal periods
Capacity variation: More investigators available on weekdays
Risk stratification: Higher-value transactions warrant lower thresholds

Dynamic Threshold Approaches

•Time-based: Recalibrate threshold weekly/monthly based on recent data
•Context-aware: Different thresholds for different customer segments or transaction types
•Feedback-driven: Adjust threshold based on investigation outcomes
•Capacity-aware: Lower threshold when investigation capacity is available
•Risk-proportionate: Threshold inversely proportional to potential loss

Threshold Stability Trade-off

Frequent threshold changes can confuse operators and make system behavior unpredictable. Balance adaptiveness with stability. Consider guardrails that limit how much thresholds can change between updates.

Summary and Best Practices

Key Takeaways

•Never use 0.5 blindly — The default threshold is almost never optimal for real applications
•Derive from costs — The cost-optimal threshold t* = C_FP / (C_FP + C_FN) provides a principled starting point
•Validate empirically — Verify theoretical threshold on holdout data; real models may not be perfectly calibrated
•Consider constraints — Operational realities (capacity, regulations) may override pure cost optimization
•Monitor and adapt — Optimal thresholds change as conditions change; implement monitoring and recalibration

Page Complete

You now understand how to select decision thresholds that optimize business outcomes. Next, we'll explore how to align model metrics with broader business objectives through Business Metric Alignment.

2 / 5

Loading learning content...

Machine LearningCustom and Business Metrics

Custom and Business Metrics

LevelIntermediate

Duration90 mins

TopicCustom and Business Metrics

2 / 5

Threshold Optimization

The Hidden Lever: Decision Thresholds

Most practitioners use the default threshold of 0.5 without question. This is almost always suboptimal.

The Threshold Matters Enormously:

A fraud detector at threshold 0.5 might catch 60% of fraud
The same model at threshold 0.1 might catch 95% of fraud—but with more false alarms
The optimal point depends on your specific cost structure

Threshold optimization is the systematic process of finding the decision boundary that maximizes business value given your unique constraints.

What You Will Learn

The Threshold-Performance Relationship

Changing the decision threshold creates a fundamental trade-off between different error types. Understanding this relationship is essential for informed threshold selection.

As threshold increases (more conservative):

Fewer positive predictions overall
Higher precision (fewer false positives among positives)
Lower recall (more false negatives, missing true positives)
More false negatives, fewer false positives

As threshold decreases (more aggressive):

More positive predictions overall
Lower precision (more false positives among positives)
Higher recall (fewer false negatives, catching more true positives)
Fewer false negatives, more false positives

Impact of Threshold Changes on Confusion Matrix
Threshold	TP	FP	FN	TN	Precision	Recall
0.1 (aggressive)	95	400	5	500	19.2%	95.0%
0.3	85	150	15	750	36.2%	85.0%
0.5 (default)	70	50	30	850	58.3%	70.0%
0.7	50	15	50	885	76.9%	50.0%
0.9 (conservative)	20	2	80	898	90.9%	20.0%

The Precision-Recall Trade-off

Deriving the Cost-Optimal Threshold

Given a cost matrix, we can derive the mathematically optimal threshold that minimizes expected cost.

Decision Theory Foundation:

For a given instance x with predicted probability P(y=1|x) = p, we should predict positive if the expected cost of predicting positive is less than predicting negative:

$$E[\text{Cost}|\text{predict positive}] < E[\text{Cost}|\text{predict negative}]$$

Expanding: $$p \cdot C_{TP} + (1-p) \cdot C_{FP} < p \cdot C_{FN} + (1-p) \cdot C_{TN}$$

Solving for p (assuming $C_{TP} = C_{TN} = 0$):

$$p > \frac{C_{FP}}{C_{FP} + C_{FN}}$$

The Cost-Optimal Threshold:

$$t^* = \frac{C_{FP}}{C_{FP} + C_{FN}} = \frac{1}{1 + \frac{C_{FN}}{C_{FP}}} = \frac{1}{1 + CR}$$

where CR = C_FN / C_FP is the cost ratio.

Cost Ratios and Optimal Thresholds
Scenario	C_FP	C_FN	Cost Ratio	Optimal Threshold
Equal costs	$1	$1	1:1	0.500
Fraud detection	$10	$150	15:1	0.063
Cancer screening	$100	$10000	100:1	0.010
Spam filter	$50	$5	0.1:1	0.909
Ad click prediction	$0.01	$0.10	10:1	0.091

cost_optimal_threshold.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import numpy as np
from sklearn.metrics import confusion_matrix
 
def cost_optimal_threshold(cost_fp, cost_fn):
    """Compute the theoretically optimal threshold given costs."""
    return cost_fp / (cost_fp + cost_fn)
 
def find_best_threshold_empirically(y_true, y_proba, cost_fp, cost_fn, 
                                     thresholds=None):
    """
    Find threshold that minimizes total cost empirically.
    
    Searches over candidate thresholds and returns the one
    with minimum expected cost on the provided data.
    """
    if thresholds is None:
        thresholds = np.linspace(0.01, 0.99, 99)
    
    best_threshold = 0.5
    best_cost = float('inf')
    results = []
    
    for t in thresholds:
        y_pred = (y_proba >= t).astype(int)
        tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
        
        total_cost = fp * cost_fp + fn * cost_fn
        results.append({'threshold': t, 'cost': total_cost, 
                       'fp': fp, 'fn': fn})
        
        if total_cost < best_cost:
            best_cost = total_cost
            best_threshold = t
    
    return best_threshold, best_cost, results
 
# Example
np.random.seed(42)
n = 10000
y_true = np.random.binomial(1, 0.05, n)  # 5% positive
# Simulated well-calibrated probabilities
y_proba = np.clip(y_true * 0.7 + np.random.beta(2, 10, n) * 0.5, 0, 1)
 
cost_fp, cost_fn = 10, 150
theoretical = cost_optimal_threshold(cost_fp, cost_fn)
empirical, min_cost, _ = find_best_threshold_empirically(
    y_true, y_proba, cost_fp, cost_fn
)
 
print(f"Theoretical optimal: {theoretical:.4f}")
print(f"Empirical optimal: {empirical:.4f}")
print(f"Minimum cost: ${min_cost:, .2f
                        }")

ROC-Based Threshold Selection

The ROC curve provides a powerful framework for threshold selection by visualizing all possible operating points of a classifier.

Key Insight:

Each point on the ROC curve corresponds to a specific threshold. Moving along the curve from bottom-left (threshold=1) to top-right (threshold=0) trades false positive rate for true positive rate.

Common ROC-Based Threshold Selection Methods:

Youden's J Statistic: Maximize J = TPR - FPR (point furthest from diagonal)
Cost-Weighted Point: Find point minimizing cost on iso-cost lines
Operational Constraint: Fix FPR or TPR and find corresponding threshold
Distance to Perfect: Minimize distance to (0, 1) corner

roc_threshold_selection.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
import numpy as np
from sklearn.metrics import roc_curve, roc_auc_score
 
def find_threshold_youden(y_true, y_proba):
    """Find threshold maximizing Youden's J statistic."""
    fpr, tpr, thresholds = roc_curve(y_true, y_proba)
    j_scores = tpr - fpr
    best_idx = np.argmax(j_scores)
    return thresholds[best_idx], j_scores[best_idx]
 
def find_threshold_cost_weighted(y_true, y_proba, cost_fp, cost_fn,
                                  prevalence=None):
    """
    Find threshold minimizing cost using ROC curve.
    
    The optimal point satisfies: slope = (C_FP/C_FN) * ((1-π)/π)
    where π is the prevalence of positives.
    """
    if prevalence is None:
        prevalence = np.mean(y_true)
    
    fpr, tpr, thresholds = roc_curve(y_true, y_proba)
    
    # Compute cost at each ROC point
    n_pos = prevalence
    n_neg = 1 - prevalence
    
    costs = []
    for i in range(len(thresholds)):
        # Expected cost per sample
        cost = fpr[i] * n_neg * cost_fp + (1 - tpr[i]) * n_pos * cost_fn
        costs.append(cost)
    
    best_idx = np.argmin(costs)
    return thresholds[best_idx], costs[best_idx]
 
def find_threshold_at_fpr(y_true, y_proba, target_fpr):
    """Find threshold achieving target false positive rate."""
    fpr, tpr, thresholds = roc_curve(y_true, y_proba)
    idx = np.argmin(np.abs(fpr - target_fpr))
    return thresholds[idx], fpr[idx], tpr[idx]
 
# Demonstration
np.random.seed(42)
y_true = np.random.binomial(1, 0.1, 5000)
y_proba = np.clip(y_true * 0.6 + np.random.beta(2, 8, 5000), 0, 1)
 
print("ROC-Based Threshold Selection Methods")
print("=" * 50)
 
t_youden, j = find_threshold_youden(y_true, y_proba)
print(f"Youden's J: threshold={t_youden:.4f}, J={j:.4f}")
 
t_cost, cost = find_threshold_cost_weighted(y_true, y_proba, 10, 150)
print(f"Cost-weighted: threshold={t_cost:.4f}, cost={cost:.4f}")
 
t_fpr, actual_fpr, tpr = find_threshold_at_fpr(y_true, y_proba, 0.05)
print(f"FPR=5%: threshold={t_fpr:.4f}, actual_fpr={actual_fpr:.4f}, tpr={tpr:.4f}")

Precision-Recall Based Threshold Selection

For imbalanced datasets, Precision-Recall curves often provide more insight than ROC curves. Different threshold selection strategies apply.

When to Use PR-Based Selection:

Heavily imbalanced classes (positive rate < 5%)
When false positives among predictions matter more than overall FPR
When you need to report precision at specific recall targets

PR-Based Selection Methods:

F1-Optimal: Find threshold maximizing F1 score (harmonic mean of precision and recall)
F-beta Optimal: Maximize F-beta for custom precision-recall weighting
Precision at Recall: Fix recall target, find corresponding precision
Break-Even Point: Where precision equals recall

F-beta Score

pr_threshold_selection.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
import numpy as np
from sklearn.metrics import precision_recall_curve, f1_score
 
def find_threshold_f1_optimal(y_true, y_proba):
    """Find threshold maximizing F1 score."""
    precision, recall, thresholds = precision_recall_curve(y_true, y_proba)
    
    # Compute F1 at each threshold
    f1_scores = 2 * precision * recall / (precision + recall + 1e-10)
    
    best_idx = np.argmax(f1_scores[:-1])  # Last value is undefined
    return thresholds[best_idx], f1_scores[best_idx]
 
def find_threshold_fbeta_optimal(y_true, y_proba, beta=1.0):
    """Find threshold maximizing F-beta score."""
    precision, recall, thresholds = precision_recall_curve(y_true, y_proba)
    
    beta_sq = beta ** 2
    fbeta = (1 + beta_sq) * precision * recall / (beta_sq * precision + recall + 1e-10)
    
    best_idx = np.argmax(fbeta[:-1])
    return thresholds[best_idx], fbeta[best_idx]
 
def find_threshold_at_recall(y_true, y_proba, target_recall):
    """Find threshold achieving target recall."""
    precision, recall, thresholds = precision_recall_curve(y_true, y_proba)
    
    # Find highest threshold (most conservative) achieving target recall
    valid_idx = np.where(recall[:-1] >= target_recall)[0]
    if len(valid_idx) == 0:
        return thresholds[0], precision[0], recall[0]
    
    idx = valid_idx[-1]  # Highest threshold meeting requirement
    return thresholds[idx], precision[idx], recall[idx]
 
# Example with imbalanced data
np.random.seed(42)
y_true = np.random.binomial(1, 0.02, 10000)  # 2% positive
y_proba = np.clip(y_true * 0.7 + np.random.beta(1, 20, 10000), 0, 1)
 
print("PR-Based Threshold Selection")
print("=" * 50)
 
t_f1, f1 = find_threshold_f1_optimal(y_true, y_proba)
print(f"F1-optimal: threshold={t_f1:.4f}, F1={f1:.4f}")
 
# High recall (catch 90% of positives)
t_f2, f2 = find_threshold_fbeta_optimal(y_true, y_proba, beta=2)
print(f"F2-optimal: threshold={t_f2:.4f}, F2={f2:.4f}")
 
# High precision
t_f05, f05 = find_threshold_fbeta_optimal(y_true, y_proba, beta=0.5)
print(f"F0.5-optimal: threshold={t_f05:.4f}, F0.5={f05:.4f}")
 
t_rec, prec, rec = find_threshold_at_recall(y_true, y_proba, 0.90)
print(f"Recall≥90%: threshold={t_rec:.4f}, precision={prec:.4f}, recall={rec:.4f}")

Handling Operational Constraints

Real deployments often face constraints beyond pure cost optimization:

Common Operational Constraints:

Capacity Limits: "We can only investigate 100 cases per day"
FPR Ceiling: "False positive rate must be below 5%"
Recall Floor: "We must catch at least 80% of fraud"
Precision Floor: "At least 20% of flagged cases must be true positives"
Fairness Requirements: "TPR must be equal across demographic groups"

Constraint Handling Strategies

•Threshold Search with Constraints: Search for cost-optimal threshold among those satisfying constraints
•Lagrangian Relaxation: Convert constraints to penalty terms in objective function
•Multi-Threshold Systems: Different thresholds for different risk tiers
•Human-in-the-Loop: Route uncertain predictions (near threshold) to human review

constrained_threshold.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
import numpy as np
from sklearn.metrics import confusion_matrix
 
def find_threshold_with_constraints(y_true, y_proba, cost_fp, cost_fn,
                                     min_recall=None, max_fpr=None,
                                     min_precision=None, max_predictions=None):
    """
    Find cost-optimal threshold subject to operational constraints.
    """
    n = len(y_true)
    thresholds = np.linspace(0.01, 0.99, 199)
    
    best_threshold = None
    best_cost = float('inf')
    
    for t in thresholds:
        y_pred = (y_proba >= t).astype(int)
        tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
        
        # Check constraints
        recall = tp / (tp + fn) if (tp + fn) > 0 else 0
        fpr = fp / (fp + tn) if (fp + tn) > 0 else 0
        precision = tp / (tp + fp) if (tp + fp) > 0 else 1
        n_predictions = tp + fp
        
        # Constraint violations
        if min_recall is not None and recall < min_recall:
            continue
        if max_fpr is not None and fpr > max_fpr:
            continue
        if min_precision is not None and precision < min_precision:
            continue
        if max_predictions is not None and n_predictions > max_predictions:
            continue
        
        # Compute cost
        cost = fp * cost_fp + fn * cost_fn
        
        if cost < best_cost:
            best_cost = cost
            best_threshold = t
    
    return best_threshold, best_cost
 
# Example
np.random.seed(42)
y_true = np.random.binomial(1, 0.05, 5000)
y_proba = np.clip(y_true * 0.7 + np.random.beta(2, 10, 5000), 0, 1)
 
print("Constrained Threshold Optimization")
print("=" * 50)
 
# Unconstrained
t_unc, cost_unc = find_threshold_with_constraints(
    y_true, y_proba, 10, 150
)
print(f"Unconstrained: threshold={t_unc:.4f}, cost=${cost_unc:, .0f
                }")
 
# With recall constraint
t_rec, cost_rec = find_threshold_with_constraints(
                    y_true, y_proba, 10, 150, min_recall = 0.85
                )
print(f"Recall≥85%: threshold={t_rec:.4f}, cost=${cost_rec:,.0f}")
 
# With capacity constraint
t_cap, cost_cap = find_threshold_with_constraints(
                    y_true, y_proba, 10, 150, max_predictions = 200
                )
print(f"Max 200 preds: threshold={t_cap:.4f}, cost=${cost_cap:,.0f}")

Dynamic and Adaptive Thresholds

Static thresholds assume stationary conditions. In practice, optimal thresholds may need to change over time or vary by context.

When to Use Dynamic Thresholds:

Concept drift: When the relationship between features and outcomes changes
Seasonality: Fraud patterns differ during holidays vs. normal periods
Capacity variation: More investigators available on weekdays
Risk stratification: Higher-value transactions warrant lower thresholds

Dynamic Threshold Approaches

•Time-based: Recalibrate threshold weekly/monthly based on recent data
•Context-aware: Different thresholds for different customer segments or transaction types
•Feedback-driven: Adjust threshold based on investigation outcomes
•Capacity-aware: Lower threshold when investigation capacity is available
•Risk-proportionate: Threshold inversely proportional to potential loss

Threshold Stability Trade-off

Summary and Best Practices

Key Takeaways

•Never use 0.5 blindly — The default threshold is almost never optimal for real applications
•Derive from costs — The cost-optimal threshold t* = C_FP / (C_FP + C_FN) provides a principled starting point
•Validate empirically — Verify theoretical threshold on holdout data; real models may not be perfectly calibrated
•Consider constraints — Operational realities (capacity, regulations) may override pure cost optimization
•Monitor and adapt — Optimal thresholds change as conditions change; implement monitoring and recalibration

Page Complete

2 / 5