0/318

00:00:00

Description

Editorial

Precision-Recall Curve for Binary Classification

MEDIUM20 pts

The Precision-Recall (PR) Curve is a powerful evaluation tool for binary classification models, particularly valuable when dealing with imbalanced datasets where positive examples are rare. Unlike ROC curves, PR curves focus specifically on the performance of the classifier concerning the positive class, making them ideal for scenarios like fraud detection, disease diagnosis, or anomaly detection.

Core Concepts:

Precision measures the proportion of predicted positives that are actually positive: $$\text{Precision} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Positives}}$$

Recall (also called Sensitivity or True Positive Rate) measures the proportion of actual positives that are correctly identified: $$\text{Recall} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Negatives}}$$

Building the PR Curve:

The PR curve is constructed by evaluating precision and recall at various decision thresholds. For each unique score in the predicted probabilities (sorted in descending order):

Samples with scores ≥ threshold are classified as positive
Samples with scores < threshold are classified as negative
Compute precision and recall based on this binary classification

Edge Case Handling:

When there are no predicted positives (all samples below threshold), precision is defined as 1.0 (vacuous truth—you made no false positive predictions)
When there are no actual positives in the dataset, recall is 0.0 for all thresholds

Your Task: Write a Python function that computes the precision-recall curve given true binary labels and predicted probability scores. The function should return precision values, recall values, and the corresponding thresholds used for evaluation.

Example

Input

y_true = [1, 0, 1, 0]
y_scores = [0.9, 0.7, 0.4, 0.2]

Output

{"precisions": [1.0, 0.5, 0.6667, 0.5], "recalls": [0.5, 0.5, 1.0, 1.0], "thresholds": [0.9, 0.7, 0.4, 0.2]}

Explanation

The thresholds are the unique scores sorted in descending order: [0.9, 0.7, 0.4, 0.2].

At threshold 0.9: • Predicted positive: sample 0 (score 0.9 ≥ 0.9) → TP=1, FP=0 • Precision = 1/(1+0) = 1.0, Recall = 1/2 = 0.5

At threshold 0.7: • Predicted positive: samples 0, 1 (scores ≥ 0.7) → TP=1, FP=1 • Precision = 1/(1+1) = 0.5, Recall = 1/2 = 0.5

At threshold 0.4: • Predicted positive: samples 0, 1, 2 (scores ≥ 0.4) → TP=2, FP=1 • Precision = 2/(2+1) ≈ 0.6667, Recall = 2/2 = 1.0

At threshold 0.2: • Predicted positive: all samples (scores ≥ 0.2) → TP=2, FP=2 • Precision = 2/(2+2) = 0.5, Recall = 2/2 = 1.0

Example

Input

y_true = [1, 1, 0, 0]
y_scores = [0.8, 0.6, 0.4, 0.2]

Output

{"precisions": [1.0, 1.0, 0.6667, 0.5], "recalls": [0.5, 1.0, 1.0, 1.0], "thresholds": [0.8, 0.6, 0.4, 0.2]}

Explanation

This example shows a well-separated case where positive samples have higher scores.

At threshold 0.8: • Only sample 0 is predicted positive → TP=1, FP=0 • Precision = 1.0, Recall = 0.5 (found 1 of 2 positives)

At threshold 0.6: • Samples 0, 1 predicted positive → TP=2, FP=0 • Precision = 1.0, Recall = 1.0 (perfect at this threshold!)

At threshold 0.4: • Samples 0, 1, 2 predicted positive → TP=2, FP=1 • Precision = 0.6667, Recall = 1.0

At threshold 0.2: • All samples predicted positive → TP=2, FP=2 • Precision = 0.5, Recall = 1.0

Example

Input

y_true = [1, 1, 1, 0, 0, 0]
y_scores = [0.9, 0.8, 0.7, 0.3, 0.2, 0.1]

Output

{"precisions": [1.0, 1.0, 1.0, 0.75, 0.6, 0.5], "recalls": [0.3333, 0.6667, 1.0, 1.0, 1.0, 1.0], "thresholds": [0.9, 0.8, 0.7, 0.3, 0.2, 0.1]}

Explanation

With 3 positive and 3 negative samples, this shows the classic precision-recall trade-off.

At threshold 0.9: 1 positive predicted → Precision=1.0, Recall=0.3333 At threshold 0.8: 2 positives predicted → Precision=1.0, Recall=0.6667 At threshold 0.7: 3 positives predicted → Precision=1.0, Recall=1.0 (optimal threshold!) At threshold 0.3: 4 predicted positive → Precision=0.75, Recall=1.0 At threshold 0.2: 5 predicted positive → Precision=0.6, Recall=1.0 At threshold 0.1: All predicted positive → Precision=0.5, Recall=1.0

Notice how lowering the threshold maintains recall at 1.0 but degrades precision.

Accepted0/0·0% Acceptance

Constraints

1 ≤ length of y_true ≤ 10,000
length of y_true = length of y_scores
y_true[i] ∈ {0, 1} (binary labels)
0 ≤ y_scores[i] ≤ 1 (probability scores)
At least one positive label exists in y_true (sum(y_true) ≥ 1)
All scores in y_scores are unique (no duplicate thresholds)
Precision and recall values should be rounded to 4 decimal places

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

y_true =

[1,0,1,0]

y_scores =

[0.9,0.7,0.4,0.2]

Precision-Recall Curve for Binary Classification

Hints

Precision-Recall Curve for Binary Classification

Hints