Loading content...
In maximum-margin classification, the goal is not just to separate classes correctly, but to do so with the widest possible decision boundary. The margin penalty loss is the foundational loss function that drives this objective by penalizing predictions that fail to achieve a sufficient separation margin.
Consider a binary classification problem where samples belong to one of two classes, represented by labels +1 and -1. A classifier produces a raw prediction score s for each sample (not a probability, but a signed distance from the decision boundary).
The key insight of margin-based classification is that we don't just want predictions to be on the correct side of the boundary—we want them to be on the correct side with confidence. Specifically, we define the functional margin as:
$$\text{margin} = y \cdot s$$
where y is the true label (+1 or -1) and s is the prediction score. When this product is positive and large, the classifier is confidently correct. When it's negative, the classifier made an error.
The margin penalty loss enforces a target margin of 1. For each sample, the loss is:
$$L_i = \max(0, 1 - y_i \cdot s_i)$$
This elegant formula captures three scenarios:
y × s ≥ 1): The prediction is confidently correct, and loss = 00 < y × s < 1): The prediction is correct but not confident enough, incurring a small penaltyy × s ≤ 0): The prediction is on the wrong side of the boundary, incurring a large penalty (≥ 1)The overall loss is the arithmetic mean across all samples:
$$L = \frac{1}{n} \sum_{i=1}^{n} \max(0, 1 - y_i \cdot s_i)$$
Write a function that computes the average margin penalty loss given an array of true labels and an array of raw prediction scores.
Function Signature:
def compute_margin_penalty_loss(actual_labels: list[int], prediction_scores: list[float]) -> float
Requirements:
actual_labels: List of true class labels, where each is either -1 or +1prediction_scores: List of raw prediction scores (continuous values)actual_labels = [1, -1, 1, -1]
prediction_scores = [0.5, -0.5, -0.2, 0.3]0.875Let's compute the margin penalty loss for each sample:
Sample 1: actual = +1, prediction = 0.5 • Margin = 1 × 0.5 = 0.5 • Loss = max(0, 1 - 0.5) = max(0, 0.5) = 0.5 • Interpretation: Correct side but insufficient margin
Sample 2: actual = -1, prediction = -0.5 • Margin = (-1) × (-0.5) = 0.5 • Loss = max(0, 1 - 0.5) = max(0, 0.5) = 0.5 • Interpretation: Correct side but insufficient margin
Sample 3: actual = +1, prediction = -0.2 • Margin = 1 × (-0.2) = -0.2 • Loss = max(0, 1 - (-0.2)) = max(0, 1.2) = 1.2 • Interpretation: Wrong side of boundary (misclassification)
Sample 4: actual = -1, prediction = 0.3 • Margin = (-1) × 0.3 = -0.3 • Loss = max(0, 1 - (-0.3)) = max(0, 1.3) = 1.3 • Interpretation: Wrong side of boundary (misclassification)
Average Loss = (0.5 + 0.5 + 1.2 + 1.3) / 4 = 3.5 / 4 = 0.875
actual_labels = [1, 1, -1, -1]
prediction_scores = [2.0, 1.5, -2.0, -1.5]0.0Let's verify each sample achieves sufficient margin:
Sample 1: actual = +1, prediction = 2.0 • Margin = 1 × 2.0 = 2.0 ≥ 1 ✓ • Loss = max(0, 1 - 2.0) = max(0, -1.0) = 0
Sample 2: actual = +1, prediction = 1.5 • Margin = 1 × 1.5 = 1.5 ≥ 1 ✓ • Loss = max(0, 1 - 1.5) = max(0, -0.5) = 0
Sample 3: actual = -1, prediction = -2.0 • Margin = (-1) × (-2.0) = 2.0 ≥ 1 ✓ • Loss = max(0, 1 - 2.0) = max(0, -1.0) = 0
Sample 4: actual = -1, prediction = -1.5 • Margin = (-1) × (-1.5) = 1.5 ≥ 1 ✓ • Loss = max(0, 1 - 1.5) = max(0, -0.5) = 0
Average Loss = (0 + 0 + 0 + 0) / 4 = 0.0
All samples are correctly classified with margins exceeding the target of 1, resulting in zero total loss.
actual_labels = [1, -1, 1, -1]
prediction_scores = [-1.0, 1.0, -1.0, 1.0]2.0This represents a worst-case scenario where every prediction is maximally wrong:
Sample 1: actual = +1, prediction = -1.0 • Margin = 1 × (-1.0) = -1.0 • Loss = max(0, 1 - (-1.0)) = max(0, 2.0) = 2.0
Sample 2: actual = -1, prediction = 1.0 • Margin = (-1) × 1.0 = -1.0 • Loss = max(0, 1 - (-1.0)) = max(0, 2.0) = 2.0
Sample 3: actual = +1, prediction = -1.0 • Margin = 1 × (-1.0) = -1.0 • Loss = max(0, 1 - (-1.0)) = max(0, 2.0) = 2.0
Sample 4: actual = -1, prediction = 1.0 • Margin = (-1) × 1.0 = -1.0 • Loss = max(0, 1 - (-1.0)) = max(0, 2.0) = 2.0
Average Loss = (2.0 + 2.0 + 2.0 + 2.0) / 4 = 2.0
Every prediction is not only wrong but confidently wrong in the opposite direction, yielding maximum loss.
Constraints