0/318

00:00:00

Description

Editorial

Batch Inference Health Monitor

EASY10 pts

In production machine learning systems, maintaining visibility into the health and reliability of batch prediction pipelines is critical for operational excellence. MLOps teams rely on real-time dashboards and automated alerting systems to detect anomalies, performance degradation, and failures before they impact downstream consumers.

When running large-scale batch inference jobs—such as overnight scoring of millions of customer records, daily fraud detection sweeps, or hourly recommendation updates—the system generates prediction results that need continuous monitoring. Each prediction result carries metadata about its execution status and, for successful predictions, a confidence score indicating the model's certainty in its output.

Your Task:

Given a list of prediction results from a batch inference job, implement a function that computes essential health metrics commonly tracked in MLOps observability platforms. These metrics help engineering teams assess:

Overall Reliability: What percentage of predictions completed without errors?
Model Confidence Quality: How confident is the model in its successful predictions on average?
Threshold Violation Rate: What proportion of predictions fall below an acceptable confidence level?

Input Format:

You will receive:

predictions: A list of dictionaries, where each dictionary represents a single prediction result with:
- status: A string that is either 'success' (prediction completed) or 'error' (prediction failed)
- confidence: A float between 0 and 1 representing model certainty (only present when status is 'success')
confidence_threshold: A float between 0 and 1 representing the minimum acceptable confidence level

Output Format:

Return a dictionary containing three metrics, all rounded to 2 decimal places:

success_rate: The percentage of predictions with status = 'success' out of all predictions
avg_confidence: The mean confidence score across all successful predictions, expressed as a percentage (0-100)
low_confidence_rate: The percentage of successful predictions whose confidence falls strictly below the threshold

Edge Cases:

If the input list is empty, return an empty dictionary {}
If there are no successful predictions, return success_rate as calculated, and both avg_confidence and low_confidence_rate as 0.0

Example

Input

predictions = [{'status': 'success', 'confidence': 0.9}, {'status': 'success', 'confidence': 0.8}, {'status': 'error'}, {'status': 'success', 'confidence': 0.4}, {'status': 'success', 'confidence': 0.7}]
confidence_threshold = 0.5

Output

{'success_rate': 80.0, 'avg_confidence': 70.0, 'low_confidence_rate': 25.0}

Explanation

Total predictions: 5 Successful predictions: 4 (with confidences [0.9, 0.8, 0.4, 0.7]) Failed predictions: 1

• Success Rate: 4/5 = 80.0% • Average Confidence: (0.9 + 0.8 + 0.4 + 0.7) / 4 = 0.7 → 70.0% • Low Confidence Rate: Only 1 prediction (0.4) is below the 0.5 threshold → 1/4 = 25.0%

Example

Input

predictions = [{'status': 'success', 'confidence': 0.95}, {'status': 'success', 'confidence': 0.85}, {'status': 'success', 'confidence': 0.75}]
confidence_threshold = 0.8

Output

{'success_rate': 100.0, 'avg_confidence': 85.0, 'low_confidence_rate': 33.33}

Explanation

Total predictions: 3 Successful predictions: 3 (all succeeded with confidences [0.95, 0.85, 0.75])

• Success Rate: 3/3 = 100.0% • Average Confidence: (0.95 + 0.85 + 0.75) / 3 = 0.85 → 85.0% • Low Confidence Rate: 1 prediction (0.75) is below the 0.8 threshold → 1/3 = 33.33%

Example

Input

predictions = [{'status': 'success', 'confidence': 0.3}, {'status': 'success', 'confidence': 0.6}, {'status': 'error'}, {'status': 'error'}]
confidence_threshold = 0.5

Output

{'success_rate': 50.0, 'avg_confidence': 45.0, 'low_confidence_rate': 50.0}

Explanation

Total predictions: 4 Successful predictions: 2 (with confidences [0.3, 0.6]) Failed predictions: 2

• Success Rate: 2/4 = 50.0% • Average Confidence: (0.3 + 0.6) / 2 = 0.45 → 45.0% • Low Confidence Rate: 1 prediction (0.3) is below the 0.5 threshold → 1/2 = 50.0%

Accepted0/0·0% Acceptance

Constraints

0 ≤ length of predictions ≤ 10,000
Each prediction has a 'status' field that is either 'success' or 'error'
Successful predictions have a 'confidence' field in the range [0.0, 1.0]
0.0 ≤ confidence_threshold ≤ 1.0
All output metrics must be rounded to exactly 2 decimal places
An empty input list should return an empty dictionary

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

predictions =

[{"status":"success","confidence":0.9},{"status":"success","confidence":0.8},{"status":"error"},{"status":"success","confidence":0.4},{"status":"success","confidence":0.7}]

confidence_threshold =

0.5

Batch Inference Health Monitor

Hints

Batch Inference Health Monitor

Hints