Model Fit Diagnosis: Identifying Generalization Issues

EASY10 pts

In machine learning, a model's ability to generalize from training data to unseen data is paramount to its real-world utility. Two common pathologies can undermine this ability: overfitting and underfitting. Understanding and diagnosing these conditions is a foundational skill for any machine learning practitioner.

Overfitting occurs when a model learns the training data too well—including its noise and idiosyncrasies—resulting in excellent training performance but poor performance on new, unseen data. Think of it like a student who memorizes exam answers verbatim without understanding the underlying concepts: they excel on practice tests they've seen but struggle with novel questions.

Underfitting happens when a model is too simplistic to capture the underlying patterns in the data, performing poorly on both training and test datasets. This is analogous to a student who hasn't studied enough and lacks fundamental understanding to answer any questions well.

A well-balanced model (good fit) achieves a harmonious balance—it captures the essential patterns in the data without memorizing noise, resulting in strong performance on both training and test data.

Diagnostic Criteria: The diagnosis is based on comparing training accuracy and test accuracy using these rules:

• Overfitting: The training accuracy significantly exceeds the test accuracy, with a difference greater than 0.2 (20 percentage points). This indicates the model has memorized the training data but fails to generalize.

• Underfitting: Both training accuracy AND test accuracy fall below 0.7 (70%). This suggests the model lacks the capacity or has not been trained sufficiently to learn the underlying patterns.

• Good Fit: Neither of the above conditions applies. The model demonstrates adequate learning capacity and reasonable generalization.

Your Task: Implement a function that takes the training and test accuracy values as inputs and returns:

"1" if the model is overfitting
"-1" if the model is underfitting
"0" if the model has achieved a good fit

Example

Input

training_accuracy = 0.95, test_accuracy = 0.65

Output

"1"

Explanation

The training accuracy (95%) dramatically outperforms test accuracy (65%), with a difference of 0.30 (30 percentage points), which exceeds the 0.2 threshold.

This classic overfitting signature indicates the model has essentially memorized the training data but fails to generalize to new examples. The model likely has too many parameters relative to the training data size, or needs regularization techniques like dropout, L1/L2 penalties, or early stopping.

Example

Input

training_accuracy = 0.5, test_accuracy = 0.45

Output

"-1"

Explanation

Both training accuracy (50%) and test accuracy (45%) are well below the 0.7 threshold.

This underfitting pattern suggests the model is too simple to capture meaningful patterns in the data—it performs nearly as poorly as random guessing on a binary classification task. Remedies include using a more complex model architecture, adding relevant features, reducing regularization, or training for more epochs.

Example

Input

training_accuracy = 0.85, test_accuracy = 0.8

Output

"0"

Explanation

Training accuracy (85%) and test accuracy (80%) are both above the 0.7 threshold, and their difference (0.05) is well within the acceptable range of 0.2.

This represents a well-balanced model that has learned the underlying patterns effectively while maintaining good generalization. The modest 5-percentage-point gap is normal and indicates healthy model behavior.

Accepted0/0·0% Acceptance

Constraints

0.0 ≤ training_accuracy ≤ 1.0
0.0 ≤ test_accuracy ≤ 1.0
Accuracy values represent proportions (0.0 = 0%, 1.0 = 100%)
The overfitting threshold is fixed at 0.2 (difference between training and test accuracy)
The underfitting threshold is fixed at 0.7 (minimum acceptable accuracy for both metrics)
When multiple conditions could apply, prioritize overfitting detection over underfitting detection

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

test_accuracy =

0.65

training_accuracy =

0.95

Model Fit Diagnosis: Identifying Generalization Issues

Hints

Model Fit Diagnosis: Identifying Generalization Issues

Hints