Distribution Stability Analysis Using Stability Divergence Index (Medium) — Practice with Code Visualizer

In machine learning operations (MLOps) and production data systems, monitoring the stability of feature distributions is paramount for ensuring that deployed models continue to perform as expected. When the statistical properties of incoming data deviate significantly from the data used during model training, this phenomenon—known as distribution shift or data drift—can lead to degraded model accuracy, biased predictions, and unreliable outputs.

The Stability Divergence Index (SDI), also referred to as the Population Stability Index (PSI), is a widely adopted quantitative metric for measuring the degree to which a distribution has changed over time. It provides a single scalar value that captures the magnitude of distributional divergence between two datasets.

The Algorithm

Given two distributions represented as sample sets:

Baseline Distribution (e.g., training data feature values): The reference distribution that serves as the expected standard
Current Distribution (e.g., production data): The observed distribution to be compared against the baseline

The SDI is calculated through the following procedure:

Step 1: Define Bucket Boundaries

Create n equal-width buckets (bins) spanning the combined range of both distributions:

Determine the global minimum: min_val = min(min(baseline), min(current))
Determine the global maximum: max_val = max(max(baseline), max(current))
Bucket width: width = (max_val - min_val) / n

Step 2: Compute Proportions

For each bucket, calculate the proportion of samples from each distribution that fall within that bucket:

baseline_proportion[i] = (count of baseline samples in bucket i) / (total baseline samples)
current_proportion[i] = (count of current samples in bucket i) / (total current samples)

Step 3: Handle Zero Proportions

To avoid numerical instability with logarithms, replace any zero proportion with a small epsilon value (ε = 0.0001).

Step 4: Compute the Stability Divergence Index

$$SDI = \sum_{i=1}^{n} (current_proportion_i - baseline_proportion_i) \times \ln\left(\frac{current_proportion_i}{baseline_proportion_i}\right)$$

Step 5: Interpret the Result

The SDI value indicates the severity of distribution shift:

SDI Value	Interpretation	Action Required
< 0.1	No significant shift	Continue monitoring
0.1 – 0.25	Moderate shift	Investigate and monitor closely
≥ 0.25	Significant shift	Immediate investigation required; consider model retraining

Your Task

Write a Python function that computes the Stability Divergence Index between a baseline distribution and a current distribution, and returns a comprehensive drift assessment.

Function Requirements:

Partition the combined data range into equal-width buckets as specified by bucket_count
Calculate proportions for each bucket in both distributions
Apply epsilon smoothing (ε = 0.0001) to replace zero proportions
Compute the SDI using the formula above
Return a dictionary with the SDI value (rounded to 4 decimal places), a boolean indicating drift detection, and a drift severity classification

Edge Case: If either input list is empty, return an empty dictionary {}.

The baseline data is concentrated in the range [1, 5] while the current data has shifted to [3, 7]. This represents a notable rightward shift in the distribution.

Bucket Analysis:

Combined range: [1, 7], with bucket width = 1.2
Baseline tends to populate lower buckets
Current data tends to populate higher buckets

When we compute proportions for each of the 5 buckets and apply the SDI formula, the resulting value of 6.6336 far exceeds the 0.25 threshold, indicating a significant distribution shift that warrants immediate investigation into why production data has diverged so dramatically from the training baseline.

When the baseline and current distributions are identical, every bucket contains the same proportion of samples from both distributions.

Bucket Analysis:

For each bucket i: baseline_proportion[i] = current_proportion[i]
The ratio ln(current/baseline) = ln(1) = 0 for all buckets
Therefore, every term in the SDI summation equals zero

The SDI of 0.0 confirms that there is no distribution shift whatsoever. This is the ideal scenario in production—the incoming data closely matches what the model was trained on.

This example demonstrates an extreme case of distribution shift where the baseline and current data occupy completely non-overlapping regions of the feature space.

Bucket Analysis:

Baseline data: entirely in range [1, 3] (lower buckets)
Current data: entirely in range [8, 10] (upper buckets)
No bucket contains meaningful samples from both distributions

The epsilon smoothing (0.0001) is applied to prevent division by zero, but the massive disparity results in an extremely high SDI of 17.2439. This indicates a catastrophic distribution shift—the production data is fundamentally different from training data, and model predictions are likely unreliable.

The Algorithm

Given two distributions represented as sample sets:

Baseline Distribution (e.g., training data feature values): The reference distribution that serves as the expected standard
Current Distribution (e.g., production data): The observed distribution to be compared against the baseline

The SDI is calculated through the following procedure:

Step 1: Define Bucket Boundaries

Create n equal-width buckets (bins) spanning the combined range of both distributions:

Determine the global minimum: min_val = min(min(baseline), min(current))
Determine the global maximum: max_val = max(max(baseline), max(current))
Bucket width: width = (max_val - min_val) / n

Step 2: Compute Proportions

For each bucket, calculate the proportion of samples from each distribution that fall within that bucket:

baseline_proportion[i] = (count of baseline samples in bucket i) / (total baseline samples)
current_proportion[i] = (count of current samples in bucket i) / (total current samples)

Step 3: Handle Zero Proportions

To avoid numerical instability with logarithms, replace any zero proportion with a small epsilon value (ε = 0.0001).

Step 4: Compute the Stability Divergence Index

$$SDI = \sum_{i=1}^{n} (current_proportion_i - baseline_proportion_i) \times \ln\left(\frac{current_proportion_i}{baseline_proportion_i}\right)$$

Step 5: Interpret the Result

The SDI value indicates the severity of distribution shift:

SDI Value	Interpretation	Action Required
< 0.1	No significant shift	Continue monitoring
0.1 – 0.25	Moderate shift	Investigate and monitor closely
≥ 0.25	Significant shift	Immediate investigation required; consider model retraining

Your Task

Write a Python function that computes the Stability Divergence Index between a baseline distribution and a current distribution, and returns a comprehensive drift assessment.

Function Requirements:

Partition the combined data range into equal-width buckets as specified by bucket_count
Calculate proportions for each bucket in both distributions
Apply epsilon smoothing (ε = 0.0001) to replace zero proportions
Compute the SDI using the formula above
Return a dictionary with the SDI value (rounded to 4 decimal places), a boolean indicating drift detection, and a drift severity classification

Edge Case: If either input list is empty, return an empty dictionary {}.

The baseline data is concentrated in the range [1, 5] while the current data has shifted to [3, 7]. This represents a notable rightward shift in the distribution.

Bucket Analysis:

Combined range: [1, 7], with bucket width = 1.2
Baseline tends to populate lower buckets
Current data tends to populate higher buckets

When the baseline and current distributions are identical, every bucket contains the same proportion of samples from both distributions.

Bucket Analysis:

For each bucket i: baseline_proportion[i] = current_proportion[i]
The ratio ln(current/baseline) = ln(1) = 0 for all buckets
Therefore, every term in the SDI summation equals zero

The SDI of 0.0 confirms that there is no distribution shift whatsoever. This is the ideal scenario in production—the incoming data closely matches what the model was trained on.

This example demonstrates an extreme case of distribution shift where the baseline and current data occupy completely non-overlapping regions of the feature space.

Bucket Analysis:

Baseline data: entirely in range [1, 3] (lower buckets)
Current data: entirely in range [8, 10] (upper buckets)
No bucket contains meaningful samples from both distributions

Distribution Stability Analysis Using Stability Divergence Index

The Algorithm

Step 1: Define Bucket Boundaries

Step 2: Compute Proportions

Step 3: Handle Zero Proportions

Step 4: Compute the Stability Divergence Index

Step 5: Interpret the Result

Your Task

Hints

Distribution Stability Analysis Using Stability Divergence Index

The Algorithm

Step 1: Define Bucket Boundaries

Step 2: Compute Proportions

Step 3: Handle Zero Proportions

Step 4: Compute the Stability Divergence Index

Step 5: Interpret the Result

Your Task

Hints