Feature Attribution Methods - Learning Module

Loading content...

0/278

LIME

Explaining Predictions Through Local Approximation

Imagine trying to understand why a deep neural network classified a particular image as a "husky" rather than a "wolf." The network has millions of parameters, nonlinear activations, and complex internal representations. How can we possibly explain its decision?

LIME (Local Interpretable Model-agnostic Explanations), introduced by Ribeiro, Singh, and Guestrin in 2016, offers an elegant solution: Don't try to interpret the complex model globally. Instead, approximate it locally with a simple, interpretable model.

The key insight is that even highly nonlinear models behave approximately linearly in small neighborhoods. Around any specific prediction, we can fit a simple model (like linear regression) that mimics the complex model's behavior locally. This simple surrogate inherits the original model's predictions nearby while being inherently interpretable.

LIME became wildly popular because of its simplicity, flexibility, and the publication's memorable name. It works on any model—neural networks, gradient boosting, random forests, SVMs—without requiring access to model internals.

What You Will Learn

This page covers LIME comprehensively: the mathematical framework, implementation details, variants for different data types, critical evaluation of its limitations, and practical guidelines for production use. You'll understand when LIME is appropriate and when alternatives like SHAP are preferable.

The LIME Framework

LIME operates on a simple but powerful principle: complex models are locally simple.

The Core Idea

Choose an instance to explain: The specific prediction you want to understand
Generate perturbed samples: Create variations of that instance by slightly modifying features
Query the black-box model: Get predictions for all perturbed samples
Weight by proximity: Nearby perturbations matter more than distant ones
Fit an interpretable model: Train a simple model (linear, decision tree) on the weighted samples
Extract explanation: The interpretable model's coefficients/structure explain the original prediction

Mathematical Formulation

LIME seeks an explanation $g$ from a class of interpretable models $G$ (e.g., linear models) that minimizes:

$$\xi(x) = \underset{g \in G}{\text{argmin}} ; \mathcal{L}(f, g, \pi_x) + \Omega(g)$$

Where:

$f$ is the original (black-box) model
$g$ is the interpretable surrogate model
$\pi_x$ is a proximity measure centered at instance $x$
$\mathcal{L}$ is the faithfulness loss (how well $g$ approximates $f$ locally)
$\Omega(g)$ is model complexity (encouraging simpler explanations)

The Taylor Expansion Intuition

LIME is conceptually similar to Taylor series approximation. Just as any smooth function can be approximated linearly in a small neighborhood (f(x+δ) ≈ f(x) + ∇f·δ), any model can be approximated by a linear model near any point. LIME finds that linear approximation empirically.

The Interpretable Representation

A crucial aspect of LIME is the interpretable representation $x' \in {0,1}^{d'}$.

For tabular data, this might be:

Binary indicators for whether each feature equals (or is close to) its original value

For text:

Binary indicators for whether each word is present

For images:

Binary indicators for whether each superpixel is visible

The interpretable representation must be:

Human-understandable: Presence/absence of components
Mappable to original space: We can convert interpretable representations back to actual inputs
Lower-dimensional: Often $d' < d$ (original feature space)

The LIME Algorithm in Detail

Let's formalize the LIME algorithm step by step.

Step 1: Generate Perturbations

For tabular data, LIME typically:

Discretizes continuous features into quartiles/quantiles
Creates binary "interpretable" representation
Samples perturbations by randomly toggling these binary features
Maps perturbations back to original feature space

Step 2: Compute Proximity Weights

Each perturbation $z$ receives a weight based on its distance from the original instance $x$:

$$\pi_x(z) = \exp\left(-\frac{D(x, z)^2}{\sigma^2}\right)$$

This exponential kernel ensures:

Perturbations very close to $x$ get weight ≈ 1
Weight decays exponentially with distance
$\sigma$ controls the locality (width of the neighborhood)

Step 3: Query Black-Box Model

For each perturbation $z_i$, query the model to get prediction $f(z_i)$. This is the expensive step—requires many model evaluations.

Step 4: Fit Weighted Linear Model

Solve the weighted least-squares problem:

$$\min_{w} \sum_{i} \pi_x(z_i) \cdot (f(z_i) - w^\top z'_i)^2 + \lambda |w|_1$$

where:

$z'_i$ is the interpretable representation of perturbation $z_i$
$w$ are the explanation weights (feature attributions)
$\lambda$ is L1 regularization for sparsity

Step 5: Extract Explanation

The fitted weights $w$ are the feature attributions. Large positive $w_j$ means feature $j$ pushed the prediction higher. Large negative means it pushed lower.

lime_from_scratch.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
import numpy as np
from sklearn.linear_model import Ridge, Lasso
from sklearn.metrics.pairwise import euclidean_distances
from typing import Callable, Tuple, List
 
class SimpleLIME:
    """
    Simplified LIME implementation for educational purposes.
    For production, use the 'lime' package.
    """
    
    def __init__(
        self,
        model_predict: Callable,
        feature_names: List[str],
        kernel_width: float = 0.75,
        num_samples: int = 5000,
        random_state: int = 42
    ):
        """
        Parameters
        ----------
        model_predict : callable returning predictions (probabilities for classification)
        feature_names : list of feature names
        kernel_width : controls locality (as fraction of sqrt(num_features))
        num_samples : number of perturbations to generate
        """
        self.model_predict = model_predict
        self.feature_names = feature_names
        self.kernel_width = kernel_width
        self.num_samples = num_samples
        self.rng = np.random.RandomState(random_state)
    
    def explain(
        self,
        instance: np.ndarray,
        training_data: np.ndarray,
        num_features: int = 10
    ) -> dict:
        """
        Explain a single prediction.
        
        Parameters
        ----------
        instance : 1D array of feature values
        training_data : 2D array used for computing statistics
        num_features : number of top features to return
        
        Returns
        -------
        dict with explanation components
        """
        instance = instance.flatten()
        n_features = len(instance)
        
        # Compute training data statistics for sampling
        mean = training_data.mean(axis=0)
        std = training_data.std(axis=0) + 1e-8  # Avoid division by zero
        
        # Generate perturbations by sampling from Normal distribution
        # centered at instance
        perturbations = self.rng.normal(
            loc=instance,
            scale=std,
            size=(self.num_samples, n_features)
        )
        
        # First perturbation is the original instance
        perturbations[0] = instance
        
        # Query black-box model
        predictions = self.model_predict(perturbations)
        if predictions.ndim > 1:
            # Classification: take probability of predicted class
            pred_class = self.model_predict(instance.reshape(1, -1)).argmax()
            predictions = predictions[:, pred_class]
        
        # Compute distances from instance
        distances = euclidean_distances(
            perturbations, 
            instance.reshape(1, -1)
        ).flatten()
        
        # Compute proximity weights using exponential kernel
        kernel_width = self.kernel_width * np.sqrt(n_features)
        weights = np.exp(-(distances ** 2) / (kernel_width ** 2))
        
        # Fit weighted linear model with L1 regularization
        # Use scaled features for interpretability
        perturbations_scaled = (perturbations - mean) / std
        instance_scaled = (instance - mean) / std
        
        # Lasso for sparse explanation
        model = Ridge(alpha=1.0)
        model.fit(perturbations_scaled, predictions, sample_weight=weights)
        
        # Extract top features by absolute coefficient
        coefficients = model.coef_
        top_indices = np.argsort(np.abs(coefficients))[::-1][:num_features]
        
        explanation = {
            'instance': instance,
            'prediction': self.model_predict(instance.reshape(1, -1)),
            'intercept': model.intercept_,
            'local_prediction': model.predict(instance_scaled.reshape(1, -1))[0],
            'top_features': [
                {
                    'feature': self.feature_names[i],
                    'coefficient': coefficients[i],
                    'value': instance[i],
                    'contribution': coefficients[i] * instance_scaled[i]
                }
                for i in top_indices
            ],
            'all_coefficients': {
                self.feature_names[i]: coefficients[i]
                for i in range(n_features)
            },
            'r2_score': model.score(
                perturbations_scaled, 
                predictions, 
                sample_weight=weights
            )
        }
        
        return explanation
 
 
# Example usage
if __name__ == "__main__":
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.datasets import load_breast_cancer
    from sklearn.model_selection import train_test_split
    
    # Load data and train model
    data = load_breast_cancer()
    X_train, X_test, y_train, y_test = train_test_split(
        data.data, data.target, test_size=0.2, random_state=42
    )
    
    model = RandomForestClassifier(n_estimators=100, random_state=42)
    model.fit(X_train, y_train)
    
    # Create LIME explainer
    lime_explainer = SimpleLIME(
        model_predict=model.predict_proba,
        feature_names=list(data.feature_names),
        num_samples=5000
    )
    
    # Explain a prediction
    idx = 0
    explanation = lime_explainer.explain(
        X_test[idx],
        training_data=X_train,
        num_features=5
    )
    
    print(f"Prediction: {explanation['prediction']}")
    print(f"Local R² Score: {explanation['r2_score']:.4f}")
    print("
Top Features:")
    for feat in explanation['top_features']:
        direction = "↑" if feat['coefficient'] > 0 else "↓"
        print(f"  {feat['feature']}: {feat['coefficient']:+.4f} {direction}")

Using the LIME Library

The official lime Python package provides robust, well-tested implementations for tabular data, text, and images.

Installation

pip install lime

LIME for Tabular Data

lime_tabular.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import lime
import lime.lime_tabular
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
 
# Load data
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.2, random_state=42
)
 
# Train model
model = GradientBoostingClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
 
# Create LIME explainer
explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=X_train,
    feature_names=data.feature_names,
    class_names=['malignant', 'benign'],
    mode='classification',
    discretize_continuous=True,  # Discretize into quartiles
    random_state=42
)
 
# Explain a single prediction
idx = 0
instance = X_test[idx]
 
explanation = explainer.explain_instance(
    data_row=instance,
    predict_fn=model.predict_proba,
    num_features=10,
    num_samples=5000
)
 
# Print explanation
print(f"Instance {idx}")
print(f"True label: {y_test[idx]} ({data.target_names[y_test[idx]]})")
print(f"Predicted: {model.predict([instance])[0]}")
print(f"Predicted proba: {model.predict_proba([instance])[0]}")
 
print("
Top contributing features:")
for feature, weight in explanation.as_list():
    direction = "→ benign" if weight > 0 else "→ malignant"
    print(f"  {feature}: {weight:+.4f} {direction}")
 
# Show local prediction accuracy
print(f"
Local model R² (fidelity): {explanation.score:.4f}")
 
# Visualize in notebook
# explanation.show_in_notebook()
 
# Or save to HTML
# explanation.save_to_file('lime_explanation.html')
 
# Get raw map of feature index to weight
feature_weights = dict(explanation.local_exp[1])  # Class 1 (benign)
print("
Feature weights (raw):", feature_weights)

LIME for Text Classification

lime_text.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import lime
import lime.lime_text
from sklearn.pipeline import make_pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import fetch_20newsgroups
 
# Load text data
categories = ['alt.atheism', 'soc.religion.christian']
newsgroups = fetch_20newsgroups(
    subset='train', 
    categories=categories,
    remove=('headers', 'footers', 'quotes')
)
 
# Train text classifier
vectorizer = TfidfVectorizer(max_features=5000)
classifier = LogisticRegression(max_iter=1000, random_state=42)
 
pipeline = make_pipeline(vectorizer, classifier)
pipeline.fit(newsgroups.data, newsgroups.target)
 
# Create LIME text explainer
explainer = lime.lime_text.LimeTextExplainer(
    class_names=newsgroups.target_names,
    random_state=42
)
 
# Explain a prediction
idx = 0
text = newsgroups.data[idx]
 
explanation = explainer.explain_instance(
    text_instance=text,
    classifier_fn=pipeline.predict_proba,
    num_features=10,
    num_samples=1000
)
 
print(f"Text: {text[:200]}...")
print(f"
Predicted class: {newsgroups.target_names[pipeline.predict([text])[0]]}")
print(f"True class: {newsgroups.target_names[newsgroups.target[idx]]}")
 
print("
Explanation (words that influenced prediction):")
for word, weight in explanation.as_list():
    class_direction = newsgroups.target_names[1] if weight > 0 else newsgroups.target_names[0]
    print(f"  '{word}': {weight:+.4f} → {class_direction}")

LIME for Images

For images, LIME works with superpixels—contiguous regions of similar pixels. It explains which superpixels contributed to the prediction.

lime_image.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
import lime
from lime import lime_image
from skimage.segmentation import mark_boundaries
import numpy as np
import matplotlib.pyplot as plt
 
# Assume we have a trained image classifier
# from tensorflow.keras.applications import ResNet50
# model = ResNet50(weights='imagenet')
 
# Create LIME image explainer
explainer = lime_image.LimeImageExplainer(random_state=42)
 
# Explain a prediction
# image = load_and_preprocess_image('cat.jpg')  # Shape: (224, 224, 3)
 
def explain_image_prediction(model, image, explainer):
    """
    Explain an image classification prediction.
    
    Parameters
    ----------
    model : image classifier with predict method
    image : numpy array of shape (H, W, 3)
    explainer : LimeImageExplainer instance
    """
    
    # Explain instance
    explanation = explainer.explain_instance(
        image,
        model.predict,  # Prediction function
        top_labels=3,   # Explain top 3 predictions
        hide_color=0,   # Black for hidden superpixels
        num_samples=1000
    )
    
    # Get explanation for top predicted class
    top_label = explanation.top_labels[0]
    
    # Get image showing positive contributions (green)
    temp, mask = explanation.get_image_and_mask(
        label=top_label,
        positive_only=True,
        num_features=5,    # Top 5 superpixels
        hide_rest=False
    )
    
    # Visualize
    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
    
    axes[0].imshow(image)
    axes[0].set_title('Original Image')
    
    axes[1].imshow(mark_boundaries(temp, mask))
    axes[1].set_title(f'Positive regions for class {top_label}')
    
    # Show both positive (green) and negative (red) contributions
    temp2, mask2 = explanation.get_image_and_mask(
        label=top_label,
        positive_only=False,
        num_features=10,
        hide_rest=False
    )
    axes[2].imshow(mark_boundaries(temp2, mask2))
    axes[2].set_title('Positive (green) & Negative (red)')
    
    plt.tight_layout()
    plt.savefig('lime_image_explanation.png', dpi=150)
    plt.show()
    
    return explanation
 
# Example usage (with actual model):
# explanation = explain_image_prediction(model, image, explainer)

Kernel and Distance Functions

The kernel function in LIME controls locality—how quickly influence decays with distance. Understanding and tuning this is crucial for quality explanations.

The Exponential Kernel

LIME's default kernel:

$$\pi_x(z) = \exp\left(-\frac{D(x, z)^2}{\sigma^2}\right)$$

Kernel width $\sigma$ has profound effects:

$\sigma$ Value	Locality	Behavior
Very small	Very local	Explanation only valid for near-identical instances
Small	Local	Captures local decision boundary
Medium	Moderate	Balances locality with stability
Large	Global	Approaches global linear approximation
Very large	Global	Ignores locality entirely

Choosing Kernel Width

LIME's default: kernel_width = 0.75 * sqrt(num_features)

This heuristic works reasonably but isn't optimal for all cases:

Tight decision boundaries: Use smaller kernel width
Smooth decision regions: Larger width is acceptable
High-dimensional data: May need adjustment

The Distance Metric

LIME computes distances in different spaces:

Tabular data: Euclidean distance on standardized features

Text: Cosine similarity on word presence vectors (or Euclidean on binary vectors)

Images: Distance on superpixel presence/absence

kernel_width_analysis.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons
from sklearn.ensemble import RandomForestClassifier
import lime.lime_tabular
 
# Create dataset with clear decision boundary
X, y = make_moons(n_samples=500, noise=0.15, random_state=42)
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X, y)
 
# Point to explain (near decision boundary)
test_point = np.array([0.0, 0.25])
 
# Test different kernel widths
kernel_widths = [0.1, 0.5, 1.0, 2.0, 5.0]
 
fig, axes = plt.subplots(1, len(kernel_widths), figsize=(20, 4))
 
for ax, kw in zip(axes, kernel_widths):
    explainer = lime.lime_tabular.LimeTabularExplainer(
        training_data=X,
        feature_names=['x1', 'x2'],
        class_names=['Class 0', 'Class 1'],
        mode='classification',
        kernel_width=kw,
        random_state=42
    )
    
    explanation = explainer.explain_instance(
        test_point,
        model.predict_proba,
        num_features=2,
        num_samples=5000
    )
    
    # Get coefficients
    weights = dict(explanation.as_map()[1])
    coef = [weights.get(i, 0) for i in range(2)]
    
    # Plot decision boundary
    xx, yy = np.meshgrid(
        np.linspace(-1.5, 2.5, 100),
        np.linspace(-1, 1.5, 100)
    )
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    
    ax.contourf(xx, yy, Z, alpha=0.3, cmap='RdBu')
    ax.scatter(X[:, 0], X[:, 1], c=y, cmap='RdBu', s=10, alpha=0.5)
    ax.scatter(*test_point, s=200, c='yellow', edgecolor='black', zorder=5)
    
    # Draw local linear approximation
    # y = b + w1*x1 + w2*x2 => for y=0.5: x2 = (0.5-b-w1*x1)/w2
    intercept = explanation.intercept[1]
    
    ax.set_title(f'kernel_width = {kw}
R² = {explanation.score:.3f}')
    ax.set_xlim(-1.5, 2.5)
    ax.set_ylim(-1, 1.5)
 
plt.suptitle('Effect of Kernel Width on LIME Explanations')
plt.tight_layout()
plt.savefig('kernel_width_analysis.png', dpi=150)
plt.show()

Kernel Width Sensitivity

LIME explanations can change significantly with kernel width. Always verify that your chosen kernel width produces high local fidelity (R² score near 1.0) and conduct sensitivity analysis if explanations will be used for important decisions.

LIME vs SHAP: A Critical Comparison

LIME and SHAP both provide feature attributions but differ fundamentally in their approach, theoretical guarantees, and practical behavior.

Theoretical Foundation

SHAP: Grounded in Shapley values from game theory. Unique attribution satisfying efficiency, symmetry, null player, and consistency axioms. Mathematically principled.

LIME: Grounded in local surrogate modeling. Optimizes local fidelity + simplicity. No uniqueness theorem; results depend on hyperparameters.

Local Accuracy Guarantee

SHAP: Exact decomposition. SHAP values + base value = prediction. Always, by construction.

LIME: Approximate. The local linear model approximates the prediction with some error (measured by R² score). May not equal prediction exactly.

Stability

SHAP (TreeSHAP): Deterministic. Same input gives identical explanation.

LIME: Stochastic. Random sampling means repeated runs give slightly different explanations. Can be significant near decision boundaries.

LIME vs SHAP Feature Comparison
Aspect	LIME	SHAP
Theoretical basis	Local linear approximation	Shapley values (game theory)
Uniqueness	No (depends on hyperparameters)	Yes (unique fair attribution)
Local accuracy	Approximate (R² score)	Exact (sums to prediction)
Stability	Stochastic (varies between runs)	Deterministic (for TreeSHAP)
Computational cost	O(num_samples × inference)	O(2^p) exact, efficient for trees
Model-specific versions	No (always perturb+fit)	Yes (TreeSHAP, DeepSHAP)
Hyperparameter sensitivity	High (kernel width, num_samples)	Lower (background distribution)
Interpretable to humans	Linear weights + feature conditions	Additive contributions

lime_vs_shap_comparison.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
import numpy as np
import shap
import lime.lime_tabular
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
import pandas as pd
 
# Load data and train model
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.2, random_state=42
)
 
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
 
# Instance to explain
idx = 0
instance = X_test[idx]
 
# SHAP explanation
shap_explainer = shap.TreeExplainer(model)
shap_values = shap_explainer.shap_values(instance.reshape(1, -1))
 
# LIME explanation (run multiple times to assess stability)
lime_explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=X_train,
    feature_names=data.feature_names,
    class_names=['malignant', 'benign'],
    mode='classification',
    random_state=42
)
 
lime_explanation = lime_explainer.explain_instance(
    instance, 
    model.predict_proba,
    num_features=10,
    num_samples=5000
)
 
# Compare top features
print("="*60)
print("COMPARISON: SHAP vs LIME")
print("="*60)
 
print("
SHAP Top 10 Features (for class 1 - benign):")
shap_class1 = shap_values[1][0]  # Class 1 SHAP values
shap_sorted = sorted(zip(data.feature_names, shap_class1), 
                     key=lambda x: abs(x[1]), reverse=True)[:10]
for name, val in shap_sorted:
    print(f"  {name:>30}: {val:+.4f}")
 
print("
LIME Top 10 Features (for class 1 - benign):")
lime_features = lime_explanation.as_list(label=1)[:10]
for condition, weight in lime_features:
    print(f"  {condition:>40}: {weight:+.4f}")
 
# Check stability of LIME across runs
print("
" + "="*60)
print("LIME STABILITY CHECK (5 runs)")
print("="*60)
 
lime_runs = []
for seed in range(5):
    exp = lime_explainer.explain_instance(
        instance, 
        model.predict_proba,
        num_features=10,
        num_samples=5000,
        random_state=seed  # Different seed each run
    )
    weights = dict(exp.as_map()[1])
    lime_runs.append(weights)
 
# Check variance in top feature rankings
print("
Top feature by importance across runs:")
for run_idx, weights in enumerate(lime_runs):
    top_feat = max(weights.items(), key=lambda x: abs(x[1]))
    print(f"  Run {run_idx}: Feature {top_feat[0]} = {top_feat[1]:.4f}")
 
print("
SHAP is deterministic - same result every time")
print(f"LIME R² (local fidelity): {lime_explanation.score:.4f}")

When to Use LIME

•Need human-readable feature conditions (e.g., 'age > 45')
•Text or image classification (excellent LIME support)
•Quick prototype without SHAP setup
•When non-tree models make TreeSHAP unavailable
•Educational demonstrations of local explanations

When to Use SHAP

•Need stable, reproducible explanations
•Exact decomposition required (regulatory, auditing)
•Tree-based models (TreeSHAP is fast and exact)
•Want global importance from local explanations
•Theoretical guarantees matter

Stability and Fidelity Issues

LIME has known issues with stability (consistency across runs) and fidelity (how well the explanation matches the model). Understanding these is crucial for responsible use.

The Stability Problem

LIME's randomness comes from:

Random sampling of perturbations
Random feature discretization (if enabled)
Random selection of superpixels (for images)

Consequence: Run LIME twice, get different explanations. This is problematic for:

Regulatory compliance (need reproducible explanations)
User trust (why did the explanation change?)
Decision-making (which explanation is correct?)

Measuring Stability

One approach: Jaccard similarity of top-k features across runs.

$$\text{Stability} = \frac{1}{\binom{R}{2}} \sum_{i<j} \frac{|F_i \cap F_j|}{|F_i \cup F_j|}$$

where $F_i$ is the set of top-k features in run $i$.

lime_stability_analysis.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
import numpy as np
from itertools import combinations
 
def lime_stability_analysis(
    explainer, 
    instance, 
    predict_fn, 
    num_runs=10, 
    top_k=5,
    num_samples=5000
):
    """
    Analyze stability of LIME explanations across multiple runs.
    """
    all_top_features = []
    all_weights = []
    all_fidelity = []
    
    for run in range(num_runs):
        exp = explainer.explain_instance(
            instance,
            predict_fn,
            num_features=top_k,
            num_samples=num_samples,
            random_state=run  # Different seed each run
        )
        
        # Extract top features indices
        top_features = set(exp.as_map()[1].keys())
        all_top_features.append(top_features)
        
        # Store weights
        weights = exp.as_map()[1]
        all_weights.append(weights)
        
        # Store local fidelity
        all_fidelity.append(exp.score)
    
    # Compute pairwise Jaccard similarity
    jaccard_scores = []
    for f1, f2 in combinations(all_top_features, 2):
        intersection = len(f1 & f2)
        union = len(f1 | f2)
        jaccard_scores.append(intersection / union if union > 0 else 0)
    
    # Compute weight correlation across runs
    # First, get all unique features
    all_feat_indices = set()
    for weights in all_weights:
        all_feat_indices.update(weights.keys())
    
    weight_matrix = np.zeros((num_runs, len(all_feat_indices)))
    feat_to_idx = {f: i for i, f in enumerate(all_feat_indices)}
    
    for run, weights in enumerate(all_weights):
        for feat, weight in weights.items():
            weight_matrix[run, feat_to_idx[feat]] = weight
    
    # Pairwise correlations
    correlations = []
    for i, j in combinations(range(num_runs), 2):
        corr = np.corrcoef(weight_matrix[i], weight_matrix[j])[0, 1]
        if not np.isnan(corr):
            correlations.append(corr)
    
    results = {
        'mean_jaccard': np.mean(jaccard_scores),
        'std_jaccard': np.std(jaccard_scores),
        'min_jaccard': np.min(jaccard_scores),
        'mean_correlation': np.mean(correlations) if correlations else 0,
        'mean_fidelity': np.mean(all_fidelity),
        'std_fidelity': np.std(all_fidelity),
        'num_runs': num_runs
    }
    
    return results
 
# Usage
# stability = lime_stability_analysis(lime_explainer, instance, model.predict_proba)
# print(f"Mean Jaccard (top feature overlap): {stability['mean_jaccard']:.3f}")
# print(f"Mean Weight Correlation: {stability['mean_correlation']:.3f}")
# print(f"Mean Local Fidelity: {stability['mean_fidelity']:.3f}")

The Fidelity Problem

Fidelity measures how well the local linear model approximates the black-box model. Poor fidelity means the explanation doesn't accurately reflect the model's behavior.

Causes of low fidelity:

Complex, highly nonlinear decision boundaries
Insufficient sampling (num_samples too low)
Inappropriate kernel width
High-dimensional feature space

Best practice: Always check the R² score (reported as explanation.score). Treat explanations with R² < 0.8 skeptically.

Mitigation Strategies

Improving LIME Stability and Fidelity

•Increase num_samples: More perturbations → more stable estimates (but slower)
•Fixed random seed: Use same seed for reproducible explanations
•Tune kernel_width: Experiment to find optimal locality
•Ensemble explanations: Average weights across multiple runs
•Check R² score: Reject explanations below threshold
•Use SHAP instead: When stability is paramount

Production Considerations

Deploying LIME in production requires addressing computational cost, stability guarantees, and explanation formatting.

Computational Cost

LIME requires num_samples model evaluations per explanation. For:

Fast models (linear, small RF): 1000-5000 samples feasible
Slow models (large neural networks): May need 100-500 samples
Real-time APIs: Consider caching or precomputation

Ensuring Reproducibility

For regulatory and auditing purposes, explanations must be reproducible:

lime_production.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
import lime.lime_tabular
import hashlib
import json
from datetime import datetime
from dataclasses import dataclass, asdict
from typing import List, Dict, Any
 
@dataclass
class LIMEExplanationRecord:
    """Audit record for LIME explanation."""
    instance_id: str
    instance_hash: str
    timestamp: str
    prediction: float
    explanation: List[Dict[str, Any]]
    fidelity_score: float
    hyperparameters: Dict[str, Any]
    model_version: str
    
    def to_json(self) -> str:
        return json.dumps(asdict(self), indent=2)
 
 
class ProductionLIMEExplainer:
    """Production-ready LIME wrapper with reproducibility and logging."""
    
    def __init__(
        self,
        training_data,
        feature_names,
        class_names,
        model_version: str,
        num_samples: int = 5000,
        random_state: int = 42
    ):
        self.explainer = lime.lime_tabular.LimeTabularExplainer(
            training_data=training_data,
            feature_names=feature_names,
            class_names=class_names,
            mode='classification',
            discretize_continuous=True,
            random_state=random_state
        )
        self.model_version = model_version
        self.num_samples = num_samples
        self.random_state = random_state
        self.feature_names = feature_names
        self.class_names = class_names
    
    def explain_with_record(
        self,
        instance,
        predict_fn,
        instance_id: str,
        num_features: int = 10,
        fidelity_threshold: float = 0.7
    ) -> LIMEExplanationRecord:
        """
        Generate LIME explanation with full audit record.
        """
        # Hash instance for reproducibility verification
        instance_hash = hashlib.md5(
            instance.tobytes()
        ).hexdigest()
        
        # Generate explanation
        explanation = self.explainer.explain_instance(
            instance,
            predict_fn,
            num_features=num_features,
            num_samples=self.num_samples
        )
        
        # Check fidelity
        fidelity = explanation.score
        if fidelity < fidelity_threshold:
            print(f"WARNING: Low fidelity ({fidelity:.3f}) for instance {instance_id}")
        
        # Format explanation
        formatted_explanation = [
            {
                'condition': cond,
                'weight': weight,
                'direction': 'positive' if weight > 0 else 'negative'
            }
            for cond, weight in explanation.as_list()
        ]
        
        # Get prediction
        prediction = predict_fn(instance.reshape(1, -1))[0]
        if hasattr(prediction, '__len__'):
            prediction = float(max(prediction))
        
        # Create audit record
        record = LIMEExplanationRecord(
            instance_id=instance_id,
            instance_hash=instance_hash,
            timestamp=datetime.utcnow().isoformat(),
            prediction=float(prediction),
            explanation=formatted_explanation,
            fidelity_score=fidelity,
            hyperparameters={
                'num_samples': self.num_samples,
                'num_features': num_features,
                'random_state': self.random_state
            },
            model_version=self.model_version
        )
        
        return record
    
    def batch_explain(
        self,
        instances,
        instance_ids,
        predict_fn,
        num_features: int = 10
    ) -> List[LIMEExplanationRecord]:
        """Explain multiple instances with progress tracking."""
        records = []
        for i, (instance, inst_id) in enumerate(zip(instances, instance_ids)):
            record = self.explain_with_record(
                instance, predict_fn, inst_id, num_features
            )
            records.append(record)
            if (i + 1) % 10 == 0:
                print(f"Explained {i+1}/{len(instances)} instances")
        return records
 
 
# Usage
# prod_explainer = ProductionLIMEExplainer(
#     training_data=X_train,
#     feature_names=feature_names,
#     class_names=class_names,
#     model_version="v1.2.3",
#     num_samples=5000,
#     random_state=42  # Fixed for reproducibility
# )
# record = prod_explainer.explain_with_record(instance, model.predict_proba, "user_123")
# print(record.to_json())

Audit Trail Requirements

For regulated industries (finance, healthcare), store: instance hash, explanation, hyperparameters, model version, and timestamp. This enables reproduction and verification of past explanations.

Summary

LIME pioneered practical model-agnostic explanations and remains valuable despite its limitations. Let's consolidate the key insights:

Key Takeaways

•Local approximation: LIME explains predictions by fitting interpretable models in local neighborhoods
•Model-agnostic: Works on any model that can produce predictions
•Hyperparameter sensitive: Kernel width, num_samples, and discretization choices affect results significantly
•Stability concerns: Random sampling causes run-to-run variance; use fixed seeds for production
•Fidelity matters: Check R² score to verify explanation quality (>0.8 recommended)
•vs SHAP: LIME lacks theoretical guarantees; prefer SHAP when stability and exactness matter
•Best for text/images: LIME's superpixel and word-based explanations are intuitive for unstructured data

Ready for Integrated Gradients

You now understand LIME—a widely-used but theoretically weaker alternative to SHAP. Next, we'll explore Integrated Gradients, a gradient-based attribution method specifically designed for differentiable models like neural networks, which satisfies important theoretical properties while being computationally efficient.

LIME

Explaining Predictions Through Local Approximation

What You Will Learn

The LIME Framework

LIME operates on a simple but powerful principle: complex models are locally simple.

The Core Idea

Choose an instance to explain: The specific prediction you want to understand
Generate perturbed samples: Create variations of that instance by slightly modifying features
Query the black-box model: Get predictions for all perturbed samples
Weight by proximity: Nearby perturbations matter more than distant ones
Fit an interpretable model: Train a simple model (linear, decision tree) on the weighted samples
Extract explanation: The interpretable model's coefficients/structure explain the original prediction

Mathematical Formulation

LIME seeks an explanation $g$ from a class of interpretable models $G$ (e.g., linear models) that minimizes:

$$\xi(x) = \underset{g \in G}{\text{argmin}} ; \mathcal{L}(f, g, \pi_x) + \Omega(g)$$

Where:

$f$ is the original (black-box) model
$g$ is the interpretable surrogate model
$\pi_x$ is a proximity measure centered at instance $x$
$\mathcal{L}$ is the faithfulness loss (how well $g$ approximates $f$ locally)
$\Omega(g)$ is model complexity (encouraging simpler explanations)

The Taylor Expansion Intuition

The Interpretable Representation

A crucial aspect of LIME is the interpretable representation $x' \in {0,1}^{d'}$.

For tabular data, this might be:

Binary indicators for whether each feature equals (or is close to) its original value

For text:

Binary indicators for whether each word is present

For images:

Binary indicators for whether each superpixel is visible

The interpretable representation must be:

Human-understandable: Presence/absence of components
Mappable to original space: We can convert interpretable representations back to actual inputs
Lower-dimensional: Often $d' < d$ (original feature space)

The LIME Algorithm in Detail

Let's formalize the LIME algorithm step by step.

Step 1: Generate Perturbations

For tabular data, LIME typically:

Discretizes continuous features into quartiles/quantiles
Creates binary "interpretable" representation
Samples perturbations by randomly toggling these binary features
Maps perturbations back to original feature space

Step 2: Compute Proximity Weights

Each perturbation $z$ receives a weight based on its distance from the original instance $x$:

$$\pi_x(z) = \exp\left(-\frac{D(x, z)^2}{\sigma^2}\right)$$

This exponential kernel ensures:

Perturbations very close to $x$ get weight ≈ 1
Weight decays exponentially with distance
$\sigma$ controls the locality (width of the neighborhood)

Step 3: Query Black-Box Model

For each perturbation $z_i$, query the model to get prediction $f(z_i)$. This is the expensive step—requires many model evaluations.

Step 4: Fit Weighted Linear Model

Solve the weighted least-squares problem:

$$\min_{w} \sum_{i} \pi_x(z_i) \cdot (f(z_i) - w^\top z'_i)^2 + \lambda |w|_1$$

where:

$z'_i$ is the interpretable representation of perturbation $z_i$
$w$ are the explanation weights (feature attributions)
$\lambda$ is L1 regularization for sparsity

Step 5: Extract Explanation

The fitted weights $w$ are the feature attributions. Large positive $w_j$ means feature $j$ pushed the prediction higher. Large negative means it pushed lower.

lime_from_scratch.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
import numpy as np
from sklearn.linear_model import Ridge, Lasso
from sklearn.metrics.pairwise import euclidean_distances
from typing import Callable, Tuple, List
 
class SimpleLIME:
    """
    Simplified LIME implementation for educational purposes.
    For production, use the 'lime' package.
    """
    
    def __init__(
        self,
        model_predict: Callable,
        feature_names: List[str],
        kernel_width: float = 0.75,
        num_samples: int = 5000,
        random_state: int = 42
    ):
        """
        Parameters
        ----------
        model_predict : callable returning predictions (probabilities for classification)
        feature_names : list of feature names
        kernel_width : controls locality (as fraction of sqrt(num_features))
        num_samples : number of perturbations to generate
        """
        self.model_predict = model_predict
        self.feature_names = feature_names
        self.kernel_width = kernel_width
        self.num_samples = num_samples
        self.rng = np.random.RandomState(random_state)
    
    def explain(
        self,
        instance: np.ndarray,
        training_data: np.ndarray,
        num_features: int = 10
    ) -> dict:
        """
        Explain a single prediction.
        
        Parameters
        ----------
        instance : 1D array of feature values
        training_data : 2D array used for computing statistics
        num_features : number of top features to return
        
        Returns
        -------
        dict with explanation components
        """
        instance = instance.flatten()
        n_features = len(instance)
        
        # Compute training data statistics for sampling
        mean = training_data.mean(axis=0)
        std = training_data.std(axis=0) + 1e-8  # Avoid division by zero
        
        # Generate perturbations by sampling from Normal distribution
        # centered at instance
        perturbations = self.rng.normal(
            loc=instance,
            scale=std,
            size=(self.num_samples, n_features)
        )
        
        # First perturbation is the original instance
        perturbations[0] = instance
        
        # Query black-box model
        predictions = self.model_predict(perturbations)
        if predictions.ndim > 1:
            # Classification: take probability of predicted class
            pred_class = self.model_predict(instance.reshape(1, -1)).argmax()
            predictions = predictions[:, pred_class]
        
        # Compute distances from instance
        distances = euclidean_distances(
            perturbations, 
            instance.reshape(1, -1)
        ).flatten()
        
        # Compute proximity weights using exponential kernel
        kernel_width = self.kernel_width * np.sqrt(n_features)
        weights = np.exp(-(distances ** 2) / (kernel_width ** 2))
        
        # Fit weighted linear model with L1 regularization
        # Use scaled features for interpretability
        perturbations_scaled = (perturbations - mean) / std
        instance_scaled = (instance - mean) / std
        
        # Lasso for sparse explanation
        model = Ridge(alpha=1.0)
        model.fit(perturbations_scaled, predictions, sample_weight=weights)
        
        # Extract top features by absolute coefficient
        coefficients = model.coef_
        top_indices = np.argsort(np.abs(coefficients))[::-1][:num_features]
        
        explanation = {
            'instance': instance,
            'prediction': self.model_predict(instance.reshape(1, -1)),
            'intercept': model.intercept_,
            'local_prediction': model.predict(instance_scaled.reshape(1, -1))[0],
            'top_features': [
                {
                    'feature': self.feature_names[i],
                    'coefficient': coefficients[i],
                    'value': instance[i],
                    'contribution': coefficients[i] * instance_scaled[i]
                }
                for i in top_indices
            ],
            'all_coefficients': {
                self.feature_names[i]: coefficients[i]
                for i in range(n_features)
            },
            'r2_score': model.score(
                perturbations_scaled, 
                predictions, 
                sample_weight=weights
            )
        }
        
        return explanation
 
 
# Example usage
if __name__ == "__main__":
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.datasets import load_breast_cancer
    from sklearn.model_selection import train_test_split
    
    # Load data and train model
    data = load_breast_cancer()
    X_train, X_test, y_train, y_test = train_test_split(
        data.data, data.target, test_size=0.2, random_state=42
    )
    
    model = RandomForestClassifier(n_estimators=100, random_state=42)
    model.fit(X_train, y_train)
    
    # Create LIME explainer
    lime_explainer = SimpleLIME(
        model_predict=model.predict_proba,
        feature_names=list(data.feature_names),
        num_samples=5000
    )
    
    # Explain a prediction
    idx = 0
    explanation = lime_explainer.explain(
        X_test[idx],
        training_data=X_train,
        num_features=5
    )
    
    print(f"Prediction: {explanation['prediction']}")
    print(f"Local R² Score: {explanation['r2_score']:.4f}")
    print("
Top Features:")
    for feat in explanation['top_features']:
        direction = "↑" if feat['coefficient'] > 0 else "↓"
        print(f"  {feat['feature']}: {feat['coefficient']:+.4f} {direction}")

Using the LIME Library

The official lime Python package provides robust, well-tested implementations for tabular data, text, and images.

Installation

pip install lime

LIME for Tabular Data

lime_tabular.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import lime
import lime.lime_tabular
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
 
# Load data
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.2, random_state=42
)
 
# Train model
model = GradientBoostingClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
 
# Create LIME explainer
explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=X_train,
    feature_names=data.feature_names,
    class_names=['malignant', 'benign'],
    mode='classification',
    discretize_continuous=True,  # Discretize into quartiles
    random_state=42
)
 
# Explain a single prediction
idx = 0
instance = X_test[idx]
 
explanation = explainer.explain_instance(
    data_row=instance,
    predict_fn=model.predict_proba,
    num_features=10,
    num_samples=5000
)
 
# Print explanation
print(f"Instance {idx}")
print(f"True label: {y_test[idx]} ({data.target_names[y_test[idx]]})")
print(f"Predicted: {model.predict([instance])[0]}")
print(f"Predicted proba: {model.predict_proba([instance])[0]}")
 
print("
Top contributing features:")
for feature, weight in explanation.as_list():
    direction = "→ benign" if weight > 0 else "→ malignant"
    print(f"  {feature}: {weight:+.4f} {direction}")
 
# Show local prediction accuracy
print(f"
Local model R² (fidelity): {explanation.score:.4f}")
 
# Visualize in notebook
# explanation.show_in_notebook()
 
# Or save to HTML
# explanation.save_to_file('lime_explanation.html')
 
# Get raw map of feature index to weight
feature_weights = dict(explanation.local_exp[1])  # Class 1 (benign)
print("
Feature weights (raw):", feature_weights)

LIME for Text Classification

lime_text.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import lime
import lime.lime_text
from sklearn.pipeline import make_pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import fetch_20newsgroups
 
# Load text data
categories = ['alt.atheism', 'soc.religion.christian']
newsgroups = fetch_20newsgroups(
    subset='train', 
    categories=categories,
    remove=('headers', 'footers', 'quotes')
)
 
# Train text classifier
vectorizer = TfidfVectorizer(max_features=5000)
classifier = LogisticRegression(max_iter=1000, random_state=42)
 
pipeline = make_pipeline(vectorizer, classifier)
pipeline.fit(newsgroups.data, newsgroups.target)
 
# Create LIME text explainer
explainer = lime.lime_text.LimeTextExplainer(
    class_names=newsgroups.target_names,
    random_state=42
)
 
# Explain a prediction
idx = 0
text = newsgroups.data[idx]
 
explanation = explainer.explain_instance(
    text_instance=text,
    classifier_fn=pipeline.predict_proba,
    num_features=10,
    num_samples=1000
)
 
print(f"Text: {text[:200]}...")
print(f"
Predicted class: {newsgroups.target_names[pipeline.predict([text])[0]]}")
print(f"True class: {newsgroups.target_names[newsgroups.target[idx]]}")
 
print("
Explanation (words that influenced prediction):")
for word, weight in explanation.as_list():
    class_direction = newsgroups.target_names[1] if weight > 0 else newsgroups.target_names[0]
    print(f"  '{word}': {weight:+.4f} → {class_direction}")

LIME for Images

For images, LIME works with superpixels—contiguous regions of similar pixels. It explains which superpixels contributed to the prediction.

lime_image.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
import lime
from lime import lime_image
from skimage.segmentation import mark_boundaries
import numpy as np
import matplotlib.pyplot as plt
 
# Assume we have a trained image classifier
# from tensorflow.keras.applications import ResNet50
# model = ResNet50(weights='imagenet')
 
# Create LIME image explainer
explainer = lime_image.LimeImageExplainer(random_state=42)
 
# Explain a prediction
# image = load_and_preprocess_image('cat.jpg')  # Shape: (224, 224, 3)
 
def explain_image_prediction(model, image, explainer):
    """
    Explain an image classification prediction.
    
    Parameters
    ----------
    model : image classifier with predict method
    image : numpy array of shape (H, W, 3)
    explainer : LimeImageExplainer instance
    """
    
    # Explain instance
    explanation = explainer.explain_instance(
        image,
        model.predict,  # Prediction function
        top_labels=3,   # Explain top 3 predictions
        hide_color=0,   # Black for hidden superpixels
        num_samples=1000
    )
    
    # Get explanation for top predicted class
    top_label = explanation.top_labels[0]
    
    # Get image showing positive contributions (green)
    temp, mask = explanation.get_image_and_mask(
        label=top_label,
        positive_only=True,
        num_features=5,    # Top 5 superpixels
        hide_rest=False
    )
    
    # Visualize
    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
    
    axes[0].imshow(image)
    axes[0].set_title('Original Image')
    
    axes[1].imshow(mark_boundaries(temp, mask))
    axes[1].set_title(f'Positive regions for class {top_label}')
    
    # Show both positive (green) and negative (red) contributions
    temp2, mask2 = explanation.get_image_and_mask(
        label=top_label,
        positive_only=False,
        num_features=10,
        hide_rest=False
    )
    axes[2].imshow(mark_boundaries(temp2, mask2))
    axes[2].set_title('Positive (green) & Negative (red)')
    
    plt.tight_layout()
    plt.savefig('lime_image_explanation.png', dpi=150)
    plt.show()
    
    return explanation
 
# Example usage (with actual model):
# explanation = explain_image_prediction(model, image, explainer)

Kernel and Distance Functions

The kernel function in LIME controls locality—how quickly influence decays with distance. Understanding and tuning this is crucial for quality explanations.

The Exponential Kernel

LIME's default kernel:

$$\pi_x(z) = \exp\left(-\frac{D(x, z)^2}{\sigma^2}\right)$$

Kernel width $\sigma$ has profound effects:

$\sigma$ Value	Locality	Behavior
Very small	Very local	Explanation only valid for near-identical instances
Small	Local	Captures local decision boundary
Medium	Moderate	Balances locality with stability
Large	Global	Approaches global linear approximation
Very large	Global	Ignores locality entirely

Choosing Kernel Width

LIME's default: kernel_width = 0.75 * sqrt(num_features)

This heuristic works reasonably but isn't optimal for all cases:

Tight decision boundaries: Use smaller kernel width
Smooth decision regions: Larger width is acceptable
High-dimensional data: May need adjustment

The Distance Metric

LIME computes distances in different spaces:

Tabular data: Euclidean distance on standardized features

Text: Cosine similarity on word presence vectors (or Euclidean on binary vectors)

Images: Distance on superpixel presence/absence

kernel_width_analysis.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons
from sklearn.ensemble import RandomForestClassifier
import lime.lime_tabular
 
# Create dataset with clear decision boundary
X, y = make_moons(n_samples=500, noise=0.15, random_state=42)
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X, y)
 
# Point to explain (near decision boundary)
test_point = np.array([0.0, 0.25])
 
# Test different kernel widths
kernel_widths = [0.1, 0.5, 1.0, 2.0, 5.0]
 
fig, axes = plt.subplots(1, len(kernel_widths), figsize=(20, 4))
 
for ax, kw in zip(axes, kernel_widths):
    explainer = lime.lime_tabular.LimeTabularExplainer(
        training_data=X,
        feature_names=['x1', 'x2'],
        class_names=['Class 0', 'Class 1'],
        mode='classification',
        kernel_width=kw,
        random_state=42
    )
    
    explanation = explainer.explain_instance(
        test_point,
        model.predict_proba,
        num_features=2,
        num_samples=5000
    )
    
    # Get coefficients
    weights = dict(explanation.as_map()[1])
    coef = [weights.get(i, 0) for i in range(2)]
    
    # Plot decision boundary
    xx, yy = np.meshgrid(
        np.linspace(-1.5, 2.5, 100),
        np.linspace(-1, 1.5, 100)
    )
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    
    ax.contourf(xx, yy, Z, alpha=0.3, cmap='RdBu')
    ax.scatter(X[:, 0], X[:, 1], c=y, cmap='RdBu', s=10, alpha=0.5)
    ax.scatter(*test_point, s=200, c='yellow', edgecolor='black', zorder=5)
    
    # Draw local linear approximation
    # y = b + w1*x1 + w2*x2 => for y=0.5: x2 = (0.5-b-w1*x1)/w2
    intercept = explanation.intercept[1]
    
    ax.set_title(f'kernel_width = {kw}
R² = {explanation.score:.3f}')
    ax.set_xlim(-1.5, 2.5)
    ax.set_ylim(-1, 1.5)
 
plt.suptitle('Effect of Kernel Width on LIME Explanations')
plt.tight_layout()
plt.savefig('kernel_width_analysis.png', dpi=150)
plt.show()

Kernel Width Sensitivity

LIME vs SHAP: A Critical Comparison

LIME and SHAP both provide feature attributions but differ fundamentally in their approach, theoretical guarantees, and practical behavior.

Theoretical Foundation

SHAP: Grounded in Shapley values from game theory. Unique attribution satisfying efficiency, symmetry, null player, and consistency axioms. Mathematically principled.

LIME: Grounded in local surrogate modeling. Optimizes local fidelity + simplicity. No uniqueness theorem; results depend on hyperparameters.

Local Accuracy Guarantee

SHAP: Exact decomposition. SHAP values + base value = prediction. Always, by construction.

LIME: Approximate. The local linear model approximates the prediction with some error (measured by R² score). May not equal prediction exactly.

Stability

SHAP (TreeSHAP): Deterministic. Same input gives identical explanation.

LIME: Stochastic. Random sampling means repeated runs give slightly different explanations. Can be significant near decision boundaries.

LIME vs SHAP Feature Comparison
Aspect	LIME	SHAP
Theoretical basis	Local linear approximation	Shapley values (game theory)
Uniqueness	No (depends on hyperparameters)	Yes (unique fair attribution)
Local accuracy	Approximate (R² score)	Exact (sums to prediction)
Stability	Stochastic (varies between runs)	Deterministic (for TreeSHAP)
Computational cost	O(num_samples × inference)	O(2^p) exact, efficient for trees
Model-specific versions	No (always perturb+fit)	Yes (TreeSHAP, DeepSHAP)
Hyperparameter sensitivity	High (kernel width, num_samples)	Lower (background distribution)
Interpretable to humans	Linear weights + feature conditions	Additive contributions

lime_vs_shap_comparison.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
import numpy as np
import shap
import lime.lime_tabular
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
import pandas as pd
 
# Load data and train model
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.2, random_state=42
)
 
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
 
# Instance to explain
idx = 0
instance = X_test[idx]
 
# SHAP explanation
shap_explainer = shap.TreeExplainer(model)
shap_values = shap_explainer.shap_values(instance.reshape(1, -1))
 
# LIME explanation (run multiple times to assess stability)
lime_explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=X_train,
    feature_names=data.feature_names,
    class_names=['malignant', 'benign'],
    mode='classification',
    random_state=42
)
 
lime_explanation = lime_explainer.explain_instance(
    instance, 
    model.predict_proba,
    num_features=10,
    num_samples=5000
)
 
# Compare top features
print("="*60)
print("COMPARISON: SHAP vs LIME")
print("="*60)
 
print("
SHAP Top 10 Features (for class 1 - benign):")
shap_class1 = shap_values[1][0]  # Class 1 SHAP values
shap_sorted = sorted(zip(data.feature_names, shap_class1), 
                     key=lambda x: abs(x[1]), reverse=True)[:10]
for name, val in shap_sorted:
    print(f"  {name:>30}: {val:+.4f}")
 
print("
LIME Top 10 Features (for class 1 - benign):")
lime_features = lime_explanation.as_list(label=1)[:10]
for condition, weight in lime_features:
    print(f"  {condition:>40}: {weight:+.4f}")
 
# Check stability of LIME across runs
print("
" + "="*60)
print("LIME STABILITY CHECK (5 runs)")
print("="*60)
 
lime_runs = []
for seed in range(5):
    exp = lime_explainer.explain_instance(
        instance, 
        model.predict_proba,
        num_features=10,
        num_samples=5000,
        random_state=seed  # Different seed each run
    )
    weights = dict(exp.as_map()[1])
    lime_runs.append(weights)
 
# Check variance in top feature rankings
print("
Top feature by importance across runs:")
for run_idx, weights in enumerate(lime_runs):
    top_feat = max(weights.items(), key=lambda x: abs(x[1]))
    print(f"  Run {run_idx}: Feature {top_feat[0]} = {top_feat[1]:.4f}")
 
print("
SHAP is deterministic - same result every time")
print(f"LIME R² (local fidelity): {lime_explanation.score:.4f}")

When to Use LIME

•Need human-readable feature conditions (e.g., 'age > 45')
•Text or image classification (excellent LIME support)
•Quick prototype without SHAP setup
•When non-tree models make TreeSHAP unavailable
•Educational demonstrations of local explanations

When to Use SHAP

•Need stable, reproducible explanations
•Exact decomposition required (regulatory, auditing)
•Tree-based models (TreeSHAP is fast and exact)
•Want global importance from local explanations
•Theoretical guarantees matter

Stability and Fidelity Issues

LIME has known issues with stability (consistency across runs) and fidelity (how well the explanation matches the model). Understanding these is crucial for responsible use.

The Stability Problem

LIME's randomness comes from:

Random sampling of perturbations
Random feature discretization (if enabled)
Random selection of superpixels (for images)

Consequence: Run LIME twice, get different explanations. This is problematic for:

Regulatory compliance (need reproducible explanations)
User trust (why did the explanation change?)
Decision-making (which explanation is correct?)

Measuring Stability

One approach: Jaccard similarity of top-k features across runs.

$$\text{Stability} = \frac{1}{\binom{R}{2}} \sum_{i<j} \frac{|F_i \cap F_j|}{|F_i \cup F_j|}$$

where $F_i$ is the set of top-k features in run $i$.

lime_stability_analysis.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
import numpy as np
from itertools import combinations
 
def lime_stability_analysis(
    explainer, 
    instance, 
    predict_fn, 
    num_runs=10, 
    top_k=5,
    num_samples=5000
):
    """
    Analyze stability of LIME explanations across multiple runs.
    """
    all_top_features = []
    all_weights = []
    all_fidelity = []
    
    for run in range(num_runs):
        exp = explainer.explain_instance(
            instance,
            predict_fn,
            num_features=top_k,
            num_samples=num_samples,
            random_state=run  # Different seed each run
        )
        
        # Extract top features indices
        top_features = set(exp.as_map()[1].keys())
        all_top_features.append(top_features)
        
        # Store weights
        weights = exp.as_map()[1]
        all_weights.append(weights)
        
        # Store local fidelity
        all_fidelity.append(exp.score)
    
    # Compute pairwise Jaccard similarity
    jaccard_scores = []
    for f1, f2 in combinations(all_top_features, 2):
        intersection = len(f1 & f2)
        union = len(f1 | f2)
        jaccard_scores.append(intersection / union if union > 0 else 0)
    
    # Compute weight correlation across runs
    # First, get all unique features
    all_feat_indices = set()
    for weights in all_weights:
        all_feat_indices.update(weights.keys())
    
    weight_matrix = np.zeros((num_runs, len(all_feat_indices)))
    feat_to_idx = {f: i for i, f in enumerate(all_feat_indices)}
    
    for run, weights in enumerate(all_weights):
        for feat, weight in weights.items():
            weight_matrix[run, feat_to_idx[feat]] = weight
    
    # Pairwise correlations
    correlations = []
    for i, j in combinations(range(num_runs), 2):
        corr = np.corrcoef(weight_matrix[i], weight_matrix[j])[0, 1]
        if not np.isnan(corr):
            correlations.append(corr)
    
    results = {
        'mean_jaccard': np.mean(jaccard_scores),
        'std_jaccard': np.std(jaccard_scores),
        'min_jaccard': np.min(jaccard_scores),
        'mean_correlation': np.mean(correlations) if correlations else 0,
        'mean_fidelity': np.mean(all_fidelity),
        'std_fidelity': np.std(all_fidelity),
        'num_runs': num_runs
    }
    
    return results
 
# Usage
# stability = lime_stability_analysis(lime_explainer, instance, model.predict_proba)
# print(f"Mean Jaccard (top feature overlap): {stability['mean_jaccard']:.3f}")
# print(f"Mean Weight Correlation: {stability['mean_correlation']:.3f}")
# print(f"Mean Local Fidelity: {stability['mean_fidelity']:.3f}")

The Fidelity Problem

Fidelity measures how well the local linear model approximates the black-box model. Poor fidelity means the explanation doesn't accurately reflect the model's behavior.

Causes of low fidelity:

Complex, highly nonlinear decision boundaries
Insufficient sampling (num_samples too low)
Inappropriate kernel width
High-dimensional feature space

Best practice: Always check the R² score (reported as explanation.score). Treat explanations with R² < 0.8 skeptically.

Mitigation Strategies

Improving LIME Stability and Fidelity

•Increase num_samples: More perturbations → more stable estimates (but slower)
•Fixed random seed: Use same seed for reproducible explanations
•Tune kernel_width: Experiment to find optimal locality
•Ensemble explanations: Average weights across multiple runs
•Check R² score: Reject explanations below threshold
•Use SHAP instead: When stability is paramount

Production Considerations

Deploying LIME in production requires addressing computational cost, stability guarantees, and explanation formatting.

Computational Cost

LIME requires num_samples model evaluations per explanation. For:

Fast models (linear, small RF): 1000-5000 samples feasible
Slow models (large neural networks): May need 100-500 samples
Real-time APIs: Consider caching or precomputation

Ensuring Reproducibility

For regulatory and auditing purposes, explanations must be reproducible:

lime_production.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
import lime.lime_tabular
import hashlib
import json
from datetime import datetime
from dataclasses import dataclass, asdict
from typing import List, Dict, Any
 
@dataclass
class LIMEExplanationRecord:
    """Audit record for LIME explanation."""
    instance_id: str
    instance_hash: str
    timestamp: str
    prediction: float
    explanation: List[Dict[str, Any]]
    fidelity_score: float
    hyperparameters: Dict[str, Any]
    model_version: str
    
    def to_json(self) -> str:
        return json.dumps(asdict(self), indent=2)
 
 
class ProductionLIMEExplainer:
    """Production-ready LIME wrapper with reproducibility and logging."""
    
    def __init__(
        self,
        training_data,
        feature_names,
        class_names,
        model_version: str,
        num_samples: int = 5000,
        random_state: int = 42
    ):
        self.explainer = lime.lime_tabular.LimeTabularExplainer(
            training_data=training_data,
            feature_names=feature_names,
            class_names=class_names,
            mode='classification',
            discretize_continuous=True,
            random_state=random_state
        )
        self.model_version = model_version
        self.num_samples = num_samples
        self.random_state = random_state
        self.feature_names = feature_names
        self.class_names = class_names
    
    def explain_with_record(
        self,
        instance,
        predict_fn,
        instance_id: str,
        num_features: int = 10,
        fidelity_threshold: float = 0.7
    ) -> LIMEExplanationRecord:
        """
        Generate LIME explanation with full audit record.
        """
        # Hash instance for reproducibility verification
        instance_hash = hashlib.md5(
            instance.tobytes()
        ).hexdigest()
        
        # Generate explanation
        explanation = self.explainer.explain_instance(
            instance,
            predict_fn,
            num_features=num_features,
            num_samples=self.num_samples
        )
        
        # Check fidelity
        fidelity = explanation.score
        if fidelity < fidelity_threshold:
            print(f"WARNING: Low fidelity ({fidelity:.3f}) for instance {instance_id}")
        
        # Format explanation
        formatted_explanation = [
            {
                'condition': cond,
                'weight': weight,
                'direction': 'positive' if weight > 0 else 'negative'
            }
            for cond, weight in explanation.as_list()
        ]
        
        # Get prediction
        prediction = predict_fn(instance.reshape(1, -1))[0]
        if hasattr(prediction, '__len__'):
            prediction = float(max(prediction))
        
        # Create audit record
        record = LIMEExplanationRecord(
            instance_id=instance_id,
            instance_hash=instance_hash,
            timestamp=datetime.utcnow().isoformat(),
            prediction=float(prediction),
            explanation=formatted_explanation,
            fidelity_score=fidelity,
            hyperparameters={
                'num_samples': self.num_samples,
                'num_features': num_features,
                'random_state': self.random_state
            },
            model_version=self.model_version
        )
        
        return record
    
    def batch_explain(
        self,
        instances,
        instance_ids,
        predict_fn,
        num_features: int = 10
    ) -> List[LIMEExplanationRecord]:
        """Explain multiple instances with progress tracking."""
        records = []
        for i, (instance, inst_id) in enumerate(zip(instances, instance_ids)):
            record = self.explain_with_record(
                instance, predict_fn, inst_id, num_features
            )
            records.append(record)
            if (i + 1) % 10 == 0:
                print(f"Explained {i+1}/{len(instances)} instances")
        return records
 
 
# Usage
# prod_explainer = ProductionLIMEExplainer(
#     training_data=X_train,
#     feature_names=feature_names,
#     class_names=class_names,
#     model_version="v1.2.3",
#     num_samples=5000,
#     random_state=42  # Fixed for reproducibility
# )
# record = prod_explainer.explain_with_record(instance, model.predict_proba, "user_123")
# print(record.to_json())

Audit Trail Requirements

For regulated industries (finance, healthcare), store: instance hash, explanation, hyperparameters, model version, and timestamp. This enables reproduction and verification of past explanations.

Summary

LIME pioneered practical model-agnostic explanations and remains valuable despite its limitations. Let's consolidate the key insights:

Key Takeaways

•Local approximation: LIME explains predictions by fitting interpretable models in local neighborhoods
•Model-agnostic: Works on any model that can produce predictions
•Hyperparameter sensitive: Kernel width, num_samples, and discretization choices affect results significantly
•Stability concerns: Random sampling causes run-to-run variance; use fixed seeds for production
•Fidelity matters: Check R² score to verify explanation quality (>0.8 recommended)
•vs SHAP: LIME lacks theoretical guarantees; prefer SHAP when stability and exactness matter
•Best for text/images: LIME's superpixel and word-based explanations are intuitive for unstructured data

Ready for Integrated Gradients