Interpretability Fundamentals - Learning Module

Loading content...

0/278

Local vs Global Interpretability

The Scope of Explanation

Consider two questions a loan officer might ask about an ML-based credit decision system:

"Why was this specific applicant denied?" — The officer needs to explain to John Smith why his application was rejected. What factors specific to his application led to this outcome?
"How does the model generally work?" — The compliance team needs to audit the system. What factors does the model consider important overall? What patterns drive its decisions across all applicants?

These are fundamentally different questions requiring fundamentally different answers. The first asks for a local explanation—understanding a single prediction. The second asks for a global explanation—understanding the model's overall behavior.

This local vs. global distinction is the second major dimension of interpretability, orthogonal to the intrinsic/post-hoc distinction we explored earlier.

What You Will Learn

By the end of this page, you will understand: the precise distinction between local and global interpretability, the key methods for each scope, how local explanations can be aggregated to create global understanding, the different stakeholder needs each serves, and how to combine both for complete model understanding.

Local Interpretability: Explaining Individual Predictions

Local interpretability explains why a model made a specific prediction for a specific input. It answers the question: "Given this particular input, why did the model produce this particular output?"

Local explanations are instance-specific. The explanation for one input may be completely different from the explanation for another, because different features may be relevant in different contexts.

Local Explanation Characteristics

•Specific to one input instance
•May vary across instances
•Captures context-dependent reasoning
•Directly actionable for affected individuals
•Supports case-by-case auditing
•Answers 'why this prediction?'

Use Cases for Local Explanations

•Explaining decisions to affected users
•Debugging specific misclassifications
•Generating regulatory reason codes
•Supporting human-in-the-loop review
•Identifying model edge cases
•Legal discovery for contested decisions

Example: Medical diagnosis

Consider a model that predicts diabetes risk. For two patients:

Patient A (predicted high risk):

Local explanation: 'High risk primarily due to elevated fasting glucose (contributes +0.35 to risk score), combined with family history (+0.20) and BMI > 30 (+0.15). Age and activity level have minimal contribution.'

Patient B (also predicted high risk):

Local explanation: 'High risk driven by combination of age > 65 (+0.30), sedentary lifestyle (+0.25), and elevated HbA1c (+0.20). Fasting glucose is normal and does not contribute.'

Both patients have high risk predictions, but for completely different reasons. A global explanation saying 'fasting glucose is the most important feature' would be misleading for Patient B. Local interpretability captures this instance-specific nuance.

Key insight: Local explanations reveal that a model may behave very differently in different regions of the input space. What matters for one prediction may be irrelevant for another.

The Decision Boundary Perspective

Local interpretability can be thought of as describing the model's decision boundary in the vicinity of the specific input. Even if the global decision boundary is complex and non-linear, it can often be approximated locally by a simple linear function—this is the core insight behind methods like LIME.

Local Interpretability Methods

Several methods have been developed specifically for local interpretability. Each takes a different approach to explaining individual predictions.

LIME (Local Interpretable Model-agnostic Explanations)

Core idea: Approximate the complex model locally with a simple, interpretable linear model. Even if the global decision boundary is complex, it can often be approximated by a linear function in the neighborhood of any specific point.

Algorithm:

Given input x and complex model f, generate perturbed samples around x
Get model predictions for all perturbed samples
Weight samples by their proximity to x (closer = higher weight)
Fit a weighted linear model g to predict f's outputs from the perturbed samples
Return g's coefficients as the local explanation

Strengths:

Model-agnostic: works with any classifier or regressor
Produces interpretable linear explanations
Fast for individual predictions
Supports any data type (tabular, text, images)

Limitations:

Explanation depends on perturbation strategy
Can have high variance across runs
Local neighborhood definition is subjective
Linear approximation may miss non-linear local effects

Best for: Quick, model-agnostic local explanations when perfect faithfulness isn't critical.

local_interpretability_comparison.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
import shap
import lime
import lime.lime_tabular
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier
 
# Train model
model = GradientBoostingClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
 
# Select instance to explain
instance_idx = 42
instance = X_test[instance_idx:instance_idx+1]
prediction = model.predict(instance)[0]
probability = model.predict_proba(instance)[0, 1]
 
print(f"Instance {instance_idx}: Prediction = {prediction}, P(positive) = {probability:.3f}")
print("="*60)
 
# Method 1: SHAP Local Explanation
explainer_shap = shap.TreeExplainer(model)
shap_values = explainer_shap.shap_values(instance)
 
print("
📊 SHAP Local Explanation:")
print(f"Base value (expected): {explainer_shap.expected_value[1]:.4f}")
print(f"Sum of SHAP values: {shap_values[1][0].sum():.4f}")
print(f"Prediction log-odds: {explainer_shap.expected_value[1] + shap_values[1][0].sum():.4f}")
print("
Top 5 feature contributions:")
contributions = list(zip(feature_names, instance[0], shap_values[1][0]))
for name, val, shap_val in sorted(contributions, key=lambda x: -abs(x[2]))[:5]:
    direction = "↑" if shap_val > 0 else "↓"
    print(f"  {name}={val:.2f}: {shap_val:+.4f} {direction}")
 
# Method 2: LIME Local Explanation
explainer_lime = lime.lime_tabular.LimeTabularExplainer(
    X_train, 
    feature_names=feature_names,
    class_names=['Negative', 'Positive'],
    mode='classification'
)
 
lime_exp = explainer_lime.explain_instance(
    instance[0], 
    model.predict_proba,
    num_features=5
)
 
print("
📊 LIME Local Explanation:")
print(f"Local model intercept: {lime_exp.intercept[1]:.4f}")
print(f"Local prediction: {lime_exp.local_pred[0]:.4f}")
print("
Top 5 local linear coefficients:")
for feature_rule, weight in lime_exp.as_list()[:5]:
    direction = "↑" if weight > 0 else "↓"
    print(f"  {feature_rule}: {weight:+.4f} {direction}")
 
# Method 3: Counterfactual Explanation (using DiCE)
try:
    import dice_ml
    
    # Setup DiCE
    data = dice_ml.Data(
        dataframe=train_df,
        continuous_features=continuous_features,
        outcome_name='target'
    )
    dice_model = dice_ml.Model(model=model, backend='sklearn')
    exp = dice_ml.Dice(data, dice_model, method='random')
    
    # Generate counterfactuals
    query_instance = pd.DataFrame([instance[0]], columns=feature_names)
    cf = exp.generate_counterfactuals(
        query_instance,
        total_CFs=3,
        desired_class='opposite'
    )
    
    print("
📊 Counterfactual Explanations:")
    print("To flip the prediction, consider these changes:")
    cf.visualize_as_dataframe()
except ImportError:
    print("
(DiCE not installed - skipping counterfactual example)")
 
# Comparison: Do methods agree?
print("
" + "="*60)
print("📊 Method Agreement Analysis")
print("="*60)
 
# Get top 3 features from each method
shap_top3 = set([name for name, _, _ in sorted(contributions, key=lambda x: -abs(x[2]))[:3]])
lime_top3 = set([feat.split()[0] for feat, _ in lime_exp.as_list()[:3]])
 
overlap = shap_top3.intersection(lime_top3)
print(f"SHAP top 3: {shap_top3}")
print(f"LIME top 3: {lime_top3}")
print(f"Agreement: {len(overlap)}/3 features overlap")
 
# Note: Some disagreement is expected due to different methodologies
# High disagreement may indicate instability or complex local behavior

Global Interpretability: Understanding Overall Behavior

Global interpretability explains the model's overall behavior across all inputs. It answers questions like: "What features does this model generally consider important? What patterns has it learned? How does it behave across the input space?"

Global explanations provide a bird's-eye view of the model, abstracting away instance-specific details to reveal general principles.

Global Explanation Characteristics

•Summarizes behavior over all inputs
•Provides consistent, stable view
•Enables systemic auditing
•Reveals learned patterns and biases
•Supports model comparison
•Answers 'how does the model work?'

Use Cases for Global Explanations

•Model validation and debugging
•Regulatory compliance audits
•Bias detection across populations
•Scientific discovery of patterns
•Documentation and model cards
•Stakeholder communication

The abstraction challenge:

Global explanations necessarily involve abstraction. A complex model may behave differently in different regions of the input space. A global explanation must somehow summarize this heterogeneous behavior.

Consider a credit model:

For young applicants, credit history length matters most
For older applicants, income and debt ratio dominate
For very high income applicants, most other features become irrelevant

A global explanation might say 'income is the most important feature on average,' but this average obscures the heterogeneous local behavior. This is both the power and the limitation of global interpretability—it simplifies, which aids comprehension but can mislead.

Types of global explanation:

Feature importance rankings — Which features matter most, on average?
Partial dependence plots — How does average prediction change as each feature varies?
Feature interaction detection — Which pairs of features have joint effects?
Global surrogate models — What simple model best approximates the complex one?
Prototype extraction — What representative examples characterize model behavior?

The Averaging Problem

Global explanations can be dangerously misleading when local behavior varies significantly. Simpson's paradox applies: a feature might have positive effect globally but negative effect for specific subgroups. Always combine global explanations with local analysis for high-stakes applications.

Global Interpretability Methods

Global interpretability methods provide various lenses for understanding overall model behavior.

Global Feature Importance Methods

Feature importance methods rank features by their overall contribution to model predictions.

Permutation Importance:

Shuffle one feature's values across all instances
Measure decrease in model performance (accuracy, AUC, etc.)
Large decrease = feature is important
Intuitive and model-agnostic
Captures any type of feature effect (linear, non-linear, interaction)

Impurity-based Importance (Trees):

Sum decrease in impurity (Gini, entropy) from splits on each feature
Fast to compute (already available from training)
Biased toward high-cardinality features
Only available for tree-based models

SHAP Global Importance:

Average |SHAP value| across all instances
Theoretically grounded, handles interactions properly
Computationally expensive for large datasets
Consistent with local SHAP explanations

Comparison:

Method	Speed	Bias Issues	Handles Interactions	Theory
Permutation	Medium	Low	Yes	Sound
Impurity	Fast	High-cardinality bias	Partial	Heuristic
SHAP	Slow	Low	Yes	Axiomatic

global_interpretability_methods.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
import shap
import numpy as np
import matplotlib.pyplot as plt
from sklearn.inspection import permutation_importance, PartialDependenceDisplay
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.tree import DecisionTreeClassifier
 
# Train complex model
model = GradientBoostingClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
 
# ============================================
# Method 1: Permutation Importance
# ============================================
print("📊 Permutation Feature Importance")
print("="*50)
 
perm_imp = permutation_importance(
    model, X_test, y_test, 
    n_repeats=30, 
    random_state=42
)
 
for name, imp, std in sorted(
    zip(feature_names, perm_imp.importances_mean, perm_imp.importances_std),
    key=lambda x: -x[1]
):
    print(f"  {name}: {imp:.4f} ± {std:.4f}")
 
# ============================================
# Method 2: SHAP Global Importance
# ============================================
print("
📊 SHAP Global Feature Importance")
print("="*50)
 
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
 
# Mean absolute SHAP value per feature
shap_importance = np.abs(shap_values[1]).mean(axis=0)
for name, imp in sorted(zip(feature_names, shap_importance), key=lambda x: -x[1]):
    print(f"  {name}: {imp:.4f}")
 
# SHAP summary plot (beeswarm) - shows distribution of effects
shap.summary_plot(shap_values[1], X_test, feature_names=feature_names, show=False)
plt.title("SHAP Summary Plot")
plt.tight_layout()
plt.savefig("shap_summary.png", dpi=150)
plt.close()
 
# ============================================
# Method 3: Partial Dependence Plots
# ============================================
print("
📊 Generating Partial Dependence Plots...")
 
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
top_features = [name for name, _ in sorted(
    zip(feature_names, shap_importance), key=lambda x: -x[1]
)[:6]]
 
for ax, feature in zip(axes.flatten(), top_features):
    feature_idx = list(feature_names).index(feature)
    PartialDependenceDisplay.from_estimator(
        model, X_test, [feature_idx],
        feature_names=feature_names,
        ax=ax
    )
    ax.set_title(f"PDP: {feature}")
 
plt.tight_layout()
plt.savefig("pdp_plots.png", dpi=150)
plt.close()
print("  Saved to pdp_plots.png")
 
# ============================================
# Method 4: Global Surrogate Model
# ============================================
print("
📊 Global Surrogate Model Analysis")
print("="*50)
 
# Use black-box predictions as target
bb_predictions = model.predict(X_train)
 
# Train interpretable surrogate
surrogate = DecisionTreeClassifier(max_depth=5, random_state=42)
surrogate.fit(X_train, bb_predictions)
 
# Evaluate fidelity
train_fidelity = (surrogate.predict(X_train) == bb_predictions).mean()
test_predictions = model.predict(X_test)
test_fidelity = (surrogate.predict(X_test) == test_predictions).mean()
 
print(f"  Surrogate fidelity (train): {train_fidelity:.3f}")
print(f"  Surrogate fidelity (test): {test_fidelity:.3f}")
 
if test_fidelity > 0.85:
    print("  ✓ High fidelity - surrogate explanations are reasonably faithful")
    # Export tree rules
    from sklearn.tree import export_text
    print("
  Decision tree rules (simplified global explanation):")
    print(export_text(surrogate, feature_names=list(feature_names), max_depth=3))
else:
    print("  ⚠ Low fidelity - interpret surrogate with caution")
 
# ============================================
# Comparing Global Methods
# ============================================
print("
📊 Cross-Method Agreement")
print("="*50)
 
# Get top 5 from each method
perm_top5 = set([name for name, _, _ in sorted(
    zip(feature_names, perm_imp.importances_mean, perm_imp.importances_std),
    key=lambda x: -x[1]
)[:5]])
 
shap_top5 = set([name for name, _ in sorted(
    zip(feature_names, shap_importance), key=lambda x: -x[1]
)[:5]])
 
print(f"  Permutation top 5: {perm_top5}")
print(f"  SHAP top 5: {shap_top5}")
print(f"  Agreement: {len(perm_top5.intersection(shap_top5))}/5 features")

From Local to Global: Aggregation Approaches

A powerful approach to global interpretability is aggregating local explanations. This preserves the faithfulness of local methods while providing global insight.

The key insight: If we have a local explanation for every prediction, we can analyze patterns in these explanations to understand global behavior.

Local-to-Global Aggregation Strategies

•Mean Absolute SHAP — Average |SHAP value| across all instances. This gives global feature importance that's consistent with local explanations.
•SHAP Dependence Plots — Plot SHAP value for a feature against feature value, colored by another feature. Reveals both main effects and interactions.
•SHAP Clustering — Cluster instances by their SHAP value profiles. Each cluster represents a distinct 'explanation mode' with different decision logic.
•Local Explanation Histograms — For each feature, histogram the local importance values. Wide distributions indicate heterogeneous effects.
•Subgroup Analysis — Aggregate local explanations within subgroups (demographics, segments). Reveals if model treats groups differently.
•Explanation Prototypes — Find representative instances whose local explanations typify common patterns. Present as 'model explanation archetypes.'

Why aggregation beats direct global methods:

Preserves heterogeneity: By keeping individual explanations before aggregating, we can analyze variation, not just averages.
Consistent interpretation: Local and global explanations use the same units and logic. SHAP values mean the same thing locally and globally.
Faithful to complex models: Direct global methods sometimes oversimplify. Aggregating faithful local explanations maintains accuracy.
Enables stratified analysis: We can aggregate differently for different subgroups, revealing heterogeneous treatment.
Supports auditing: Regulators can examine both individual explanations and aggregate patterns.

The SHAP Summary Plot

SHAP's summary plot (beeswarm plot) is the quintessential local-to-global visualization. Each dot is a local SHAP value, positioned by feature importance and magnitude. The distribution of dots reveals both feature rankings AND how feature values affect predictions. This single visualization bridges local and global interpretability beautifully.

local_to_global_aggregation.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
import shap
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
 
# Compute SHAP values for all test instances
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)[1]  # For positive class
 
# ============================================
# Aggregation 1: Mean Absolute SHAP (Global Importance)
# ============================================
print("📊 Global Feature Importance (Mean |SHAP|)")
global_importance = np.abs(shap_values).mean(axis=0)
for name, imp in sorted(zip(feature_names, global_importance), key=lambda x: -x[1]):
    print(f"  {name}: {imp:.4f}")
 
# ============================================
# Aggregation 2: SHAP Distribution Analysis
# ============================================
print("
📊 SHAP Value Distribution Analysis")
print("(Heterogeneity indicator: std/mean)")
 
for i, name in enumerate(feature_names):
    mean_abs = np.abs(shap_values[:, i]).mean()
    std = np.abs(shap_values[:, i]).std()
    heterogeneity = std / mean_abs if mean_abs > 0 else 0
    print(f"  {name}: mean={mean_abs:.4f}, std={std:.4f}, CV={heterogeneity:.2f}")
 
# High CV = heterogeneous effect (feature matters more for some instances)
# Low CV = homogeneous effect (feature matters similarly across instances)
 
# ============================================
# Aggregation 3: Subgroup Analysis
# ============================================
print("
📊 Subgroup SHAP Analysis (by demographic)")
 
# Assume we have a demographic variable
# This reveals if model treats groups differently
for group in ['A', 'B']:
    mask = X_test_df['demographic'] == group
    group_shap = shap_values[mask]
    group_importance = np.abs(group_shap).mean(axis=0)
    
    print(f"
  Group {group} (n={mask.sum()}):")
    for name, imp in sorted(zip(feature_names, group_importance), key=lambda x: -x[1])[:3]:
        print(f"    {name}: {imp:.4f}")
 
# If top features differ between groups, model uses different decision logic
 
# ============================================
# Aggregation 4: Explanation Clustering
# ============================================
print("
📊 Explanation Clustering (finding decision modes)")
 
# Cluster instances by their SHAP profiles
n_clusters = 3
kmeans = KMeans(n_clusters=n_clusters, random_state=42)
cluster_labels = kmeans.fit_predict(shap_values)
 
for c in range(n_clusters):
    mask = cluster_labels == c
    cluster_shap = shap_values[mask]
    mean_shap = cluster_shap.mean(axis=0)
    
    print(f"
  Cluster {c} (n={mask.sum()}, {100*mask.mean():.1f}% of data):")
    # Top features for this cluster
    top_idx = np.argsort(-np.abs(mean_shap))[:3]
    for idx in top_idx:
        direction = "↑" if mean_shap[idx] > 0 else "↓"
        print(f"    {feature_names[idx]}: {mean_shap[idx]:+.4f} {direction}")
    
    # Characterize cluster
    cluster_X = X_test[mask]
    print(f"    Avg prediction: {model.predict_proba(cluster_X)[:, 1].mean():.3f}")
 
# Each cluster represents a distinct "explanation mode"
# Cluster 0 might be "approved due to high income"
# Cluster 1 might be "denied due to poor credit history"
# etc.

Combining Local and Global Perspectives

Neither local nor global interpretability alone tells the complete story. A comprehensive interpretability strategy combines both perspectives, using each where it's most valuable.

Local vs Global: When to Use Each
Scenario	Best Approach	Why
Explaining individual decisions to users	Local	Users need personalized, specific explanations
Regulatory compliance (ECOA reason codes)	Local	Regulations require per-decision explanations
Debugging specific misclassifications	Local	Need to understand what went wrong for that instance
Model documentation and model cards	Global	Stakeholders need overall behavior summary
Bias detection across populations	Global + Subgroup	Need to compare treatment across groups globally
Scientific discovery of patterns	Global	Looking for general relationships, not individual cases
Human-in-the-loop decision support	Both	Global for mental model, local for specific decision
Model comparison and selection	Global	Need to compare overall behavior, not individual predictions
Edge case analysis	Local with sampling	Aggregate local explanations for outlier instances
Stakeholder presentation	Both	Global overview + specific illustrative examples

A comprehensive interpretability workflow:

Start with global analysis:
- Compute global feature importance to understand what features matter
- Generate PDP/ALE plots to understand marginal effects
- Check for concerning patterns (unexpected feature importance, proxy usage)
Stratify by subgroups:
- Repeat global analysis for relevant subgroups
- Look for heterogeneous treatment across groups
- Identify if model uses different logic for different populations
Deep dive with local analysis:
- Select representative instances from each subgroup/cluster
- Generate local explanations (SHAP, LIME)
- Verify that local explanations align with global patterns
Examine edge cases:
- Find instances where local explanations are unusual
- These may indicate model uncertainty or training data gaps
- Document limitations based on edge case analysis
Synthesize and communicate:
- Combine global and local insights into coherent narrative
- Use global explanations for high-level summary
- Use local examples to illustrate and ground abstract patterns

The Zoom In/Zoom Out Metaphor

Think of global and local interpretability like a map. Global interpretability is like viewing a city from above—you see neighborhoods, major roads, and overall layout. Local interpretability is like street view—you see specific buildings, signs, and details. Both perspectives are needed to truly understand the territory.

Practical Guidelines for Scope Selection

Choosing between local and global methods—and how to combine them—depends on practical constraints and stakeholder needs.

Practical Decision Framework

•Consider your audience: Non-technical stakeholders often prefer global explanations ('this model mostly looks at income and credit history') over local details. Technical users may need local explanations for debugging.
•Consider regulatory requirements: Some regulations (ECOA, GDPR) require per-decision explanations, mandating local interpretability. Others allow aggregate fairness analysis, which is global.
•Consider computational cost: Local methods (especially SHAP) can be expensive to compute for every prediction. For high-volume systems, consider: (a) sampling for global analysis, (b) pre-computed explanations for common cases, (c) on-demand local explanations for flagged cases.
•Consider explanation stability: If local explanations vary dramatically across similar instances, they may confuse users. Global explanations provide more consistent messaging but may oversimplify. Test explanation stability during development.
•Consider actionability: For recourse (helping users change outcomes), local explanations are essential—global knowledge that 'income matters' doesn't tell an individual what to do. For system design (should we collect this feature?), global analysis suffices.

Method Selection by Constraint
Constraint	Recommended Approach
Real-time explanations needed	LIME or cached SHAP; avoid expensive exact computation
Regulatory audit required	Global analysis + sampled local explanations; document methodology
User-facing explanation API	Local SHAP or counterfactuals with caching
Model comparison during development	Global importance + PDP; consistent across models
Bias detection	Subgroup global analysis + local examples of disparate treatment
Scientific publication	Both global and local with statistical significance tests

Summary: Local vs Global Interpretability

We've explored the scope dimension of interpretability. Let's consolidate the key insights:

Key Takeaways

•Local interpretability explains individual predictions. It captures context-specific reasoning and is essential for per-decision transparency.
•Global interpretability explains overall model behavior. It provides consistent, summarized understanding suitable for auditing and documentation.
•Local methods include SHAP (local), LIME, counterfactuals, and gradient-based attributions. Each provides different types of instance-level explanation.
•Global methods include permutation importance, PDP/ALE plots, global SHAP aggregation, and surrogate models. Each reveals different aspects of overall behavior.
•Local-to-global aggregation (especially SHAP) bridges both perspectives. Aggregating faithful local explanations provides global insight while preserving heterogeneity.
•Neither alone is sufficient. Comprehensive interpretability combines global overview with local detail, selected based on stakeholder needs and regulatory requirements.

What's next:

We've now covered two major taxonomic dimensions: intrinsic vs. post-hoc (when interpretability is built) and local vs. global (what scope of behavior is explained). Next, we'll explore another important distinction: model-specific vs. model-agnostic methods. Should we use methods tailored to specific model types, or general methods that work across all models?

Page Complete

You now understand the local vs. global interpretability dimension. Local explanations answer 'why this prediction?' while global explanations answer 'how does the model work?' Mastering both perspectives—and knowing when to use each—is essential for comprehensive model understanding.