Loading content...
A point prediction—no matter how accurate—is fundamentally incomplete. When a weather app says "Tomorrow's high: 72°F," you naturally wonder: How sure are they? Could it be 65°F or 80°F? A prediction without uncertainty quantification is like a balance sheet without error bars.
Prediction intervals address this need by providing a range that is expected to contain the true outcome with a specified probability. Traditional approaches assume parametric error distributions (typically Gaussian), but quantile regression offers a distribution-free alternative that adapts to heteroscedastic, asymmetric, and heavy-tailed data.
The Appeal of Quantile-Based Prediction Intervals:
By the end of this page, you will understand how to construct prediction intervals using quantile regression, evaluate their calibration, compare to Gaussian-based intervals, and leverage conformal prediction for finite-sample guarantees.
Let's clarify the distinction between point predictions, confidence intervals, and prediction intervals.
Point Prediction:
A single best guess, typically $\hat{y}(x) = \hat{\mathbb{E}}[Y \mid X = x]$ or $\hat{y}(x) = \hat{Q}_{0.5}(Y \mid X = x)$.
Confidence Interval for the Mean:
An interval $[L(x), U(x)]$ such that: $$P(\mathbb{E}[Y \mid X = x] \in [L(x), U(x)]) \geq 1 - \alpha$$
This covers the true conditional mean, which is a fixed (but unknown) quantity.
Prediction Interval:
An interval $[L(x), U(x)]$ such that: $$P(Y_{new} \in [L(x), U(x)] \mid X_{new} = x) \geq 1 - \alpha$$
This covers a future observation $Y_{new}$—a random variable.
| Aspect | Confidence Interval | Prediction Interval |
|---|---|---|
| Target | True conditional mean E[Y|X=x] | Future observation Y_new |
| Width | Decreases with n → ∞ | Converges to nonzero limit (irreducible noise) |
| Uncertainty source | Estimation uncertainty only | Estimation + inherent noise |
| Typical width | Narrow (with enough data) | Wider (includes noise variance) |
| OLS formula | ŷ ± t_{α/2} · SE(ŷ) | ŷ ± t_{α/2} · √(σ̂² + SE²(ŷ)) |
Key Insight:
Even with infinite data, prediction intervals remain wide because they must account for irreducible noise. Confidence intervals shrink toward zero width as sample size increases, but prediction intervals converge to a band reflecting the fundamental variability in $Y \mid X$.
Why Standard Prediction Intervals Fail:
Classic OLS-based prediction intervals assume: $$Y \mid X = x \sim N(x^\top \beta, \sigma^2)$$
This fails when:
Quantile regression intervals make none of these assumptions.
Many practitioners confuse confidence intervals with prediction intervals and report intervals that are far too narrow. A 95% prediction interval for individual outcomes will always be wider than a 95% confidence interval for the mean.
The fundamental principle is elegant: a $(1-\alpha)$ prediction interval is bounded by the $\alpha/2$ and $(1-\alpha/2)$ conditional quantiles.
Construction:
For a $(1-\alpha) \times 100%$ prediction interval:
Example: 90% Prediction Interval
Interpretation: If the quantile models are correctly specified, 90% of future observations at covariate value $x$ should fall within this interval.
Theoretical Justification:
Under correct model specification:
$$P(Y \in [Q_{\alpha/2}(Y|X), Q_{1-\alpha/2}(Y|X)] \mid X = x) = 1 - \alpha$$
This follows directly from the definition of quantiles:
$$P(Y \leq Q_\tau(Y|X)) = \tau$$
So: $$P(Q_{\alpha/2}(Y|X) < Y \leq Q_{1-\alpha/2}(Y|X)) = (1-\alpha/2) - \alpha/2 = 1 - \alpha$$
Key Advantages:
Asymmetric intervals: The lower and upper bounds are fit independently, naturally adapting to skewed distributions
Heteroscedastic coverage: If variance increases with $x$, the quantile predictions automatically diverge, widening the interval
No normality requirement: Works for any continuous distribution
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
import numpy as npimport matplotlib.pyplot as pltfrom sklearn.linear_model import QuantileRegressorfrom sklearn.model_selection import train_test_split np.random.seed(42) # Generate heteroscedastic data with skewed errorsn = 1000X = np.random.uniform(0, 10, n).reshape(-1, 1) # Skewed, heteroscedastic errors (exponential-like)# Variance and skewness increase with Xerrors = np.random.exponential(scale=1 + 0.3 * X.ravel()) - (1 + 0.3 * X.ravel())y = 2 * X.ravel() + 5 + errors # Split dataX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Fit quantile regression models for 90% prediction intervalalpha = 0.10 # For 90% interval model_lower = QuantileRegressor(quantile=alpha/2, alpha=0, solver='highs')model_median = QuantileRegressor(quantile=0.5, alpha=0, solver='highs')model_upper = QuantileRegressor(quantile=1-alpha/2, alpha=0, solver='highs') model_lower.fit(X_train, y_train)model_median.fit(X_train, y_train)model_upper.fit(X_train, y_train) # Predictions on test setlower = model_lower.predict(X_test)median = model_median.predict(X_test)upper = model_upper.predict(X_test) # Compute coveragecoverage = np.mean((y_test >= lower) & (y_test <= upper))avg_width = np.mean(upper - lower) print(f"90% Prediction Interval Evaluation:")print(f" Empirical Coverage: {coverage:.1%}")print(f" Average Width: {avg_width:.2f}")print(f" Expected Coverage: 90%") # Visualizationfig, axes = plt.subplots(1, 2, figsize=(14, 5)) # Plot 1: Data and prediction intervalsax1 = axes[0]sort_idx = np.argsort(X_test.ravel())X_sorted = X_test.ravel()[sort_idx] ax1.scatter(X_test, y_test, alpha=0.5, s=20, c='gray', label='Test Data')ax1.plot(X_sorted, lower[sort_idx], 'b--', linewidth=2, label='5th percentile')ax1.plot(X_sorted, median[sort_idx], 'r-', linewidth=2, label='Median')ax1.plot(X_sorted, upper[sort_idx], 'b--', linewidth=2, label='95th percentile')ax1.fill_between(X_sorted, lower[sort_idx], upper[sort_idx], alpha=0.2, color='blue', label='90% PI') ax1.set_xlabel('X', fontsize=12)ax1.set_ylabel('y', fontsize=12)ax1.set_title(f'Quantile Regression Prediction Intervals(Coverage: {coverage:.1%})', fontsize=14)ax1.legend(loc='upper left')ax1.grid(True, alpha=0.3) # Plot 2: Interval width as function of Xax2 = axes[1]widths = upper - lowerax2.scatter(X_test, widths, alpha=0.5, s=30)ax2.set_xlabel('X', fontsize=12)ax2.set_ylabel('Interval Width', fontsize=12)ax2.set_title('Prediction Interval Width vs. Covariate(Captures Heteroscedasticity)', fontsize=14)ax2.grid(True, alpha=0.3) plt.tight_layout()plt.show()Notice how the interval width increases with X—exactly matching the increasing variance in the data. This happens automatically without explicitly modeling variance. Traditional normal-theory intervals would have constant width, providing incorrect coverage.
Let's formally compare quantile-based intervals to traditional Gaussian assumptions.
Standard (Gaussian) Prediction Interval:
For OLS with homoscedastic Gaussian errors: $$\text{PI}{1-\alpha}(x) = \hat{y}(x) \pm t{n-p, 1-\alpha/2} \cdot \hat{\sigma} \sqrt{1 + x^\top(X^\top X)^{-1}x}$$
where:
Assumptions Embedded in This Formula:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
import numpy as npimport matplotlib.pyplot as pltfrom sklearn.linear_model import LinearRegression, QuantileRegressorfrom scipy import stats np.random.seed(42)n = 500 # Generate heavily skewed, heteroscedastic dataX = np.random.uniform(0, 10, n).reshape(-1, 1) # Skewed Chi-squared errors with increasing variancedf_chi = 3errors = (np.random.chisquare(df_chi, n) - df_chi) * (0.5 + 0.2 * X.ravel())y = 3 * X.ravel() + 10 + errors # Method 1: Gaussian-based prediction intervalols = LinearRegression().fit(X, y)y_pred_ols = ols.predict(X)residuals = y - y_pred_olssigma_hat = np.std(residuals, ddof=2) # For new predictions (ignoring estimation uncertainty for simplicity)X_grid = np.linspace(0, 10, 100).reshape(-1, 1)y_pred_grid = ols.predict(X_grid)gaussian_lower = y_pred_grid - 1.645 * sigma_hat # 90% intervalgaussian_upper = y_pred_grid + 1.645 * sigma_hat # Method 2: Quantile regression prediction intervalqr_lower = QuantileRegressor(quantile=0.05, alpha=0, solver='highs').fit(X, y)qr_upper = QuantileRegressor(quantile=0.95, alpha=0, solver='highs').fit(X, y)qr_lower_pred = qr_lower.predict(X_grid)qr_upper_pred = qr_upper.predict(X_grid) # Evaluate coverageqr_coverage = np.mean((y >= qr_lower.predict(X)) & (y <= qr_upper.predict(X)))gaussian_coverage = np.mean((y >= y_pred_ols - 1.645 * sigma_hat) & (y <= y_pred_ols + 1.645 * sigma_hat)) print("90% Prediction Interval Coverage Comparison:")print(f" Quantile Regression: {qr_coverage:.1%}")print(f" Gaussian-based: {gaussian_coverage:.1%}")print(f" Target: 90%") # Visualizationfig, axes = plt.subplots(1, 2, figsize=(14, 5)) ax1 = axes[0]ax1.scatter(X, y, alpha=0.3, s=20, c='gray')ax1.plot(X_grid, y_pred_grid, 'k-', linewidth=2, label='OLS Mean')ax1.fill_between(X_grid.ravel(), gaussian_lower, gaussian_upper, alpha=0.3, color='red', label=f'Gaussian 90% PI (Cov: {gaussian_coverage:.0%})')ax1.set_xlabel('X')ax1.set_ylabel('y')ax1.set_title('Gaussian Prediction Interval')ax1.legend()ax1.grid(True, alpha=0.3) ax2 = axes[1]ax2.scatter(X, y, alpha=0.3, s=20, c='gray')qr_median = QuantileRegressor(quantile=0.5, alpha=0, solver='highs').fit(X, y)ax2.plot(X_grid, qr_median.predict(X_grid), 'k-', linewidth=2, label='QR Median')ax2.fill_between(X_grid.ravel(), qr_lower_pred, qr_upper_pred, alpha=0.3, color='blue', label=f'Quantile 90% PI (Cov: {qr_coverage:.0%})')ax2.set_xlabel('X')ax2.set_ylabel('y')ax2.set_title('Quantile Regression Prediction Interval')ax2.legend()ax2.grid(True, alpha=0.3) plt.tight_layout()plt.show()Gaussian intervals perform well when data are truly normal and homoscedastic. For large samples from approximately normal populations, the difference may be negligible. The quantile approach shines in challenging cases: skewed data, heavy tails, or heteroscedasticity.
A prediction interval is only as good as its calibration—the match between stated and actual coverage.
Definition (Coverage):
The empirical coverage of a $(1-\alpha)$ prediction interval on a test set is:
$$\hat{C} = \frac{1}{n_{test}} \sum_{i=1}^{n_{test}} \mathbb{1}{Y_i \in [L(X_i), U(X_i)]}$$
A well-calibrated interval has $\hat{C} \approx 1 - \alpha$.
Coverage Plots:
To assess calibration across multiple coverage levels:
Miscoverage Patterns:
| Pattern | Diagnosis | Cause |
|---|---|---|
| Empirical < Nominal | Under-coverage | Intervals too narrow; model overconfident |
| Empirical > Nominal | Over-coverage | Intervals too wide; model underconfident |
| Miscoverage at tails only | Tail misspecification | Model fails at extreme quantiles |
| Miscoverage depends on X | Conditional miscalibration | Model wrong for some covariate values |
Conditional Coverage:
Strong calibration requires covering correctly conditional on $X$: $$P(Y \in [L(X), U(X)] \mid X = x) = 1 - \alpha \quad \forall x$$
Marginal coverage (average over $X$) is weaker but easier to achieve.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
import numpy as npimport matplotlib.pyplot as pltfrom sklearn.linear_model import QuantileRegressorfrom sklearn.model_selection import train_test_split def calibration_plot(X_train, y_train, X_test, y_test): """ Generate calibration plot for quantile regression prediction intervals. """ # Nominal coverage levels to evaluate nominal_levels = np.arange(0.1, 1.0, 0.1) # 10%, 20%, ..., 90% empirical_coverage = [] for level in nominal_levels: alpha = 1 - level tau_lower = alpha / 2 tau_upper = 1 - alpha / 2 # Fit quantile models model_lower = QuantileRegressor(quantile=tau_lower, alpha=0, solver='highs') model_upper = QuantileRegressor(quantile=tau_upper, alpha=0, solver='highs') model_lower.fit(X_train, y_train) model_upper.fit(X_train, y_train) # Predict on test set lower = model_lower.predict(X_test) upper = model_upper.predict(X_test) # Compute coverage covered = (y_test >= lower) & (y_test <= upper) empirical_coverage.append(np.mean(covered)) # Plot fig, ax = plt.subplots(figsize=(8, 8)) ax.plot([0, 1], [0, 1], 'k--', linewidth=2, label='Perfect Calibration') ax.plot(nominal_levels, empirical_coverage, 'bo-', markersize=10, linewidth=2, label='Quantile Regression') # Shaded region for acceptable deviation ax.fill_between([0, 1], [0 - 0.05, 1 - 0.05], [0 + 0.05, 1 + 0.05], alpha=0.2, color='gray', label='±5% Tolerance') ax.set_xlabel('Nominal Coverage Level', fontsize=12) ax.set_ylabel('Empirical Coverage', fontsize=12) ax.set_title('Prediction Interval Calibration Plot', fontsize=14) ax.legend(loc='lower right') ax.set_xlim(0, 1) ax.set_ylim(0, 1) ax.grid(True, alpha=0.3) ax.set_aspect('equal') return empirical_coverage # Generate datanp.random.seed(42)n = 2000X = np.random.uniform(0, 10, n).reshape(-1, 1)y = 2 * X.ravel() + np.random.normal(0, 1 + 0.3 * X.ravel(), n) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Generate calibration plotcoverage = calibration_plot(X_train, y_train, X_test, y_test) print("Calibration Summary:")print("-" * 40)for level, emp in zip(np.arange(0.1, 1.0, 0.1), coverage): print(f" Nominal {level:.0%}: Empirical {emp:.1%}")If the linear quantile model is misspecified (the true conditional quantile function is nonlinear), calibration will suffer. Use flexible models (splines, gradient boosting, neural networks) when relationships are complex.
Standard quantile regression provides asymptotically valid intervals—coverage is correct as $n \to \infty$. But in finite samples, coverage may deviate from the nominal level due to estimation error.
Conformal prediction provides a framework for constructing prediction intervals with exact finite-sample coverage guarantees under minimal assumptions.
The Conformal Idea:
Instead of directly predicting quantiles, conformal prediction:
Split Conformal Prediction:
Theorem (Finite-Sample Validity):
If $(X_i, Y_i)$ are exchangeable, the conformal interval has: $$P(Y_{new} \in \text{PI}(X_{new})) \geq 1 - \alpha$$
This holds exactly, not just asymptotically!
Conformalized Quantile Regression (CQR):
CQR combines the adaptivity of quantile regression with conformal guarantees:
Advantages of CQR:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
import numpy as npfrom sklearn.linear_model import QuantileRegressor def conformalized_quantile_regression(X_train, y_train, X_cal, y_cal, X_test, alpha=0.1): """ Conformalized Quantile Regression (CQR) for prediction intervals. Parameters: ----------- X_train : array (n_train, p) Training features y_train : array (n_train,) Training targets X_cal : array (n_cal, p) Calibration features y_cal : array (n_cal,) Calibration targets X_test : array (n_test, p) Test features alpha : float Miscoverage level (1-alpha = coverage) Returns: -------- lower, upper : arrays Prediction interval bounds """ # Step 1: Fit quantile regressors on training data tau_lo = alpha / 2 tau_hi = 1 - alpha / 2 qr_lo = QuantileRegressor(quantile=tau_lo, alpha=0, solver='highs') qr_hi = QuantileRegressor(quantile=tau_hi, alpha=0, solver='highs') qr_lo.fit(X_train, y_train) qr_hi.fit(X_train, y_train) # Step 2: Compute conformity scores on calibration data q_lo_cal = qr_lo.predict(X_cal) q_hi_cal = qr_hi.predict(X_cal) scores = np.maximum(q_lo_cal - y_cal, y_cal - q_hi_cal) # Step 3: Find conformal quantile n_cal = len(y_cal) quantile_level = (1 - alpha) * (1 + 1/n_cal) q_hat = np.quantile(scores, min(quantile_level, 1.0)) # Step 4: Construct conformalized intervals on test data q_lo_test = qr_lo.predict(X_test) q_hi_test = qr_hi.predict(X_test) lower = q_lo_test - q_hat upper = q_hi_test + q_hat return lower, upper, q_hat # Example usagenp.random.seed(42)n = 2000 # Generate heteroscedastic dataX = np.random.uniform(0, 10, n).reshape(-1, 1)y = 2 * X.ravel() + np.random.normal(0, 0.5 + 0.3 * X.ravel(), n) # Three-way splitn_train = int(0.5 * n)n_cal = int(0.25 * n) X_train, y_train = X[:n_train], y[:n_train]X_cal, y_cal = X[n_train:n_train+n_cal], y[n_train:n_train+n_cal]X_test, y_test = X[n_train+n_cal:], y[n_train+n_cal:] # Apply CQRalpha = 0.1lower, upper, q_hat = conformalized_quantile_regression( X_train, y_train, X_cal, y_cal, X_test, alpha) # Evaluatecoverage = np.mean((y_test >= lower) & (y_test <= upper))avg_width = np.mean(upper - lower) print(f"Conformalized Quantile Regression Results:")print(f" Target Coverage: {1-alpha:.0%}")print(f" Empirical Coverage: {coverage:.1%}")print(f" Average Interval Width: {avg_width:.2f}")print(f" Conformal Correction q̂: {q_hat:.3f}")CQR provides: (1) adaptive interval widths from quantile regression, and (2) finite-sample coverage guarantees from conformal prediction. The conformal correction q̂ adjusts for any finite-sample miscalibration of the base quantile regressor.
Prediction intervals become more challenging in complex settings.
Multi-Output Prediction Regions:
When $Y \in \mathbb{R}^d$ is multivariate, we need prediction regions rather than intervals:
High-Dimensional Covariates:
With many features:
Cautions:
Strategies for High Dimensions:
Regularized Quantile Regression: Add L1 or L2 penalty to quantile objective $$\min_\beta \sum_{i=1}^n \rho_\tau(y_i - x_i^\top \beta) + \lambda |\beta|_1$$
Gradient Boosting Quantile Regression: LightGBM and XGBoost support quantile loss
Deep Quantile Networks: Neural networks with pinball loss; can predict multiple quantiles simultaneously
Dimension Reduction: Apply PCA or other techniques before quantile regression
For high-dimensional problems, gradient boosting with quantile loss often outperforms linear quantile regression without requiring explicit regularization tuning—the ensemble structure provides implicit regularization.
Deploying prediction intervals in production requires attention to several practical issues.
Interval Width as a Feature:
The width of prediction intervals can itself be informative:
This enables adaptive decision-making systems that escalate uncertain cases.
Example: Loan Underwriting
| Predicted Range | Interval Width | Action |
|---|---|---|
| [$50K, $55K] | $5K (narrow) | Auto-approve if in policy |
| [$40K, $70K] | $30K (wide) | Flag for manual review |
| [$30K, $100K] | $70K (very wide) | Request additional documentation |
Use interval width to build intelligent systems that know when they don't know. This is crucial for safe deployment of ML models—confident predictions can be automated while uncertain predictions get human oversight.
We have explored how quantile regression enables construction of prediction intervals with principled uncertainty quantification.
What's Next:
In the final page of this module, we'll explore robust estimation with quantile regression—how it naturally resists outliers, the connection to median regression, and when robustness becomes critical for reliable inference.
You now understand how to construct, evaluate, and deploy prediction intervals using quantile regression. These intervals adapt to heteroscedasticity, make no distributional assumptions, and—with conformal calibration—provide finite-sample coverage guarantees. Next, we'll complete the module by examining the robustness properties of quantile regression.