Loading learning content...
Standard LLE provides an elegant approach to nonlinear dimensionality reduction, but it has limitations. The algorithm can produce distorted embeddings in certain geometric configurations, struggles with noisy data, and lacks theoretical guarantees on the quality of the recovered parameterization.
Over the years, researchers have developed numerous LLE variants that address these shortcomings while preserving the core philosophy of local linear reconstruction. These variants offer improved theoretical properties, better robustness, and specialized capabilities for different data characteristics.
This page surveys the most important LLE variants: Hessian LLE (HLLE), Modified LLE (MLLE), Local Tangent Space Alignment (LTSA), and other extensions. Understanding these variants allows you to select the most appropriate algorithm for your specific manifold learning task.
By the end of this page, you will: • Understand the limitations of standard LLE that motivated variants • Master Hessian LLE and its improved theoretical guarantees • Apply Modified LLE for multiple weight solutions • Use LTSA for explicit tangent space alignment • Select the appropriate variant for different scenarios
Before exploring variants, we must understand what problems they solve. Standard LLE has several important limitations:
Limitation 1: Ill-Posed Reconstruction (When k > D)
When the number of neighbors $k$ exceeds the input dimension $D$, the local Gram matrix is rank-deficient, requiring regularization. This regularization introduces bias and can distort the embedding.
Limitation 2: Sensitivity to Neighborhood Selection
LLE is sensitive to how neighbors are chosen. If a neighborhood spans high curvature or includes points from different manifold "branches," the linear approximation fails.
Limitation 3: Non-Convex Neighborhoods
When a point lies on the boundary of a folded manifold, its Euclidean neighbors may not form a convex set in manifold coordinates. LLE's barycentric reconstruction assumes (implicitly) that the point can be represented as a convex combination.
Limitation 4: Lack of Isometry Guarantees
Standard LLE preserves local reconstruction weights but does not explicitly preserve geometric quantities like distances or angles. The embedding may distort these properties.
Limitation 5: Mapping to d ≠ Intrinsic Dimension
If we embed into dimension $d$ different from the true intrinsic dimension, LLE can produce collapsed or split embeddings.
| Failure Mode | Symptom | Cause | Variant Solution |
|---|---|---|---|
| Collapsed embedding | All points map near same location | d > intrinsic dim, or disconnected graph | Check eigenvalue spectrum |
| Distorted unfolding | Swiss roll 'twists' in embedding | k too large, curvature issues | HLLE, smaller k |
| Spurious connections | Distant manifold points merge | Short-circuits in neighborhood graph | Smaller k, better sampling |
| Boundary effects | Edge points distorted | Non-convex local geometry | MLLE handles multiple solutions |
| Unstable results | Different runs give different embeddings | Numerical sensitivity, regularization | LTSA more stable |
Signs that standard LLE is failing: • Very small second eigenvalue (near zero) • Many negative weights • Embedding looks "crumpled" or distorted • Results change dramatically with small parameter changes • Reconstruction error is high relative to data variance
Hessian Locally Linear Embedding (HLLE) was introduced by Donoho and Grimes (2003) to address a fundamental limitation of standard LLE: the lack of guaranteed isometry.
The Key Insight:
While standard LLE preserves local reconstruction weights, it doesn't explicitly preserve the manifold's intrinsic geometry. HLLE instead minimizes a functional based on the Hessian of the mapping from manifold to embedding—ensuring that the mapping locally preserves the manifold's Riemannian structure.
Mathematical Foundation:
For a smooth embedding $f: \mathcal{M} \to \mathbb{R}^d$ of a manifold $\mathcal{M}$ into $\mathbb{R}^d$, the Hessian measures how the embedding curves:
$$H_f(\mathbf{x}) = \nabla^2 f(\mathbf{x})$$
A "flat" embedding (isometry to Euclidean space) has zero Hessian everywhere. HLLE seeks an embedding that minimizes the integrated squared Hessian.
The HLLE Functional:
$$\mathcal{E}{HLLE} = \int\mathcal{M} |H_f|_F^2 , d\mu$$
In practice, this is estimated from the data using local PCA to estimate the tangent space at each point, then constructing discrete Hessian estimators.
Algorithm Outline:
Compute neighborhoods: Find $k$ nearest neighbors for each point
Estimate tangent spaces: Perform local PCA in each neighborhood to find the $d$-dimensional tangent basis
Construct local Hessian estimators: Build matrices $\mathbf{H}_i$ that estimate the Hessian at each point using Taylor expansion in tangent coordinates
Form global quadratic form: Assemble local estimators into a global matrix $\mathbf{M}_{HLLE}$
Eigen-decomposition: Find the $d$ smallest non-trivial eigenvectors of $\mathbf{M}_{HLLE}$
Local Hessian Construction:
For point $i$ with tangent basis $\mathbf{T}_i \in \mathbb{R}^{D \times d}$, project neighbors onto tangent coordinates: $$\mathbf{t}_j = \mathbf{T}_i^T (\mathbf{x}_j - \mathbf{x}_i)$$
Construct the local design matrix including constant, linear, and quadratic terms: $$\mathbf{Z}_i = [\mathbf{1}, \mathbf{t}, \text{vec}(\mathbf{t}\mathbf{t}^T)]$$
The null space of $\mathbf{Z}_i$ (after QR decomposition) provides the Hessian estimator $\mathbf{H}_i$.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145
import numpy as npfrom sklearn.neighbors import NearestNeighborsfrom scipy.linalg import eigh, svd def hessian_lle(X, n_neighbors=10, n_components=2): """ Hessian Locally Linear Embedding (HLLE). Parameters: ----------- X : ndarray of shape (N, D) Input data n_neighbors : int Number of neighbors (must be > d + d*(d+1)/2) n_components : int Target dimensionality (d) Returns: -------- Y : ndarray of shape (N, n_components) Embedding coordinates """ N, D = X.shape d = n_components k = n_neighbors # Minimum neighbors for HLLE dp = d + 1 + d * (d + 1) // 2 # d + 1 + d(d+1)/2 if k < dp: raise ValueError(f"n_neighbors must be >= {dp} for d={d} components") # Find neighbors nn = NearestNeighbors(n_neighbors=k+1) nn.fit(X) _, indices = nn.kneighbors(X) indices = indices[:, 1:] # Remove self # Build the cost matrix M = np.zeros((N, N)) for i in range(N): neighbor_idx = indices[i] # Center the neighborhood X_nbrs = X[neighbor_idx] - X[i] # SVD to get tangent space U, s, Vt = svd(X_nbrs, full_matrices=False) # Use top d right singular vectors as tangent basis tangent = Vt[:d].T # Shape: (k, d) # Project onto tangent coordinates nbrs_tangent = X_nbrs @ tangent # Shape: (k, d) # Build design matrix for Hessian # Include: constant (1), linear terms (t), quadratic terms (t_i * t_j) # Constant term ones = np.ones((k, 1)) # Linear terms linear = nbrs_tangent # Shape: (k, d) # Quadratic terms: t_1^2, t_1*t_2, ..., t_d^2 # Number of unique quadratic terms: d*(d+1)/2 n_quad = d * (d + 1) // 2 quadratic = np.zeros((k, n_quad)) idx = 0 for p in range(d): for q in range(p, d): quadratic[:, idx] = nbrs_tangent[:, p] * nbrs_tangent[:, q] idx += 1 # Full design matrix Z = np.hstack([ones, linear, quadratic]) # Shape: (k, 1+d+n_quad) # QR decomposition to find null space Q, R = np.linalg.qr(Z, mode='complete') # Null space of Z^T is last columns of Q # These are the Hessian estimators n_null = k - (1 + d + n_quad) if n_null <= 0: # Not enough neighbors for null space continue W = Q[:, -n_null:] # Null space basis # The null space should have dimension d*(d+1)/2 for a d-dim manifold # Use it to build the local quadratic term # Local contribution to M H = W @ W.T # Local Hessian estimator # Add to global matrix ix = np.ix_(neighbor_idx, neighbor_idx) M[ix] += H # Symmetrize M = (M + M.T) / 2 # Eigendecomposition eigenvalues, eigenvectors = eigh(M) # Sort by eigenvalue idx = np.argsort(eigenvalues) eigenvalues = eigenvalues[idx] eigenvectors = eigenvectors[:, idx] # Skip the d+1 smallest eigenvalues (null space) # For HLLE, the null space has dimension d+1 (translations + constants) Y = eigenvectors[:, (d+1):(2*d+1)] * np.sqrt(N) return Y # Comparison demonstrationif __name__ == "__main__": from sklearn.datasets import make_swiss_roll from sklearn.manifold import LocallyLinearEmbedding import matplotlib.pyplot as plt # Generate data X, color = make_swiss_roll(n_samples=1000, noise=0.1, random_state=42) # Standard LLE lle = LocallyLinearEmbedding(n_neighbors=12, n_components=2, method='standard') Y_lle = lle.fit_transform(X) # HLLE from sklearn hlle = LocallyLinearEmbedding(n_neighbors=12, n_components=2, method='hessian') Y_hlle = hlle.fit_transform(X) # Plot comparison fig, axes = plt.subplots(1, 2, figsize=(12, 5)) axes[0].scatter(Y_lle[:, 0], Y_lle[:, 1], c=color, cmap='viridis', s=10) axes[0].set_title('Standard LLE') axes[1].scatter(Y_hlle[:, 0], Y_hlle[:, 1], c=color, cmap='viridis', s=10) axes[1].set_title('Hessian LLE') plt.tight_layout() plt.show()HLLE is preferred when: • Theoretical guarantees on isometry are needed • The manifold is known to be Riemannian • Standard LLE produces distorted unfoldings • You need better preservation of local geometry (angles, areas)
However, HLLE requires more neighbors (k ≥ 1 + d + d(d+1)/2) and is more computationally expensive.
Modified Locally Linear Embedding (MLLE) addresses the case when the reconstruction weights are not unique. This happens when $k > d$ (more neighbors than intrinsic dimension)—the weight solution becomes underdetermined.
The Problem with Non-Unique Weights:
In standard LLE with regularization, we pick one particular solution from the space of valid weights. But different choices lead to different embeddings. MLLE explicitly uses the entire space of valid weight vectors to construct a more robust embedding.
Key Insight:
When the local Gram matrix is rank-deficient, there's a subspace of weight vectors that achieve zero reconstruction error. MLLE constructs an embedding that respects all these valid weight vectors, not just a regularized single solution.
Mathematical Formulation:
For point $i$ with neighbors, let $\mathbf{S}_i$ denote the subspace of weight vectors that perfectly reconstruct $\mathbf{x}_i$. MLLE constructs local matrices $\mathbf{M}_i$ that encode this entire subspace:
$$\mathbf{M}_i = \mathbf{I} - \mathbf{V}_i \mathbf{V}_i^T$$
where $\mathbf{V}_i$ is the orthonormal basis for the right singular vectors of the local reconstruction problem with non-zero singular values.
The global cost matrix is: $$\mathbf{M}_{MLLE} = \sum_i \mathbf{M}_i^{\text{(expanded)}}$$
where "expanded" means extending the local matrix to the full N×N dimensions.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142
import numpy as npfrom sklearn.neighbors import NearestNeighborsfrom scipy.linalg import eigh, svd def modified_lle(X, n_neighbors=10, n_components=2, reg=1e-3): """ Modified Locally Linear Embedding (MLLE). Handles the case when weights are not unique by considering the full space of valid weight vectors. Parameters: ----------- X : ndarray of shape (N, D) Input data n_neighbors : int Number of neighbors n_components : int Target dimensionality reg : float Regularization for numerical stability Returns: -------- Y : ndarray of shape (N, n_components) Embedding coordinates """ N, D = X.shape d = n_components k = n_neighbors # Find neighbors nn = NearestNeighbors(n_neighbors=k+1) nn.fit(X) _, indices = nn.kneighbors(X) indices = indices[:, 1:] # Remove self # Build cost matrix M = np.zeros((N, N)) for i in range(N): neighbor_idx = indices[i] # Center at x_i X_nbrs = X[neighbor_idx] - X[i] # SVD of neighborhood U, s, Vt = svd(X_nbrs, full_matrices=True) # Determine effective rank (number of significant singular values) # This tells us the intrinsic dimension of the local neighborhood tol = max(X_nbrs.shape) * np.finfo(float).eps * s[0] rank = np.sum(s > tol) # If rank < k, we have a subspace of valid weights if rank < k: # The null space of X_nbrs^T corresponds to weight directions # that achieve zero reconstruction error # V contains right singular vectors # Last (k - rank) columns span the null space V_null = Vt.T[:, rank:] # Shape: (k, k-rank) # Include the valid weight subspace # MLLE uses all directions, not just one regularized solution # Compute the standard weights as well # (from the non-null component) if rank > 0: # Pseudo-inverse solution V_valid = Vt.T[:, :rank] # Shape: (k, rank) # Gram matrix approach (similar to standard LLE) eta = X_nbrs.T # Shape: (D, k) G = X_nbrs @ X_nbrs.T # Shape: (k, k) G += reg * np.trace(G) / k * np.eye(k) ones = np.ones(k) w = np.linalg.solve(G, ones) w = w / np.sum(w) # Extend w to include null space effects # For MLLE, we use projection onto the principal subspace else: # All weights are valid w = np.ones(k) / k # Local cost matrix based on valid subspace # M_i = I - V_valid @ V_valid^T (projects onto null space) V = Vt.T[:, :max(rank, d)] # Use at least d components H = np.eye(k) - V @ V.T else: # Full rank: unique weight solution G = X_nbrs @ X_nbrs.T G += reg * np.trace(G) / k * np.eye(k) ones = np.ones(k) w = np.linalg.solve(G, ones) w = w / np.sum(w) # Local cost matrix from weights w = w.reshape(-1, 1) H = w @ w.T # Add to global matrix ix = np.ix_(neighbor_idx, neighbor_idx) M[ix] += H # Also need diagonal contributions for i in range(N): M[i, i] += 1.0 neighbor_idx = indices[i] # Subtract off-diagonal contributions properly # (This is simplified; full MLLE has more careful construction) # Symmetrize M = (M + M.T) / 2 # Eigendecomposition eigenvalues, eigenvectors = eigh(M) # Skip smallest eigenvalue (null space) idx = np.argsort(eigenvalues) eigenvalues = eigenvalues[idx] eigenvectors = eigenvectors[:, idx] Y = eigenvectors[:, 1:d+1] * np.sqrt(N) return Y # For reliable results, use sklearn's implementationfrom sklearn.manifold import LocallyLinearEmbedding def sklearn_mlle(X, n_neighbors=10, n_components=2): """Use sklearn's Modified LLE implementation.""" mlle = LocallyLinearEmbedding( n_neighbors=n_neighbors, n_components=n_components, method='modified' ) return mlle.fit_transform(X)MLLE is particularly useful when: • k > intrinsic dimension (common in practice) • Results are sensitive to regularization parameter • You observe instability across different runs • The manifold has varying local dimensionality
MLLE provides more stable results by not arbitrarily choosing from multiple valid weight solutions.
Local Tangent Space Alignment (LTSA) takes a more direct approach to manifold learning: explicitly compute local tangent spaces and then align them globally.
Philosophy:
While LLE implicitly encodes tangent space information through reconstruction weights, LTSA explicitly:
The Algorithm:
Step 1: Local Tangent Space Estimation
For each point $i$, perform PCA on the k-nearest neighbors to find the local tangent space $T_i$:
Step 2: Local Coordinates
Project neighbors onto the tangent space: $$\boldsymbol{\theta}_j^{(i)} = \mathbf{T}_i^T (\mathbf{x}_j - \bar{\mathbf{x}}_i)$$
Step 3: Alignment
Construct a matrix $\mathbf{W}_i$ that maps local tangent coordinates to global embedding coordinates. Minimize: $$\sum_i |\mathbf{Y}_i - \mathbf{W}_i \boldsymbol{\Theta}_i|_F^2$$
subject to constraints preventing trivial solutions.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134
import numpy as npfrom sklearn.neighbors import NearestNeighborsfrom scipy.linalg import eigh, svd def ltsa(X, n_neighbors=10, n_components=2): """ Local Tangent Space Alignment (LTSA). Explicitly estimates tangent spaces and aligns them globally. Parameters: ----------- X : ndarray of shape (N, D) Input data n_neighbors : int Number of neighbors n_components : int Target dimensionality Returns: -------- Y : ndarray of shape (N, n_components) Embedding coordinates """ N, D = X.shape d = n_components k = n_neighbors # Find neighbors nn = NearestNeighbors(n_neighbors=k+1) nn.fit(X) _, indices = nn.kneighbors(X) indices = indices[:, 1:] # Remove self # Initialize cost matrix B = np.zeros((N, N)) for i in range(N): neighbor_idx = indices[i] # Include x_i in the neighborhood for LTSA full_idx = np.concatenate([[i], neighbor_idx]) X_local = X[full_idx] # Center X_centered = X_local - np.mean(X_local, axis=0) # Local PCA to find tangent space U, s, Vt = svd(X_centered, full_matrices=False) # Local tangent coordinates (first d dimensions) theta = U[:, :d] # Shape: (k+1, d) # Centering matrix for alignment H = np.eye(k + 1) - np.ones((k+1, k+1)) / (k + 1) # Projection matrix onto orthogonal complement of tangent space # This is what needs to be minimized in the global embedding G = H - theta @ theta.T @ H # The alignment cost for this neighborhood S = G.T @ G # Add to global matrix ix = np.ix_(full_idx, full_idx) B[ix] += S # Eigendecomposition eigenvalues, eigenvectors = eigh(B) # Sort idx = np.argsort(eigenvalues) eigenvalues = eigenvalues[idx] eigenvectors = eigenvectors[:, idx] # Skip null space (smallest eigenvalues) # LTSA has (d+1) zero eigenvalues Y = eigenvectors[:, (d+1):(2*d+1)] * np.sqrt(N) return Y def compare_lle_variants(X, n_neighbors=12, n_components=2): """ Compare all LLE variants on the same dataset. """ from sklearn.manifold import LocallyLinearEmbedding variants = { 'standard': 'standard', 'hessian': 'hessian', 'modified': 'modified', 'ltsa': 'ltsa', } results = {} for name, method in variants.items(): try: lle = LocallyLinearEmbedding( n_neighbors=n_neighbors, n_components=n_components, method=method ) results[name] = lle.fit_transform(X) except Exception as e: print(f"{name} failed: {e}") results[name] = None return results if __name__ == "__main__": from sklearn.datasets import make_swiss_roll import matplotlib.pyplot as plt # Generate data X, color = make_swiss_roll(n_samples=1000, noise=0.1, random_state=42) # Compare variants results = compare_lle_variants(X, n_neighbors=12, n_components=2) # Plot fig, axes = plt.subplots(2, 2, figsize=(12, 10)) axes = axes.flatten() for ax, (name, Y) in zip(axes, results.items()): if Y is not None: ax.scatter(Y[:, 0], Y[:, 1], c=color, cmap='viridis', s=10) ax.set_title(f'{name.upper()} LLE') ax.set_xlabel('Component 1') ax.set_ylabel('Component 2') plt.tight_layout() plt.show()Each LLE variant has its strengths and appropriate use cases. Here's a comprehensive comparison to guide selection:
| Variant | Key Feature | Minimum k | Complexity | Best For |
|---|---|---|---|---|
| Standard LLE | Reconstruction weights | k > d | O(DNk² + Nk³) | Simple manifolds, quick exploration |
| HLLE | Hessian-based isometry | k > 1+d+d(d+1)/2 | O(DNk² + Nk³) | Theoretical guarantees, isometry |
| MLLE | Multiple weight solutions | k > d | O(DNk² + Nk³) | Ambiguous local geometry |
| LTSA | Explicit tangent alignment | k > d | O(DNk² + Nk³) | Clear tangent structure |
For most applications:
In scikit-learn, all variants are available through the 'method' parameter of LocallyLinearEmbedding.
Beyond the core variants, several extensions address robustness and special use cases:
1. Robust LLE (RLLE)
Standard LLE is sensitive to outliers—a single corrupted point can distort the entire embedding. Robust LLE addresses this through:
2. Sparse LLE
When interpretability is important, Sparse LLE adds L1 regularization to the weight computation: $$\min_\mathbf{w} |\mathbf{x}i - \mathbf{X}\mathcal{N} \mathbf{w}|^2 + \lambda |\mathbf{w}|_1$$
subject to $\mathbf{1}^T \mathbf{w} = 1$
This produces sparse weights where each point depends on only a few key neighbors.
3. Semi-Supervised LLE
When some points have known coordinates (landmarks), Semi-Supervised LLE incorporates this information:
4. Incremental LLE
For streaming data or very large datasets, Incremental LLE:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122
import numpy as npfrom sklearn.neighbors import NearestNeighbors def robust_lle_weights(X, k=10, max_iter=10, huber_delta=1.0): """ Compute LLE weights with Huber loss for robustness. Uses iteratively reweighted least squares (IRLS) to down-weight the influence of outliers. """ N, D = X.shape nn = NearestNeighbors(n_neighbors=k+1) nn.fit(X) _, indices = nn.kneighbors(X) indices = indices[:, 1:] W = np.zeros((N, N)) for i in range(N): neighbor_idx = indices[i] x_i = X[i] X_nbrs = X[neighbor_idx] # Initialize weights uniformly w = np.ones(k) / k for iteration in range(max_iter): # Compute residuals reconstruction = X_nbrs.T @ w residual = x_i - reconstruction r_norm = np.linalg.norm(residual) # Compute Huber weights if r_norm < huber_delta: # Quadratic region - unit weight rho = 1.0 else: # Linear region - reduced weight rho = huber_delta / r_norm # Weighted Gram matrix eta = x_i - X_nbrs # Robustness weights per neighbor neighbor_residuals = np.linalg.norm(eta, axis=1) neighbor_weights = np.where( neighbor_residuals < huber_delta, 1.0, huber_delta / neighbor_residuals ) # Weight the Gram matrix G = (eta * neighbor_weights[:, None]) @ eta.T # Regularize G += 1e-3 * np.trace(G) / k * np.eye(k) # Solve ones = np.ones(k) w_new = np.linalg.solve(G, ones) w_new = w_new / np.sum(w_new) # Check convergence if np.linalg.norm(w_new - w) < 1e-6: break w = w_new W[i, neighbor_idx] = w return W def sparse_lle_weights(X, k=10, sparsity=0.5, max_iter=100): """ Compute sparse LLE weights with L1 regularization. Uses proximal gradient descent to solve: min ||x - Xw||^2 + lambda * ||w||_1 s.t. sum(w) = 1 """ N, D = X.shape nn = NearestNeighbors(n_neighbors=k+1) nn.fit(X) _, indices = nn.kneighbors(X) indices = indices[:, 1:] W = np.zeros((N, N)) for i in range(N): neighbor_idx = indices[i] x_i = X[i] X_nbrs = X[neighbor_idx] # Shape: (k, D) # Gram matrix for gradient G = X_nbrs @ X_nbrs.T b = X_nbrs @ x_i # Initialize w = np.ones(k) / k step_size = 1.0 / (np.linalg.norm(G, 2) + 1e-6) lam = sparsity for _ in range(max_iter): # Gradient step grad = G @ w - b w_temp = w - step_size * grad # Soft thresholding (proximal operator for L1) w_temp = np.sign(w_temp) * np.maximum(np.abs(w_temp) - step_size * lam, 0) # Project onto sum-to-one constraint w_new = w_temp + (1 - np.sum(w_temp)) / k # Check convergence if np.linalg.norm(w_new - w) < 1e-6: break w = w_new W[i, neighbor_idx] = w return WA significant limitation of LLE and its variants is that they are transductive: they provide embeddings only for the training points. Embedding new points requires re-running the entire algorithm.
Out-of-Sample Extension addresses this by learning a mapping from high-dimensional space to the embedding that can be applied to new points.
Method 1: Nearest Neighbor Interpolation
For a new point $\mathbf{x}_{new}$:
Method 2: Kernel Extension
View LLE as implicitly defining a kernel, then use kernel regression: $$\mathbf{y}{new} = \sum_i \alpha_i K(\mathbf{x}{new}, \mathbf{x}_i)$$
where $K$ is derived from the LLE weight structure and $\alpha$ is fit on training data.
Method 3: Neural Network Mapping
Train a neural network to approximate the LLE mapping:
Method 4: Parametric LLE
Learn a parametric mapping during the LLE optimization itself, constraining the embedding to be a function of the input: $$\mathbf{y}i = f\theta(\mathbf{x}_i)$$
Optimize over both the embedding and the parameters $\theta$.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115
import numpy as npfrom sklearn.neighbors import NearestNeighborsfrom sklearn.manifold import LocallyLinearEmbedding class LLEWithExtension: """ LLE with out-of-sample extension capability. """ def __init__(self, n_neighbors=10, n_components=2, reg=1e-3): self.n_neighbors = n_neighbors self.n_components = n_components self.reg = reg def fit_transform(self, X): """Fit LLE and store training data for extension.""" self.X_train_ = X.copy() # Fit standard LLE lle = LocallyLinearEmbedding( n_neighbors=self.n_neighbors, n_components=self.n_components, reg=self.reg ) self.Y_train_ = lle.fit_transform(X) # Store neighbor finder self.nn_ = NearestNeighbors(n_neighbors=self.n_neighbors+1) self.nn_.fit(X) return self.Y_train_ def transform(self, X_new): """ Embed new points using nearest neighbor interpolation. """ n_new = X_new.shape[0] Y_new = np.zeros((n_new, self.n_components)) # Find neighbors in training set distances, indices = self.nn_.kneighbors(X_new) for i in range(n_new): # Get neighbors (may include self if X_new contains training points) neighbor_idx = indices[i, :self.n_neighbors] # Compute reconstruction weights x_i = X_new[i] X_nbrs = self.X_train_[neighbor_idx] # Gram matrix eta = x_i - X_nbrs G = eta @ eta.T G += self.reg * np.trace(G) / self.n_neighbors * np.eye(self.n_neighbors) # Solve for weights ones = np.ones(self.n_neighbors) w = np.linalg.solve(G, ones) w = w / np.sum(w) # Apply weights to embedded coordinates Y_nbrs = self.Y_train_[neighbor_idx] Y_new[i] = Y_nbrs.T @ w return Y_new def reconstruction_error(self, X_new, Y_new): """ Compute reconstruction error for new points. """ error = 0.0 distances, indices = self.nn_.kneighbors(X_new) for i in range(len(X_new)): neighbor_idx = indices[i, :self.n_neighbors] x_i = X_new[i] X_nbrs = self.X_train_[neighbor_idx] # Compute weights eta = x_i - X_nbrs G = eta @ eta.T G += self.reg * np.trace(G) / self.n_neighbors * np.eye(self.n_neighbors) ones = np.ones(self.n_neighbors) w = np.linalg.solve(G, ones) w = w / np.sum(w) # Reconstruction in embedding space Y_nbrs = self.Y_train_[neighbor_idx] y_reconstructed = Y_nbrs.T @ w error += np.linalg.norm(Y_new[i] - y_reconstructed) ** 2 return error # Exampleif __name__ == "__main__": from sklearn.datasets import make_swiss_roll # Generate data X, color = make_swiss_roll(n_samples=1000, noise=0.1, random_state=42) # Split into train and test X_train, X_test = X[:800], X[800:] color_train, color_test = color[:800], color[800:] # Fit LLE with extension lle = LLEWithExtension(n_neighbors=12, n_components=2) Y_train = lle.fit_transform(X_train) # Extend to test points Y_test = lle.transform(X_test) print(f"Training embedding shape: {Y_train.shape}") print(f"Test embedding shape: {Y_test.shape}")We have explored the rich landscape of LLE variants, each addressing specific limitations of the standard algorithm.
What's Next:
In the final page of this module, we examine the limitations of LLE in detail—understanding when the algorithm fails, why these failures occur, and how to diagnose and mitigate them. This knowledge is essential for applying LLE effectively in practice and knowing when alternative methods are more appropriate.
You now understand the major LLE variants and when to use each. This knowledge enables you to select the most appropriate algorithm for your specific manifold learning task and troubleshoot issues with standard LLE.