Loading content...
We've established that high-dimensional data often lies on lower-dimensional manifolds—but how do we actually discover and exploit this structure? Given a finite, noisy sample from an unknown data distribution, how do we recover the underlying manifold's geometry, topology, and coordinates?
This page surveys the major families of estimation approaches:
Each family makes different trade-offs between computational cost, geometric fidelity, and robustness. Understanding these trade-offs enables you to choose the right tool for your specific problem.
By completing this page, you will:
• Understand the taxonomy of manifold estimation methods • Compare global vs local preservation approaches • Recognize when to apply each family of methods • Connect classical methods to modern deep learning approaches • Make informed choices for dimensionality reduction and visualization tasks
Before exploring solutions, let's precisely define what we're trying to estimate.
Given:
Goals (various, not all simultaneously achievable):
Estimate intrinsic dimension d — How many independent parameters describe the data?
Learn the embedding φ: M → ℝᵈ — Find intrinsic coordinates for each point
Preserve geometry — Ensure the embedding preserves distances, angles, or other relationships
Enable interpolation — Given new points, find their intrinsic coordinates
Enable generation — Given intrinsic coordinates, produce new data points
Different methods prioritize different goals.
| Method Family | Primary Goal | Preserves | Requires |
|---|---|---|---|
| PCA | Linear subspace | Variance | Linearity assumption |
| Isomap / MDS | Global embedding | Geodesic distances | Connected manifold |
| LLE / Laplacian | Local embedding | Neighborhood structure | Local linearity |
| t-SNE / UMAP | Visualization | Cluster structure / topology | Choosing hyperparameters |
| Autoencoders | Reconstruction | Information for reconstruction | Sufficient capacity & data |
| VAEs / Flows | Generative model | Probability density | Likelihood model specification |
Key Trade-offs:
Global vs Local: Global methods preserve large-scale structure but may distort local details. Local methods preserve neighborhoods but may miss global topology.
Distances vs Neighborhoods: Some methods try to preserve specific distance values; others only preserve which points are neighbors.
Deterministic vs Probabilistic: Deterministic embeddings give point estimates; probabilistic methods model uncertainty and enable generation.
Out-of-sample Extension: Some methods only embed the training data; others provide formulas for new points.
Computational Cost: Methods range from O(n²) or O(n³) for classical approaches to O(n) for approximate methods.
No single method works best for all manifolds. A method that excels on Swiss roll may fail on a torus. Always validate on synthetic data with known structure before trusting results on real data. Multiple methods providing similar insights is stronger evidence than any one method alone.
Global methods aim to preserve the geodesic distance structure of the manifold. If two points are far apart as measured along the manifold surface, they should remain far apart in the embedding.
Multidimensional Scaling (MDS):
Classical MDS finds an embedding that preserves a given distance matrix. Given pairwise distances D = (dᵢⱼ), find coordinates Y = {y₁, ..., yₙ} ⊂ ℝᵈ minimizing:
$$\text{Stress} = \sum_{i < j} (d_{ij} - |y_i - y_j|)^2$$
For Euclidean distances, MDS reduces to PCA. But MDS can use any distance matrix, including geodesic distances.
Isomap:
Isomap (Tenenbaum et al., 2000) combines geodesic distance estimation with MDS:
The key insight: shortest paths in a neighborhood graph approximate geodesic distances on the manifold.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
import numpy as npfrom sklearn.neighbors import NearestNeighbors, kneighbors_graphfrom scipy.sparse.csgraph import shortest_pathfrom sklearn.manifold import MDSimport matplotlib.pyplot as pltfrom mpl_toolkits.mplot3d import Axes3D def isomap_from_scratch(X, n_neighbors=10, n_components=2): """ Implement Isomap for manifold learning. Parameters: ----------- X : ndarray of shape (n_samples, n_features) Input data n_neighbors : int Number of neighbors for graph construction n_components : int Dimension of the embedding Returns: -------- embedding : ndarray of shape (n_samples, n_components) Low-dimensional embedding """ n_samples = X.shape[0] # Step 1: Build k-NN graph with Euclidean distances knn = NearestNeighbors(n_neighbors=n_neighbors + 1) # +1 for self knn.fit(X) distances, indices = knn.kneighbors(X) # Build sparse graph matrix graph = np.full((n_samples, n_samples), np.inf) for i in range(n_samples): for j, d in zip(indices[i, 1:], distances[i, 1:]): graph[i, j] = d graph[j, i] = d # Make symmetric # Step 2: Compute shortest path (geodesic) distances geodesic_distances = shortest_path(graph, method='D', directed=False) # Check for disconnected graph if np.isinf(geodesic_distances).any(): print("Warning: Graph is disconnected. Some distances are infinite.") # Replace inf with large value for MDS geodesic_distances[np.isinf(geodesic_distances)] = geodesic_distances[~np.isinf(geodesic_distances)].max() * 2 # Step 3: Apply classical MDS to geodesic distances # MDS via eigendecomposition of centered distance matrix # Double centering n = geodesic_distances.shape[0] D_sq = geodesic_distances ** 2 H = np.eye(n) - np.ones((n, n)) / n # Centering matrix B = -0.5 * H @ D_sq @ H # Gram matrix # Eigendecomposition eigenvalues, eigenvectors = np.linalg.eigh(B) # Sort by descending eigenvalue idx = np.argsort(eigenvalues)[::-1] eigenvalues = eigenvalues[idx] eigenvectors = eigenvectors[:, idx] # Take top n_components eigenvalues = eigenvalues[:n_components] eigenvectors = eigenvectors[:, :n_components] # Embedding: Y = V * sqrt(Lambda) embedding = eigenvectors * np.sqrt(np.maximum(eigenvalues, 0)) return embedding # Test on Swiss roll - the canonical Isomap exampledef test_isomap_on_swiss_roll(): from sklearn.datasets import make_swiss_roll X, t = make_swiss_roll(n_samples=1000, noise=0.1, random_state=42) # Apply Isomap embedding = isomap_from_scratch(X, n_neighbors=12, n_components=2) # Visualize fig = plt.figure(figsize=(14, 5)) ax1 = fig.add_subplot(131, projection='3d') ax1.scatter(X[:, 0], X[:, 1], X[:, 2], c=t, cmap='viridis', s=10) ax1.set_title('Original Swiss Roll (3D)') ax2 = fig.add_subplot(132) ax2.scatter(embedding[:, 0], embedding[:, 1], c=t, cmap='viridis', s=10) ax2.set_title('Isomap Embedding (2D)Unrolled!') ax2.axis('equal') # Compare with PCA (fails to unroll) from sklearn.decomposition import PCA pca = PCA(n_components=2) pca_embedding = pca.fit_transform(X) ax3 = fig.add_subplot(133) ax3.scatter(pca_embedding[:, 0], pca_embedding[:, 1], c=t, cmap='viridis', s=10) ax3.set_title('PCA Embedding (2D)Fails to unroll') ax3.axis('equal') plt.tight_layout() plt.show() print("Isomap preserves the geodesic structure (colors flow smoothly)") print("PCA only sees the linear structure (colors are mixed)") test_isomap_on_swiss_roll()Isomap Assumptions and Limitations:
Local methods focus on preserving small-scale relationships—which points are neighbors of which—rather than exact distances. This makes them more robust to manifold curvature and global topology.
Locally Linear Embedding (LLE):
LLE (Roweis & Saul, 2000) exploits the local linearity of manifolds:
Key assumption: Each point can be reconstructed as a linear combination of its neighbors. This linear relationship is preserved in the embedding.
Algorithm:
Find neighbors: For each point xᵢ, find its k nearest neighbors
Compute reconstruction weights: Find weights Wᵢⱼ minimizing: $$\varepsilon(W) = \sum_i |x_i - \sum_j W_{ij} x_j|^2$$ subject to Wᵢⱼ = 0 if xⱼ is not a neighbor of xᵢ, and Σⱼ Wᵢⱼ = 1
Find embedding: Find low-dimensional coordinates Y minimizing: $$\Phi(Y) = \sum_i |y_i - \sum_j W_{ij} y_j|^2$$ using the same weights W (which encode local geometry)
The weights W capture how each point relates to its local tangent plane. The embedding preserves these relationships.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192
import numpy as npfrom sklearn.neighbors import NearestNeighborsimport matplotlib.pyplot as plt def lle_from_scratch(X, n_neighbors=10, n_components=2, reg=1e-3): """ Implement Locally Linear Embedding (LLE). Parameters: ----------- X : ndarray of shape (n_samples, n_features) Input data n_neighbors : int Number of neighbors n_components : int Dimension of embedding reg : float Regularization parameter Returns: -------- embedding : ndarray of shape (n_samples, n_components) """ n_samples, n_features = X.shape # Step 1: Find k nearest neighbors knn = NearestNeighbors(n_neighbors=n_neighbors + 1) # +1 for self knn.fit(X) _, indices = knn.kneighbors(X) indices = indices[:, 1:] # Remove self # Step 2: Compute reconstruction weights W = np.zeros((n_samples, n_samples)) for i in range(n_samples): # Get neighbors neighbors = X[indices[i]] # Center on xi z = neighbors - X[i] # Local covariance matrix C = z @ z.T # Regularize for numerical stability C += reg * np.eye(n_neighbors) * np.trace(C) # Solve for weights: Cw = 1, then normalize w = np.linalg.solve(C, np.ones(n_neighbors)) w = w / w.sum() # Store weights W[i, indices[i]] = w # Step 3: Compute embedding # Minimize ||y_i - sum_j W_ij y_j||^2 # This is equivalent to eigenvalue problem for (I - W)^T (I - W) M = (np.eye(n_samples) - W).T @ (np.eye(n_samples) - W) # Find smallest non-zero eigenvalues eigenvalues, eigenvectors = np.linalg.eigh(M) # Skip the smallest (should be ~0 for constant vector) # Take next n_components embedding = eigenvectors[:, 1:n_components + 1] return embedding # Test on Swiss rolldef test_lle_on_swiss_roll(): from sklearn.datasets import make_swiss_roll X, t = make_swiss_roll(n_samples=1000, noise=0.1, random_state=42) embedding = lle_from_scratch(X, n_neighbors=12, n_components=2) fig = plt.figure(figsize=(10, 4)) ax1 = fig.add_subplot(121, projection='3d') ax1.scatter(X[:, 0], X[:, 1], X[:, 2], c=t, cmap='viridis', s=10) ax1.set_title('Original Swiss Roll') ax2 = fig.add_subplot(122) ax2.scatter(embedding[:, 0], embedding[:, 1], c=t, cmap='viridis', s=10) ax2.set_title('LLE Embedding') ax2.axis('equal') plt.tight_layout() plt.show() test_lle_on_swiss_roll()Laplacian Eigenmaps:
Another local method based on spectral graph theory:
Build weighted neighborhood graph G with edge weights Wᵢⱼ = exp(-‖xᵢ - xⱼ‖²/2σ²) or Wᵢⱼ = 1 for k-NN
Compute graph Laplacian: L = D - W, where D is diagonal with Dᵢᵢ = Σⱼ Wᵢⱼ
Solve generalized eigenvalue problem: Lf = λDf
Take eigenvectors corresponding to smallest non-zero eigenvalues as embedding coordinates
Intuition: The Laplacian measures how much a function varies over the graph. Minimizing Σᵢⱼ Wᵢⱼ(yᵢ - yⱼ)² = yᵀLy encourages connected points to have similar embeddings.
Comparison of Local Methods:
| Method | Key Idea | Strengths | Weaknesses |
|---|---|---|---|
| LLE | Preserve linear reconstruction | Simple, one hyperparameter (k) | Sensitive to noise; can collapse dimensions |
| Laplacian Eigenmaps | Minimize graph variation | Principled spectral foundation | Needs bandwidth σ; sensitive to graph construction |
| Hessian LLE | Preserve local Hessian (curvature) | More robust than basic LLE | More complex to implement; higher k needed |
| LTSA | Local tangent alignment | Good for varying curvature | Computationally more expensive |
Modern visualization methods like t-SNE and UMAP prioritize creating interpretable 2D/3D visualizations over geometric fidelity. They excel at revealing cluster structure but should be interpreted carefully.
t-Distributed Stochastic Neighbor Embedding (t-SNE):
t-SNE (van der Maaten & Hinton, 2008) uses a probabilistic approach:
Step 1: Convert distances to probabilities (high-D) $$p_{j|i} = \frac{\exp(-|x_i - x_j|^2 / 2\sigma_i^2)}{\sum_{k eq i} \exp(-|x_i - x_k|^2 / 2\sigma_i^2)}$$
Symmetrize: pᵢⱼ = (p_{j|i} + p_{i|j}) / 2n
Step 2: Define similar probabilities in low-D (using Student-t distribution) $$q_{ij} = \frac{(1 + |y_i - y_j|^2)^{-1}}{\sum_{k eq l} (1 + |y_k - y_l|^2)^{-1}}$$
Step 3: Minimize KL divergence between P and Q: $$KL(P || Q) = \sum_{i eq j} p_{ij} \log \frac{p_{ij}}{q_{ij}}$$
The Student-t distribution in low-D has heavier tails than Gaussian, preventing 'crowding' of points.
t-SNE produces beautiful visualizations but requires careful interpretation:
• Cluster sizes are meaningless: t-SNE can expand or contract clusters arbitrarily • Distances between clusters are unreliable: Distant clusters in t-SNE may not be distant in original space • Results depend on hyperparameters: Perplexity, learning rate, and initialization affect outcomes • Multiple runs can differ: The optimization is non-convex; different runs yield different layouts • Don't draw too many conclusions: t-SNE is for exploration, not for quantitative claims
Uniform Manifold Approximation and Projection (UMAP):
UMAP (McInnes et al., 2018) is a more recent method with theoretical grounding in topological data analysis:
Key ideas:
Fuzzy topological representation: Build a weighted graph where edge weights represent the probability that two points are 'connected' on the manifold
Optimize cross-entropy: Find low-D representation whose fuzzy topological structure matches the high-D structure
UMAP advantages over t-SNE:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
import numpy as npimport matplotlib.pyplot as pltfrom sklearn.manifold import TSNE# UMAP requires: pip install umap-learnimport umapfrom sklearn.datasets import fetch_openml def compare_visualization_methods(X, y, sample_size=3000): """ Compare t-SNE and UMAP on a real dataset. """ # Subsample for speed np.random.seed(42) idx = np.random.choice(len(X), size=min(sample_size, len(X)), replace=False) X_sub = X[idx] y_sub = y[idx] print("Running t-SNE...") tsne = TSNE(n_components=2, perplexity=30, learning_rate='auto', init='pca', random_state=42, n_iter=1000) X_tsne = tsne.fit_transform(X_sub) print("Running UMAP...") reducer = umap.UMAP(n_neighbors=15, min_dist=0.1, n_components=2, random_state=42) X_umap = reducer.fit_transform(X_sub) # Visualize fig, axes = plt.subplots(1, 2, figsize=(14, 6)) unique_labels = np.unique(y_sub) colors = plt.cm.tab10(np.linspace(0, 1, len(unique_labels))) for ax, embedding, title in [(axes[0], X_tsne, 't-SNE'), (axes[1], X_umap, 'UMAP')]: for i, label in enumerate(unique_labels): mask = y_sub == label ax.scatter(embedding[mask, 0], embedding[mask, 1], c=[colors[i]], label=str(label), s=5, alpha=0.6) ax.set_title(f'{title} Visualization') ax.legend(loc='best', markerscale=3, fontsize=8) ax.axis('equal') plt.tight_layout() plt.show() return X_tsne, X_umap # Example on MNIST (if available)def visualize_mnist(): try: print("Loading MNIST...") mnist = fetch_openml('mnist_784', version=1, as_frame=False) X = mnist.data.astype(np.float32) / 255.0 y = mnist.target.astype(int) compare_visualization_methods(X, y, sample_size=5000) except Exception as e: print(f"MNIST not available: {e}") print("Using synthetic data instead...") # Generate synthetic clustered data from sklearn.datasets import make_blobs X, y = make_blobs(n_samples=3000, centers=10, n_features=50, random_state=42) compare_visualization_methods(X, y) visualize_mnist()| Aspect | t-SNE | UMAP |
|---|---|---|
| Speed | Slow (O(n²) baseline) | Fast (approximate NN + SGD) |
| Global structure | Often lost | Better preserved |
| Local structure | Excellent | Excellent |
| Scalability | Hard above ~10K points | Handles millions |
| Reproducibility | Varies with init | More consistent |
| Theory | Probabilistic | Topological (fuzzy simplicial sets) |
| Out-of-sample | Not built-in | Supported via transform() |
Deep learning offers powerful tools for manifold learning, particularly when we need to learn parametric mappings that generalize to new data or when we want to generate new samples from the manifold.
Autoencoders:
Autoencoders learn encoder-decoder pairs:
The bottleneck dimension of Z determines how many coordinates are learned. If bottleneck = intrinsic dimension d, the encoder learns manifold coordinates and decoder learns the embedding.
Variational Autoencoders (VAEs):
VAEs add probabilistic structure:
VAEs enable generation: sample z ~ p(z), then generate x = gφ(z). The regularized latent space is smoother and more suitable for interpolation.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118
import numpy as npimport torchimport torch.nn as nnimport torch.optim as optimimport matplotlib.pyplot as plt class ManifoldAutoencoder(nn.Module): """ Simple autoencoder for manifold learning. """ def __init__(self, input_dim, latent_dim, hidden_dims=[64, 32]): super().__init__() # Encoder encoder_layers = [] prev_dim = input_dim for dim in hidden_dims: encoder_layers.extend([ nn.Linear(prev_dim, dim), nn.ReLU() ]) prev_dim = dim encoder_layers.append(nn.Linear(prev_dim, latent_dim)) self.encoder = nn.Sequential(*encoder_layers) # Decoder (mirror of encoder) decoder_layers = [] prev_dim = latent_dim for dim in reversed(hidden_dims): decoder_layers.extend([ nn.Linear(prev_dim, dim), nn.ReLU() ]) prev_dim = dim decoder_layers.append(nn.Linear(prev_dim, input_dim)) self.decoder = nn.Sequential(*decoder_layers) def encode(self, x): return self.encoder(x) def decode(self, z): return self.decoder(z) def forward(self, x): z = self.encode(x) return self.decode(z), z def train_autoencoder(model, X, epochs=500, lr=1e-3, batch_size=128): """Train autoencoder on data X.""" X_tensor = torch.FloatTensor(X) dataset = torch.utils.data.TensorDataset(X_tensor) loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=True) optimizer = optim.Adam(model.parameters(), lr=lr) criterion = nn.MSELoss() losses = [] for epoch in range(epochs): epoch_loss = 0 for batch in loader: x = batch[0] x_recon, z = model(x) loss = criterion(x_recon, x) optimizer.zero_grad() loss.backward() optimizer.step() epoch_loss += loss.item() losses.append(epoch_loss / len(loader)) if (epoch + 1) % 100 == 0: print(f"Epoch {epoch+1}: Loss = {losses[-1]:.6f}") return losses # Example: Learn manifold structure of Swiss rolldef demo_autoencoder_manifold(): from sklearn.datasets import make_swiss_roll X, t = make_swiss_roll(n_samples=2000, noise=0.1, random_state=42) # Train autoencoder with 2D latent space (matching intrinsic dim) model = ManifoldAutoencoder(input_dim=3, latent_dim=2, hidden_dims=[64, 32]) losses = train_autoencoder(model, X, epochs=500) # Get latent representations model.eval() with torch.no_grad(): _, Z = model(torch.FloatTensor(X)) Z = Z.numpy() # Visualize fig = plt.figure(figsize=(14, 4)) ax1 = fig.add_subplot(131, projection='3d') ax1.scatter(X[:, 0], X[:, 1], X[:, 2], c=t, cmap='viridis', s=5) ax1.set_title('Original Swiss Roll (3D)') ax2 = fig.add_subplot(132) ax2.scatter(Z[:, 0], Z[:, 1], c=t, cmap='viridis', s=5) ax2.set_title('Autoencoder Latent Space (2D)') ax2.axis('equal') ax3 = fig.add_subplot(133) ax3.plot(losses) ax3.set_xlabel('Epoch') ax3.set_ylabel('Reconstruction Loss') ax3.set_title('Training Loss') plt.tight_layout() plt.show() print("The autoencoder learns a 2D parameterization of the manifold.") print("Unlike Isomap/LLE, it provides a parametric mapping usable for new points.") demo_autoencoder_manifold()Contrastive Learning:
Contrastive methods learn representations by:
This implicitly learns the data manifold by mapping augmentations—which should lie nearby on the manifold—to nearby embeddings.
Normalizing Flows:
Flows learn invertible transformations between data and latent space:
$$x = f_\theta(z), \quad z \sim p(z)$$
With invertibility, we can:
Flows explicitly model the manifold as a warped version of simple prior space.
Use classical methods (Isomap, LLE, UMAP) when: • Visualizing or exploring data structure • Dataset is small-to-medium (< 100K points) • You don't need out-of-sample generalization • Interpretability of the algorithm matters
Use deep learning methods when: • You need parametric mappings for new data • You want to generate new samples • Dataset is large and can train neural networks • You have domain-specific architectures (images, sequences)
Applying manifold learning in practice requires careful thought about method selection, hyperparameter tuning, and result validation.
Method Selection Workflow:
Hyperparameter Selection:
Most manifold learning methods have critical hyperparameters:
Number of neighbors k (Isomap, LLE, UMAP): Too small → disconnected graph, noise sensitivity. Too large → smooths over local structure, 'short-circuits' across manifold folds.
Perplexity (t-SNE): Related to effective neighborhood size. 5-50 typical; larger for larger datasets.
Target dimension d: Should match intrinsic dimension (which you should estimate first).
min_dist (UMAP): Controls how tightly points cluster. Lower = tighter clusters; higher = more spread.
Validation Strategies:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
import numpy as npfrom sklearn.neighbors import NearestNeighbors def trustworthiness(X_high, X_low, k=10): """ Compute trustworthiness: are low-D neighbors also neighbors in high-D? Trustworthiness = 1 means all k nearest neighbors in the embedding were also among the k nearest neighbors in the original space. """ n = X_high.shape[0] # Find k-NN in both spaces nbrs_high = NearestNeighbors(n_neighbors=k+1).fit(X_high) nbrs_low = NearestNeighbors(n_neighbors=k+1).fit(X_low) _, idx_high = nbrs_high.kneighbors(X_high) _, idx_low = nbrs_low.kneighbors(X_low) idx_high = idx_high[:, 1:] # Remove self idx_low = idx_low[:, 1:] # For each point, check how many low-D neighbors are not in high-D neighbors violations = 0 for i in range(n): set_high = set(idx_high[i]) for j, neighbor in enumerate(idx_low[i]): if neighbor not in set_high: # Rank in original space (how far is this 'intruder'?) rank = np.where(np.argsort(np.linalg.norm(X_high - X_high[i], axis=1)) == neighbor)[0][0] violations += (rank - k) # Penalize by how far it was # Normalize max_violations = (n * k * (2 * n - 3 * k - 1) / 2) trust = 1 - (2 / max_violations) * violations if max_violations > 0 else 1 return max(0, min(1, trust)) def continuity(X_high, X_low, k=10): """ Compute continuity: are high-D neighbors also neighbors in low-D? This is the 'inverse' of trustworthiness. """ # Just swap the roles return trustworthiness(X_low, X_high, k) def shepard_correlation(X_high, X_low, sample_size=1000): """ Compute approximate correlation between distances in high-D and low-D. Perfect embedding would have correlation 1. """ n = X_high.shape[0] # Sample pairs to avoid O(n^2) computation if n > sample_size: idx = np.random.choice(n, size=sample_size, replace=False) X_high = X_high[idx] X_low = X_low[idx] n = len(X_high) # Compute pairwise distances d_high = [] d_low = [] for i in range(n): for j in range(i+1, n): d_high.append(np.linalg.norm(X_high[i] - X_high[j])) d_low.append(np.linalg.norm(X_low[i] - X_low[j])) return np.corrcoef(d_high, d_low)[0, 1] # Demofrom sklearn.datasets import make_swiss_rollfrom sklearn.manifold import Isomap, LocallyLinearEmbedding X, t = make_swiss_roll(n_samples=500, noise=0.1, random_state=42) print("=== Embedding Quality Comparison ===")print("Method | Trust | Contin | Shepard r")print("-" * 50) for name, method in [('Isomap', Isomap(n_components=2, n_neighbors=12)), ('LLE', LocallyLinearEmbedding(n_components=2, n_neighbors=12))]: X_embed = method.fit_transform(X) trust = trustworthiness(X, X_embed, k=10) cont = continuity(X, X_embed, k=10) shep = shepard_correlation(X, X_embed) print(f"{name:15} | {trust:.3f} | {cont:.3f} | {shep:.3f}")This page surveyed the major approaches to estimating and exploiting manifold structure in data. Each method has its place in the practitioner's toolkit.
Module Complete:
You've now completed the Manifold Hypothesis module. You understand:
The subsequent modules in this chapter will dive deep into specific algorithms—MDS, Isomap, LLE, t-SNE, UMAP—building on this foundation.
You now possess the theoretical and practical foundations for manifold learning. The next modules will transform this understanding into mastery of specific algorithms, enabling you to apply the right tool to any high-dimensional data challenge.