Loading content...
When a linear transformation acts on a vector, it typically changes both the direction and magnitude. But for certain special vectors—the eigenvectors—the transformation only scales, leaving the direction unchanged (or exactly reversed). These invariant directions are the key to understanding what a matrix does.
Eigendecomposition factorizes a square matrix into its eigenvectors and eigenvalues, revealing the fundamental structure of the linear transformation. Unlike SVD which works for any matrix, eigendecomposition requires square matrices and may involve complex numbers—but in exchange, it provides unique insights into dynamics, stability, and long-term behavior.
From PageRank's random walk analysis to quantum mechanics' measurement operators, from differential equation solutions to Markov chain steady states, eigendecomposition is the analytical tool that translates matrix algebra into interpretable, actionable understanding.
By the end of this page, you will understand: (1) The definition of eigenvalues, eigenvectors, and eigenspaces, (2) When eigendecomposition exists and when it doesn't, (3) Special properties for symmetric and positive definite matrices, (4) How to compute eigendecomposition, (5) The spectral theorem and diagonalization, and (6) Key applications in ML including PCA, spectral clustering, and Markov chains.
The core definition:
For a square matrix A ∈ ℝⁿˣⁿ (or ℂⁿˣⁿ), a scalar λ is an eigenvalue of A if there exists a non-zero vector v such that:
$$Av = \lambda v$$
The vector v is called an eigenvector corresponding to eigenvalue λ.
Interpretation: Multiplying v by A has the same effect as multiplying by the scalar λ. The direction of v is preserved (or exactly reversed if λ < 0); only the magnitude changes.
The eigenspace:
For each eigenvalue λ, the set of all eigenvectors (plus the zero vector) forms a subspace called the eigenspace: $$E_\lambda = {v : Av = \lambda v} = \text{Null}(A - \lambda I)$$
The dimension of Eλ is called the geometric multiplicity of λ.
Finding eigenvalues:
Rearranging Av = λv: $$(A - \lambda I)v = 0$$
For a non-zero solution v to exist, (A - λI) must be singular, which means: $$\det(A - \lambda I) = 0$$
This polynomial equation in λ is called the characteristic polynomial. For an n×n matrix, it's a degree-n polynomial with exactly n roots (counting multiplicity, possibly complex).
The algebraic multiplicity of λ is its multiplicity as a root of the characteristic polynomial. The geometric multiplicity is the dimension of the eigenspace. We always have: geometric ≤ algebraic. When they're unequal, the matrix is called defective and cannot be diagonalized.
Key properties of eigenvalues:
Complex eigenvalues:
For real matrices, complex eigenvalues always come in conjugate pairs: if λ = a + bi is an eigenvalue, so is λ̄ = a - bi. The corresponding eigenvectors are also complex conjugates.
123456789101112131415161718192021222324252627282930313233343536373839404142434445
import numpy as np # Example: A 3x3 matrix with real and complex eigenvaluesA = np.array([ [4, -2, 0], [1, 1, 0], [0, 0, 3]]) # Compute eigenvalues and eigenvectorseigenvalues, eigenvectors = np.linalg.eig(A) print("Matrix A:")print(A)print(f"\nEigenvalues: {eigenvalues}")print(f"\nEigenvectors (columns):")print(eigenvectors) # Verify the eigenvalue equation: Av = λvfor i in range(len(eigenvalues)): v = eigenvectors[:, i] lam = eigenvalues[i] Av = A @ v lam_v = lam * v print(f"\nλ_{i+1} = {lam:.4f}") print(f"Av = {Av}") print(f"λv = {lam_v}") print(f"Match: {np.allclose(Av, lam_v)}") # Verify trace and determinant propertiesprint(f"\nSum of eigenvalues: {np.sum(eigenvalues):.4f}")print(f"Trace of A: {np.trace(A):.4f}")print(f"Product of eigenvalues: {np.prod(eigenvalues):.4f}")print(f"Determinant of A: {np.linalg.det(A):.4f}") # Example with complex eigenvalues (rotation matrix)theta = np.pi / 4 # 45 degreesR = np.array([ [np.cos(theta), -np.sin(theta)], [np.sin(theta), np.cos(theta)]]) eig_R, _ = np.linalg.eig(R)print(f"\nRotation matrix (45°) eigenvalues: {eig_R}")print("Note: Complex eigenvalues indicate rotation without stretching")If we can find n linearly independent eigenvectors for an n×n matrix A, we can express A in a remarkably simple form.
The eigendecomposition:
Let v₁, v₂, ..., vₙ be linearly independent eigenvectors with eigenvalues λ₁, λ₂, ..., λₙ. Form:
Then: $$A = Q \Lambda Q^{-1}$$
Why this works:
Since Avᵢ = λᵢvᵢ for each eigenvector: $$AQ = A[v_1 | v_2 | ... | v_n] = [\lambda_1 v_1 | \lambda_2 v_2 | ... | \lambda_n v_n] = Q\Lambda$$
Multiplying both sides by Q⁻¹: A = QΛQ⁻¹
When is diagonalization possible?
A matrix is diagonalizable if and only if:
Matrices that are NOT diagonalizable (defective matrices):
The classic example: $$A = \begin{pmatrix} 1 & 1 \ 0 & 1 \end{pmatrix}$$
This has eigenvalue λ = 1 with algebraic multiplicity 2, but geometric multiplicity 1 (only one independent eigenvector). It cannot be diagonalized over ℝ or ℂ.
Once you have A = QΛQ⁻¹, computing matrix powers becomes trivial: A^n = QΛⁿQ⁻¹. Since Λⁿ just raises each diagonal entry to the n-th power, this is O(n) operations instead of O(n³) for naive matrix multiplication. This is essential for analyzing dynamical systems and solving recurrences.
Applications of A = QΛQ⁻¹:
1. Matrix powers: $$A^k = Q\Lambda^k Q^{-1}$$ Λᵏ = diag(λ₁ᵏ, λ₂ᵏ, ..., λₙᵏ)
2. Matrix exponential: $$e^A = Qe^\Lambda Q^{-1}$$ exp(Λ) = diag(eᵏ¹, eᵏ², ..., eᵏⁿ)
This is crucial for solving systems of linear ODEs: dx/dt = Ax.
3. Matrix functions in general: For any function f that can be applied to scalars: $$f(A) = Qf(\Lambda)Q^{-1} = Q \cdot \text{diag}(f(\lambda_1), ..., f(\lambda_n)) \cdot Q^{-1}$$
4. Understanding long-term behavior: If |λᵢ| < 1 for all eigenvalues, then Aᵏ → 0 as k → ∞. If |λᵢ| ≤ 1 with equality only for simple eigenvalues, the system is stable.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
import numpy as np def verify_diagonalization(A): """ Verify eigendecomposition A = Q Λ Q^(-1) """ n = A.shape[0] eigenvalues, Q = np.linalg.eig(A) # Create diagonal matrix Lambda = np.diag(eigenvalues) # Reconstruct A Q_inv = np.linalg.inv(Q) A_reconstructed = Q @ Lambda @ Q_inv print("Original A:") print(A) print(f"\nEigenvalues: {eigenvalues}") print("\nQ (eigenvectors):") print(Q) print("\nΛ (eigenvalue diagonal):") print(Lambda) print(f"\nReconstruction error: {np.linalg.norm(A - A_reconstructed):.2e}") return eigenvalues, Q # Diagonalizable exampleA1 = np.array([ [4, 1], [2, 3]])print("=== Diagonalizable Matrix ===")verify_diagonalization(A1) # Computing matrix powers efficientlyprint("\n=== Matrix Powers via Eigendecomposition ===")eigenvalues, Q = np.linalg.eig(A1)Lambda = np.diag(eigenvalues)Q_inv = np.linalg.inv(Q) k = 100 # Compute A^100 # Method 1: Direct (inefficient for large k)A_power_direct = np.linalg.matrix_power(A1, k) # Method 2: Via eigendecomposition (efficient)Lambda_k = np.diag(eigenvalues ** k)A_power_eigen = Q @ Lambda_k @ Q_inv print(f"A^{k} via direct computation:")print(A_power_direct)print(f"\nA^{k} via eigendecomposition:")print(np.real(A_power_eigen)) # Real part (imaginary should be ~0)print(f"\nDifference: {np.linalg.norm(A_power_direct - A_power_eigen):.2e}") # Non-diagonalizable (defective) exampleprint("\n=== Non-Diagonalizable Matrix ===")A_defective = np.array([ [1, 1], [0, 1]])eigenvalues_d, Q_d = np.linalg.eig(A_defective)print("Matrix:")print(A_defective)print(f"Eigenvalues: {eigenvalues_d}")print("Q (note: columns are NOT independent):")print(Q_d)print(f"Condition number of Q: {np.linalg.cond(Q_d):.2e}")print("(Very high condition number indicates near-singularity)")Symmetric matrices (A = Aᵀ) have exceptionally nice properties that make eigendecomposition particularly powerful in machine learning.
The Spectral Theorem:
For any real symmetric matrix A ∈ ℝⁿˣⁿ with A = Aᵀ:
This means every symmetric matrix can be written as: $$A = Q\Lambda Q^T$$
where Q is orthogonal (QᵀQ = QQᵀ = I) and Λ is diagonal with real entries.
Why this matters for ML:
Many matrices in ML are symmetric by construction:
For all of these, eigendecomposition is numerically stable, produces real results, and yields orthogonal eigenvectors for easy interpretation.
When Q is orthogonal: Q⁻¹ = Qᵀ. This means we never need to compute a matrix inverse—just transpose! The decomposition A = QΛQᵀ involves only well-conditioned operations, making it numerically robust.
Positive definite matrices:
A symmetric matrix A is positive definite if xᵀAx > 0 for all non-zero x. Equivalently:
A is positive semi-definite (PSD) if xᵀAx ≥ 0, which means λᵢ ≥ 0.
Properties of PSD matrices:
The Rayleigh quotient:
For symmetric A, the Rayleigh quotient is: $$R(x) = \frac{x^T A x}{x^T x}$$
This function is minimized by the smallest eigenvector (value = smallest eigenvalue) and maximized by the largest eigenvector (value = largest eigenvalue): $$\lambda_{min} = \min_{x \neq 0} R(x), \quad \lambda_{max} = \max_{x \neq 0} R(x)$$
This is the foundation of power iteration and many optimization algorithms.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
import numpy as np def analyze_symmetric_matrix(A): """ Analyze properties of a symmetric matrix using spectral theorem. """ # Verify symmetry assert np.allclose(A, A.T), "Matrix is not symmetric!" # Use eigh for symmetric matrices (more stable, guaranteed real) eigenvalues, Q = np.linalg.eigh(A) # eigh returns in ascending order, often we want descending idx = np.argsort(eigenvalues)[::-1] eigenvalues = eigenvalues[idx] Q = Q[:, idx] print("Symmetric Matrix Analysis") print("=" * 50) print("Matrix A:") print(A) print(f"\nEigenvalues: {eigenvalues}") print(f"All real: {np.all(np.isreal(eigenvalues))}") # Check positive definiteness if np.all(eigenvalues > 0): print("Definiteness: Positive Definite") elif np.all(eigenvalues >= 0): print("Definiteness: Positive Semi-Definite") elif np.all(eigenvalues < 0): print("Definiteness: Negative Definite") else: print("Definiteness: Indefinite") # Verify orthogonality of eigenvectors print(f"\nQ (eigenvectors):") print(Q) print(f"\nQ^T Q (should be identity):") print(np.round(Q.T @ Q, 10)) # Verify reconstruction Lambda = np.diag(eigenvalues) A_reconstructed = Q @ Lambda @ Q.T print(f"\nReconstruction error ||A - QΛQ^T||: {np.linalg.norm(A - A_reconstructed):.2e}") return eigenvalues, Q # Example 1: Positive definite matrix (covariance)np.random.seed(42)X = np.random.randn(5, 3)Cov = X.T @ X # Always PSDprint("=== Positive Definite Covariance Matrix ===")analyze_symmetric_matrix(Cov) # Example 2: Graph Laplacianprint("\n=== Graph Laplacian ===")# Adjacency matrix for a simple graphAdj = np.array([ [0, 1, 1, 0], [1, 0, 1, 1], [1, 1, 0, 1], [0, 1, 1, 0]])Degree = np.diag(Adj.sum(axis=1))Laplacian = Degree - Adjanalyze_symmetric_matrix(Laplacian) # Rayleigh quotient demonstrationprint("\n=== Rayleigh Quotient ===")A = np.array([ [4, 1, 1], [1, 3, 0], [1, 0, 2]])eigenvalues, Q = np.linalg.eigh(A) for i, (lam, v) in enumerate(zip(eigenvalues, Q.T)): rayleigh = (v @ A @ v) / (v @ v) print(f"Eigenvector {i+1}: eigenvalue = {lam:.4f}, Rayleigh quotient = {rayleigh:.4f}")Computing eigenvalues and eigenvectors is a fundamental numerical problem with well-developed algorithms.
The challenge:
While the characteristic polynomial det(A - λI) = 0 defines eigenvalues, finding polynomial roots is numerically unstable for degree ≥ 5 (Abel-Ruffini theorem shows no general closed form). Modern algorithms avoid explicit polynomial computation.
Core algorithms:
1. Power Iteration (for largest eigenvalue):
2. Inverse Iteration (for eigenvalue near μ):
3. QR Algorithm (for all eigenvalues):
4. Symmetric QR / Jacobi:
NumPy provides np.linalg.eig (general matrices) and np.linalg.eigh (symmetric/Hermitian). The 'h' version is faster, more stable, and guarantees real eigenvalues—always use it for symmetric matrices! SciPy adds even more specialized routines for banded, sparse, and tridiagonal matrices.
Computational complexity:
| Algorithm | Time | Notes |
|---|---|---|
| Full eigendecomposition (dense) | O(n³) | Standard for small-medium |
| QR algorithm | O(n³) | Dominant for general matrices |
| Power iteration (1 eigenpair) | O(n²) per iteration | Simple, parallelizable |
| Lanczos (k eigenpairs, symmetric) | O(kn²) | For large sparse matrices |
| Arnoldi (k eigenpairs, general) | O(kn²) | For large sparse matrices |
For large sparse matrices:
When n is large but the matrix is sparse (few non-zero entries), iterative methods like Lanczos and Arnoldi compute only the dominant eigenpairs without forming dense intermediate matrices. These are essential for:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
import numpy as npfrom scipy.sparse.linalg import eigsh def power_iteration(A, num_iterations=100, tol=1e-10): """ Power iteration to find the largest eigenvalue/eigenvector. Returns: eigenvalue: Largest eigenvalue (by magnitude) eigenvector: Corresponding unit eigenvector history: List of eigenvalue estimates per iteration """ n = A.shape[0] # Random initialization x = np.random.randn(n) x = x / np.linalg.norm(x) history = [] for i in range(num_iterations): # Power step y = A @ x # Estimate eigenvalue via Rayleigh quotient eigenvalue = x @ y # Since x is unit norm history.append(eigenvalue) # Normalize x_new = y / np.linalg.norm(y) # Check convergence if np.linalg.norm(x_new - x) < tol: print(f"Converged in {i+1} iterations") return eigenvalue, x_new, history x = x_new print(f"Reached max iterations ({num_iterations})") return eigenvalue, x, history # Test on a symmetric matrixnp.random.seed(42)n = 100A = np.random.randn(n, n)A = A @ A.T # Make symmetric positive semidefinite # Power iterationlam_pow, v_pow, history = power_iteration(A) # Compare with numpyeigenvalues_np, eigenvectors_np = np.linalg.eigh(A)lam_max = eigenvalues_np[-1] # Largest (eigh returns ascending) print(f"\nPower iteration eigenvalue: {lam_pow:.6f}")print(f"NumPy largest eigenvalue: {lam_max:.6f}")print(f"Relative error: {abs(lam_pow - lam_max) / lam_max:.2e}") # For sparse matrices, use scipy.sparse.linalg.eigshfrom scipy.sparse import random as sparse_random print("\n=== Sparse Eigenvalue Computation ===")n_large = 1000k_eigenpairs = 5A_sparse = sparse_random(n_large, n_large, density=0.01, format='csr')A_sparse = A_sparse @ A_sparse.T # Symmetric # Get top k eigenvalueseigenvalues_sparse, eigenvectors_sparse = eigsh(A_sparse, k=k_eigenpairs, which='LM')print(f"Matrix size: {n_large} x {n_large}")print(f"Top {k_eigenpairs} eigenvalues: {eigenvalues_sparse[::-1]}")Beyond algebraic manipulation, eigendecomposition reveals deep geometric structure.
Eigenvectors as invariant directions:
When A acts on space, most vectors get rotated. Eigenvectors are the special directions that only get scaled:
For symmetric matrices, eigenvectors are mutually orthogonal, creating a natural coordinate system aligned with the matrix's action.
The ellipsoid picture for symmetric A:
Consider the quadratic form q(x) = xᵀAx for positive definite A.
The level set {x : xᵀAx = 1} is an ellipsoid:
This is fundamental to understanding:
Condition number:
The ratio κ(A) = |λₘₐₓ|/|λₘᵢₙ| (condition number) measures how "stretched" the ellipsoid is:
Ill-conditioned matrices cause numerical instability and slow convergence in iterative methods.
The condition number matters whenever you solve Ax = b or optimize quadratics. For κ = 10⁶, solving the linear system can lose about 6 digits of precision. Preconditioning (transforming to reduce κ) is a critical technique for large-scale ML optimization.
Eigenvalues and matrix dynamics:
For discrete dynamical systems xₖ₊₁ = Axₖ:
The dominant eigenvalue (largest |λ|) determines long-term behavior: $$x_k \approx \lambda_1^k (c_1 v_1) \text{ as } k \to \infty$$
where v₁ is the dominant eigenvector and c₁ is determined by initial conditions.
Complex eigenvalues indicate oscillation:
If λ = re^(iθ) with θ ≠ 0, π:
This is why complex eigenvalues come in conjugate pairs for real matrices—they represent oscillatory modes.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182
import numpy as npimport matplotlib.pyplot as plt def visualize_matrix_dynamics(A, x0, num_steps=50): """ Visualize trajectory of discrete dynamical system x_{k+1} = Ax_k """ eigenvalues, _ = np.linalg.eig(A) # Simulate trajectory trajectory = [x0] x = x0.copy() for _ in range(num_steps): x = A @ x trajectory.append(x.copy()) trajectory = np.array(trajectory) # Stability analysis max_eig = np.max(np.abs(eigenvalues)) if max_eig < 1: stability = "Stable (converges to 0)" elif max_eig == 1: stability = "Marginally stable" else: stability = "Unstable (diverges)" fig, axes = plt.subplots(1, 2, figsize=(14, 5)) # Plot trajectory axes[0].plot(trajectory[:, 0], trajectory[:, 1], 'b.-', alpha=0.7) axes[0].plot(x0[0], x0[1], 'go', markersize=10, label='Start') axes[0].plot(trajectory[-1, 0], trajectory[-1, 1], 'r*', markersize=15, label='End') axes[0].axhline(y=0, color='k', linewidth=0.5) axes[0].axvline(x=0, color='k', linewidth=0.5) axes[0].set_xlabel('x₁') axes[0].set_ylabel('x₂') axes[0].set_title(f'Trajectory: {stability}') axes[0].legend() axes[0].axis('equal') # Plot eigenvalues in complex plane theta = np.linspace(0, 2*np.pi, 100) axes[1].plot(np.cos(theta), np.sin(theta), 'k--', alpha=0.3, label='Unit circle') axes[1].scatter(eigenvalues.real, eigenvalues.imag, s=100, c='red', zorder=5) for i, lam in enumerate(eigenvalues): axes[1].annotate(f'λ{i+1}={lam:.2f}', (lam.real, lam.imag), textcoords="offset points", xytext=(10, 5)) axes[1].axhline(y=0, color='k', linewidth=0.5) axes[1].axvline(x=0, color='k', linewidth=0.5) axes[1].set_xlim(-2, 2) axes[1].set_ylim(-2, 2) axes[1].set_aspect('equal') axes[1].set_xlabel('Real') axes[1].set_ylabel('Imaginary') axes[1].set_title('Eigenvalues in Complex Plane') axes[1].legend() plt.tight_layout() plt.show() print(f"Eigenvalues: {eigenvalues}") print(f"Max |eigenvalue|: {max_eig:.4f}") # Example 1: Stable (contraction)A_stable = 0.9 * np.array([[np.cos(0.3), -np.sin(0.3)], [np.sin(0.3), np.cos(0.3)]])x0 = np.array([2, 1])print("=== Stable System (Rotation + Contraction) ===")visualize_matrix_dynamics(A_stable, x0) # Example 2: Unstable (expansion)A_unstable = 1.1 * np.eye(2) + 0.1 * np.array([[0, 1], [-1, 0]])print("\n=== Unstable System (Spiral outward) ===")visualize_matrix_dynamics(A_unstable, x0, num_steps=30) # Example 3: Saddle (one stable, one unstable direction)A_saddle = np.array([[1.2, 0], [0, 0.8]])print("\n=== Saddle Point (Mixed stability) ===")visualize_matrix_dynamics(A_saddle, x0)Eigendecomposition is not just mathematical elegance—it's the computational engine behind many ML methods.
1. Principal Component Analysis (PCA):
PCA finds orthogonal directions of maximum variance. Given centered data matrix X:
Eigendecomposition of the symmetric PSD matrix C gives ordered directions of decreasing importance—exactly what PCA needs.
2. Spectral Clustering:
To cluster data based on graph structure:
The eigenvectors of L encode cluster structure—points in the same cluster have similar eigenvector coordinates.
3. Kernel PCA:
For nonlinear dimensionality reduction:
The eigenvectors of K correspond to principal components in the kernel feature space.
Google's PageRank algorithm finds the dominant eigenvector of a modified web graph transition matrix. The eigenvector with eigenvalue 1 gives the stationary distribution of a random surfer—web pages with higher eigenvector components are more 'important'. This made eigendecomposition central to web search.
4. Markov Chains and Stationary Distributions:
For transition matrix P (rows sum to 1):
5. Linear Dynamical Systems:
The system dx/dt = Ax has solution x(t) = exp(At)x₀.
Eigendecomposition gives:
6. Hessian Analysis for Optimization:
At a critical point of f(x), the Hessian H = ∇²f determines:
The smallest eigenvalue direction indicates where the function curves least—important for understanding loss surface geometry.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
import numpy as npfrom sklearn.datasets import make_moonsfrom sklearn.cluster import KMeans, SpectralClusteringimport matplotlib.pyplot as plt def spectral_clustering_from_scratch(X, n_clusters, sigma=0.5): """ Implement spectral clustering using eigendecomposition. Steps: 1. Build similarity graph (RBF kernel) 2. Compute normalized Laplacian 3. Find bottom k eigenvectors 4. Cluster in eigenvector space """ n = X.shape[0] # Step 1: Compute similarity matrix (RBF kernel) from scipy.spatial.distance import cdist distances = cdist(X, X, 'sqeuclidean') W = np.exp(-distances / (2 * sigma**2)) np.fill_diagonal(W, 0) # No self-loops # Step 2: Compute normalized Laplacian L = I - D^(-1/2) W D^(-1/2) D = np.diag(W.sum(axis=1)) D_inv_sqrt = np.diag(1.0 / np.sqrt(W.sum(axis=1) + 1e-10)) L_norm = np.eye(n) - D_inv_sqrt @ W @ D_inv_sqrt # Step 3: Eigendecomposition (get smallest eigenvalues) eigenvalues, eigenvectors = np.linalg.eigh(L_norm) # Take the k smallest eigenvectors (excluding first which is constant) # For normalized Laplacian, smallest eigenvalue is 0 Z = eigenvectors[:, :n_clusters] # Normalize rows Z_normalized = Z / np.linalg.norm(Z, axis=1, keepdims=True) # Step 4: K-means clustering in eigenvector space kmeans = KMeans(n_clusters=n_clusters, random_state=42, n_init=10) labels = kmeans.fit_predict(Z_normalized) return labels, eigenvalues, Z # Generate non-linearly separable dataX, y_true = make_moons(n_samples=300, noise=0.05, random_state=42) # Our implementationlabels_scratch, eigenvalues, Z = spectral_clustering_from_scratch(X, n_clusters=2) # Sklearn implementation for comparisonspectral = SpectralClustering(n_clusters=2, affinity='rbf', gamma=2, random_state=42)labels_sklearn = spectral.fit_predict(X) # Visualizationfig, axes = plt.subplots(2, 2, figsize=(12, 10)) # Original data with true labelsaxes[0, 0].scatter(X[:, 0], X[:, 1], c=y_true, cmap='coolwarm', alpha=0.7)axes[0, 0].set_title('Original Data (True Labels)') # Our spectral clustering resultaxes[0, 1].scatter(X[:, 0], X[:, 1], c=labels_scratch, cmap='coolwarm', alpha=0.7)axes[0, 1].set_title('Spectral Clustering (Our Implementation)') # Eigenvector embeddingaxes[1, 0].scatter(Z[:, 0], Z[:, 1], c=labels_scratch, cmap='coolwarm', alpha=0.7)axes[1, 0].set_xlabel('Eigenvector 1')axes[1, 0].set_ylabel('Eigenvector 2')axes[1, 0].set_title('Eigenvector Space (Linearly Separable!)') # Eigenvalue spectrumaxes[1, 1].bar(range(len(eigenvalues[:20])), eigenvalues[:20])axes[1, 1].set_xlabel('Index')axes[1, 1].set_ylabel('Eigenvalue')axes[1, 1].set_title('Eigenvalue Spectrum (Eigengap at k=2)')axes[1, 1].axhline(y=eigenvalues[2], color='r', linestyle='--', label=f'Gap after λ₂') plt.tight_layout()plt.show() print(f"Clustering accuracy: {np.mean(labels_scratch == y_true):.2%}")print(f"First few eigenvalues: {eigenvalues[:5].round(4)}")We've explored eigendecomposition comprehensively—from definitions through spectral theory to practical ML applications. Let's consolidate the essential insights.
What's next:
We'll explore QR Decomposition—a fundamental factorization that underlies eigenvalue algorithms, least squares solvers, and numerical stability techniques. While less glamorous than SVD or eigendecomposition, QR is the computational workhorse that makes many algorithms practical.
You now understand eigendecomposition in depth—when it exists, how to compute it, what it reveals geometrically, and how it powers core ML algorithms. Combined with SVD, you have the two most important matrix decompositions for machine learning.