Eigenvalues And Eigenvectors - Learning Module

Loading content...

0/278

Geometric Interpretation

Seeing the Invisible Structure

Mathematics becomes powerful when abstract symbols acquire geometric meaning. The eigenvalue equation Av = λv is more than algebra—it describes a profound geometric relationship: eigenvectors are the directions along which a linear transformation acts most simply.

Imagine a rubber sheet being stretched, compressed, and sheared. Most points move in complicated ways. But certain directions—the eigenvectors—experience only pure stretching or compression, no rotation or shearing. These are the 'natural axes' of the transformation.

Understanding this geometry transforms eigenanalysis from symbol manipulation into visual intuition. When you see a covariance matrix in PCA, you'll visualize ellipsoids and principal axes. When you analyze graph Laplacians, you'll see smooth functions over clusters. This page builds that vision.

What You Will Learn

By the end of this page, you will visualize eigenvectors as invariant directions, interpret eigenvalues as scaling factors, understand how different eigenvalue types (positive, negative, zero, complex) produce different geometric effects, and connect this geometry to the ellipsoid interpretation of positive definite matrices—the foundation of PCA visualization.

Eigenvectors as Invariant Directions

Consider what happens when we apply a matrix A to various vectors. Most vectors rotate as well as scale—their direction changes. But eigenvectors are special: they only scale, staying on the same line through the origin.

Definition revisited geometrically:

An eigenvector v is a direction in space that remains invariant under the transformation A. The vector may stretch (|λ| > 1), compress (|λ| < 1), flip (λ < 0), or vanish (λ = 0), but it never rotates off its original line.

Think of a matrix as a machine that takes input vectors and outputs transformed vectors. Eigenvectors are the fixed directions of this machine—the only inputs whose outputs point the same way (or exactly opposite).

Generic Vectors

•Direction changes after transformation
•Both magnitude and angle affected
•End up pointing somewhere new
•No special relationship with A

Eigenvectors

•Direction preserved (or flipped)
•Only magnitude changes
•Stay on same line through origin
•Reveal A's intrinsic structure

visualize_eigenvectors.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
import numpy as np
import matplotlib.pyplot as plt
 
def visualize_transformation(A, title="Linear Transformation"):
    """
    Visualize how a matrix transforms vectors, highlighting
    eigenvectors as invariant directions.
    """
    fig, axes = plt.subplots(1, 2, figsize=(14, 6))
    
    # Compute eigenvectors and eigenvalues
    eigenvalues, eigenvectors = np.linalg.eig(A)
    
    # Create a grid of test vectors (unit circle)
    theta = np.linspace(0, 2*np.pi, 100)
    circle_x = np.cos(theta)
    circle_y = np.sin(theta)
    
    # Transform the circle
    circle_points = np.vstack([circle_x, circle_y])
    transformed = A @ circle_points
    
    # Left plot: Before transformation
    ax1 = axes[0]
    ax1.plot(circle_x, circle_y, 'b-', linewidth=2, label='Unit circle')
    ax1.set_xlim(-3, 3)
    ax1.set_ylim(-3, 3)
    ax1.set_aspect('equal')
    ax1.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
    ax1.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)
    ax1.set_title('Before Transformation', fontsize=12)
    ax1.grid(True, alpha=0.3)
    
    # Plot eigenvectors on left
    colors = ['red', 'green']
    for i in range(len(eigenvalues)):
        if np.isreal(eigenvalues[i]):
            v = eigenvectors[:, i].real
            ax1.arrow(0, 0, v[0], v[1], head_width=0.1, head_length=0.1,
                     fc=colors[i], ec=colors[i], linewidth=2)
            ax1.text(v[0]*1.2, v[1]*1.2, f'v{i+1}', fontsize=12, color=colors[i])
    
    # Right plot: After transformation
    ax2 = axes[1]
    ax2.plot(transformed[0], transformed[1], 'b-', linewidth=2, 
             label='Transformed ellipse')
    ax2.set_xlim(-3, 3)
    ax2.set_ylim(-3, 3)
    ax2.set_aspect('equal')
    ax2.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
    ax2.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)
    ax2.set_title('After Transformation', fontsize=12)
    ax2.grid(True, alpha=0.3)
    
    # Plot transformed eigenvectors (should be scaled versions)
    for i in range(len(eigenvalues)):
        if np.isreal(eigenvalues[i]):
            v = eigenvectors[:, i].real
            Av = (A @ v).real
            # Original direction (scaled for comparison)
            ax2.arrow(0, 0, v[0], v[1], head_width=0.1, head_length=0.1,
                     fc=colors[i], ec=colors[i], linewidth=1, alpha=0.3)
            # Transformed (should be parallel!)
            ax2.arrow(0, 0, Av[0], Av[1], head_width=0.1, head_length=0.1,
                     fc=colors[i], ec=colors[i], linewidth=2)
            λ = eigenvalues[i].real
            ax2.text(Av[0]*1.1, Av[1]*1.1, f'Av{i+1}=λ{i+1}v{i+1}
(λ={λ:.2f})', 
                    fontsize=10, color=colors[i])
    
    plt.suptitle(f'{title}
A = {A.tolist()}', fontsize=14)
    plt.tight_layout()
    plt.savefig('transformation_visualization.png', dpi=150)
    plt.show()
    
    print(f"Eigenvalues: {eigenvalues}")
    print(f"Eigenvectors (columns): 
{eigenvectors}")
 
# Example: Stretching matrix
A = np.array([
    [2, 1],
    [1, 2]
])
visualize_transformation(A, "Symmetric Stretching")

The unit circle test:

A powerful way to visualize a 2D linear transformation is to see what it does to the unit circle. Every 2D linear transformation (with real eigenvalues) maps the unit circle to an ellipse. The eigenvectors point along the major and minor axes of this ellipse, and the eigenvalues give the lengths of those semi-axes.

This is exactly what PCA exploits: the covariance matrix's eigenvectors define the ellipsoid that best fits the data cloud, and eigenvalues measure the spread along each axis.

Eigenvalues as Scaling Factors

If eigenvectors are the 'special directions,' eigenvalues tell us how much the transformation stretches or compresses along those directions. The magnitude and sign of eigenvalues have distinct geometric meanings:

Geometric Interpretation of Eigenvalue Values
Eigenvalue (λ)	Geometric Effect	Visual Interpretation
λ > 1	Stretching	Eigenvector direction expands; points move away from origin
0 < λ < 1	Compression	Eigenvector direction shrinks; points move toward origin
λ = 1	Preservation	No change along this direction; identity behavior
λ = 0	Collapse	Entire direction collapses to zero; dimension lost
-1 < λ < 0	Flip + Compress	Direction reverses and shrinks
λ = -1	Reflection	Direction reverses but maintains length
λ < -1	Flip + Stretch	Direction reverses and expands

Absolute Value Matters

|λ| determines the scaling magnitude. If |λ| > 1, the direction stretches; if |λ| < 1, it compresses. The sign determines whether direction is preserved (λ > 0) or flipped (λ < 0). This is crucial for understanding stability in dynamical systems.

Example: Diagonal matrix interpretation

The simplest case is a diagonal matrix:

$$D = \begin{pmatrix} 3 & 0 \ 0 & 0.5 \end{pmatrix}$$

The eigenvectors are the standard basis vectors e₁ = [1, 0]ᵀ and e₂ = [0, 1]ᵀ, with eigenvalues 3 and 0.5 respectively.

Geometrically:

Vectors along the x-axis stretch by factor 3
Vectors along the y-axis compress by factor 0.5
The unit circle becomes an ellipse with semi-axes of length 3 (horizontal) and 0.5 (vertical)

Every matrix with real eigenvalues can be understood this way—we just need to find the 'natural' coordinate system (eigenvectors) where the action becomes this simple.

eigenvalue_effects.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
import numpy as np
import matplotlib.pyplot as plt
 
def demonstrate_eigenvalue_effects():
    """
    Demonstrate how different eigenvalue combinations
    produce different geometric transformations.
    """
    fig, axes = plt.subplots(2, 3, figsize=(15, 10))
    
    # Different transformation matrices
    transformations = [
        ("Uniform Stretch
λ₁=λ₂=2", np.array([[2, 0], [0, 2]])),
        ("Non-uniform Stretch
λ₁=3, λ₂=0.5", np.array([[3, 0], [0, 0.5]])),
        ("Reflection (y-axis)
λ₁=-1, λ₂=1", np.array([[-1, 0], [0, 1]])),
        ("Shear (non-diagonal)
λ₁=3, λ₂=1", np.array([[2, 1], [1, 2]])),
        ("Collapse to 1D
λ₁=2, λ₂=0", np.array([[2, 0], [0, 0]])),
        ("Rotation (complex λ)
90° rotation", np.array([[0, -1], [1, 0]])),
    ]
    
    theta = np.linspace(0, 2*np.pi, 100)
    circle = np.vstack([np.cos(theta), np.sin(theta)])
    
    for ax, (title, A) in zip(axes.flat, transformations):
        # Transform unit circle
        transformed = A @ circle
        
        # Get eigenvalues for annotation
        eigvals = np.linalg.eigvals(A)
        
        # Plot
        ax.plot(circle[0], circle[1], 'b--', alpha=0.3, label='Original')
        ax.plot(transformed[0], transformed[1], 'r-', linewidth=2, label='Transformed')
        ax.set_xlim(-4, 4)
        ax.set_ylim(-4, 4)
        ax.set_aspect('equal')
        ax.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
        ax.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)
        ax.grid(True, alpha=0.3)
        ax.set_title(f'{title}', fontsize=11)
        
        # Annotate eigenvalues
        eig_str = ', '.join([f'{e:.2f}' if np.isreal(e) else f'{e:.2f}' 
                           for e in eigvals])
        ax.text(0.05, 0.95, f'λ: {eig_str}', transform=ax.transAxes,
               fontsize=9, verticalalignment='top', 
               bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))
    
    plt.suptitle('How Different Eigenvalues Transform the Unit Circle', fontsize=14)
    plt.tight_layout()
    plt.savefig('eigenvalue_effects.png', dpi=150)
    plt.show()
 
demonstrate_eigenvalue_effects()

Complex Eigenvalues: Rotation and Oscillation

Not all matrices have real eigenvalues. When the characteristic polynomial has complex roots, the eigenvalues come as conjugate pairs: λ = a ± bi. These indicate rotation mixed with scaling.

The geometry of complex eigenvalues:

A matrix with complex eigenvalues λ = a ± bi can be understood as:

Rotation by angle θ = arctan(b/a)
Scaling by factor r = √(a² + b²)

Pure rotation (like 90° counterclockwise) has eigenvalues ±i—no real eigenvalues at all! This makes sense geometrically: a rotation has no invariant directions (except the trivial zero vector or the rotation axis in 3D).

The Fundamental Theorem of Algebra at Work

Every n×n real matrix has exactly n eigenvalues in ℂ (counting multiplicities). If some eigenvalues are complex, they appear in conjugate pairs (since the characteristic polynomial has real coefficients). A real 2×2 matrix with complex eigenvalues represents rotation; a real 3×3 matrix can have one real eigenvalue (rotation axis) and a conjugate pair (rotation in perpendicular plane).

Polar form interpretation:

Write the complex eigenvalue as λ = re^(iθ) where:

r = |λ| = radial distance from origin = scaling factor
θ = arg(λ) = angle = rotation amount

Repeated application of the matrix (Aⁿ) has eigenvalues λⁿ = rⁿe^(inθ):

If r < 1: spiral inward (damped oscillation)
If r = 1: pure rotation (circle)
If r > 1: spiral outward (growing oscillation)

This is why complex eigenvalues matter for dynamical systems and recurrent neural networks—they determine whether oscillatory behavior decays, persists, or explodes.

complex_eigenvalues.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
import numpy as np
import matplotlib.pyplot as plt
 
def visualize_complex_eigenvalue_dynamics():
    """
    Visualize how complex eigenvalues create spiral/rotational behavior
    when a transformation is applied repeatedly.
    """
    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
    
    # Three cases: damped, pure rotation, growing
    cases = [
        ("Damped Oscillation
|λ| < 1", 0.9, np.pi/6),    # r=0.9, θ=30°
        ("Pure Rotation
|λ| = 1", 1.0, np.pi/6),         # r=1.0, θ=30°
        ("Growing Oscillation
|λ| > 1", 1.1, np.pi/6),   # r=1.1, θ=30°
    ]
    
    for ax, (title, r, theta) in zip(axes, cases):
        # Rotation matrix with scaling
        # This has eigenvalues r*e^(±iθ)
        A = r * np.array([
            [np.cos(theta), -np.sin(theta)],
            [np.sin(theta), np.cos(theta)]
        ])
        
        # Start with a point and iterate
        point = np.array([[1], [0]])
        trajectory = [point.flatten()]
        
        for _ in range(50):
            point = A @ point
            trajectory.append(point.flatten())
        
        trajectory = np.array(trajectory)
        
        # Plot trajectory
        ax.plot(trajectory[:, 0], trajectory[:, 1], 'b-', alpha=0.7)
        ax.scatter(trajectory[:, 0], trajectory[:, 1], c=range(len(trajectory)),
                  cmap='viridis', s=20)
        ax.scatter([1], [0], c='red', s=100, zorder=5, label='Start')
        ax.scatter(trajectory[-1, 0], trajectory[-1, 1], c='green', 
                  s=100, zorder=5, marker='s', label='End')
        
        # Formatting
        ax.set_xlim(-3, 3)
        ax.set_ylim(-3, 3)
        ax.set_aspect('equal')
        ax.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
        ax.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)
        ax.grid(True, alpha=0.3)
        ax.set_title(title, fontsize=12)
        ax.legend(loc='upper right')
        
        # Show eigenvalues
        eigvals = np.linalg.eigvals(A)
        ax.text(0.05, 0.05, f'λ = {eigvals[0]:.2f}', 
               transform=ax.transAxes, fontsize=9,
               bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))
    
    plt.suptitle('Effect of Complex Eigenvalues on Repeated Transformation', fontsize=14)
    plt.tight_layout()
    plt.savefig('complex_eigenvalue_dynamics.png', dpi=150)
    plt.show()
 
# Also visualize eigenvalues in complex plane
def plot_eigenvalues_complex_plane():
    """
    Show eigenvalues as points in the complex plane,
    with unit circle indicating stability boundary.
    """
    fig, ax = plt.subplots(figsize=(8, 8))
    
    # Unit circle (stability boundary for discrete systems)
    theta = np.linspace(0, 2*np.pi, 100)
    ax.plot(np.cos(theta), np.sin(theta), 'k--', alpha=0.5, 
           label='|λ| = 1 (stability boundary)')
    
    # Sample matrices and their eigenvalues
    matrices = [
        ("Stable system", np.array([[0.8, 0.1], [-0.1, 0.8]])),
        ("Marginally stable", np.array([[0, -1], [1, 0]])),
        ("Unstable system", np.array([[1.1, 0.1], [-0.1, 1.1]])),
        ("Mixed stability", np.array([[0.5, 0], [0, 1.5]])),
    ]
    
    colors = ['green', 'blue', 'red', 'orange']
    for (name, A), color in zip(matrices, colors):
        eigvals = np.linalg.eigvals(A)
        ax.scatter(eigvals.real, eigvals.imag, c=color, s=150, 
                  label=name, edgecolors='black', linewidths=2)
    
    ax.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
    ax.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)
    ax.set_xlim(-2, 2)
    ax.set_ylim(-2, 2)
    ax.set_aspect('equal')
    ax.set_xlabel('Real(λ)', fontsize=12)
    ax.set_ylabel('Imag(λ)', fontsize=12)
    ax.set_title('Eigenvalues in the Complex Plane', fontsize=14)
    ax.legend(loc='upper right')
    ax.grid(True, alpha=0.3)
    plt.savefig('complex_plane_eigenvalues.png', dpi=150)
    plt.show()
 
visualize_complex_eigenvalue_dynamics()
plot_eigenvalues_complex_plane()

The Ellipsoid Interpretation (Foundation for PCA)

One of the most important geometric interpretations connects eigenanalysis to ellipsoids. This is the visual foundation for understanding Principal Component Analysis.

The key insight:

For a symmetric positive definite matrix A (like a covariance matrix), the equation:

$$\mathbf{x}^T \mathbf{A}^{-1} \mathbf{x} = 1$$

defines an ellipsoid in ℝⁿ. The eigenvectors of A give the principal axes of this ellipsoid, and the square roots of the eigenvalues give the lengths of the semi-axes.

PCA Geometry Preview

In PCA, the covariance matrix Σ describes the 'shape' of the data cloud. Its eigenvectors point along the principal axes of the data ellipsoid. Eigenvalues quantify how spread out the data is along each axis. The largest eigenvalue corresponds to the direction of maximum variance—the first principal component!

Visual construction:

Start with a unit sphere in ℝⁿ (all points at distance 1 from origin)
The transformation by A maps this sphere to an ellipsoid
The eigenvectors become the principal axes of the ellipsoid
The eigenvalues determine how much each axis stretches

For the covariance matrix, this ellipsoid represents the 'confidence region' of the data distribution. Points inside are more likely; points outside are less likely. The eigenvector with largest eigenvalue points in the direction of greatest data spread.

Example: 2D Gaussian distribution

A 2D Gaussian with covariance matrix Σ has probability contours that are ellipses. These ellipses are level sets of x^T Σ^(-1) x = c. The principal axes of these ellipses are exactly the eigenvectors of Σ.

ellipsoid_visualization.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse
import matplotlib.transforms as transforms
 
def draw_covariance_ellipse(cov, ax, n_std=2.0, **kwargs):
    """
    Draw an ellipse representing a 2D covariance matrix.
    The ellipse shows n_std standard deviations.
    """
    eigenvalues, eigenvectors = np.linalg.eig(cov)
    
    # Sort by eigenvalue (largest first)
    order = eigenvalues.argsort()[::-1]
    eigenvalues = eigenvalues[order]
    eigenvectors = eigenvectors[:, order]
    
    # Width and height are "diameter", so 2 * sqrt(eigenvalue) * n_std
    width = 2 * n_std * np.sqrt(eigenvalues[0])
    height = 2 * n_std * np.sqrt(eigenvalues[1])
    
    # Rotation angle
    angle = np.degrees(np.arctan2(eigenvectors[1, 0], eigenvectors[0, 0]))
    
    ellipse = Ellipse(xy=(0, 0), width=width, height=height, 
                      angle=angle, **kwargs)
    ax.add_patch(ellipse)
    
    return eigenvalues, eigenvectors
 
def visualize_covariance_geometry():
    """
    Visualize how the covariance matrix defines an ellipse
    and how eigenvectors/eigenvalues relate to its axes.
    """
    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
    
    # Three different covariance matrices
    covariances = [
        ("Uncorrelated
equal variance", np.array([[1, 0], [0, 1]])),
        ("Uncorrelated
different variance", np.array([[4, 0], [0, 1]])),
        ("Correlated
(tilted ellipse)", np.array([[2, 1.5], [1.5, 2]])),
    ]
    
    for ax, (title, cov) in zip(axes, covariances):
        # Generate sample data
        np.random.seed(42)
        data = np.random.multivariate_normal([0, 0], cov, 500)
        
        # Plot data points
        ax.scatter(data[:, 0], data[:, 1], alpha=0.3, s=10, c='blue')
        
        # Draw covariance ellipse
        eigenvalues, eigenvectors = draw_covariance_ellipse(
            cov, ax, n_std=2.0, 
            facecolor='none', edgecolor='red', linewidth=2, 
            label='2σ ellipse'
        )
        
        # Draw eigenvector directions (scaled by sqrt of eigenvalue)
        colors = ['green', 'orange']
        for i in range(2):
            v = eigenvectors[:, i]
            scale = 2 * np.sqrt(eigenvalues[i])  # 2σ length
            ax.arrow(0, 0, v[0]*scale, v[1]*scale, 
                    head_width=0.15, head_length=0.1,
                    fc=colors[i], ec=colors[i], linewidth=2)
            ax.text(v[0]*scale*1.1, v[1]*scale*1.1, 
                   f'PC{i+1}
λ={eigenvalues[i]:.2f}',
                   fontsize=9, color=colors[i])
        
        ax.set_xlim(-6, 6)
        ax.set_ylim(-6, 6)
        ax.set_aspect('equal')
        ax.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
        ax.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)
        ax.set_title(f'{title}
Σ = {cov.tolist()}', fontsize=11)
        ax.grid(True, alpha=0.3)
    
    plt.suptitle('Covariance Matrix Determines Data Ellipse
'
                 'Eigenvectors = Principal Axes, Eigenvalues = Variance along Each Axis', 
                 fontsize=13)
    plt.tight_layout()
    plt.savefig('covariance_ellipse_geometry.png', dpi=150)
    plt.show()
 
visualize_covariance_geometry()

Visualizing Eigenvalue Decomposition

If a matrix A is diagonalizable, we can write:

$$\mathbf{A} = \mathbf{V} \mathbf{\Lambda} \mathbf{V}^{-1}$$

where V is the matrix of eigenvectors (as columns) and Λ is the diagonal matrix of eigenvalues. This decomposition has a beautiful geometric interpretation:

The three-step transformation:

V⁻¹: Change coordinates from standard basis to eigenvector basis
Λ: Apply simple scaling along each eigenvector direction
V: Change coordinates back to standard basis

In other words, every diagonalizable transformation is just scaling along special axes (eigenvectors), disguised by a coordinate transformation.

The Power of Diagonalization

This decomposition simplifies matrix powers: A^n = V Λ^n V⁻¹. Since Λ is diagonal, Λ^n just raises each diagonal entry to the nth power. This is why eigenanalysis is crucial for analyzing dynamical systems, Markov chains, and any iterative process.

decomposition_visualization.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
import numpy as np
import matplotlib.pyplot as plt
 
def visualize_eigendecomposition_steps():
    """
    Visualize the eigenvalue decomposition A = VΛV^(-1)
    as a three-step geometric transformation.
    """
    # Define a matrix with distinct eigenvalues
    A = np.array([
        [2, 1],
        [1, 2]
    ])
    
    eigenvalues, V = np.linalg.eig(A)
    Lambda = np.diag(eigenvalues)
    V_inv = np.linalg.inv(V)
    
    # Unit square vertices
    square = np.array([
        [0, 1, 1, 0, 0],
        [0, 0, 1, 1, 0]
    ])
    
    fig, axes = plt.subplots(1, 4, figsize=(16, 4))
    
    # Step 0: Original
    ax = axes[0]
    ax.plot(square[0], square[1], 'b-', linewidth=2)
    ax.fill(square[0], square[1], alpha=0.3)
    ax.set_xlim(-3, 3)
    ax.set_ylim(-3, 3)
    ax.set_aspect('equal')
    ax.set_title('Original Shape', fontsize=11)
    ax.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
    ax.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)
    ax.grid(True, alpha=0.3)
    
    # Step 1: Apply V^(-1) (change to eigenvector coordinates)
    step1 = V_inv @ square
    ax = axes[1]
    ax.plot(step1[0], step1[1], 'g-', linewidth=2)
    ax.fill(step1[0], step1[1], alpha=0.3, color='green')
    ax.set_xlim(-3, 3)
    ax.set_ylim(-3, 3)
    ax.set_aspect('equal')
    ax.set_title('Step 1: V⁻¹x
(Eigenvector coordinates)', fontsize=11)
    ax.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
    ax.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)
    ax.grid(True, alpha=0.3)
    
    # Step 2: Apply Λ (simple scaling)
    step2 = Lambda @ step1
    ax = axes[2]
    ax.plot(step2[0], step2[1], 'orange', linewidth=2)
    ax.fill(step2[0], step2[1], alpha=0.3, color='orange')
    ax.set_xlim(-3, 3)
    ax.set_ylim(-3, 3)
    ax.set_aspect('equal')
    ax.set_title(f'Step 2: Λ(V⁻¹x)
Scale by λ₁={eigenvalues[0]:.2f}, λ₂={eigenvalues[1]:.2f}', 
                fontsize=11)
    ax.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
    ax.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)
    ax.grid(True, alpha=0.3)
    
    # Step 3: Apply V (back to standard coordinates)
    step3 = V @ step2  # This equals A @ square
    ax = axes[3]
    ax.plot(step3[0], step3[1], 'r-', linewidth=2)
    ax.fill(step3[0], step3[1], alpha=0.3, color='red')
    ax.set_xlim(-3, 3)
    ax.set_ylim(-3, 3)
    ax.set_aspect('equal')
    ax.set_title('Step 3: VΛV⁻¹x = Ax
(Final result)', fontsize=11)
    ax.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
    ax.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)
    ax.grid(True, alpha=0.3)
    
    plt.suptitle(f'Eigenvalue Decomposition: A = VΛV⁻¹
'
                f'A = {A.tolist()}', fontsize=13)
    plt.tight_layout()
    plt.savefig('eigendecomposition_steps.png', dpi=150)
    plt.show()
    
    # Verify
    print("Verification:")
    print(f"V @ Λ @ V⁻¹ =")
    print(V @ Lambda @ V_inv)
    print(f"
Should equal A =")
    print(A)
 
visualize_eigendecomposition_steps()

For symmetric matrices:

When A is symmetric (A = Aᵀ), the eigenvectors are orthogonal, so V is an orthogonal matrix (V⁻¹ = Vᵀ). The decomposition becomes:

$$\mathbf{A} = \mathbf{V} \mathbf{\Lambda} \mathbf{V}^T$$

This is the spectral decomposition—the foundation of PCA. Since V is orthogonal:

The coordinate transformation is a pure rotation (no distortion)
Numerical computation is more stable (no matrix inversion needed)
The geometric interpretation is cleaner: rotate to principal axes, scale, rotate back

Eigenvalue Geometry in Higher Dimensions

While 2D visualizations are intuitive, real ML applications involve high-dimensional spaces. The geometric intuition extends beautifully:

In n-dimensional space:

The unit sphere (all vectors with ||x|| = 1) transforms to an n-dimensional ellipsoid
The eigenvectors define the principal axes of this ellipsoid
The eigenvalues determine the semi-axis lengths
The eigenvector with largest eigenvalue points in the direction of maximum stretching

Geometric Concepts Across Dimensions
Dimension	Unit Object	Transforms To	Principal Axes
1D	Interval [-1, 1]	Scaled interval	1 eigenvector
2D	Unit circle	Ellipse	2 orthogonal eigenvectors
3D	Unit sphere	Ellipsoid	3 orthogonal eigenvectors
nD	Unit hypersphere	n-Ellipsoid	n orthogonal eigenvectors

The Curse of Dimensionality... and Its Cure

In high dimensions, data often lies near a lower-dimensional subspace. Eigenanalysis reveals this structure: if many eigenvalues are near zero, the data is (approximately) confined to the subspace spanned by eigenvectors with large eigenvalues. This is how PCA achieves dimensionality reduction—we keep only the directions that 'matter.'

Eigenvalue spectrum and intrinsic dimensionality:

For a covariance matrix of high-dimensional data, the eigenvalue spectrum (plot of eigenvalues sorted by magnitude) reveals the data's intrinsic structure:

Steep drop: Data has a clear low-dimensional structure
Gradual decay: Variance is spread across many dimensions
Elbow point: Natural cutoff for dimensionality reduction

In PCA, we often keep eigenvectors until we capture some fraction (e.g., 95%) of total variance. The eigenvalue spectrum tells us how many components that requires.

eigenvalue_spectrum.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
import numpy as np
import matplotlib.pyplot as plt
 
def analyze_eigenvalue_spectrum():
    """
    Demonstrate how eigenvalue spectrum reveals
    intrinsic dimensionality of data.
    """
    np.random.seed(42)
    
    # Create three types of data with different intrinsic dimensions
    n_samples = 1000
    n_features = 50
    
    fig, axes = plt.subplots(2, 3, figsize=(15, 10))
    
    datasets = []
    
    # Dataset 1: Truly 2D data embedded in 50D
    # Data lies on a 2D plane
    latent_2d = np.random.randn(n_samples, 2)
    projection_2d = np.random.randn(2, n_features)
    data_2d = latent_2d @ projection_2d + 0.1 * np.random.randn(n_samples, n_features)
    datasets.append(("2D Subspace + Noise", data_2d))
    
    # Dataset 2: 10D data embedded in 50D
    latent_10d = np.random.randn(n_samples, 10)
    projection_10d = np.random.randn(10, n_features)
    data_10d = latent_10d @ projection_10d + 0.1 * np.random.randn(n_samples, n_features)
    datasets.append(("10D Subspace + Noise", data_10d))
    
    # Dataset 3: Full rank (no low-dim structure)
    data_full = np.random.randn(n_samples, n_features)
    datasets.append(("No Low-D Structure", data_full))
    
    for idx, (title, data) in enumerate(datasets):
        # Compute covariance matrix
        data_centered = data - data.mean(axis=0)
        cov = data_centered.T @ data_centered / (n_samples - 1)
        
        # Get eigenvalues
        eigenvalues = np.linalg.eigvalsh(cov)[::-1]  # Sorted descending
        
        # Cumulative variance explained
        total_var = eigenvalues.sum()
        cumulative_var = np.cumsum(eigenvalues) / total_var
        
        # Top plot: Eigenvalue spectrum
        ax = axes[0, idx]
        ax.semilogy(range(1, len(eigenvalues)+1), eigenvalues, 'b-', linewidth=2)
        ax.scatter(range(1, len(eigenvalues)+1), eigenvalues, c='blue', s=20)
        ax.set_xlabel('Component Index')
        ax.set_ylabel('Eigenvalue (log scale)')
        ax.set_title(f'{title}
Eigenvalue Spectrum')
        ax.grid(True, alpha=0.3)
        
        # Highlight the "elbow"
        if idx < 2:  # Low-D datasets
            intrinsic_dim = [2, 10][idx]
            ax.axvline(x=intrinsic_dim, color='red', linestyle='--', 
                      label=f'Intrinsic dim ≈ {intrinsic_dim}')
            ax.legend()
        
        # Bottom plot: Cumulative variance
        ax = axes[1, idx]
        ax.plot(range(1, len(eigenvalues)+1), cumulative_var * 100, 'g-', linewidth=2)
        ax.scatter(range(1, len(eigenvalues)+1), cumulative_var * 100, c='green', s=20)
        ax.axhline(y=95, color='red', linestyle='--', alpha=0.7, label='95% variance')
        ax.set_xlabel('Number of Components')
        ax.set_ylabel('Cumulative Variance Explained (%)')
        ax.set_title('Cumulative Variance')
        ax.set_ylim(0, 105)
        ax.grid(True, alpha=0.3)
        ax.legend()
        
        # Find how many components for 95% variance
        n_components_95 = np.argmax(cumulative_var >= 0.95) + 1
        ax.axvline(x=n_components_95, color='orange', linestyle=':', 
                  label=f'95% at {n_components_95} components')
        ax.legend()
    
    plt.suptitle('Eigenvalue Spectrum Reveals Intrinsic Dimensionality', fontsize=14)
    plt.tight_layout()
    plt.savefig('eigenvalue_spectrum_analysis.png', dpi=150)
    plt.show()
 
analyze_eigenvalue_spectrum()

Building Geometric Intuition for ML

Let's consolidate the geometric intuitions that will serve you throughout machine learning:

Essential Geometric Intuitions

•Eigenvectors reveal structure. Any diagonalizable matrix has a natural coordinate system—its eigenvectors. In this system, the transformation is trivially simple: just scaling along each axis.
•Eigenvalues quantify importance. Large eigenvalues indicate directions of high variance/activity/significance. Small eigenvalues indicate negligible directions. Zero eigenvalues indicate collapsed dimensions.
•Positive definite matrices define ellipsoids. The eigenvectors are the principal axes; eigenvalues determine axis lengths. Covariance matrices are positive semi-definite, so PCA finds the data's ellipsoidal shape.
•Complex eigenvalues mean rotation. If a matrix has complex eigenvalues, it involves rotation. The magnitude |λ| determines radial scaling (growth/decay), and arg(λ) determines rotation angle.
•Eigenvalue spectrum reveals dimensionality. The rate at which eigenvalues decay indicates how many dimensions carry meaningful information. Sharp drop-offs suggest low-dimensional structure.
•Symmetric matrices have orthogonal eigenvectors. This makes them especially well-behaved. The spectral theorem guarantees clean decomposition—foundational for PCA.

The Mental Model

When you see a matrix in ML context, train yourself to think: 'What are its eigenvectors? What do its eigenvalues tell me?' For a covariance matrix, this reveals principal directions. For a Markov transition matrix, this reveals steady states. For a Hessian, this reveals curvature directions. The eigenvalue lens applies universally.

Summary: Geometric Interpretation

We've developed the geometric intuition to visualize and understand eigenvalues and eigenvectors. Let's consolidate:

Key Takeaways

•Eigenvectors are invariant directions — They only scale under transformation, never rotate. They define the 'natural axes' of a matrix.
•Eigenvalues are scaling factors — |λ| > 1 stretches, |λ| < 1 compresses, λ < 0 flips, λ = 0 collapses. Sign and magnitude both matter.
•Unit circles become ellipses — Linear transformations map circles to ellipses. Eigenvectors point along ellipse axes; eigenvalues give axis lengths.
•Complex eigenvalues indicate rotation — Matrices with complex eigenvalues involve rotational dynamics. Magnitude determines growth/decay; argument determines rotation angle.
•Covariance ellipsoids visualize data — For Gaussian data, confidence regions are ellipsoids aligned with eigenvectors. This is the geometry underlying PCA.
•Eigendecomposition A = VΛV⁻¹ simplifies everything — It reveals that complex transformations are just scaling along special directions, dressed up by coordinate changes.
•Eigenvalue spectra reveal dimensionality — The decay pattern of eigenvalues indicates intrinsic dimension. Fast decay means low-dimensional structure.

What's Next:

Now that we can visualize what eigenvalues and eigenvectors mean, the next page focuses on computing them. We'll explore the characteristic polynomial approach, numerical algorithms like the power method and QR iteration, and practical considerations for implementing eigenanalysis in ML applications.

Page Complete

You now have geometric intuition for eigenvalues and eigenvectors—invariant directions, scaling factors, ellipsoids, and spectra. This visual understanding will make PCA, spectral clustering, and stability analysis intuitive. Next, we'll learn how to actually compute eigenvalues and eigenvectors.