Projections And Least Squares - Learning Module

Loading content...

101

0/245

Pseudoinverse

Beyond the Inverse: The Pseudoinverse

The matrix inverse A⁻¹ only exists for square, non-singular matrices. But in machine learning, we routinely encounter:

Overdetermined systems (m > n): More equations than unknowns
Underdetermined systems (m < n): Fewer equations than unknowns
Rank-deficient matrices: Linearly dependent columns or rows

The Moore-Penrose pseudoinverse A⁺ generalizes the inverse to all matrices. It provides:

Least squares solution when the system has no exact solution
Minimum-norm solution when infinitely many solutions exist
Best approximation in both senses simultaneously

For full-rank matrices: A⁺ = (AᵀA)⁻¹Aᵀ (left inverse) or A⁺ = Aᵀ(AAᵀ)⁻¹ (right inverse). For general matrices, the SVD provides the definition.

What You Will Learn

By the end of this page, you will understand the pseudoinverse definition via SVD, its four defining properties (Moore-Penrose conditions), when and why it gives the minimum-norm least squares solution, and its computational implementation.

Definition via SVD

The Singular Value Decomposition of any m×n matrix A is:

A = UΣVᵀ

where:

U ∈ ℝᵐˣᵐ is orthogonal (columns are left singular vectors)
Σ ∈ ℝᵐˣⁿ is diagonal with non-negative singular values σ₁ ≥ σ₂ ≥ ... ≥ σᵣ > 0
V ∈ ℝⁿˣⁿ is orthogonal (columns are right singular vectors)
r = rank(A) is the number of non-zero singular values

The pseudoinverse A⁺ is defined as:

A⁺ = VΣ⁺Uᵀ

where Σ⁺ is the n×m matrix with entries:

(Σ⁺)ᵢᵢ = 1/σᵢ for non-zero singular values
All other entries are zero

pseudoinverse_svd.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
import numpy as np
from numpy.linalg import svd, pinv, norm
 
def compute_pseudoinverse_svd(A: np.ndarray, tol: float = 1e-10) -> np.ndarray:
    """
    Compute the pseudoinverse A⁺ using SVD.
    
    A = UΣVᵀ  =>  A⁺ = VΣ⁺Uᵀ
    """
    U, sigma, Vt = svd(A, full_matrices=False)
    
    # Invert non-zero singular values
    sigma_inv = np.array([1/s if s > tol else 0 for s in sigma])
    
    # Construct Σ⁺ (diagonal matrix)
    Sigma_pinv = np.diag(sigma_inv)
    
    # A⁺ = VΣ⁺Uᵀ
    A_pinv = Vt.T @ Sigma_pinv @ U.T
    
    return A_pinv
 
# Example: Overdetermined full-rank (m > n)
A1 = np.array([[1, 2], [3, 4], [5, 6]])
print("=== Overdetermined System ===")
print(f"A (3×2):\n{A1}")
A1_pinv = compute_pseudoinverse_svd(A1)
print(f"A⁺ (2×3):\n{np.round(A1_pinv, 4)}")
print(f"NumPy pinv:\n{np.round(pinv(A1), 4)}")
print(f"Match: {np.allclose(A1_pinv, pinv(A1))}")
 
# Verify: A⁺ = (AᵀA)⁻¹Aᵀ for full column rank
A1_pinv_formula = np.linalg.inv(A1.T @ A1) @ A1.T
print(f"(AᵀA)⁻¹Aᵀ:\n{np.round(A1_pinv_formula, 4)}")
 
# Example: Underdetermined (m < n)
print("\n=== Underdetermined System ===")
A2 = np.array([[1, 2, 3], [4, 5, 6]])
print(f"A (2×3):\n{A2}")
A2_pinv = compute_pseudoinverse_svd(A2)
print(f"A⁺ (3×2):\n{np.round(A2_pinv, 4)}")

The Moore-Penrose Conditions

The pseudoinverse A⁺ is uniquely characterized by four conditions:

AA⁺A = A — A⁺ acts as a partial inverse
A⁺AA⁺ = A⁺ — A acts as a partial inverse of A⁺
(AA⁺)ᵀ = AA⁺ — AA⁺ is symmetric (orthogonal projection onto Col(A))
(A⁺A)ᵀ = A⁺A — A⁺A is symmetric (orthogonal projection onto Row(A))

These conditions are necessary and sufficient. Any matrix X satisfying all four equals A⁺.

Interpretation:

AA⁺ = projection onto Col(A)
A⁺A = projection onto Row(A) = Col(Aᵀ)
When A is invertible: A⁺ = A⁻¹, and all conditions become identities

moore_penrose_verify.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import numpy as np
from numpy.linalg import pinv
 
def verify_moore_penrose(A: np.ndarray, tol: float = 1e-10) -> dict:
    """Verify all four Moore-Penrose conditions for A⁺."""
    A_pinv = pinv(A)
    
    results = {}
    
    # Condition 1: AA⁺A = A
    cond1 = A @ A_pinv @ A
    results['cond1'] = np.allclose(cond1, A, atol=tol)
    
    # Condition 2: A⁺AA⁺ = A⁺
    cond2 = A_pinv @ A @ A_pinv
    results['cond2'] = np.allclose(cond2, A_pinv, atol=tol)
    
    # Condition 3: (AA⁺)ᵀ = AA⁺
    AA_pinv = A @ A_pinv
    results['cond3'] = np.allclose(AA_pinv.T, AA_pinv, atol=tol)
    
    # Condition 4: (A⁺A)ᵀ = A⁺A
    A_pinvA = A_pinv @ A
    results['cond4'] = np.allclose(A_pinvA.T, A_pinvA, atol=tol)
    
    print("=== Moore-Penrose Conditions ===")
    print(f"1. AA⁺A = A: {results['cond1']}")
    print(f"2. A⁺AA⁺ = A⁺: {results['cond2']}")
    print(f"3. (AA⁺)ᵀ = AA⁺: {results['cond3']}")
    print(f"4. (A⁺A)ᵀ = A⁺A: {results['cond4']}")
    
    return results
 
# Test with various matrix types
A = np.array([[1, 2], [3, 4], [5, 6]])
verify_moore_penrose(A)
 
print("\n=== Projection Properties ===")
A_pinv = pinv(A)
P_col = A @ A_pinv  # Projects onto Col(A)
P_row = A_pinv @ A  # Projects onto Row(A)
print(f"AA⁺ (proj onto Col(A)):\n{np.round(P_col, 4)}")
print(f"A⁺A (proj onto Row(A)):\n{np.round(P_row, 4)}")

Least Squares via Pseudoinverse

For the system Ax = b, the pseudoinverse gives:

x̂ = A⁺b

This solution has a profound property:

A⁺b is the minimum-norm vector that minimizes ||Ax - b||

Two optimality conditions in one:

Least squares: Minimizes ||Ax - b||² among all x
Minimum norm: Among all minimizers, has smallest ||x||

Cases:

Overdetermined, full rank (m > n, rank = n): Unique minimizer. A⁺b = (AᵀA)⁻¹Aᵀb
Underdetermined, full rank (m < n, rank = m): Infinitely many exact solutions. A⁺b gives the one with smallest norm
Rank deficient: Infinitely many least-squares minimizers. A⁺b gives the minimum-norm one

Pseudoinverse for Different Matrix Types
Type	Dimensions	Solution	Property
Full rank, square	m = n = r	A⁺ = A⁻¹	Exact inverse
Full column rank	m > n = r	A⁺ = (AᵀA)⁻¹Aᵀ	Left inverse, least squares
Full row rank	m = r < n	A⁺ = Aᵀ(AAᵀ)⁻¹	Right inverse, min norm
Rank deficient	r < min(m,n)	A⁺ via SVD	Min norm least squares

The Minimum-Norm Property

Why minimum norm?

When a system is underdetermined or rank-deficient, infinitely many solutions exist. The pseudoinverse selects the solution with smallest Euclidean norm—the one closest to the origin.

Geometric intuition:

The solution set is an affine subspace (plane, line, etc.). The minimum-norm solution is the point on this subspace nearest to the origin—found by orthogonal projection of 0 onto the affine subspace.

Connection to regularization:

Minimum-norm solutions connect to regularization. As λ → 0 in ridge regression:

(AᵀA + λI)⁻¹Aᵀb → A⁺b

The pseudoinverse is the limit of regularized solutions—no explicit regularization, but inherent preference for small coefficients.

minimum_norm.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import numpy as np
from numpy.linalg import pinv, norm, lstsq
 
def demonstrate_minimum_norm():
    """
    Show that A⁺b gives minimum-norm solution for underdetermined systems.
    """
    # Underdetermined: 2 equations, 4 unknowns
    A = np.array([[1, 2, 1, 0], [0, 1, 2, 1]])
    b = np.array([4, 3])
    
    # Pseudoinverse solution
    x_pinv = pinv(A) @ b
    
    print("=== Minimum Norm Property ===")
    print(f"A (2×4):\n{A}")
    print(f"b: {b}")
    print(f"\nPseudoinverse solution x⁺ = A⁺b: {np.round(x_pinv, 4)}")
    print(f"||x⁺|| = {norm(x_pinv):.4f}")
    print(f"Ax⁺ = {A @ x_pinv} (should equal b)")
    
    # Compare with other solutions
    # x = x_pinv + null space component
    # Find null space of A
    from scipy.linalg import null_space
    N = null_space(A)  # Columns span null space
    
    print(f"\nNull space dimension: {N.shape[1]}")
    
    # Try adding null space components
    print("\nComparing with other solutions:")
    for alpha in [1, -1, 2]:
        x_other = x_pinv + alpha * N[:, 0]  # Add null space component
        print(f"  x = x⁺ + {alpha}*n: {np.round(x_other, 4)}")
        print(f"    ||x|| = {norm(x_other):.4f} >= {norm(x_pinv):.4f}")
        print(f"    Ax = {np.round(A @ x_other, 4)}")
 
demonstrate_minimum_norm()

Numerical Computation

Computing A⁺ in practice:

NumPy: np.linalg.pinv(A) — uses SVD with default tolerance
SVD-based: Compute SVD, invert singular values above threshold
lstsq: np.linalg.lstsq(A, b) — solves least squares without explicit A⁺

Tolerance for small singular values:

Numerically, singular values may be tiny but non-zero due to floating-point errors. The pseudoinverse treats σᵢ < ε as zero:

Default tolerance: ε = max(m, n) × machine_epsilon × σ_max
Too small tolerance: numerical instability
Too large: lose information, rank underestimation

SVD is the gold standard for numerical stability in rank-deficient cases.

Numerical Stability

Never compute A⁺ via (AᵀA)⁻¹Aᵀ for rank-deficient matrices. AᵀA will be singular, and numerical errors can cause catastrophic results. Always use SVD-based computation (pinv) for safety.

Applications in Machine Learning

ML Applications of Pseudoinverse

•Linear regression: β̂ = X⁺y gives least squares coefficients, even with collinear features
•Data imputation: Estimate missing values using low-rank matrix completion
•Inverse problems: Recover signals from incomplete measurements
•Neural network initialization: Xavier/He initialization connects to pseudoinverse analysis
•Extreme learning machines: Hidden layer weights via pseudoinverse of activations

Connection to regularization:

The pseudoinverse relates to ridge regression:

Ridge: β̂_λ = (XᵀX + λI)⁻¹Xᵀy
As λ → 0⁺: β̂_λ → X⁺y (minimum-norm solution)
Truncated SVD: Keep only top k singular values ≈ pseudoinverse with threshold

Pseudoinverse gives the solution that regularization methods converge to when regularization strength approaches zero.

Summary: Pseudoinverse

Key Takeaways

•Pseudoinverse A⁺ = VΣ⁺Uᵀ: Defined via SVD for any matrix
•Four Moore-Penrose conditions: Uniquely characterize A⁺
•Least squares solution: x̂ = A⁺b minimizes ||Ax - b||
•Minimum norm: Among all minimizers, A⁺b has smallest ||x||
•Projection matrices: AA⁺ and A⁺A are orthogonal projections
•Numerical computation: Use SVD-based methods for stability

Module Complete

Congratulations! You've mastered Projections and Least Squares. You understand orthogonal projection geometrically and algebraically, projection matrices and their properties, normal equations, the geometric view of least squares, and the pseudoinverse for general solutions. This foundation is essential for linear regression, PCA, and optimization throughout ML.

Pseudoinverse

Beyond the Inverse: The Pseudoinverse

The matrix inverse A⁻¹ only exists for square, non-singular matrices. But in machine learning, we routinely encounter:

Overdetermined systems (m > n): More equations than unknowns
Underdetermined systems (m < n): Fewer equations than unknowns
Rank-deficient matrices: Linearly dependent columns or rows

The Moore-Penrose pseudoinverse A⁺ generalizes the inverse to all matrices. It provides:

Least squares solution when the system has no exact solution
Minimum-norm solution when infinitely many solutions exist
Best approximation in both senses simultaneously

For full-rank matrices: A⁺ = (AᵀA)⁻¹Aᵀ (left inverse) or A⁺ = Aᵀ(AAᵀ)⁻¹ (right inverse). For general matrices, the SVD provides the definition.

What You Will Learn

Definition via SVD

The Singular Value Decomposition of any m×n matrix A is:

A = UΣVᵀ

where:

U ∈ ℝᵐˣᵐ is orthogonal (columns are left singular vectors)
Σ ∈ ℝᵐˣⁿ is diagonal with non-negative singular values σ₁ ≥ σ₂ ≥ ... ≥ σᵣ > 0
V ∈ ℝⁿˣⁿ is orthogonal (columns are right singular vectors)
r = rank(A) is the number of non-zero singular values

The pseudoinverse A⁺ is defined as:

A⁺ = VΣ⁺Uᵀ

where Σ⁺ is the n×m matrix with entries:

(Σ⁺)ᵢᵢ = 1/σᵢ for non-zero singular values
All other entries are zero

pseudoinverse_svd.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
import numpy as np
from numpy.linalg import svd, pinv, norm
 
def compute_pseudoinverse_svd(A: np.ndarray, tol: float = 1e-10) -> np.ndarray:
    """
    Compute the pseudoinverse A⁺ using SVD.
    
    A = UΣVᵀ  =>  A⁺ = VΣ⁺Uᵀ
    """
    U, sigma, Vt = svd(A, full_matrices=False)
    
    # Invert non-zero singular values
    sigma_inv = np.array([1/s if s > tol else 0 for s in sigma])
    
    # Construct Σ⁺ (diagonal matrix)
    Sigma_pinv = np.diag(sigma_inv)
    
    # A⁺ = VΣ⁺Uᵀ
    A_pinv = Vt.T @ Sigma_pinv @ U.T
    
    return A_pinv
 
# Example: Overdetermined full-rank (m > n)
A1 = np.array([[1, 2], [3, 4], [5, 6]])
print("=== Overdetermined System ===")
print(f"A (3×2):\n{A1}")
A1_pinv = compute_pseudoinverse_svd(A1)
print(f"A⁺ (2×3):\n{np.round(A1_pinv, 4)}")
print(f"NumPy pinv:\n{np.round(pinv(A1), 4)}")
print(f"Match: {np.allclose(A1_pinv, pinv(A1))}")
 
# Verify: A⁺ = (AᵀA)⁻¹Aᵀ for full column rank
A1_pinv_formula = np.linalg.inv(A1.T @ A1) @ A1.T
print(f"(AᵀA)⁻¹Aᵀ:\n{np.round(A1_pinv_formula, 4)}")
 
# Example: Underdetermined (m < n)
print("\n=== Underdetermined System ===")
A2 = np.array([[1, 2, 3], [4, 5, 6]])
print(f"A (2×3):\n{A2}")
A2_pinv = compute_pseudoinverse_svd(A2)
print(f"A⁺ (3×2):\n{np.round(A2_pinv, 4)}")

The Moore-Penrose Conditions

The pseudoinverse A⁺ is uniquely characterized by four conditions:

AA⁺A = A — A⁺ acts as a partial inverse
A⁺AA⁺ = A⁺ — A acts as a partial inverse of A⁺
(AA⁺)ᵀ = AA⁺ — AA⁺ is symmetric (orthogonal projection onto Col(A))
(A⁺A)ᵀ = A⁺A — A⁺A is symmetric (orthogonal projection onto Row(A))

These conditions are necessary and sufficient. Any matrix X satisfying all four equals A⁺.

Interpretation:

AA⁺ = projection onto Col(A)
A⁺A = projection onto Row(A) = Col(Aᵀ)
When A is invertible: A⁺ = A⁻¹, and all conditions become identities

moore_penrose_verify.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import numpy as np
from numpy.linalg import pinv
 
def verify_moore_penrose(A: np.ndarray, tol: float = 1e-10) -> dict:
    """Verify all four Moore-Penrose conditions for A⁺."""
    A_pinv = pinv(A)
    
    results = {}
    
    # Condition 1: AA⁺A = A
    cond1 = A @ A_pinv @ A
    results['cond1'] = np.allclose(cond1, A, atol=tol)
    
    # Condition 2: A⁺AA⁺ = A⁺
    cond2 = A_pinv @ A @ A_pinv
    results['cond2'] = np.allclose(cond2, A_pinv, atol=tol)
    
    # Condition 3: (AA⁺)ᵀ = AA⁺
    AA_pinv = A @ A_pinv
    results['cond3'] = np.allclose(AA_pinv.T, AA_pinv, atol=tol)
    
    # Condition 4: (A⁺A)ᵀ = A⁺A
    A_pinvA = A_pinv @ A
    results['cond4'] = np.allclose(A_pinvA.T, A_pinvA, atol=tol)
    
    print("=== Moore-Penrose Conditions ===")
    print(f"1. AA⁺A = A: {results['cond1']}")
    print(f"2. A⁺AA⁺ = A⁺: {results['cond2']}")
    print(f"3. (AA⁺)ᵀ = AA⁺: {results['cond3']}")
    print(f"4. (A⁺A)ᵀ = A⁺A: {results['cond4']}")
    
    return results
 
# Test with various matrix types
A = np.array([[1, 2], [3, 4], [5, 6]])
verify_moore_penrose(A)
 
print("\n=== Projection Properties ===")
A_pinv = pinv(A)
P_col = A @ A_pinv  # Projects onto Col(A)
P_row = A_pinv @ A  # Projects onto Row(A)
print(f"AA⁺ (proj onto Col(A)):\n{np.round(P_col, 4)}")
print(f"A⁺A (proj onto Row(A)):\n{np.round(P_row, 4)}")

Least Squares via Pseudoinverse

For the system Ax = b, the pseudoinverse gives:

x̂ = A⁺b

This solution has a profound property:

A⁺b is the minimum-norm vector that minimizes ||Ax - b||

Two optimality conditions in one:

Least squares: Minimizes ||Ax - b||² among all x
Minimum norm: Among all minimizers, has smallest ||x||

Cases:

Overdetermined, full rank (m > n, rank = n): Unique minimizer. A⁺b = (AᵀA)⁻¹Aᵀb
Underdetermined, full rank (m < n, rank = m): Infinitely many exact solutions. A⁺b gives the one with smallest norm
Rank deficient: Infinitely many least-squares minimizers. A⁺b gives the minimum-norm one

Pseudoinverse for Different Matrix Types
Type	Dimensions	Solution	Property
Full rank, square	m = n = r	A⁺ = A⁻¹	Exact inverse
Full column rank	m > n = r	A⁺ = (AᵀA)⁻¹Aᵀ	Left inverse, least squares
Full row rank	m = r < n	A⁺ = Aᵀ(AAᵀ)⁻¹	Right inverse, min norm
Rank deficient	r < min(m,n)	A⁺ via SVD	Min norm least squares

The Minimum-Norm Property

Why minimum norm?

When a system is underdetermined or rank-deficient, infinitely many solutions exist. The pseudoinverse selects the solution with smallest Euclidean norm—the one closest to the origin.

Geometric intuition:

Connection to regularization:

Minimum-norm solutions connect to regularization. As λ → 0 in ridge regression:

(AᵀA + λI)⁻¹Aᵀb → A⁺b

The pseudoinverse is the limit of regularized solutions—no explicit regularization, but inherent preference for small coefficients.

minimum_norm.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import numpy as np
from numpy.linalg import pinv, norm, lstsq
 
def demonstrate_minimum_norm():
    """
    Show that A⁺b gives minimum-norm solution for underdetermined systems.
    """
    # Underdetermined: 2 equations, 4 unknowns
    A = np.array([[1, 2, 1, 0], [0, 1, 2, 1]])
    b = np.array([4, 3])
    
    # Pseudoinverse solution
    x_pinv = pinv(A) @ b
    
    print("=== Minimum Norm Property ===")
    print(f"A (2×4):\n{A}")
    print(f"b: {b}")
    print(f"\nPseudoinverse solution x⁺ = A⁺b: {np.round(x_pinv, 4)}")
    print(f"||x⁺|| = {norm(x_pinv):.4f}")
    print(f"Ax⁺ = {A @ x_pinv} (should equal b)")
    
    # Compare with other solutions
    # x = x_pinv + null space component
    # Find null space of A
    from scipy.linalg import null_space
    N = null_space(A)  # Columns span null space
    
    print(f"\nNull space dimension: {N.shape[1]}")
    
    # Try adding null space components
    print("\nComparing with other solutions:")
    for alpha in [1, -1, 2]:
        x_other = x_pinv + alpha * N[:, 0]  # Add null space component
        print(f"  x = x⁺ + {alpha}*n: {np.round(x_other, 4)}")
        print(f"    ||x|| = {norm(x_other):.4f} >= {norm(x_pinv):.4f}")
        print(f"    Ax = {np.round(A @ x_other, 4)}")
 
demonstrate_minimum_norm()

Numerical Computation

Computing A⁺ in practice:

NumPy: np.linalg.pinv(A) — uses SVD with default tolerance
SVD-based: Compute SVD, invert singular values above threshold
lstsq: np.linalg.lstsq(A, b) — solves least squares without explicit A⁺

Tolerance for small singular values:

Numerically, singular values may be tiny but non-zero due to floating-point errors. The pseudoinverse treats σᵢ < ε as zero:

Default tolerance: ε = max(m, n) × machine_epsilon × σ_max
Too small tolerance: numerical instability
Too large: lose information, rank underestimation

SVD is the gold standard for numerical stability in rank-deficient cases.

Numerical Stability

Never compute A⁺ via (AᵀA)⁻¹Aᵀ for rank-deficient matrices. AᵀA will be singular, and numerical errors can cause catastrophic results. Always use SVD-based computation (pinv) for safety.

Applications in Machine Learning

ML Applications of Pseudoinverse

•Linear regression: β̂ = X⁺y gives least squares coefficients, even with collinear features
•Data imputation: Estimate missing values using low-rank matrix completion
•Inverse problems: Recover signals from incomplete measurements
•Neural network initialization: Xavier/He initialization connects to pseudoinverse analysis
•Extreme learning machines: Hidden layer weights via pseudoinverse of activations

Connection to regularization:

The pseudoinverse relates to ridge regression:

Ridge: β̂_λ = (XᵀX + λI)⁻¹Xᵀy
As λ → 0⁺: β̂_λ → X⁺y (minimum-norm solution)
Truncated SVD: Keep only top k singular values ≈ pseudoinverse with threshold

Pseudoinverse gives the solution that regularization methods converge to when regularization strength approaches zero.

Summary: Pseudoinverse

Key Takeaways

•Pseudoinverse A⁺ = VΣ⁺Uᵀ: Defined via SVD for any matrix
•Four Moore-Penrose conditions: Uniquely characterize A⁺
•Least squares solution: x̂ = A⁺b minimizes ||Ax - b||
•Minimum norm: Among all minimizers, A⁺b has smallest ||x||
•Projection matrices: AA⁺ and A⁺A are orthogonal projections
•Numerical computation: Use SVD-based methods for stability

Module Complete