Machine LearningVectors and Vector Spaces

Vectors and Vector Spaces: The Language of Machine Learning

LevelBeginner

Duration90 mins

TopicVectors and Vector Spaces

4 / 5

Span and Linear Independence

What Vectors Can We Reach?

Given a set of vectors, two fundamental questions arise:

What can we build? — What set of vectors can be expressed as linear combinations of our vectors?
Do we have redundancy? — Are some vectors expressible in terms of others, or is each essential?

The answers to these questions are span and linear independence—two of the most important concepts in all of linear algebra.

Span tells us the reach of our vectors: the complete set of destinations accessible via linear combinations. Linear independence tells us about efficiency: whether we're using the minimum number of vectors to achieve that reach, or whether we have unnecessary redundancy.

These concepts directly determine whether systems of equations have solutions, whether transformations are invertible, whether datasets have redundant features, and how neural networks can (or cannot) represent data.

What You Will Learn

By the end of this page, you will understand span as the set of all reachable vectors, master linear independence and dependence with geometric intuition, learn to test for linear independence computationally, connect these concepts to machine learning (feature redundancy, rank, representability), and understand why these concepts are essential for basis and dimension.

The Concept of Span

Definition of span:

The span of a set of vectors ${\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k}$ is the set of all possible linear combinations of those vectors:

$$\text{span}(\mathbf{v}_1, \ldots, \mathbf{v}_k) = \left{ c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_k\mathbf{v}_k ;|; c_1, c_2, \ldots, c_k \in \mathbb{R} \right}$$

In other words, span is every vector you can "reach" by combining the given vectors with any coefficients.

Key insight: The span is always a subspace—it includes the zero vector and is closed under addition and scalar multiplication.

Think of Span as 'Reach'

If someone gives you vectors v₁ and v₂, the span is the answer to: 'What destinations can I reach by traveling any amount in the v₁ direction and any amount in the v₂ direction?' The span is your complete travel map.

Examples of span:

Single non-zero vector: The span of $\mathbf{v} = (1, 2)$ is all vectors of the form $c(1, 2) = (c, 2c)$—a line through the origin in the direction of $\mathbf{v}$.

Two non-parallel vectors in $\mathbb{R}^2$: The span of $\mathbf{e}_1 = (1, 0)$ and $\mathbf{e}_2 = (0, 1)$ is all of $\mathbb{R}^2$—any 2D vector can be written as $c_1\mathbf{e}_1 + c_2\mathbf{e}_2$.

Two parallel vectors: The span of $(1, 2)$ and $(2, 4)$ is just the line through $(1, 2)$—the second vector adds no new reach because it's a scalar multiple of the first.

Three vectors in $\mathbb{R}^3$ that aren't coplanar: The span is all of $\mathbb{R}^3$.

Three vectors in $\mathbb{R}^3$ that ARE coplanar: The span is just the plane containing them—less than all of $\mathbb{R}^3$.

Span Examples in Different Cases
Vectors	Space	Span	Dimension of Span
$(1, 2)$	$\mathbb{R}^2$	Line through origin	1
$(1, 0), (0, 1)$	$\mathbb{R}^2$	All of $\mathbb{R}^2$	2
$(1, 2), (2, 4)$	$\mathbb{R}^2$	Line (parallel vectors)	1
$(1, 0, 0), (0, 1, 0)$	$\mathbb{R}^3$	xy-plane	2
$(1, 0, 0), (0, 1, 0), (0, 0, 1)$	$\mathbb{R}^3$	All of $\mathbb{R}^3$	3

span_examples.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import numpy as np
 
def is_in_span(target, vectors, tol=1e-10):
    """
    Check if target vector is in the span of given vectors.
    Returns (is_in_span, coefficients).
    """
    # Stack vectors as columns of matrix A
    A = np.column_stack(vectors)
    
    # Solve A @ c = target using least squares
    coeffs, residuals, rank, s = np.linalg.lstsq(A, target, rcond=None)
    
    # Check if residual is essentially zero
    reconstruction = A @ coeffs
    error = np.linalg.norm(target - reconstruction)
    
    return error < tol, coeffs
 
# Example 1: v is in span of e1, e2
e1 = np.array([1, 0])
e2 = np.array([0, 1])
v = np.array([3, 4])
 
in_span, coeffs = is_in_span(v, [e1, e2])
print(f"Is {v} in span of e1, e2? {in_span}")
print(f"Coefficients: {coeffs}")  # [3, 4]
print(f"Verification: {coeffs[0]}*e1 + {coeffs[1]}*e2 = {coeffs[0]*e1 + coeffs[1]*e2}")
 
# Example 2: Check if [1, 1, 1] is in span of [1,0,0] and [0,1,0]
v1 = np.array([1, 0, 0])
v2 = np.array([0, 1, 0])
target = np.array([1, 1, 1])
 
in_span, coeffs = is_in_span(target, [v1, v2])
print(f"\nIs {target} in span of xy-plane? {in_span}")
# It's not! The z-component can't be reached.
 
target2 = np.array([1, 1, 0])  # In the xy-plane
in_span2, coeffs2 = is_in_span(target2, [v1, v2])
print(f"Is {target2} in span of xy-plane? {in_span2}")
print(f"Coefficients: {coeffs2}")

Linear Independence

Definition of linear independence:

A set of vectors ${\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k}$ is linearly independent if the only solution to:

$$c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_k\mathbf{v}_k = \mathbf{0}$$

is the trivial solution $c_1 = c_2 = \cdots = c_k = 0$.

If there exists a non-trivial solution (at least one $c_i \neq 0$), the vectors are linearly dependent.

Intuitive meaning:

Independent: No vector can be expressed as a linear combination of the others. Each vector adds new "reach."
Dependent: At least one vector is redundant—it can be written in terms of the others. Removing it doesn't reduce the span.

The Redundancy Test

Linear dependence means there's redundancy: some vector is 'saying' what could already be said with the others. Linear independence means every vector contributes something unique—the set is minimal for its span.

Equivalent characterizations of linear independence:

These are all equivalent ways to say vectors are linearly independent:

The only way to get zero is with all coefficients zero
No vector is a linear combination of the others
Each vector adds genuinely new directions not reachable from the remaining vectors
The span has dimension equal to the number of vectors
The matrix with these vectors as columns has full column rank

Geometric interpretation:

2 vectors are independent ⟺ They're not parallel (neither is a scalar multiple of the other)
3 vectors are independent ⟺ They're not coplanar (don't all lie in the same plane through origin)
In general: k vectors are independent ⟺ They don't all lie in a (k-1)-dimensional subspace

Linear Independence Examples
Vectors	Independent?	Reason
$(1, 0), (0, 1)$	Yes	Standard basis, not parallel
$(1, 2), (2, 4)$	No	$(2, 4) = 2(1, 2)$—parallel
$(1, 0), (0, 1), (1, 1)$	No	$(1, 1) = (1, 0) + (0, 1)$—third is sum of first two
$(1, 0, 0), (0, 1, 0), (0, 0, 1)$	Yes	Standard basis of $\mathbb{R}^3$
$(1, 2, 3), (4, 5, 6), (7, 8, 9)$	No	Third is linear combination of first two

linear_independence.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import numpy as np
 
def check_linear_independence(vectors, tol=1e-10):
    """
    Check if vectors are linearly independent using matrix rank.
    Returns (is_independent, rank, num_vectors).
    """
    # Stack vectors as columns
    A = np.column_stack(vectors)
    rank = np.linalg.matrix_rank(A, tol=tol)
    num_vectors = len(vectors)
    is_independent = (rank == num_vectors)
    return is_independent, rank, num_vectors
 
# Example 1: Standard basis (independent)
e1, e2 = np.array([1, 0]), np.array([0, 1])
ind, rank, n = check_linear_independence([e1, e2])
print(f"Standard basis: independent={ind}, rank={rank}, n={n}")
 
# Example 2: Parallel vectors (dependent)
v1, v2 = np.array([1, 2]), np.array([2, 4])
ind, rank, n = check_linear_independence([v1, v2])
print(f"Parallel vectors: independent={ind}, rank={rank}, n={n}")
 
# Example 3: Three vectors in R^2 (must be dependent)
a, b, c = np.array([1, 0]), np.array([0, 1]), np.array([1, 1])
ind, rank, n = check_linear_independence([a, b, c])
print(f"3 vectors in R^2: independent={ind}, rank={rank}, n={n}")
 
# Example 4: Check if arithmetic sequence is independent
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
v3 = np.array([7, 8, 9])
ind, rank, n = check_linear_independence([v1, v2, v3])
print(f"Arithmetic sequence: independent={ind}, rank={rank}, n={n}")
 
# Find the dependence relation
A = np.column_stack([v1, v2, v3])
# v3 = -1*v1 + 2*v2? Let's check
print(f"\nv3 = {v3}")
print(f"-1*v1 + 2*v2 = {-1*v1 + 2*v2}")
# So v1 - 2*v2 + v3 = 0 (non-trivial solution!)

Relationship Between Span and Independence

Span and linear independence are deeply connected—they're two sides of the same coin.

Key relationship:

Span tells us what we can reach (the set of destinations)
Independence tells us how efficiently we reach it (no redundancy)

Together, they determine the concept of a basis: a set of vectors that is both:

Spanning: The span is the whole space (we can reach everything)
Independent: No vector is redundant (minimal spanning set)

The dimension connection:

For a vector space of dimension $n$:

You need at least n independent vectors to span the entire space
You cannot have more than n independent vectors in that space
Exactly n independent vectors form a basis

The Two Failure Modes

Adding more vectors to a set can fail in two ways: (1) The new vector is already in the span of existing ones—no new reach, creates dependence. (2) You're trying to fit more than n vectors in an n-dimensional space—impossible to keep independence. Both failures indicate redundancy.

Practical implications:

For solving equations $A\mathbf{x} = \mathbf{b}$:

If columns of $A$ span the range containing $\mathbf{b}$: solution exists
If columns of $A$ are independent: solution is unique when it exists
If both: unique solution (invertible system)

For machine learning:

Independence and Span in ML Contexts
Situation	Span Implications	Independence Implications
Feature matrix	What predictions can be made	Are all features necessary?
Weight matrix	What outputs are reachable	Are weights uniquely determined?
Embedding space	What concepts can be represented	Is dimensionality appropriate?
PCA components	How much variance is captured	Each component is orthogonal (independent)

Computational detection:

The rank of a matrix equals:

The dimension of the column space (span of columns)
The maximum number of linearly independent columns
The maximum number of linearly independent rows

If a matrix has fewer independent columns than total columns, the columns are dependent.

span_independence_relationship.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import numpy as np
 
def analyze_vectors(vectors, space_dim):
    """Analyze span and independence of a set of vectors."""
    A = np.column_stack(vectors)
    rank = np.linalg.matrix_rank(A)
    num_vectors = len(vectors)
    
    is_independent = (rank == num_vectors)
    spans_space = (rank == space_dim)
    is_basis = is_independent and spans_space
    
    print(f"Number of vectors: {num_vectors}")
    print(f"Ambient space dimension: {space_dim}")
    print(f"Rank (dimension of span): {rank}")
    print(f"Linearly independent: {is_independent}")
    print(f"Spans the space: {spans_space}")
    print(f"Forms a basis: {is_basis}")
    return rank, is_independent, spans_space, is_basis
 
# Case 1: Fewer vectors than dimension (can't span)
print("Case 1: Two vectors in R^3")
v1, v2 = np.array([1, 0, 0]), np.array([0, 1, 0])
analyze_vectors([v1, v2], 3)
 
# Case 2: Right number, but dependent (can't do either)
print("\nCase 2: Three dependent vectors in R^3")
v1 = np.array([1, 0, 0])
v2 = np.array([0, 1, 0])
v3 = np.array([1, 1, 0])  # v3 = v1 + v2
analyze_vectors([v1, v2, v3], 3)
 
# Case 3: Right number, independent = basis
print("\nCase 3: Three independent vectors in R^3 (basis)")
v1 = np.array([1, 0, 0])
v2 = np.array([0, 1, 0])
v3 = np.array([0, 0, 1])
analyze_vectors([v1, v2, v3], 3)
 
# Case 4: More vectors than dimension (must be dependent)
print("\nCase 4: Four vectors in R^3")
v1, v2, v3, v4 = (np.array([1, 0, 0]), np.array([0, 1, 0]), 
                  np.array([0, 0, 1]), np.array([1, 1, 1]))
analyze_vectors([v1, v2, v3, v4], 3)

Testing for Linear Independence

Several methods exist to test whether vectors are linearly independent.

Method 1: Determinant (square case)

For $n$ vectors in $\mathbb{R}^n$, form the matrix with vectors as columns:

If $\det(A) \neq 0$: vectors are independent
If $\det(A) = 0$: vectors are dependent

This only works when the number of vectors equals the space dimension.

Method 2: Rank comparison

For any number of vectors:

Form matrix $A$ with vectors as columns
Compute $\text{rank}(A)$
Independent ⟺ rank equals number of vectors

Method 3: Row reduction (Gaussian elimination)

Reduce the matrix to row echelon form:

Independent ⟺ no row of zeros (each column has a pivot)
Dependent ⟺ at least one row of zeros

Method 4: Solve the homogeneous system

Find all solutions to $A\mathbf{c} = \mathbf{0}$:

Independent ⟺ only solution is $\mathbf{c} = \mathbf{0}$
Dependent ⟺ non-zero solutions exist

Numerical Precision

In practice, determinants and ranks are affected by floating-point errors. A determinant might be 1e-15 instead of exactly 0. Always use tolerance thresholds (e.g., np.linalg.matrix_rank uses SVD with tolerance). Near-dependent vectors cause numerical instability.

testing_independence.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import numpy as np
from scipy.linalg import null_space
 
# Test vectors
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
v3 = np.array([7, 8, 9])
 
A = np.column_stack([v1, v2, v3])
 
# Method 1: Determinant (only for square matrices)
det = np.linalg.det(A)
print(f"Method 1 - Determinant: {det:.6f}")
print(f"Independent (det != 0): {not np.isclose(det, 0)}")
 
# Method 2: Rank comparison
rank = np.linalg.matrix_rank(A)
n_vectors = A.shape[1]
print(f"\nMethod 2 - Rank: {rank}, Number of vectors: {n_vectors}")
print(f"Independent (rank == n): {rank == n_vectors}")
 
# Method 3: Row reduction (via QR or similar)
# The rank reveals this automatically
 
# Method 4: Null space
# If null space has dimension > 0, vectors are dependent
null = null_space(A)
print(f"\nMethod 4 - Null space:")
print(f"Null space dimension: {null.shape[1]}")
if null.shape[1] > 0:
    print(f"Non-trivial null vector: {null[:, 0]}")
    # This means: null[0]*v1 + null[1]*v2 + null[2]*v3 = 0
    c = null[:, 0]
    verification = c[0]*v1 + c[1]*v2 + c[2]*v3
    print(f"Verification (should be ~0): {verification}")
 
# Compare with truly independent vectors
print("\n--- Independent vectors ---")
u1 = np.array([1, 0, 0])
u2 = np.array([0, 1, 0])
u3 = np.array([0, 0, 1])
 
B = np.column_stack([u1, u2, u3])
print(f"Determinant: {np.linalg.det(B)}")
print(f"Rank: {np.linalg.matrix_rank(B)}")
print(f"Null space dimension: {null_space(B).shape[1]}")  # 0

Understanding in Higher Dimensions

In machine learning, we work with vectors in hundreds or thousands of dimensions. Geometric intuition from 2D/3D still applies, but we must think more abstractly.

High-dimensional span:

In $\mathbb{R}^{1000}$, a set of 10 independent vectors spans a 10-dimensional subspace:

This subspace contains infinitely many vectors
But it's a tiny fraction of the full 1000-dimensional space
Most random vectors in $\mathbb{R}^{1000}$ are NOT in this span

High-dimensional independence:

In $\mathbb{R}^n$, you can have at most $n$ independent vectors. Random vectors are almost always independent (probability 1 in continuous distributions) until you exceed the space dimension.

Intuition Transfer

In 3D: 2 independent vectors span a plane (2D subspace). In 1000D: 2 independent vectors still span a 2D subspace—it's just a 2D 'plane' embedded in a much larger space. The concepts are identical; only our ability to visualize changes.

ML implications:

Feature spaces: If you have 100 features but only 20 are independent (80 can be expressed from the 20), your effective dimensionality is 20. This is what PCA exploits—finding the intrinsic dimension of data.

Overparameterized models: Neural networks often have more parameters than training samples. The weight vectors live in a space where many solutions exist (underdetermined system). Regularization helps pick among equivalent solutions.

Embedding spaces: Word embeddings in $\mathbb{R}^{300}$ represent a vocabulary of 100,000 words. The 100,000 word vectors must be dependent (can't have 100,000 independent vectors in $\mathbb{R}^{300}$). This isn't a bug—it's how embeddings capture relationships (similar words have similar vectors).

high_dim_examples.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import numpy as np
 
# In high dimensions, random vectors are almost always independent
np.random.seed(42)
dim = 100
n_vectors = 50
 
# Generate random vectors
random_vectors = [np.random.randn(dim) for _ in range(n_vectors)]
A = np.column_stack(random_vectors)
 
rank = np.linalg.matrix_rank(A)
print(f"Dimension: {dim}")
print(f"Number of random vectors: {n_vectors}")
print(f"Rank: {rank}")
print(f"All independent: {rank == n_vectors}")  # Almost certainly True
 
# What happens at the boundary?
n_vectors = 100  # Same as dimension
random_vectors = [np.random.randn(dim) for _ in range(n_vectors)]
A = np.column_stack(random_vectors)
rank = np.linalg.matrix_rank(A)
print(f"\n{n_vectors} vectors in {dim} dimensions: rank = {rank}")
 
# Exceeding dimension forces dependence
n_vectors = 101  # More than dimension
random_vectors = [np.random.randn(dim) for _ in range(n_vectors)]
A = np.column_stack(random_vectors)
rank = np.linalg.matrix_rank(A)
print(f"{n_vectors} vectors in {dim} dimensions: rank = {rank}")
print(f"Must be dependent: {rank < n_vectors}")
 
# Simulating redundant features
print("\n--- Feature Redundancy Example ---")
# Create 50 'real' features
real_features = np.random.randn(1000, 50)  # 1000 samples, 50 features
 
# Create 30 'redundant' features as linear combinations
redundant_weights = np.random.randn(50, 30)
redundant_features = real_features @ redundant_weights
 
# Combined feature matrix
all_features = np.hstack([real_features, redundant_features])
print(f"Total features: {all_features.shape[1]}")
print(f"Rank of feature matrix: {np.linalg.matrix_rank(all_features)}")
print(f"Effective dimensionality: 50 (the rest are redundant)")

Applications in Machine Learning

Span and linear independence have direct, practical applications throughout machine learning.

1. Feature Selection and Multicollinearity

When features are linearly dependent (or nearly so), regression coefficients become unstable. This is called multicollinearity.

Perfect collinearity: One feature is an exact linear combination of others → Matrix is singular, can't invert
Near collinearity: Features are approximately dependent → Large coefficient variance, overfitting

Detection: Check if feature matrix has full column rank, or examine condition number.

2. Dimensionality Reduction (PCA)

PCA finds a new set of independent directions (principal components) that:

Span the data's variability
Are orthogonal (hence independent)
Are ordered by importance

The effective dimension is how many components explain most variance.

The Rank Tells the Story

A 1000×100 feature matrix might have rank 50. This means only 50 features are truly independent—the other 50 are linear combinations. Understanding this prevents overfitting and guides feature engineering.

3. Neural Network Expressivity

A linear layer with weight matrix $W \in \mathbb{R}^{m \times n}$:

Maps $\mathbb{R}^n$ to a subspace of $\mathbb{R}^m$
The dimension of that subspace is $\text{rank}(W)$
If rank < m, some outputs are unreachable
If rank < n, input information is lost (dimensionality reduction)

4. Embedding Quality

Good embeddings balance:

Span: Covering the conceptual space (enough directions to represent all meanings)
Independence: Each dimension captures distinct information (not redundant)

5. Model Identifiability

A model is identifiable if different parameter settings produce different outputs. Linear dependence in the feature/design matrix can make parameters non-identifiable (infinitely many equally good solutions).

ml_applications.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
import numpy as np
 
# Example 1: Detecting multicollinearity
np.random.seed(42)
 
# Create feature matrix with multicollinearity
n_samples = 100
x1 = np.random.randn(n_samples)
x2 = np.random.randn(n_samples)
x3 = 2*x1 + 3*x2 + 0.01*np.random.randn(n_samples)  # Nearly dependent
 
X = np.column_stack([x1, x2, x3])
print("Multicollinearity detection:")
print(f"Feature matrix rank: {np.linalg.matrix_rank(X)}")
print(f"Number of features: {X.shape[1]}")
 
# Condition number (large = ill-conditioned = multicollinearity)
cond_num = np.linalg.cond(X)
print(f"Condition number: {cond_num:.2f} (high indicates near-dependence)")
 
# Example 2: Understanding linear layer expressivity
input_dim = 100
hidden_dim = 50
 
# Weight matrix with potentially reduced rank
W_full_rank = np.random.randn(hidden_dim, input_dim)
W_low_rank = np.random.randn(hidden_dim, 10) @ np.random.randn(10, input_dim)  # Rank <= 10
 
print(f"\nLinear layer expressivity:")
print(f"Full rank W: rank = {np.linalg.matrix_rank(W_full_rank)}")
print(f"Low rank W: rank = {np.linalg.matrix_rank(W_low_rank)}")
 
# The low-rank layer can only produce outputs in a 10-dimensional subspace
# even though the output space is 50-dimensional
 
# Example 3: PCA and effective dimensionality
from sklearn.decomposition import PCA
 
# Generate data with intrinsic dimension 5 in 20D space
true_dim = 5
ambient_dim = 20
n_samples = 500
 
# Data lies in a 5D subspace
low_dim_data = np.random.randn(n_samples, true_dim)
projection = np.random.randn(true_dim, ambient_dim)
data = low_dim_data @ projection + 0.1 * np.random.randn(n_samples, ambient_dim)
 
pca = PCA()
pca.fit(data)
 
print(f"\nPCA variance explained:")
cumulative_var = np.cumsum(pca.explained_variance_ratio_)
for i, var in enumerate(cumulative_var[:10]):
    print(f"  {i+1} components: {var:.4f} variance")
print(f"Effective dimension (>99% variance): {np.searchsorted(cumulative_var, 0.99) + 1}")

Important Properties and Theorems

Several fundamental properties and theorems govern span and linear independence.

Key properties:

Fundamental Properties

•Adding zero vector creates dependence: If {v₁, ..., vₖ} is independent, adding 0 makes it dependent.
•Subsets of independent sets are independent: If {v₁, v₂, v₃} is independent, so is {v₁, v₂}.
•Supersets of dependent sets are dependent: If {v₁, v₂} is dependent, so is {v₁, v₂, v₃}.
•Single non-zero vector is always independent: {v} is independent iff v ≠ 0.
•Span is unchanged by adding dependent vectors: span{v₁, v₂, v₁+v₂} = span{v₁, v₂}.

Key theorems:

Theorem (Dimension Bound): In $\mathbb{R}^n$, any set of more than $n$ vectors must be linearly dependent.

Theorem (Unique Representation): If ${\mathbf{v}_1, \ldots, \mathbf{v}_k}$ is linearly independent and $\mathbf{w}$ is in their span, then the coefficients expressing $\mathbf{w}$ as a linear combination are unique.

Theorem (Span Absorption): If $\mathbf{w}$ is in the span of ${\mathbf{v}_1, \ldots, \mathbf{v}_k}$, then: $$\text{span}(\mathbf{v}_1, \ldots, \mathbf{v}_k, \mathbf{w}) = \text{span}(\mathbf{v}_1, \ldots, \mathbf{v}_k)$$

Adding vectors already in the span doesn't expand it.

Theorem (Extension): Any independent set in a finite-dimensional vector space can be extended to a basis by adding vectors.

The Exchange Lemma

If S is an independent set and B is a spanning set, then |S| ≤ |B|. This implies all bases have the same size (defining the dimension), and that you can't have more independent vectors than a spanning set has.

Summary: The Foundation for Basis and Dimension

Span and linear independence are the two pillars supporting all of linear algebra. Together, they define what vectors can represent and how efficiently.

Key Takeaways

•Span is the set of all linear combinations—all vectors reachable from a given set
•Linear independence means no vector is redundant—each contributes unique reach
•Dependence means some vector can be written in terms of others—removing it doesn't reduce span
•Testing independence uses rank, determinant, or null space computation
•In n dimensions, at most n vectors can be independent; more vectors force dependence
•ML applications include feature selection, dimensionality reduction, model expressivity, and identifiability
•These concepts lead to basis and dimension—the natural next step in understanding vector spaces

What's next:

Span and independence come together in the concept of vector spaces and subspaces. A basis is a set that is both spanning and independent—the minimal complete description of a space. The number of vectors in any basis defines the dimension. These ideas give vector spaces their fundamental structure.

Page Complete

You now understand span (what vectors can reach) and linear independence (when vectors are non-redundant). These concepts are essential for understanding bases, dimension, rank, and the solvability of linear systems—all critical for machine learning.

4 / 5

Loading learning content...

Machine LearningVectors and Vector Spaces

Vectors and Vector Spaces: The Language of Machine Learning

LevelBeginner

Duration90 mins

TopicVectors and Vector Spaces

4 / 5

Span and Linear Independence

What Vectors Can We Reach?

Given a set of vectors, two fundamental questions arise:

What can we build? — What set of vectors can be expressed as linear combinations of our vectors?
Do we have redundancy? — Are some vectors expressible in terms of others, or is each essential?

The answers to these questions are span and linear independence—two of the most important concepts in all of linear algebra.

What You Will Learn

The Concept of Span

Definition of span:

The span of a set of vectors ${\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k}$ is the set of all possible linear combinations of those vectors:

$$\text{span}(\mathbf{v}_1, \ldots, \mathbf{v}_k) = \left{ c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_k\mathbf{v}_k ;|; c_1, c_2, \ldots, c_k \in \mathbb{R} \right}$$

In other words, span is every vector you can "reach" by combining the given vectors with any coefficients.

Key insight: The span is always a subspace—it includes the zero vector and is closed under addition and scalar multiplication.

Think of Span as 'Reach'

Examples of span:

Single non-zero vector: The span of $\mathbf{v} = (1, 2)$ is all vectors of the form $c(1, 2) = (c, 2c)$—a line through the origin in the direction of $\mathbf{v}$.

Two parallel vectors: The span of $(1, 2)$ and $(2, 4)$ is just the line through $(1, 2)$—the second vector adds no new reach because it's a scalar multiple of the first.

Three vectors in $\mathbb{R}^3$ that aren't coplanar: The span is all of $\mathbb{R}^3$.

Three vectors in $\mathbb{R}^3$ that ARE coplanar: The span is just the plane containing them—less than all of $\mathbb{R}^3$.

Span Examples in Different Cases
Vectors	Space	Span	Dimension of Span
$(1, 2)$	$\mathbb{R}^2$	Line through origin	1
$(1, 0), (0, 1)$	$\mathbb{R}^2$	All of $\mathbb{R}^2$	2
$(1, 2), (2, 4)$	$\mathbb{R}^2$	Line (parallel vectors)	1
$(1, 0, 0), (0, 1, 0)$	$\mathbb{R}^3$	xy-plane	2
$(1, 0, 0), (0, 1, 0), (0, 0, 1)$	$\mathbb{R}^3$	All of $\mathbb{R}^3$	3

span_examples.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import numpy as np
 
def is_in_span(target, vectors, tol=1e-10):
    """
    Check if target vector is in the span of given vectors.
    Returns (is_in_span, coefficients).
    """
    # Stack vectors as columns of matrix A
    A = np.column_stack(vectors)
    
    # Solve A @ c = target using least squares
    coeffs, residuals, rank, s = np.linalg.lstsq(A, target, rcond=None)
    
    # Check if residual is essentially zero
    reconstruction = A @ coeffs
    error = np.linalg.norm(target - reconstruction)
    
    return error < tol, coeffs
 
# Example 1: v is in span of e1, e2
e1 = np.array([1, 0])
e2 = np.array([0, 1])
v = np.array([3, 4])
 
in_span, coeffs = is_in_span(v, [e1, e2])
print(f"Is {v} in span of e1, e2? {in_span}")
print(f"Coefficients: {coeffs}")  # [3, 4]
print(f"Verification: {coeffs[0]}*e1 + {coeffs[1]}*e2 = {coeffs[0]*e1 + coeffs[1]*e2}")
 
# Example 2: Check if [1, 1, 1] is in span of [1,0,0] and [0,1,0]
v1 = np.array([1, 0, 0])
v2 = np.array([0, 1, 0])
target = np.array([1, 1, 1])
 
in_span, coeffs = is_in_span(target, [v1, v2])
print(f"\nIs {target} in span of xy-plane? {in_span}")
# It's not! The z-component can't be reached.
 
target2 = np.array([1, 1, 0])  # In the xy-plane
in_span2, coeffs2 = is_in_span(target2, [v1, v2])
print(f"Is {target2} in span of xy-plane? {in_span2}")
print(f"Coefficients: {coeffs2}")

Linear Independence

Definition of linear independence:

A set of vectors ${\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k}$ is linearly independent if the only solution to:

$$c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_k\mathbf{v}_k = \mathbf{0}$$

is the trivial solution $c_1 = c_2 = \cdots = c_k = 0$.

If there exists a non-trivial solution (at least one $c_i \neq 0$), the vectors are linearly dependent.

Intuitive meaning:

Independent: No vector can be expressed as a linear combination of the others. Each vector adds new "reach."
Dependent: At least one vector is redundant—it can be written in terms of the others. Removing it doesn't reduce the span.

The Redundancy Test

Equivalent characterizations of linear independence:

These are all equivalent ways to say vectors are linearly independent:

The only way to get zero is with all coefficients zero
No vector is a linear combination of the others
Each vector adds genuinely new directions not reachable from the remaining vectors
The span has dimension equal to the number of vectors
The matrix with these vectors as columns has full column rank

Geometric interpretation:

2 vectors are independent ⟺ They're not parallel (neither is a scalar multiple of the other)
3 vectors are independent ⟺ They're not coplanar (don't all lie in the same plane through origin)
In general: k vectors are independent ⟺ They don't all lie in a (k-1)-dimensional subspace

Linear Independence Examples
Vectors	Independent?	Reason
$(1, 0), (0, 1)$	Yes	Standard basis, not parallel
$(1, 2), (2, 4)$	No	$(2, 4) = 2(1, 2)$—parallel
$(1, 0), (0, 1), (1, 1)$	No	$(1, 1) = (1, 0) + (0, 1)$—third is sum of first two
$(1, 0, 0), (0, 1, 0), (0, 0, 1)$	Yes	Standard basis of $\mathbb{R}^3$
$(1, 2, 3), (4, 5, 6), (7, 8, 9)$	No	Third is linear combination of first two

linear_independence.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import numpy as np
 
def check_linear_independence(vectors, tol=1e-10):
    """
    Check if vectors are linearly independent using matrix rank.
    Returns (is_independent, rank, num_vectors).
    """
    # Stack vectors as columns
    A = np.column_stack(vectors)
    rank = np.linalg.matrix_rank(A, tol=tol)
    num_vectors = len(vectors)
    is_independent = (rank == num_vectors)
    return is_independent, rank, num_vectors
 
# Example 1: Standard basis (independent)
e1, e2 = np.array([1, 0]), np.array([0, 1])
ind, rank, n = check_linear_independence([e1, e2])
print(f"Standard basis: independent={ind}, rank={rank}, n={n}")
 
# Example 2: Parallel vectors (dependent)
v1, v2 = np.array([1, 2]), np.array([2, 4])
ind, rank, n = check_linear_independence([v1, v2])
print(f"Parallel vectors: independent={ind}, rank={rank}, n={n}")
 
# Example 3: Three vectors in R^2 (must be dependent)
a, b, c = np.array([1, 0]), np.array([0, 1]), np.array([1, 1])
ind, rank, n = check_linear_independence([a, b, c])
print(f"3 vectors in R^2: independent={ind}, rank={rank}, n={n}")
 
# Example 4: Check if arithmetic sequence is independent
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
v3 = np.array([7, 8, 9])
ind, rank, n = check_linear_independence([v1, v2, v3])
print(f"Arithmetic sequence: independent={ind}, rank={rank}, n={n}")
 
# Find the dependence relation
A = np.column_stack([v1, v2, v3])
# v3 = -1*v1 + 2*v2? Let's check
print(f"\nv3 = {v3}")
print(f"-1*v1 + 2*v2 = {-1*v1 + 2*v2}")
# So v1 - 2*v2 + v3 = 0 (non-trivial solution!)

Relationship Between Span and Independence

Span and linear independence are deeply connected—they're two sides of the same coin.

Key relationship:

Span tells us what we can reach (the set of destinations)
Independence tells us how efficiently we reach it (no redundancy)

Together, they determine the concept of a basis: a set of vectors that is both:

Spanning: The span is the whole space (we can reach everything)
Independent: No vector is redundant (minimal spanning set)

The dimension connection:

For a vector space of dimension $n$:

You need at least n independent vectors to span the entire space
You cannot have more than n independent vectors in that space
Exactly n independent vectors form a basis

The Two Failure Modes

Practical implications:

For solving equations $A\mathbf{x} = \mathbf{b}$:

If columns of $A$ span the range containing $\mathbf{b}$: solution exists
If columns of $A$ are independent: solution is unique when it exists
If both: unique solution (invertible system)

For machine learning:

Independence and Span in ML Contexts
Situation	Span Implications	Independence Implications
Feature matrix	What predictions can be made	Are all features necessary?
Weight matrix	What outputs are reachable	Are weights uniquely determined?
Embedding space	What concepts can be represented	Is dimensionality appropriate?
PCA components	How much variance is captured	Each component is orthogonal (independent)

Computational detection:

The rank of a matrix equals:

The dimension of the column space (span of columns)
The maximum number of linearly independent columns
The maximum number of linearly independent rows

If a matrix has fewer independent columns than total columns, the columns are dependent.

span_independence_relationship.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import numpy as np
 
def analyze_vectors(vectors, space_dim):
    """Analyze span and independence of a set of vectors."""
    A = np.column_stack(vectors)
    rank = np.linalg.matrix_rank(A)
    num_vectors = len(vectors)
    
    is_independent = (rank == num_vectors)
    spans_space = (rank == space_dim)
    is_basis = is_independent and spans_space
    
    print(f"Number of vectors: {num_vectors}")
    print(f"Ambient space dimension: {space_dim}")
    print(f"Rank (dimension of span): {rank}")
    print(f"Linearly independent: {is_independent}")
    print(f"Spans the space: {spans_space}")
    print(f"Forms a basis: {is_basis}")
    return rank, is_independent, spans_space, is_basis
 
# Case 1: Fewer vectors than dimension (can't span)
print("Case 1: Two vectors in R^3")
v1, v2 = np.array([1, 0, 0]), np.array([0, 1, 0])
analyze_vectors([v1, v2], 3)
 
# Case 2: Right number, but dependent (can't do either)
print("\nCase 2: Three dependent vectors in R^3")
v1 = np.array([1, 0, 0])
v2 = np.array([0, 1, 0])
v3 = np.array([1, 1, 0])  # v3 = v1 + v2
analyze_vectors([v1, v2, v3], 3)
 
# Case 3: Right number, independent = basis
print("\nCase 3: Three independent vectors in R^3 (basis)")
v1 = np.array([1, 0, 0])
v2 = np.array([0, 1, 0])
v3 = np.array([0, 0, 1])
analyze_vectors([v1, v2, v3], 3)
 
# Case 4: More vectors than dimension (must be dependent)
print("\nCase 4: Four vectors in R^3")
v1, v2, v3, v4 = (np.array([1, 0, 0]), np.array([0, 1, 0]), 
                  np.array([0, 0, 1]), np.array([1, 1, 1]))
analyze_vectors([v1, v2, v3, v4], 3)

Testing for Linear Independence

Several methods exist to test whether vectors are linearly independent.

Method 1: Determinant (square case)

For $n$ vectors in $\mathbb{R}^n$, form the matrix with vectors as columns:

If $\det(A) \neq 0$: vectors are independent
If $\det(A) = 0$: vectors are dependent

This only works when the number of vectors equals the space dimension.

Method 2: Rank comparison

For any number of vectors:

Form matrix $A$ with vectors as columns
Compute $\text{rank}(A)$
Independent ⟺ rank equals number of vectors

Method 3: Row reduction (Gaussian elimination)

Reduce the matrix to row echelon form:

Independent ⟺ no row of zeros (each column has a pivot)
Dependent ⟺ at least one row of zeros

Method 4: Solve the homogeneous system

Find all solutions to $A\mathbf{c} = \mathbf{0}$:

Independent ⟺ only solution is $\mathbf{c} = \mathbf{0}$
Dependent ⟺ non-zero solutions exist

Numerical Precision

testing_independence.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import numpy as np
from scipy.linalg import null_space
 
# Test vectors
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
v3 = np.array([7, 8, 9])
 
A = np.column_stack([v1, v2, v3])
 
# Method 1: Determinant (only for square matrices)
det = np.linalg.det(A)
print(f"Method 1 - Determinant: {det:.6f}")
print(f"Independent (det != 0): {not np.isclose(det, 0)}")
 
# Method 2: Rank comparison
rank = np.linalg.matrix_rank(A)
n_vectors = A.shape[1]
print(f"\nMethod 2 - Rank: {rank}, Number of vectors: {n_vectors}")
print(f"Independent (rank == n): {rank == n_vectors}")
 
# Method 3: Row reduction (via QR or similar)
# The rank reveals this automatically
 
# Method 4: Null space
# If null space has dimension > 0, vectors are dependent
null = null_space(A)
print(f"\nMethod 4 - Null space:")
print(f"Null space dimension: {null.shape[1]}")
if null.shape[1] > 0:
    print(f"Non-trivial null vector: {null[:, 0]}")
    # This means: null[0]*v1 + null[1]*v2 + null[2]*v3 = 0
    c = null[:, 0]
    verification = c[0]*v1 + c[1]*v2 + c[2]*v3
    print(f"Verification (should be ~0): {verification}")
 
# Compare with truly independent vectors
print("\n--- Independent vectors ---")
u1 = np.array([1, 0, 0])
u2 = np.array([0, 1, 0])
u3 = np.array([0, 0, 1])
 
B = np.column_stack([u1, u2, u3])
print(f"Determinant: {np.linalg.det(B)}")
print(f"Rank: {np.linalg.matrix_rank(B)}")
print(f"Null space dimension: {null_space(B).shape[1]}")  # 0

Understanding in Higher Dimensions

In machine learning, we work with vectors in hundreds or thousands of dimensions. Geometric intuition from 2D/3D still applies, but we must think more abstractly.

High-dimensional span:

In $\mathbb{R}^{1000}$, a set of 10 independent vectors spans a 10-dimensional subspace:

This subspace contains infinitely many vectors
But it's a tiny fraction of the full 1000-dimensional space
Most random vectors in $\mathbb{R}^{1000}$ are NOT in this span

High-dimensional independence:

In $\mathbb{R}^n$, you can have at most $n$ independent vectors. Random vectors are almost always independent (probability 1 in continuous distributions) until you exceed the space dimension.

Intuition Transfer

ML implications:

high_dim_examples.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import numpy as np
 
# In high dimensions, random vectors are almost always independent
np.random.seed(42)
dim = 100
n_vectors = 50
 
# Generate random vectors
random_vectors = [np.random.randn(dim) for _ in range(n_vectors)]
A = np.column_stack(random_vectors)
 
rank = np.linalg.matrix_rank(A)
print(f"Dimension: {dim}")
print(f"Number of random vectors: {n_vectors}")
print(f"Rank: {rank}")
print(f"All independent: {rank == n_vectors}")  # Almost certainly True
 
# What happens at the boundary?
n_vectors = 100  # Same as dimension
random_vectors = [np.random.randn(dim) for _ in range(n_vectors)]
A = np.column_stack(random_vectors)
rank = np.linalg.matrix_rank(A)
print(f"\n{n_vectors} vectors in {dim} dimensions: rank = {rank}")
 
# Exceeding dimension forces dependence
n_vectors = 101  # More than dimension
random_vectors = [np.random.randn(dim) for _ in range(n_vectors)]
A = np.column_stack(random_vectors)
rank = np.linalg.matrix_rank(A)
print(f"{n_vectors} vectors in {dim} dimensions: rank = {rank}")
print(f"Must be dependent: {rank < n_vectors}")
 
# Simulating redundant features
print("\n--- Feature Redundancy Example ---")
# Create 50 'real' features
real_features = np.random.randn(1000, 50)  # 1000 samples, 50 features
 
# Create 30 'redundant' features as linear combinations
redundant_weights = np.random.randn(50, 30)
redundant_features = real_features @ redundant_weights
 
# Combined feature matrix
all_features = np.hstack([real_features, redundant_features])
print(f"Total features: {all_features.shape[1]}")
print(f"Rank of feature matrix: {np.linalg.matrix_rank(all_features)}")
print(f"Effective dimensionality: 50 (the rest are redundant)")

Applications in Machine Learning

Span and linear independence have direct, practical applications throughout machine learning.

1. Feature Selection and Multicollinearity

When features are linearly dependent (or nearly so), regression coefficients become unstable. This is called multicollinearity.

Perfect collinearity: One feature is an exact linear combination of others → Matrix is singular, can't invert
Near collinearity: Features are approximately dependent → Large coefficient variance, overfitting

Detection: Check if feature matrix has full column rank, or examine condition number.

2. Dimensionality Reduction (PCA)

PCA finds a new set of independent directions (principal components) that:

Span the data's variability
Are orthogonal (hence independent)
Are ordered by importance

The effective dimension is how many components explain most variance.

The Rank Tells the Story

3. Neural Network Expressivity

A linear layer with weight matrix $W \in \mathbb{R}^{m \times n}$:

Maps $\mathbb{R}^n$ to a subspace of $\mathbb{R}^m$
The dimension of that subspace is $\text{rank}(W)$
If rank < m, some outputs are unreachable
If rank < n, input information is lost (dimensionality reduction)

4. Embedding Quality

Good embeddings balance:

Span: Covering the conceptual space (enough directions to represent all meanings)
Independence: Each dimension captures distinct information (not redundant)

5. Model Identifiability

ml_applications.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
import numpy as np
 
# Example 1: Detecting multicollinearity
np.random.seed(42)
 
# Create feature matrix with multicollinearity
n_samples = 100
x1 = np.random.randn(n_samples)
x2 = np.random.randn(n_samples)
x3 = 2*x1 + 3*x2 + 0.01*np.random.randn(n_samples)  # Nearly dependent
 
X = np.column_stack([x1, x2, x3])
print("Multicollinearity detection:")
print(f"Feature matrix rank: {np.linalg.matrix_rank(X)}")
print(f"Number of features: {X.shape[1]}")
 
# Condition number (large = ill-conditioned = multicollinearity)
cond_num = np.linalg.cond(X)
print(f"Condition number: {cond_num:.2f} (high indicates near-dependence)")
 
# Example 2: Understanding linear layer expressivity
input_dim = 100
hidden_dim = 50
 
# Weight matrix with potentially reduced rank
W_full_rank = np.random.randn(hidden_dim, input_dim)
W_low_rank = np.random.randn(hidden_dim, 10) @ np.random.randn(10, input_dim)  # Rank <= 10
 
print(f"\nLinear layer expressivity:")
print(f"Full rank W: rank = {np.linalg.matrix_rank(W_full_rank)}")
print(f"Low rank W: rank = {np.linalg.matrix_rank(W_low_rank)}")
 
# The low-rank layer can only produce outputs in a 10-dimensional subspace
# even though the output space is 50-dimensional
 
# Example 3: PCA and effective dimensionality
from sklearn.decomposition import PCA
 
# Generate data with intrinsic dimension 5 in 20D space
true_dim = 5
ambient_dim = 20
n_samples = 500
 
# Data lies in a 5D subspace
low_dim_data = np.random.randn(n_samples, true_dim)
projection = np.random.randn(true_dim, ambient_dim)
data = low_dim_data @ projection + 0.1 * np.random.randn(n_samples, ambient_dim)
 
pca = PCA()
pca.fit(data)
 
print(f"\nPCA variance explained:")
cumulative_var = np.cumsum(pca.explained_variance_ratio_)
for i, var in enumerate(cumulative_var[:10]):
    print(f"  {i+1} components: {var:.4f} variance")
print(f"Effective dimension (>99% variance): {np.searchsorted(cumulative_var, 0.99) + 1}")

Important Properties and Theorems

Several fundamental properties and theorems govern span and linear independence.

Key properties:

Fundamental Properties

•Adding zero vector creates dependence: If {v₁, ..., vₖ} is independent, adding 0 makes it dependent.
•Subsets of independent sets are independent: If {v₁, v₂, v₃} is independent, so is {v₁, v₂}.
•Supersets of dependent sets are dependent: If {v₁, v₂} is dependent, so is {v₁, v₂, v₃}.
•Single non-zero vector is always independent: {v} is independent iff v ≠ 0.
•Span is unchanged by adding dependent vectors: span{v₁, v₂, v₁+v₂} = span{v₁, v₂}.

Key theorems:

Theorem (Dimension Bound): In $\mathbb{R}^n$, any set of more than $n$ vectors must be linearly dependent.

Adding vectors already in the span doesn't expand it.

Theorem (Extension): Any independent set in a finite-dimensional vector space can be extended to a basis by adding vectors.

The Exchange Lemma

Summary: The Foundation for Basis and Dimension

Span and linear independence are the two pillars supporting all of linear algebra. Together, they define what vectors can represent and how efficiently.

Key Takeaways

•Span is the set of all linear combinations—all vectors reachable from a given set
•Linear independence means no vector is redundant—each contributes unique reach
•Dependence means some vector can be written in terms of others—removing it doesn't reduce span
•Testing independence uses rank, determinant, or null space computation
•In n dimensions, at most n vectors can be independent; more vectors force dependence
•ML applications include feature selection, dimensionality reduction, model expressivity, and identifiability
•These concepts lead to basis and dimension—the natural next step in understanding vector spaces

What's next:

Page Complete

4 / 5