Loading problem...
In statistics and machine learning, understanding how features relate to one another is fundamental for building robust models. The covariance matrix is a powerful tool that captures the joint variability between pairs of features in a dataset.
Given a collection of k feature vectors, each containing n observations, the covariance matrix is a k × k symmetric matrix where the element at position (i, j) represents the covariance between feature i and feature j.
Mathematical Foundation:
For two features X and Y, each with n observations, the sample covariance is computed as:
$$\text{Cov}(X, Y) = \frac{1}{n-1} \sum_{i=1}^{n} (X_i - \bar{X})(Y_i - \bar{Y})$$
Where:
Key Properties of the Covariance Matrix:
Your Task: Write a Python function that computes the covariance matrix for a given collection of feature vectors. The function should:
vectors = [[1, 2, 3], [4, 5, 6]][[1.0, 1.0], [1.0, 1.0]]We have 2 features, each with 3 observations.
Step 1: Calculate Means • Mean of Feature 1: (1 + 2 + 3) / 3 = 2.0 • Mean of Feature 2: (4 + 5 + 6) / 3 = 5.0
Step 2: Calculate Covariances
Cov(Feature1, Feature1) - The variance of Feature 1: = [(1-2)² + (2-2)² + (3-2)²] / (3-1) = [1 + 0 + 1] / 2 = 1.0
Cov(Feature1, Feature2): = [(1-2)(4-5) + (2-2)(5-5) + (3-2)(6-5)] / (3-1) = [(-1)(-1) + (0)(0) + (1)(1)] / 2 = [1 + 0 + 1] / 2 = 1.0
Cov(Feature2, Feature2) - The variance of Feature 2: = [(4-5)² + (5-5)² + (6-5)²] / (3-1) = [1 + 0 + 1] / 2 = 1.0
Due to symmetry, Cov(Feature2, Feature1) = Cov(Feature1, Feature2) = 1.0
The resulting covariance matrix is [[1.0, 1.0], [1.0, 1.0]].
vectors = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]][[1.6667, 1.6667, 1.6667], [1.6667, 1.6667, 1.6667], [1.6667, 1.6667, 1.6667]]We have 3 features, each with 4 observations.
Step 1: Calculate Means • Mean of Feature 1: (1 + 2 + 3 + 4) / 4 = 2.5 • Mean of Feature 2: (5 + 6 + 7 + 8) / 4 = 6.5 • Mean of Feature 3: (9 + 10 + 11 + 12) / 4 = 10.5
Step 2: Observe the Pattern All three features are perfectly linearly correlated (each increases by 1 for each observation). This means all pairwise covariances will be identical.
Sample Variance (diagonal elements): = [(−1.5)² + (−0.5)² + (0.5)² + (1.5)²] / (4-1) = [2.25 + 0.25 + 0.25 + 2.25] / 3 = 5 / 3 ≈ 1.6667
Since all features have identical deviation patterns, every covariance equals 1.6667.
The 3×3 covariance matrix has all elements equal to 1.6667.
vectors = [[2, 4, 6, 8, 10]][[10.0]]We have only 1 feature with 5 observations. The covariance matrix is simply a 1×1 matrix containing the variance of that feature.
Step 1: Calculate Mean Mean = (2 + 4 + 6 + 8 + 10) / 5 = 6.0
Step 2: Calculate Variance Variance = [(2-6)² + (4-6)² + (6-6)² + (8-6)² + (10-6)²] / (5-1) = [16 + 4 + 0 + 4 + 16] / 4 = 40 / 4 = 10.0
The covariance matrix is [[10.0]].
Constraints