0/318

00:00:00

Description

Editorial

Gaussian Distribution Divergence Measure

MEDIUM20 pts

In probability theory and information theory, quantifying the difference between two probability distributions is a fundamental problem with widespread applications across machine learning, statistics, and signal processing. One of the most important measures for this purpose is the Kullback-Leibler (KL) Divergence.

The KL divergence, also known as relative entropy, measures how one probability distribution diverges from a reference distribution. It provides an asymmetric measure of the "information lost" when using one distribution to approximate another. For multivariate Gaussian distributions, this divergence can be computed analytically using a closed-form formula.

Problem Statement: Given two multivariate Gaussian (Normal) distributions P and Q, each characterized by their mean vectors (μₚ and μ_q) and covariance matrices (Σₚ and Σ_q), compute the KL divergence D_KL(P || Q).

Mathematical Formulation: For two d-dimensional multivariate Gaussians, the KL divergence is given by:

$$D_{KL}(P | Q) = \frac{1}{2} \left[ \log\frac{|\Sigma_q|}{|\Sigma_p|} - d + \text{tr}(\Sigma_q^{-1}\Sigma_p) + (\mu_q - \mu_p)^T \Sigma_q^{-1} (\mu_q - \mu_p) \right]$$

Where:

|Σ| denotes the determinant of the covariance matrix Σ
tr(·) is the trace operator (sum of diagonal elements)
Σ⁻¹ is the matrix inverse
d is the dimensionality (length of the mean vectors)
(μ_q - μ_p)ᵀ Σ_q⁻¹ (μ_q - μ_p) is the Mahalanobis distance squared between the means

Key Properties:

Non-negativity: D_KL(P || Q) ≥ 0 for all P and Q
Zero divergence: D_KL(P || Q) = 0 if and only if P = Q (identical distributions)
Asymmetry: D_KL(P || Q) ≠ D_KL(Q || P) in general
Not a true metric: Does not satisfy the triangle inequality

Your Task: Implement a function that computes the KL divergence between two multivariate Gaussian distributions given their mean vectors and covariance matrices. The result should be rounded to 4 decimal places.

Example

Input

mu_p = [0.0, 0.0]
Cov_p = [[1.0, 0.0], [0.0, 1.0]]
mu_q = [1.0, 1.0]
Cov_q = [[1.0, 0.0], [0.0, 1.0]]

Output

1.0

Explanation

Both distributions have the same identity covariance matrix, but different means.

Step-by-step calculation:

Dimensionality: d = 2
Log determinant ratio: Since both covariance matrices are identity matrices: • |Σ_q| = |Σ_p| = 1.0 • log(|Σ_q|/|Σ_p|) = log(1) = 0
Trace term: tr(Σ_q⁻¹ Σ_p) = tr(I · I) = tr(I) = 2
Mahalanobis distance: The mean difference is [1.0, 1.0] • (μ_q - μ_p)ᵀ Σ_q⁻¹ (μ_q - μ_p) = [1, 1] · I · [1, 1]ᵀ = 1² + 1² = 2
Final computation: • D_KL = 0.5 × (0 - 2 + 2 + 2) = 0.5 × 2 = 1.0

The divergence of 1.0 reflects the distance between the two distribution centers.

Example

Input

mu_p = [2.0, 3.0]
Cov_p = [[2.0, 0.5], [0.5, 1.5]]
mu_q = [2.0, 3.0]
Cov_q = [[2.0, 0.5], [0.5, 1.5]]

Output

0.0

Explanation

Both distributions are identical - they have the same mean vectors and the same covariance matrices.

Verification using the formula:

Log determinant ratio: log(|Σ_q|/|Σ_p|) = log(1) = 0 (identical matrices)
Trace term: tr(Σ_q⁻¹ Σ_p) = tr(I) = 2 (since Σ_q⁻¹ Σ_p = I when Σ_q = Σ_p)
Mahalanobis distance: (μ_q - μ_p) = [0, 0], so the squared distance = 0
Final computation: • D_KL = 0.5 × (0 - 2 + 2 + 0) = 0.5 × 0 = 0.0

This confirms the fundamental property: KL divergence is zero if and only if the distributions are identical. When P and Q are the same distribution, no information is lost by using Q to represent P.

Example

Input

mu_p = [1.0, 2.0, 3.0]
Cov_p = [[2.0, 0.3, 0.1], [0.3, 1.5, 0.2], [0.1, 0.2, 1.0]]
mu_q = [0.0, 0.0, 0.0]
Cov_q = [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]

Output

7.2304

Explanation

This example demonstrates a 3-dimensional case with more complex covariance structures.

Distribution P: Has mean [1, 2, 3] and a non-diagonal covariance matrix with correlations between dimensions.

Distribution Q: Is a standard multivariate normal (mean at origin, identity covariance).

Analysis:

Dimensionality: d = 3
Determinant calculation: • |Σ_p| ≈ 2.67 (computed from the 3×3 determinant) • |Σ_q| = 1.0 (identity matrix) • Log ratio contributes: log(1/2.67) ≈ -0.98
Trace contribution: tr(Σ_q⁻¹ Σ_p) = tr(Σ_p) = 2 + 1.5 + 1 = 4.5 (since Σ_q⁻¹ = I)
Mahalanobis distance: μ_p is far from μ_q, and with identity Σ_q: • Distance² = 1² + 2² + 3² = 14
Combined result: D_KL = 0.5 × (-0.98 - 3 + 4.5 + 14) ≈ 7.2304

The high divergence reflects both the mean displacement and the different covariance structures.

Accepted0/0·0% Acceptance

Constraints

1 ≤ d ≤ 50 (dimensionality of the distributions)
Mean vectors μ_p and μ_q have the same length d
Covariance matrices Σ_p and Σ_q are d × d symmetric positive definite matrices
-10³ ≤ μ_p[i], μ_q[i] ≤ 10³ (mean values)
-10³ ≤ Σ_p[i][j], Σ_q[i][j] ≤ 10³ (covariance values)
Covariance matrices are guaranteed to be invertible (positive definite)
Output should be rounded to 4 decimal places

Code

Visualizer

Solutions

14px

Test Cases3

Results

Submissions

mu_p =

[0,0]

mu_q =

[1,1]

Cov_p =

[[1,0],[0,1]]

Cov_q =

[[1,0],[0,1]]

Gaussian Distribution Divergence Measure

Hints

Gaussian Distribution Divergence Measure

Hints