Approximate Posterior Quality Metric via Monte Carlo Sampling (Hard) — Practice with Code Visualizer

In probabilistic machine learning, inferring hidden (latent) variables from observed data is a central challenge. Given observed data x, we often want to compute the posterior distribution p(z|x), which tells us what hidden variables z are likely given what we've observed. However, computing this posterior exactly is typically intractable due to the normalization constant (evidence) in Bayes' theorem.

Variational Inference offers an elegant solution: instead of computing the exact posterior, we approximate it with a simpler, tractable distribution q(z) and optimize this approximation to be as close as possible to the true posterior. The Evidence Lower Bound (ELBO) serves as our optimization objective—a quantity that we maximize to find the best approximation.

Mathematical Foundation

The ELBO provides a lower bound on the log-evidence (marginal likelihood) log p(x) and is defined as:

$$\text{ELBO} = \mathbb{E}{q(z)}[\log p(x|z)] + \mathbb{E}{q(z)}[\log p(z)] - \mathbb{E}_{q(z)}[\log q(z)]$$

This can be rewritten as three interpretable components:

Expected Log-Likelihood: $\mathbb{E}_{q(z)}[\log p(x|z)]$ — measures how well samples from q(z) explain the observed data
Expected Log-Prior: $\mathbb{E}_{q(z)}[\log p(z)]$ — measures how consistent samples from q(z) are with our prior beliefs
Entropy: $H[q] = -\mathbb{E}_{q(z)}[\log q(z)]$ — measures the "spread" or uncertainty in our approximation

Gaussian Distributions

For this problem, all distributions are Gaussian (Normal):

Approximate Posterior: $q(z) = \mathcal{N}(z | \mu_q, \sigma_q^2)$
Prior: $p(z) = \mathcal{N}(z | \mu_p, \sigma_p^2)$
Likelihood: $p(x_i|z) = \mathcal{N}(x_i | z, \sigma_l^2)$ for each observation

The Gaussian probability density function is:

$$\mathcal{N}(x|\mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)$$

The entropy of a Gaussian is:

$$H[q] = \frac{1}{2}\log(2\pi e \sigma_q^2)$$

Monte Carlo Estimation

Since expectations over q(z) don't always have closed-form solutions in more complex models, we estimate them using Monte Carlo sampling:

Draw n_samples samples $z^{(1)}, z^{(2)}, ..., z^{(n)}$ from $q(z)$
Approximate expectations as sample averages: $$\mathbb{E}{q(z)}[f(z)] \approx \frac{1}{n} \sum{i=1}^{n} f(z^{(i)})$$

Your Task

Implement a function that computes the ELBO given:

Observed data points x
Parameters of the approximate posterior q(z): mean (q_mean) and standard deviation (q_std)
Parameters of the prior p(z): mean (prior_mean) and standard deviation (prior_std)
Standard deviation of the likelihood p(x|z): likelihood_std
Number of Monte Carlo samples to use

Return the estimated ELBO value rounded to 2 decimal places.

Important: Use a fixed random seed of 42 for reproducibility when sampling from q(z).

The ELBO is computed as the sum of three components:

1. Expected Log-Likelihood E_q[log p(x|z)]: For each sample z drawn from q(z) = N(0.5, 0.7²), we compute the log-probability of observing x=1.0 under the likelihood p(x|z) = N(z, 1.0²). Averaging over 10,000 samples gives approximately -1.29.

2. Expected Log-Prior E_q[log p(z)]: For each sample z from q, we compute log p(z) where p(z) = N(0.0, 1.0²). The average is approximately -1.29.

3. Entropy H[q]: For a Gaussian, H[q] = 0.5 × log(2πe × σ²) = 0.5 × log(2π × 2.718 × 0.49) ≈ 1.06.

ELBO = -1.29 + (-1.29) + 1.06 ≈ -1.52

In this case, the approximate posterior q(z) exactly matches the prior p(z), both being N(0, 1).

1. Expected Log-Likelihood: For x=0.0 and z samples from N(0, 1), the expected log-likelihood under N(z, 1²) is approximately -1.42.

2. Expected Log-Prior: Since q = p, the samples from q have expected log-prior of approximately -1.42.

3. Entropy: H[N(0, 1)] = 0.5 × log(2πe) ≈ 1.42.

ELBO = -1.42 + (-1.42) + 1.42 ≈ -1.43

This represents a baseline scenario where the variational approximation matches the prior.

With multiple observations, the ELBO accounts for all data points:

1. Expected Log-Likelihood: Now we sum log-likelihoods for x₁=1.0, x₂=2.0, and x₃=3.0. Each observation contributes to the total expected log-likelihood. With q_mean=2.0, the approximation is centered near the data mean, giving a reasonable expected log-likelihood of approximately -3.24.

2. Expected Log-Prior: With a wider prior (σ=2.0), the penalty for q_mean=2.0 being away from prior_mean=0.0 is reduced. The expected log-prior is approximately -1.73.

3. Entropy: H[N(2, 0.5²)] = 0.5 × log(2πe × 0.25) ≈ 0.42 (lower entropy due to more concentrated approximation).

ELBO = -3.24 + (-1.73) + 0.42 ≈ -5.55

The ELBO balances fitting the data well (high likelihood) while staying consistent with the prior and maintaining some uncertainty (entropy).

Mathematical Foundation

The ELBO provides a lower bound on the log-evidence (marginal likelihood) log p(x) and is defined as:

$$\text{ELBO} = \mathbb{E}{q(z)}[\log p(x|z)] + \mathbb{E}{q(z)}[\log p(z)] - \mathbb{E}_{q(z)}[\log q(z)]$$

This can be rewritten as three interpretable components:

Expected Log-Likelihood: $\mathbb{E}_{q(z)}[\log p(x|z)]$ — measures how well samples from q(z) explain the observed data
Expected Log-Prior: $\mathbb{E}_{q(z)}[\log p(z)]$ — measures how consistent samples from q(z) are with our prior beliefs
Entropy: $H[q] = -\mathbb{E}_{q(z)}[\log q(z)]$ — measures the "spread" or uncertainty in our approximation

Gaussian Distributions

For this problem, all distributions are Gaussian (Normal):

Approximate Posterior: $q(z) = \mathcal{N}(z | \mu_q, \sigma_q^2)$
Prior: $p(z) = \mathcal{N}(z | \mu_p, \sigma_p^2)$
Likelihood: $p(x_i|z) = \mathcal{N}(x_i | z, \sigma_l^2)$ for each observation

The Gaussian probability density function is:

$$\mathcal{N}(x|\mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)$$

The entropy of a Gaussian is:

$$H[q] = \frac{1}{2}\log(2\pi e \sigma_q^2)$$

Monte Carlo Estimation

Since expectations over q(z) don't always have closed-form solutions in more complex models, we estimate them using Monte Carlo sampling:

Draw n_samples samples $z^{(1)}, z^{(2)}, ..., z^{(n)}$ from $q(z)$
Approximate expectations as sample averages: $$\mathbb{E}{q(z)}[f(z)] \approx \frac{1}{n} \sum{i=1}^{n} f(z^{(i)})$$

Your Task

Implement a function that computes the ELBO given:

Observed data points x
Parameters of the approximate posterior q(z): mean (q_mean) and standard deviation (q_std)
Parameters of the prior p(z): mean (prior_mean) and standard deviation (prior_std)
Standard deviation of the likelihood p(x|z): likelihood_std
Number of Monte Carlo samples to use

Return the estimated ELBO value rounded to 2 decimal places.

Important: Use a fixed random seed of 42 for reproducibility when sampling from q(z).

The ELBO is computed as the sum of three components:

2. Expected Log-Prior E_q[log p(z)]: For each sample z from q, we compute log p(z) where p(z) = N(0.0, 1.0²). The average is approximately -1.29.

3. Entropy H[q]: For a Gaussian, H[q] = 0.5 × log(2πe × σ²) = 0.5 × log(2π × 2.718 × 0.49) ≈ 1.06.

ELBO = -1.29 + (-1.29) + 1.06 ≈ -1.52

In this case, the approximate posterior q(z) exactly matches the prior p(z), both being N(0, 1).

1. Expected Log-Likelihood: For x=0.0 and z samples from N(0, 1), the expected log-likelihood under N(z, 1²) is approximately -1.42.

2. Expected Log-Prior: Since q = p, the samples from q have expected log-prior of approximately -1.42.

3. Entropy: H[N(0, 1)] = 0.5 × log(2πe) ≈ 1.42.

ELBO = -1.42 + (-1.42) + 1.42 ≈ -1.43

This represents a baseline scenario where the variational approximation matches the prior.

With multiple observations, the ELBO accounts for all data points:

2. Expected Log-Prior: With a wider prior (σ=2.0), the penalty for q_mean=2.0 being away from prior_mean=0.0 is reduced. The expected log-prior is approximately -1.73.

3. Entropy: H[N(2, 0.5²)] = 0.5 × log(2πe × 0.25) ≈ 0.42 (lower entropy due to more concentrated approximation).

ELBO = -3.24 + (-1.73) + 0.42 ≈ -5.55

The ELBO balances fitting the data well (high likelihood) while staying consistent with the prior and maintaining some uncertainty (entropy).

Approximate Posterior Quality Metric via Monte Carlo Sampling

Mathematical Foundation

Gaussian Distributions

Monte Carlo Estimation

Your Task

Hints

Approximate Posterior Quality Metric via Monte Carlo Sampling

Mathematical Foundation

Gaussian Distributions

Monte Carlo Estimation

Your Task

Hints