Loading content...
Denoising Diffusion Probabilistic Models (DDPMs) represent one of the most significant breakthroughs in generative AI, powering systems like Stable Diffusion, DALL-E, and Midjourney. At the heart of training these models lies a critical metric: the reconstruction error, which quantifies how accurately the model can reverse the noise corruption process.
The forward diffusion process systematically corrupts data by adding Gaussian noise over a sequence of timesteps. Starting with a clean sample x₀, noise is progressively added according to a variance schedule (typically denoted as β) until the data becomes indistinguishable from pure noise.
Linear Beta Schedule:
The variance schedule is defined as a linear interpolation from beta_start to beta_end across num_timesteps values:
$$\beta_t = \text{beta_start} + \frac{t-1}{T-1} \times (\text{beta_end} - \text{beta_start})$$
where t ranges from 1 to T (the total number of timesteps).
Alpha Values: From the beta schedule, we compute alpha values:
$$\alpha_t = 1 - \beta_t$$
Cumulative Alpha Product (Alpha Bar): The cumulative product of alphas up to timestep t:
$$\bar{\alpha}t = \prod{s=1}^{t} \alpha_s$$
Using the reparameterization trick, a noisy sample at timestep t can be directly computed from the original sample:
$$x_t = \sqrt{\bar{\alpha}_t} \cdot x_0 + \sqrt{1 - \bar{\alpha}_t} \cdot \epsilon$$
where ε is the actual noise added to the sample.
Given a noisy sample xₜ and the model's predicted noise ε̂, we can reconstruct an estimate of the original sample:
$$\hat{x}_0 = \frac{x_t - \sqrt{1 - \bar{\alpha}_t} \cdot \hat{\epsilon}}{\sqrt{\bar{\alpha}_t}}$$
The reconstruction error is the Mean Squared Error (MSE) between the true original sample and the reconstructed estimate:
$$\mathcal{L}{\text{recon}} = \text{MSE}(x_0, \hat{x}0) = \frac{1}{n} \sum{i=1}^{n} (x{0,i} - \hat{x}_{0,i})^2$$
Your Task: Implement a function that computes the reconstruction error for a diffusion model. The function should:
beta_start to beta_end with num_timesteps valuesalpha = 1 - betax_0 = [1.0]
t = 1
beta_start = 0.1
beta_end = 0.2
num_timesteps = 5
noise = [1.0]
predicted_noise = [0.9]0.0011Step 1: Generate beta schedule With beta_start=0.1, beta_end=0.2, and num_timesteps=5, the linear schedule at t=1 gives β₁ = 0.1.
Step 2: Compute alpha and alpha_bar α₁ = 1 - β₁ = 1 - 0.1 = 0.9 α̅₁ = α₁ = 0.9 (only first timestep)
Step 3: Compute noisy sample x_t x₁ = √0.9 × 1.0 + √0.1 × 1.0 ≈ 0.9487 + 0.3162 ≈ 1.2649
Step 4: Reconstruct x̂₀ using predicted noise x̂₀ = (1.2649 - √0.1 × 0.9) / √0.9 ≈ (1.2649 - 0.2846) / 0.9487 ≈ 1.0333
Step 5: Compute MSE MSE = (1.0 - 1.0333)² = 0.001109 ≈ 0.0011
The small noise prediction error (0.1) leads to a small reconstruction error.
x_0 = [1.0, 2.0]
t = 2
beta_start = 0.01
beta_end = 0.1
num_timesteps = 10
noise = [0.5, -0.5]
predicted_noise = [0.5, -0.5]0.0When the predicted noise exactly matches the actual noise added during the forward process, the reconstruction is perfect.
Since predicted_noise = noise = [0.5, -0.5], the model perfectly predicts the corruption, allowing exact recovery of x₀.
Result: MSE = 0.0 (perfect reconstruction)
This demonstrates the ideal scenario where the denoising model has learned the exact noise distribution.
x_0 = [0.0, 1.0, -1.0]
t = 5
beta_start = 0.0001
beta_end = 0.02
num_timesteps = 100
noise = [1.0, -1.0, 0.5]
predicted_noise = [0.8, -0.9, 0.6]0.0001This example uses typical diffusion model hyperparameters with a very gradual noise schedule.
Noise prediction errors: • Dimension 1: |1.0 - 0.8| = 0.2 • Dimension 2: |-1.0 - (-0.9)| = 0.1 • Dimension 3: |0.5 - 0.6| = 0.1
At early timesteps (t=5 out of 100), the cumulative alpha (α̅ₜ) is still close to 1, meaning minimal noise has been added. The small prediction errors translate to a very small reconstruction error of 0.0001.
This illustrates how reconstruction error relates to both the noise schedule and prediction accuracy.
Constraints