Sample Mean Convergence to Gaussian Distribution (Medium) — Practice with Code Visualizer

One of the most profound and powerful results in probability theory is the observation that the distribution of sample means converges to a Gaussian (normal) distribution, regardless of the underlying population distribution, provided certain conditions are met. This fundamental phenomenon underpins much of modern statistical inference, hypothesis testing, and machine learning.

The Principle: When you repeatedly draw random samples of size n from any population with a finite mean μ and variance σ², compute the sample mean for each draw, and examine the distribution of these sample means:

The distribution of sample means approaches a normal distribution as the sample size increases
The mean of the sample means equals the population mean μ
The standard deviation of the sample means (called the standard error) equals σ / √n

Standardization to Z-Scores: To demonstrate this convergence rigorously, we standardize the sample means using the Z-score transformation:

$$Z = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}}$$

where X̄ is the sample mean, μ is the population mean, and σ is the population standard deviation. If the convergence holds, the standardized values should follow a standard normal distribution N(0, 1), with a mean of approximately 0 and a standard deviation of approximately 1.

Supported Distributions: Your implementation should handle sampling from the following distributions:

Distribution	Parameters	Population Mean (μ)	Population Std Dev (σ)
Uniform(0, 1)	min=0, max=1	0.5	1/√12 ≈ 0.2887
Exponential(λ=1)	scale=1	1.0	1.0
Bernoulli(p=0.3)	p=0.3	0.3	√(0.3×0.7) ≈ 0.4583

Your Task: Implement a function that:

Sets the random seed for reproducibility
Draws runs number of samples, each of size n, from the specified distribution
Computes the sample mean for each of the runs samples
Standardizes these sample means to Z-scores using the known population parameters
Returns a dictionary containing the mean and standard deviation of the Z-scores, each rounded to 3 decimal places

This exercise demonstrates one of the most important results in statistics and provides the foundation for understanding why many statistical methods assume normality—because of this remarkable convergence property.

We draw 10,000 independent samples, each containing 30 values from an Exponential(scale=1) distribution. For each sample, we compute the mean, then standardize all 10,000 sample means using Z = (X̄ - 1.0) / (1.0 / √30). The exponential distribution is notably right-skewed, yet the distribution of standardized sample means approaches N(0, 1). The resulting mean of ~0.006 (close to 0) and standard deviation of ~1.009 (close to 1) confirm this convergence to a standard normal distribution.

Using a Uniform(0, 1) distribution with population mean 0.5 and population standard deviation 1/√12, we generate 10,000 samples of size 30 each. After computing sample means and standardizing with Z = (X̄ - 0.5) / ((1/√12) / √30), the resulting distribution shows mean ≈ 0.009 and std ≈ 1.0. The uniform distribution, despite being flat and bounded, produces sample means that converge beautifully to normality.

The Bernoulli(p=0.3) distribution produces only 0s and 1s—a discrete, highly non-normal distribution. With population mean p = 0.3 and population standard deviation √(p(1-p)) ≈ 0.4583, we draw 10,000 samples of size 30. After standardization, the Z-scores have mean ≈ 0.003 and std ≈ 1.004. This demonstrates that even discrete distributions obey the convergence: given sufficient sample size, the sample mean distribution becomes approximately Gaussian.

The distribution of sample means approaches a normal distribution as the sample size increases
The mean of the sample means equals the population mean μ
The standard deviation of the sample means (called the standard error) equals σ / √n