Population Mean Confidence Estimator (Medium) — Practice with Code Visualizer

One of the most powerful concepts in inferential statistics is the confidence interval—a range of plausible values that is likely to contain the true population parameter with a specified level of certainty. When we cannot measure every individual in a population, we rely on sample data to make inferences about population characteristics.

The confidence interval for a population mean provides a statistically rigorous way to express uncertainty about our estimate. Rather than stating a single point estimate, we acknowledge the inherent sampling variability by providing a range that, with a given probability, captures the true population mean.

The T-Distribution Approach: When the population standard deviation is unknown (which is almost always the case in practice), we use the Student's t-distribution instead of the normal distribution. The t-distribution has heavier tails, which accounts for the additional uncertainty introduced by estimating the population standard deviation from sample data. As sample size increases, the t-distribution converges to the standard normal distribution.

Mathematical Framework: For a sample of size n with sample mean x̄ and sample standard deviation s, the confidence interval is constructed as follows:

Standard Error of the Mean (SEM): $$SE = \frac{s}{\sqrt{n}}$$
Critical Value: Find t* from the t-distribution with df = n - 1 degrees of freedom at the desired confidence level
Margin of Error: $$ME = t^* \times SE$$
Confidence Interval: $$\left[ \bar{x} - ME, \bar{x} + ME \right]$$

Interpretation: A 95% confidence interval means that if we were to repeat this sampling process many times and construct confidence intervals for each sample, approximately 95% of those intervals would contain the true population mean. It does not mean there is a 95% probability that the true mean lies within any particular interval.

Your Task: Implement a function that takes sample data and a confidence level, then returns a dictionary containing the sample mean, standard error, margin of error, lower and upper bounds of the confidence interval, and the confidence level. Round all numerical results to 3 decimal places.

With n = 8 observations, we first compute the sample mean: (10+12+11+13+14+10+12+11)/8 = 11.625.

The sample standard deviation s ≈ 1.408.

The standard error is SE = 1.408/√8 ≈ 0.498.

With df = 7 degrees of freedom and a 95% confidence level, the two-tailed t-critical value is approximately 2.365.

The margin of error is ME = 2.365 × 0.498 ≈ 1.177.

Finally, the confidence interval is [11.625 - 1.177, 11.625 + 1.177] = [10.448, 12.802].

We can state with 95% confidence that the true population mean lies within this interval.

For this smaller sample of n = 5 observations with a higher 99% confidence level:

• Sample mean = (5.5+6.2+5.8+6.0+5.9)/5 = 5.88 • Sample standard deviation s ≈ 0.259 • Standard error SE = 0.259/√5 ≈ 0.116 • With df = 4 and 99% confidence, t-critical ≈ 4.604 (higher than for 95% confidence) • Margin of error ME = 4.604 × 0.116 ≈ 0.533 • Confidence interval = [5.88 - 0.533, 5.88 + 0.533] = [5.347, 6.413]

Note how the 99% confidence interval is wider than a 95% interval would be—greater confidence requires a wider range.

With a larger sample of n = 10 observations but lower 90% confidence:

• Sample mean = 21.7 • Sample standard deviation s ≈ 1.889 • Standard error SE = 1.889/√10 ≈ 0.597 • With df = 9 and 90% confidence, t-critical ≈ 1.833 (lower than for 95% confidence) • Margin of error ME = 1.833 × 0.597 ≈ 1.095 • Confidence interval = [21.7 - 1.095, 21.7 + 1.095] = [20.605, 22.795]

A 90% confidence interval is narrower than a 95% interval, but we are less certain that it contains the true population mean.