Loading problem...
In data science and statistical analysis, understanding the central tendency, dispersion, and distribution of a dataset is fundamental before performing any advanced modeling or decision-making. Descriptive statistics provide a powerful lens through which raw data can be summarized into meaningful, interpretable numbers.
Given a dataset represented as a list of numerical values, your task is to compute a comprehensive set of summary statistics that capture the essential characteristics of the data.
Statistical Measures to Compute:
Mean (Arithmetic Average): The sum of all values divided by the count of values. It represents the central point of the data. $$\mu = \frac{1}{N} \sum_{i=1}^{N} x_i$$
Median: The middle value when data is sorted. For even-sized datasets, it is the average of the two central values.
Mode: The value that appears most frequently in the dataset. If multiple values share the highest frequency, return the smallest one.
Population Variance: Measures the average squared deviation from the mean, indicating data spread. $$\sigma^2 = \frac{1}{N} \sum_{i=1}^{N} (x_i - \mu)^2$$
Standard Deviation: The square root of variance, providing a measure of spread in the same units as the original data. $$\sigma = \sqrt{\sigma^2}$$
Percentiles (25th, 50th, 75th): Quartile values that divide the sorted data into four equal parts.
Interquartile Range (IQR): The difference between the 75th and 25th percentiles, measuring the spread of the middle 50% of data. $$IQR = Q_3 - Q_1$$
Your Task: Write a Python function that accepts a list of numerical values and returns a dictionary containing all the above statistical measures. All floating-point results should be rounded to 4 decimal places.
data = [1, 2, 2, 3, 4, 4, 4, 5]{"mean": 3.125, "median": 3.5, "mode": 4, "variance": 1.6094, "standard_deviation": 1.2686, "25th_percentile": 2.0, "50th_percentile": 3.5, "75th_percentile": 4.0, "interquartile_range": 2.0}Mean Calculation: Sum = 1 + 2 + 2 + 3 + 4 + 4 + 4 + 5 = 25 Mean = 25 / 8 = 3.125
Median Calculation: Sorted data: [1, 2, 2, 3, 4, 4, 4, 5] For 8 elements (even count), median = average of 4th and 5th values Median = (3 + 4) / 2 = 3.5
Mode Calculation: Value frequencies: {1: 1, 2: 2, 3: 1, 4: 3, 5: 1} Mode = 4 (appears 3 times, the highest frequency)
Variance & Standard Deviation: Variance = Σ(xi - μ)² / N = 12.875 / 8 = 1.6094 Standard Deviation = √1.6094 = 1.2686
Percentiles: 25th percentile (Q1) = 2.0 50th percentile (Q2) = 3.5 (same as median) 75th percentile (Q3) = 4.0 IQR = Q3 - Q1 = 4.0 - 2.0 = 2.0
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]{"mean": 5.5, "median": 5.5, "mode": 1, "variance": 8.25, "standard_deviation": 2.8723, "25th_percentile": 3.25, "50th_percentile": 5.5, "75th_percentile": 7.75, "interquartile_range": 4.5}Mean Calculation: Sum = 1 + 2 + 3 + ... + 10 = 55 Mean = 55 / 10 = 5.5
Median Calculation: For 10 elements, median = average of 5th and 6th values Median = (5 + 6) / 2 = 5.5
Mode Calculation: All values appear exactly once. When there's no clear mode (uniform distribution), return the smallest value. Mode = 1
Variance & Standard Deviation: Variance = 82.5 / 10 = 8.25 Standard Deviation = √8.25 = 2.8723
Percentiles: Using linear interpolation for percentile calculation: 25th percentile (Q1) = 3.25 50th percentile (Q2) = 5.5 75th percentile (Q3) = 7.75 IQR = 7.75 - 3.25 = 4.5
data = [-5, -3, -1, 0, 1, 3, 5]{"mean": 0.0, "median": 0.0, "mode": -5, "variance": 10.0, "standard_deviation": 3.1623, "25th_percentile": -2.0, "50th_percentile": 0.0, "75th_percentile": 2.0, "interquartile_range": 4.0}Mean Calculation: Sum = -5 + (-3) + (-1) + 0 + 1 + 3 + 5 = 0 Mean = 0 / 7 = 0.0
Median Calculation: For 7 elements (odd count), median = 4th value (middle element) Median = 0.0
Mode Calculation: All values have equal frequency of 1. Return the smallest value. Mode = -5
Variance & Standard Deviation: This is a symmetric distribution centered at zero. Variance = 70 / 7 = 10.0 Standard Deviation = √10.0 = 3.1623
Percentiles: 25th percentile (Q1) = -2.0 50th percentile (Q2) = 0.0 75th percentile (Q3) = 2.0 IQR = 2.0 - (-2.0) = 4.0
Note: This symmetric distribution has equal spread on both sides of the mean.
Constraints