Loading content...
In machine learning, probabilistic classification combines statistical modeling with Bayesian inference to make predictions based on observed data. One elegant approach models each class using a Gaussian (Normal) distribution, assuming that features within a class follow a bell-shaped probability distribution characterized by their mean and variance.
The Probabilistic Gaussian Classifier operates on a fundamental assumption: given a class label, the features are conditionally independent and each feature follows a Gaussian distribution. While this "naive" independence assumption rarely holds perfectly in real-world data, the classifier performs remarkably well in practice across diverse domains.
Learning Phase: During training, the classifier learns three key parameters for each class c:
Prediction Phase: For a new test sample with features (x = [x_1, x_2, ..., x_d]), the classifier computes the posterior probability for each class using Bayes' theorem:
$$P(c|x) \propto P(c) \cdot \prod_{j=1}^{d} P(x_j|c)$$
Where the likelihood (P(x_j|c)) is computed using the Gaussian Probability Density Function:
$$P(x_j|c) = \frac{1}{\sqrt{2\pi\sigma^2_{c,j}}} \exp\left(-\frac{(x_j - \mu_{c,j})^2}{2\sigma^2_{c,j}}\right)$$
The predicted class is the one with the highest posterior probability.
Implementation Details:
Your Task: Implement a function that trains a Probabilistic Gaussian Classifier on the provided training data and returns predicted class labels for new test samples.
X_train = [[1.0, 2.0], [2.0, 3.0], [3.0, 4.0], [6.0, 7.0], [7.0, 8.0], [8.0, 9.0]]
y_train = [0, 0, 0, 1, 1, 1]
X_test = [[2.5, 3.5], [6.5, 7.5]][0 1]The classifier learns Gaussian parameters for each class:
Class 0: Mean = [2.0, 3.0], Variance = [0.667, 0.667], Prior = 0.5 Class 1: Mean = [7.0, 8.0], Variance = [0.667, 0.667], Prior = 0.5
For test point [2.5, 3.5]: • Distance to Class 0 mean: √((2.5-2)² + (3.5-3)²) ≈ 0.71 • Distance to Class 1 mean: √((2.5-7)² + (3.5-8)²) ≈ 6.36 • The Gaussian likelihood is much higher for Class 0 → Predicted: 0
For test point [6.5, 7.5]: • Distance to Class 0 mean: √((6.5-2)² + (7.5-3)²) ≈ 6.36 • Distance to Class 1 mean: √((6.5-7)² + (7.5-8)²) ≈ 0.71 • The Gaussian likelihood is much higher for Class 1 → Predicted: 1
X_train = [[0.0, 0.0], [1.0, 1.0], [5.0, 5.0], [6.0, 6.0], [10.0, 10.0], [11.0, 11.0]]
y_train = [0, 0, 1, 1, 2, 2]
X_test = [[0.5, 0.5], [5.5, 5.5], [10.5, 10.5]][0 1 2]This demonstrates multi-class classification with three distinct clusters:
Class 0: Mean = [0.5, 0.5], samples centered near the origin Class 1: Mean = [5.5, 5.5], samples in the middle region Class 2: Mean = [10.5, 10.5], samples at higher values
Each test point falls closest to its respective class cluster: • [0.5, 0.5] → Nearest to Class 0's distribution → Predicted: 0 • [5.5, 5.5] → Nearest to Class 1's distribution → Predicted: 1 • [10.5, 10.5] → Nearest to Class 2's distribution → Predicted: 2
X_train = [[1.0], [2.0], [3.0], [10.0], [11.0], [12.0]]
y_train = [0, 0, 0, 1, 1, 1]
X_test = [[2.5], [10.5]][0 1]This example shows the classifier working with univariate data (single feature):
Class 0: Mean = 2.0, Variance = 0.667 Class 1: Mean = 11.0, Variance = 0.667
For test point [2.5]: • Mahalanobis distance to Class 0: |2.5 - 2.0| / √0.667 ≈ 0.61 • Mahalanobis distance to Class 1: |2.5 - 11.0| / √0.667 ≈ 10.41 • Predicted: 0 (much closer to Class 0's distribution)
For test point [10.5]: • Mahalanobis distance to Class 0: |10.5 - 2.0| / √0.667 ≈ 10.41 • Mahalanobis distance to Class 1: |10.5 - 11.0| / √0.667 ≈ 0.61 • Predicted: 1 (much closer to Class 1's distribution)
Constraints