19/20
Gaussian Mixture Models (GMM) Β· Page 1 of 1

Soft Clustering with GMM

Gaussian Mixture Models (GMM)

Hard vs Soft Clustering

Hard Clustering (K-Means, DBSCAN)

Each point belongs to exactly ONE cluster.

Point: [5.1, 3.5] β†’ Cluster 0 (100%)

Soft Clustering (GMM)

Each point has PROBABILITY of belonging to each cluster.

Point: [5.1, 3.5] β†’ 70% Cluster 0, 30% Cluster 1

The Model

Assume each cluster is a Gaussian distribution (bell curve):

  • Cluster A: Mean=ΞΌ_A, Covariance=Ξ£_A
  • Cluster B: Mean=ΞΌ_B, Covariance=Ξ£_B
  • ...

A data point is sampled from one of these Gaussians!

Graphically:

Two overlapping bell curves.
Point near the overlap belongs to both with high probability.

EM Algorithm (Expectation-Maximization)

  1. Initialization: Randomly place K Gaussians
  2. E-step (Expectation): For each point, calculate probability of belonging to each Gaussian
  3. M-step (Maximization): Update Gaussian parameters (ΞΌ, Ξ£) based on probabilities
  4. Repeat until convergence

Why it works:

  • E-step: "Which cluster is this point from?"
  • M-step: "Refit each cluster to its assigned points"
  • Iterate until stable

Advantages & Disadvantages

Pros:

  • βœ“ Probabilistic (know confidence)
  • βœ“ Can handle overlapping clusters
  • βœ“ More flexible than K-Means
  • βœ“ Theoretical foundation

Cons:

  • βœ— Assumes Gaussian shape (may not hold)
  • βœ— Sensitive to number of components (K)
  • βœ— Slower than K-Means
  • βœ— Can get stuck in local optima

Choosing Number of Clusters

Use AIC (Akaike Information Criterion) or BIC (Bayesian Information Criterion):

  • Train GMM with K=1,2,3,... up to 10
  • Calculate AIC/BIC for each
  • Lower is better
  • Pick K with lowest BIC

GMM vs K-Means vs DBSCAN

AspectK-MeansGMMDBSCAN
Soft clusters?NoYesNo
Assumes shapeSphericalGaussianAny
SpeedFastMediumSlow
K needed?YesYesNo
InterpretabilityHighMediumLow
OutputLabelsProbabilitiesLabels+Noise
main.py
Loading...
OUTPUT
β–ΆClick "Run Code" to execute…