3/22
Stratified K-Fold Cross-Validation · Page 1 of 1

The Problem with Random Splits

Stratified K-Fold Cross-Validation

The Problem

Imagine your dataset has 99% Class 0 and 1% Class 1.

  • Random K-Fold might accidentally put all Class 1 samples in Fold 3.
  • Folds 1,2,4,5 train on only Class 0 data → They never learn the minority class!
  • Results are misleading.

The Solution: Stratified K-Fold

Stratified K-Fold ensures every fold has the same class distribution as the original dataset.

How it works:

  1. Sort data by class label.
  2. Divide into K groups, alternating class labels.
  3. Each fold contains ~1% Class 1 and ~99% Class 0.

Result: Every fold is a representative miniature of the whole dataset!

main.py
Loading...
OUTPUT
Click "Run Code" to execute…