Page3/22
Stratified K-Fold Cross-Validation · Page 1 of 1
The Problem with Random Splits
Stratified K-Fold Cross-Validation
The Problem
Imagine your dataset has 99% Class 0 and 1% Class 1.
- Random K-Fold might accidentally put all Class 1 samples in Fold 3.
- Folds 1,2,4,5 train on only Class 0 data → They never learn the minority class!
- Results are misleading.
The Solution: Stratified K-Fold
Stratified K-Fold ensures every fold has the same class distribution as the original dataset.
How it works:
- Sort data by class label.
- Divide into K groups, alternating class labels.
- Each fold contains ~1% Class 1 and ~99% Class 0.
Result: Every fold is a representative miniature of the whole dataset!
main.py
Loading...
OUTPUT
▶Click "Run Code" to execute…