Module

Advanced ML & Model Interpretability

Progress68%

15 / 22 pages

Lesson 1: Advanced Evaluation Metrics

Lesson 2: Stratified K-Fold Cross-Validation

Lesson 3: SHAP (SHapley Additive exPlanations)

Lesson 4: LIME (Local Interpretable Model-agnostic Explanations)

Lesson 5: Data Distributions & Normality

Lesson 6: Feature Scaling & Normalization

Lesson 7: Handling Class Imbalance

Lesson 8: Hyperparameter Tuning (Grid & Random Search)

Lesson 9: Feature Engineering — Create Better Features

Lesson 10: XGBoost — The Best Algorithm

Lesson 11: Advanced Ensemble Methods

Lesson 12: Introduction to Neural Networks

Lesson 13: Model Deployment & Production

Lesson 14: Model Monitoring & Drift Detection

Lesson 15: ML Ethics & Fairness

Lesson 16: Time Series Basics

Lesson 17: Causal Inference & A/B Testing

Lesson 18: Model Calibration & Probability Estimates

Back to Module Overview

Alt+←/→to navigatePage15/2268

Advanced Ensemble Methods · Page 1 of 1

Beyond Bagging & Boosting

28 min Advanced

Advanced Ensemble Methods

Voting Classifier

Combine multiple different algorithms:

from sklearn.ensemble import VotingClassifier

models = [
    ('lr', LogisticRegression()),
    ('rf', RandomForestClassifier()),
    ('xgb', XGBClassifier())
]

ensemble = VotingClassifier(estimators=models, voting='hard')
ensemble.fit(X_train, y_train)

Hard Voting: Majority vote (1 if 2 out of 3 models say 1) Soft Voting: Average probabilities (if models output 0.8, 0.6, 0.9 → average 0.77)

Pros: Simple, different models capture different patterns Cons: Requires training many models

Stacking (Meta-Learning)

Level 0: Train multiple diverse models on training data
Level 1: Use predictions from Level 0 as input to a meta-learner
Final: Meta-learner makes the final prediction

from sklearn.ensemble import StackingClassifier

base_learners = [
    ('lr', LogisticRegression()),
    ('rf', RandomForestClassifier()),
    ('svm', SVC(probability=True))
]

meta_learner = LogisticRegression()

stacking = StackingClassifier(
    estimators=base_learners,
    final_estimator=meta_learner,
    cv=5
)

Why Stacking Wins:

Level 0 models learn raw patterns
Meta-learner learns which models to trust
Example: RF works well on some features, LR on others → Meta-learner learns to combine them

Blending

Like stacking, but simpler:

Split training data: 60% train, 40% validation
Train diverse models on 60%
Get predictions on 40% validation set
Use validation predictions as meta-features
Train meta-learner

Pros: Faster (no CV needed in meta-training) Cons: Uses less data for base learner training

When to Use Each:

Method	Speed	Performance	Use Case
Voting	Fast	Good	Quick ensemble, different models available
Stacking	Slow	Very Good	Production, high accuracy needed
Blending	Medium	Good	Competition, limited time

main.py

OUTPUT

▶Click "Run Code" to execute…