15/22
Advanced Ensemble Methods · Page 1 of 1

Beyond Bagging & Boosting

Advanced Ensemble Methods

Voting Classifier

Combine multiple different algorithms:

from sklearn.ensemble import VotingClassifier

models = [
    ('lr', LogisticRegression()),
    ('rf', RandomForestClassifier()),
    ('xgb', XGBClassifier())
]

ensemble = VotingClassifier(estimators=models, voting='hard')
ensemble.fit(X_train, y_train)

Hard Voting: Majority vote (1 if 2 out of 3 models say 1) Soft Voting: Average probabilities (if models output 0.8, 0.6, 0.9 → average 0.77)

Pros: Simple, different models capture different patterns Cons: Requires training many models

Stacking (Meta-Learning)

  1. Level 0: Train multiple diverse models on training data
  2. Level 1: Use predictions from Level 0 as input to a meta-learner
  3. Final: Meta-learner makes the final prediction
from sklearn.ensemble import StackingClassifier

base_learners = [
    ('lr', LogisticRegression()),
    ('rf', RandomForestClassifier()),
    ('svm', SVC(probability=True))
]

meta_learner = LogisticRegression()

stacking = StackingClassifier(
    estimators=base_learners,
    final_estimator=meta_learner,
    cv=5
)

Why Stacking Wins:

  • Level 0 models learn raw patterns
  • Meta-learner learns which models to trust
  • Example: RF works well on some features, LR on others → Meta-learner learns to combine them

Blending

Like stacking, but simpler:

  1. Split training data: 60% train, 40% validation
  2. Train diverse models on 60%
  3. Get predictions on 40% validation set
  4. Use validation predictions as meta-features
  5. Train meta-learner

Pros: Faster (no CV needed in meta-training) Cons: Uses less data for base learner training

When to Use Each:

MethodSpeedPerformanceUse Case
VotingFastGoodQuick ensemble, different models available
StackingSlowVery GoodProduction, high accuracy needed
BlendingMediumGoodCompetition, limited time
main.py
Loading...
OUTPUT
Click "Run Code" to execute…