17/20
Gradient Boosting & AdaBoost Β· Page 1 of 1

Boosting Philosophy

Gradient Boosting & AdaBoost

Boosting vs Bagging (Recap)

Bagging (Random Forest)

  • Train N trees independently (random subsets)
  • Predictions averaged
  • Reduces variance

Boosting (AdaBoost, Gradient Boosting)

  • Train trees sequentially (each corrects previous)
  • Trees focused on hard-to-predict samples
  • Reduces bias

AdaBoost (Adaptive Boosting)

How it works:

  1. Train weak learner (shallow tree) on all data
  2. Calculate error and increase weight on misclassified samples
  3. Train next learner on reweighted data (hard samples matter more)
  4. Repeat N times
  5. Combine: weighted vote of all N learners

Why it works:

Early learners catch obvious patterns. Later learners focus on edge cases. Final model = consensus of experts, each specializing in different patterns.

Gradient Boosting

More general version of AdaBoost:

  1. Train initial weak learner
  2. Calculate residuals (errors)
  3. Train next learner to predict residuals
  4. Update predictions: pred = pred + learning_rate Γ— residuals_pred
  5. Repeat

Key difference from AdaBoost:

  • AdaBoost: Reweight samples
  • Gradient Boosting: Fit residuals

Gradient Boosting > AdaBoost in most cases!

Hyperparameters

ParamEffectTypical Range
n_estimatorsNumber of trees50-500
learning_rateStep size0.01-0.3 (smaller = more stable)
max_depthTree depth3-8 (shallow trees!)
subsampleRow sampling0.5-1.0
colsampleFeature sampling0.5-1.0

Comparison: Boosting vs Bagging

AspectBagging (RF)Boosting (GB)
TrainingParallelSequential
SpeedFastSlower
OverfittingLower riskHigher risk
BiasHigherLower
VarianceLowerSimilar
Best forStable baselineHigh accuracy needed

Popular Boosting Libraries

  • scikit-learn: GradientBoostingClassifier, AdaBoostClassifier
  • XGBoost: Faster, handles missing data (Lesson 5, Module 6)
  • LightGBM: Even faster, memory efficient
  • CatBoost: Handles categorical features automatically
main.py
Loading...
OUTPUT
β–ΆClick "Run Code" to execute…