Page11/22
Hyperparameter Tuning (Grid & Random Search) · Page 1 of 1
What are Hyperparameters?
Hyperparameter Tuning
Hyperparameters vs Parameters
Parameters
Learned during training.
- Linear Regression weights (w, b)
- Neural Network weights
- Decision Tree split thresholds
Hyperparameters
Set before training. You choose them.
- Learning rate
- Number of trees in Random Forest
- Max depth of decision tree
- K in K-Nearest Neighbors
- Regularization strength (λ)
Manual vs Automated Tuning
Manual (Bad)
model = RandomForest(n_estimators=10) # Guess 10
# Train, test...maybe it's not optimal
model = RandomForest(n_estimators=50) # Try 50
# Train, test...still not optimal
Takes forever, often suboptimal.
Automated: GridSearchCV (Exhaustive)
Try all combinations:
param_grid = {
'n_estimators': [10, 50, 100],
'max_depth': [5, 10, 20],
'min_samples_split': [2, 5, 10]
}
# Total: 3 * 3 * 3 = 27 combinations tested
Pros: Guaranteed to find the best combo Cons: Slow for large grids (10 hyperparameters = 10,000+ combos)
Automated: RandomizedSearchCV (Sampling)
Try random combinations:
param_dist = {
'n_estimators': range(10, 200), # Sample 20 random values
'max_depth': range(5, 50),
}
search = RandomizedSearchCV(model, param_dist, n_iter=20, cv=5)
Pros: Faster than Grid Search Cons: Might miss the optimal combo
K-Fold During Tuning
GridSearchCV automatically uses K-Fold CV to evaluate each combo:
- Split data into 5 folds
- For each hyperparameter combo:
- Train on folds 1-4, test on fold 5
- Train on folds 1,2,3,5, test on fold 4
- ... (repeat 5 times)
- Average the 5 scores
- Pick the combo with best average score
- Retrain on entire training set
- Evaluate on test set (never used during tuning!)
main.py
Loading...
OUTPUT
▶Click "Run Code" to execute…