11/22
Hyperparameter Tuning (Grid & Random Search) · Page 1 of 1

What are Hyperparameters?

Hyperparameter Tuning

Hyperparameters vs Parameters

Parameters

Learned during training.

  • Linear Regression weights (w, b)
  • Neural Network weights
  • Decision Tree split thresholds

Hyperparameters

Set before training. You choose them.

  • Learning rate
  • Number of trees in Random Forest
  • Max depth of decision tree
  • K in K-Nearest Neighbors
  • Regularization strength (λ)

Manual vs Automated Tuning

Manual (Bad)

model = RandomForest(n_estimators=10)  # Guess 10
# Train, test...maybe it's not optimal

model = RandomForest(n_estimators=50)  # Try 50
# Train, test...still not optimal

Takes forever, often suboptimal.

Automated: GridSearchCV (Exhaustive)

Try all combinations:

param_grid = {
    'n_estimators': [10, 50, 100],
    'max_depth': [5, 10, 20],
    'min_samples_split': [2, 5, 10]
}
# Total: 3 * 3 * 3 = 27 combinations tested

Pros: Guaranteed to find the best combo Cons: Slow for large grids (10 hyperparameters = 10,000+ combos)

Automated: RandomizedSearchCV (Sampling)

Try random combinations:

param_dist = {
    'n_estimators': range(10, 200),  # Sample 20 random values
    'max_depth': range(5, 50),
}
search = RandomizedSearchCV(model, param_dist, n_iter=20, cv=5)

Pros: Faster than Grid Search Cons: Might miss the optimal combo

K-Fold During Tuning

GridSearchCV automatically uses K-Fold CV to evaluate each combo:

  1. Split data into 5 folds
  2. For each hyperparameter combo:
    • Train on folds 1-4, test on fold 5
    • Train on folds 1,2,3,5, test on fold 4
    • ... (repeat 5 times)
    • Average the 5 scores
  3. Pick the combo with best average score
  4. Retrain on entire training set
  5. Evaluate on test set (never used during tuning!)
main.py
Loading...
OUTPUT
Click "Run Code" to execute…