Hyperparameter tuning

Supervised Learning with scikit-learn

George Boorman

Core Curriculum Manager

Hyperparameter tuning

  • Ridge/lasso regression: Choosing alpha

  • KNN: Choosing n_neighbors

  • Hyperparameters: Parameters we specify before fitting the model

    • Like alpha and n_neighbors
Supervised Learning with scikit-learn

Choosing the correct hyperparameters

  1. Try lots of different hyperparameter values

  2. Fit all of them separately

  3. See how well they perform

  4. Choose the best performing values

 

  • This is called hyperparameter tuning

  • It is essential to use cross-validation to avoid overfitting to the test set

  • We can still split the data and perform cross-validation on the training set

  • We withhold the test set for final evaluation

Supervised Learning with scikit-learn

Grid search cross-validation

grid of potential values of n neighbors from 2 to 11 in increments of 3, and metric options of euclidean or manhattan

Supervised Learning with scikit-learn

Grid search cross-validation

k-fold cross validation scores for each combination of hyperparameters in the grid

Supervised Learning with scikit-learn

Grid search cross-validation

5 neighbors and euclidean metric highlight, with score of 0.8748

Supervised Learning with scikit-learn

GridSearchCV in scikit-learn

from sklearn.model_selection import GridSearchCV

kf = KFold(n_splits=5, shuffle=True, random_state=42)
param_grid = {"alpha": np.arange(0.0001, 1, 10), "solver": ["sag", "lsqr"]}
ridge = Ridge()
ridge_cv = GridSearchCV(ridge, param_grid, cv=kf)
ridge_cv.fit(X_train, y_train)
print(ridge_cv.best_params_, ridge_cv.best_score_)
{'alpha': 0.0001, 'solver': 'sag'}
0.7529912278705785
Supervised Learning with scikit-learn

Limitations and an alternative approach

  • 3-fold cross-validation, 1 hyperparameter, 10 total values = 30 fits
  • 10 fold cross-validation, 3 hyperparameters, 30 total values = 900 fits
Supervised Learning with scikit-learn

RandomizedSearchCV

from sklearn.model_selection import RandomizedSearchCV

kf = KFold(n_splits=5, shuffle=True, random_state=42) param_grid = {'alpha': np.arange(0.0001, 1, 10), "solver": ['sag', 'lsqr']} ridge = Ridge()
ridge_cv = RandomizedSearchCV(ridge, param_grid, cv=kf, n_iter=2) ridge_cv.fit(X_train, y_train)
print(ridge_cv.best_params_, ridge_cv.best_score_)
{'solver': 'sag', 'alpha': 0.0001}
0.7529912278705785
Supervised Learning with scikit-learn

Evaluating on the test set

test_score = ridge_cv.score(X_test, y_test)

print(test_score)
0.7564731534089224
Supervised Learning with scikit-learn

Let's practice!

Supervised Learning with scikit-learn

Preparing Video For Download...