RandomizedSearchCV

Model Validation in Python

Kasey Jones

Data Scientist

Grid searching hyperparameters

When selecting values from multiple hyperparameters, the possible options create a grid. This grid is called the hyperparameter space.

Model Validation in Python

Grid searching continued

Benefits:

  • Tests every possible combination

Drawbacks:

  • Additional hyperparameters increase training time exponentially
Model Validation in Python

Better methods

Model Validation in Python

Random search

from sklearn.model_selection import RandomizedSearchCV

random_search = RandomizedSearchCV()

Parameter Distribution:

param_dist = {"max_depth": [4, 6, 8, None],
              "max_features": range(2, 11),
              "min_samples_split": range(2, 11)}
Model Validation in Python

Random search parameters

Parameters:

  • estimator: the model to use
  • param_distributions: dictionary containing hyperparameters and possible values
  • n_iter: number of iterations
  • scoring: scoring method to use
Model Validation in Python

Setting RandomizedSearchCV parameters

param_dist = {"max_depth": [4, 6, 8, None],
              "max_features": range(2, 11),
              "min_samples_split": range(2, 11)}
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import make_scorer, mean_absolute_error

rfr = RandomForestRegressor(n_estimators=20, random_state=1111)
scorer = make_scorer(mean_absolute_error)
Model Validation in Python

RandomizedSearchCV implemented

Setting up the random search:

random_search =\
    RandomizedSearchCV(estimator=rfr,
                       param_distributions=param_dist,
                       n_iter=40,
                       cv=5)
  • We cannot do hyperparameter tuning without understanding model validation
  • Model validation allows us to compare multiple models and parameter sets
Model Validation in Python

RandomizedSearchCV implemented

Setting up the random search:

random_search =\
    RandomizedSearchCV(estimator=rfr,
                       param_distributions=param_dist,
                       n_iter=40,
                       cv=5)

Complete the random search:

random_search.fit(X, y)
Model Validation in Python

Let's explore some examples!

Model Validation in Python

Preparing Video For Download...