RandomizedSearchCV

Validazione dei modelli in Python

Kasey Jones

Data Scientist

Grid searching hyperparameters

When selecting values from multiple hyperparameters, the possible options create a grid. This grid is called the hyperparameter space.

Validazione dei modelli in Python

Grid searching continued

Benefits:

  • Tests every possible combination

Drawbacks:

  • Additional hyperparameters increase training time exponentially
Validazione dei modelli in Python

Better methods

Validazione dei modelli in Python

Random search

from sklearn.model_selection import RandomizedSearchCV

random_search = RandomizedSearchCV()

Parameter Distribution:

param_dist = {"max_depth": [4, 6, 8, None],
              "max_features": range(2, 11),
              "min_samples_split": range(2, 11)}
Validazione dei modelli in Python

Random search parameters

Parameters:

  • estimator: the model to use
  • param_distributions: dictionary containing hyperparameters and possible values
  • n_iter: number of iterations
  • scoring: scoring method to use
Validazione dei modelli in Python

Setting RandomizedSearchCV parameters

param_dist = {"max_depth": [4, 6, 8, None],
              "max_features": range(2, 11),
              "min_samples_split": range(2, 11)}
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import make_scorer, mean_absolute_error

rfr = RandomForestRegressor(n_estimators=20, random_state=1111)
scorer = make_scorer(mean_absolute_error)
Validazione dei modelli in Python

RandomizedSearchCV implemented

Setting up the random search:

random_search =\
    RandomizedSearchCV(estimator=rfr,
                       param_distributions=param_dist,
                       n_iter=40,
                       cv=5)
  • We cannot do hyperparameter tuning without understanding model validation
  • Model validation allows us to compare multiple models and parameter sets
Validazione dei modelli in Python

RandomizedSearchCV implemented

Setting up the random search:

random_search =\
    RandomizedSearchCV(estimator=rfr,
                       param_distributions=param_dist,
                       n_iter=40,
                       cv=5)

Complete the random search:

random_search.fit(X, y)
Validazione dei modelli in Python

Let's explore some examples!

Validazione dei modelli in Python

Preparing Video For Download...