Introduction to hyperparameter tuning

Model Validation in Python

Kasey Jones

Data Scientist

Model parameters

Parameters are:

  • Learned or estimated from the data
  • The result of fitting a model
  • Used when making future predictions
  • Not manually set
Model Validation in Python

Linear regression parameters

Parameters are created by fitting a model:

from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X, y)
print(lr.coef_, lr.intercept_)
[[0.798, 0.452]] [1.786]
Model Validation in Python

Linear regression parameters

Parameters do not exist before the model is fit:

lr = LinearRegression()
print(lr.coef_, lr.intercept_)
AttributeError: 'LinearRegression' object has no attribute 'coef_'
Model Validation in Python

Model hyperparameters

Hyperparameters:

  • Manually set before the training occurs
  • Specify how the training is supposed to happen
Model Validation in Python

Random forest hyperparameters

Hyperparameter Description Possible Values (default)
n_estimators Number of decision trees in the forest 2+ (10)
max_depth Maximum depth of the decision trees 2+ (None)
max_features Number of features to consider when making a split See documentation
min_samples_split The minimum number of samples required to make a split 2+ (2)
Model Validation in Python

What is hyperparameter tuning?

Hyperparameter tuning:

  • Select hyperparameters
  • Run a single model type at different value sets
  • Create ranges of possible values to select from
  • Specify a single accuracy metric
Model Validation in Python

Specifying ranges

depth = [4, 6, 8, 10, 12]
samples = [2, 4, 6, 8]
features = [2, 4, 6, 8, 10]

# Specify hyperparameters rfc = RandomForestRegressor( n_estimators=100, max_depth=depth[0], min_samples_split=samples[3], max_features=features[1])
rfr.get_params()
{'bootstrap': True,
 'criterion': 'mse'
 ...
}
Model Validation in Python

Too many hyperparameters!

rfr.get_params()
{'bootstrap': True,
 'criterion': 'mse',
 'max_depth': 4,
 'max_features': 4,
 'max_leaf_nodes': None,
 'min_impurity_decrease': 0.0,
 'min_impurity_split': None,
 'min_samples_leaf': 1,
 'min_samples_split': 8,
 ...
 }
Model Validation in Python

General guidelines

  • Start with the basics
  • Read through the documentation
  • Test practical ranges
Model Validation in Python

Let's practice!

Model Validation in Python

Preparing Video For Download...