Model Validation in Python
Kasey Jones
Data Scientist
Benefits:
Drawbacks:
from sklearn.model_selection import RandomizedSearchCV
random_search = RandomizedSearchCV()
Parameter Distribution:
param_dist = {"max_depth": [4, 6, 8, None],
"max_features": range(2, 11),
"min_samples_split": range(2, 11)}
Parameters:
estimator
: the model to useparam_distributions
: dictionary containing hyperparameters and possible valuesn_iter
: number of iterationsscoring
: scoring method to useparam_dist = {"max_depth": [4, 6, 8, None],
"max_features": range(2, 11),
"min_samples_split": range(2, 11)}
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import make_scorer, mean_absolute_error
rfr = RandomForestRegressor(n_estimators=20, random_state=1111)
scorer = make_scorer(mean_absolute_error)
Setting up the random search:
random_search =\
RandomizedSearchCV(estimator=rfr,
param_distributions=param_dist,
n_iter=40,
cv=5)
Setting up the random search:
random_search =\
RandomizedSearchCV(estimator=rfr,
param_distributions=param_dist,
n_iter=40,
cv=5)
Complete the random search:
random_search.fit(X, y)
Model Validation in Python