Model Validation in Python
Kasey Jones
Data Scientist
Parameters are:
Parameters are created by fitting a model:
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X, y)
print(lr.coef_, lr.intercept_)
[[0.798, 0.452]] [1.786]
Parameters do not exist before the model is fit:
lr = LinearRegression()
print(lr.coef_, lr.intercept_)
AttributeError: 'LinearRegression' object has no attribute 'coef_'
Hyperparameters:
Hyperparameter | Description | Possible Values (default) |
---|---|---|
n_estimators | Number of decision trees in the forest | 2+ (10) |
max_depth | Maximum depth of the decision trees | 2+ (None) |
max_features | Number of features to consider when making a split | See documentation |
min_samples_split | The minimum number of samples required to make a split | 2+ (2) |
Hyperparameter tuning:
depth = [4, 6, 8, 10, 12] samples = [2, 4, 6, 8] features = [2, 4, 6, 8, 10]
# Specify hyperparameters rfc = RandomForestRegressor( n_estimators=100, max_depth=depth[0], min_samples_split=samples[3], max_features=features[1])
rfr.get_params()
{'bootstrap': True,
'criterion': 'mse'
...
}
rfr.get_params()
{'bootstrap': True,
'criterion': 'mse',
'max_depth': 4,
'max_features': 4,
'max_leaf_nodes': None,
'min_impurity_decrease': 0.0,
'min_impurity_split': None,
'min_samples_leaf': 1,
'min_samples_split': 8,
...
}
Model Validation in Python