Hyperparameter Tuning in Python
Alex Scriven
Data Scientist
Bayes Rule:
A statistical method of using new evidence to iteratively update our beliefs about some outcome
Bayes Rule has the form:
$$ P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)} $$
LHS = the probability of A, given B has occurred. B is some new evidence.
RHS is how we calculate this.
$$ P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)} $$
This all may be quite confusing, but let's use a common example of a medical diagnosis to demonstrate.
A medical example:
What is the probability that any person has the disease?
$$ P(D) = 0.05 $$
This is simply our prior as we have no evidence.
What is the probability that a predisposed person has the disease?
$$ P(D \mid Pre) = \frac{P(Pre \mid D) \, P(D)}{P(pre)} $$
$$ P(D \mid Pre) = \frac{0.2 \, * 0.05}{0.1} = 0.1 $$
We can apply this logic to hyperparameter tuning:
Bayesian hyperparameter tuning is very new but quite popular for larger and more complex hyperparameter tuning tasks as they work well to find optimal hyperparameter combinations in these situations
Introducing the Hyperopt
package.
To undertake bayesian hyperparameter tuning we need to:
Many options to set the grid:
Hyperopt does not use point values on the grid but instead each point represents probabilities for each hyperparameter value.
We will do a simple uniform distribution but there are many more if you check the documentation.
Set up the grid:
space = {
'max_depth': hp.quniform('max_depth', 2, 10, 2),
'min_samples_leaf': hp.quniform('min_samples_leaf', 2, 8, 2),
'learning_rate': hp.uniform('learning_rate', 0.01, 1, 55),
}
The objective function runs the algorithm:
def objective(params): params = {'max_depth': int(params['max_depth']), 'min_samples_leaf': int(params['min_samples_leaf']), 'learning_rate': params['learning_rate']} gbm_clf = GradientBoostingClassifier(n_estimators=500, **params)
best_score = cross_val_score(gbm_clf, X_train, y_train, scoring='accuracy', cv=10, n_jobs=4).mean() loss = 1 - best_score
write_results(best_score, params, iteration) return loss
Run the algorithm:
best_result = fmin(
fn=objective,
space=space,
max_evals=500,
rstate=np.random.default_rng(42),
algo=tpe.suggest)
Hyperparameter Tuning in Python