Hyperparameter Tuning in Python
Alex Scriven
Data Scientist
So far everything we have done has been uninformed search:
Uninformed search: Where each iteration of hyperparameter tuning does not learn from the previous iterations.
This is what allows us to parallelize our work. Though this doesn't sound very efficient?
The process so far:
An alternate way:
A basic informed search methodology:
Start out with a rough, random approach and iteratively refine your search.
The process is:
You could substitute (3) with further random searches before the grid search
Coarse to fine tuning has some advantages:
No need to waste time on search spaces that are not giving good results!
Note: This isn't informed on one model but batches
Let's take an example with the following hyperparameter ranges:
max_depth_list
between 1 and 65min_sample_list
between 3 and 17learn_rate_list
150 values between 0.01 and 150How many possible models do we have?
combinations_list = [list(x) for x in product(max_depth_list, min_sample_list, learn_rate_list)]
print(len(combinations_list))
134400
Let's do a random search on just 500 combinations.
Here we plot our accuracy scores:
Which models were the good ones?
Top results:
max_depth | min_samples_leaf | learn_rate | accuracy |
---|---|---|---|
10 | 7 | 0.01 | 96 |
19 | 7 | 0.023355705 | 96 |
30 | 6 | 1.038389262 | 93 |
27 | 7 | 1.11852349 | 91 |
16 | 7 | 0.597651007 | 91 |
Let's visualize the max_depth
values vs accuracy score:
min_samples_leaf
better below 8
learn_rate
worse above 1.3
What we know from iteration one:
max_depth
between 8 and 30learn_rate
less than 1.3min_samples_leaf
perhaps less than 8Where to next? Another random or grid search with what we know!
Note: This was only bivariate analysis. You can explore looking at multiple hyperparameters (3, 4 or more!) on a single graph, but that's beyond the scope of this course.
Hyperparameter Tuning in Python