Introducing Random Search

Hyperparameter Tuning in Python

Alex Scriven

Data Scientist

What you already know

 

Very similar to grid search:

  • Define an estimator, which hyperparameters to tune and the range of values for each hyperparameter.
  • We still set a cross-validation scheme and scoring function

 

BUT we instead randomly select grid squares.

Hyperparameter Tuning in Python

Why does this work?

Bengio & Bergstra (2012):

This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid.

Two main reasons:

  1. Not every hyperparameter is as important
  2. A little trick of probability
Hyperparameter Tuning in Python

A probability trick

A grid search:

10x10 of different models

How many models must we run to have a 95% chance of getting one of the green squares?

Our best models:

10x10 of different models

Hyperparameter Tuning in Python

A probability Trick

 

If we randomly select hyperparameter combinations uniformly, let's consider the chance of MISSING every single trial, to show how unlikely that is

  • Trial 1 = 0.05 chance of success and (1 - 0.05) of missing

    • Trial 2 = (1-0.05) x (1-0.05) of missing the range
      • Trial 3 = (1-0.05) x (1-0.05) x (1-0.05) of missing again
  • In fact, with n trials we have (1-0.05)^n chance that every single trial misses that desired spot.

Hyperparameter Tuning in Python

A probability trick

 

So how many trials to have a high (95%) chance of getting in that region?

  • We have (1-0.05)^n chance to miss everything.
  • So we must have (1- miss everything) chance to get in there or (1-(1-0.05)^n)
  • Solving 1-(1-0.05)^n >= 0.95 gives us n >= 59
Hyperparameter Tuning in Python

A probability trick

 

What does that all mean?

  • You are unlikely to keep completely missing the 'good area' for a long time when randomly picking new spots
  • A grid search may spend lots of time in a 'bad area' as it covers exhaustively.
Hyperparameter Tuning in Python

Some important notes

 

Remember:

  1. The maximum is still only as good as the grid you set!

  2. Remember to fairly compare this to grid search, you need to have the same modeling 'budget'

Hyperparameter Tuning in Python

Creating a random sample of hyperparameters

We can create our own random sample of hyperparameter combinations:

# Set some hyperparameter lists
learn_rate_list = np.linspace(0.001,2,150)
min_samples_leaf_list = list(range(1,51))
# Create list of combinations
from itertools import product
combinations_list = [list(x) for x in 
                    product(learn_rate_list, min_samples_leaf_list)]
# Select 100 models from our larger set
random_combinations_index = np.random.choice(
                            range(0,len(combinations_list)), 100, 
                            replace=False)
combinations_random_chosen = [combinations_list[x] for x in 
                            random_combinations_index]
Hyperparameter Tuning in Python

Visualizing a Random Search

We can also visualize the random search coverage by plotting the hyperparameter choices on an X and Y axis.

random search coverage graph

Notice how this has a wide range of the scatter but not deep coverage?

Hyperparameter Tuning in Python

Let's practice!

Hyperparameter Tuning in Python

Preparing Video For Download...