Parameters vs hyperparameters

Hyperparameter Tuning in R

Dr. Shirin Elsinghorst

Senior Data Scientist

About me

www.shirin-glander.de

About me

"Hyper"parameters vs model parameters

Let's look at an example dataset:

head(breast_cancer_data)

# A tibble: 6 x 11
  diagnosis concavity_mean symmetry_mean fractal_dimension perimeter_se smoothness_se
  <chr>              <dbl>         <dbl>             <dbl>        <dbl>         <dbl>
1 M                 0.300          0.242            0.0787         8.59       0.00640
2 M                 0.0869         0.181            0.0567         3.40       0.00522
3 M                 0.197          0.207            0.0600         4.58       0.00615
4 M                 0.241          0.260            0.0974         3.44       0.00911

And build a simple linear model.

Let's start simple: Model parameters in a linear model

# Create linear model
linear_model <- lm(perimeter_worst ~ fractal_dimension_mean, data = breast_cancer_data)

# Get coefficients
summary(linear_model)$coefficients

                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)              167.60      25.91   6.469  3.9e-09 ***
fractal_dimension_mean  -926.39     392.86  -2.358   0.0204 *

Let's start simple: Model parameters in a linear model

Model parameters are being fit (i.e. found) during training.
They are the result of model fitting or training.
In a linear model, we want to find the coefficients.

linear_model$coefficients

(Intercept) fractal_dimension_mean 
   167.5972              -926.3866

We can think of them as the slope and the y-intercept of our model.

Coefficients in a linear model

ggp <- ggplot(data = breast_cancer_data, 
              aes(x = fractal_dimension_mean, y = perimeter_worst)) +
        geom_point(color = "grey")

ggp + geom_abline(slope = linear_model$coefficients[2], 
                  intercept = linear_model$coefficients[1])

Model parameters vs hyperparameters in a linear model

Remember: model parameters are being fit (i.e. found) during training; they are the result of model fitting or training.
Hyperparameters are being set before training.
They specify HOW the training is supposed to happen.

args(lm)
help(lm)
?lm

linear_model <- lm(perimeter_worst ~ fractal_dimension_mean,
                   data = breast_cancer_data,
                   method = "qr")

Parameters vs hyperparameters in machine learning

In our linear model:

Coefficients were found during fitting.

method was an option to set before fitting.

In machine learning we might have:

Weights and biases of neural nets that are optimized during training => model parameters.
Options like learning rate, weight decay and number of trees in a Random Forest model that can be tweaked => hyperparameters.

Why tune hyperparameters?

Fantasy football players ~ Hyperparameters
Football players' positions ~ Hyperparameter values
Finding the best combination of players and positions ~ Finding the best combination of hyperparameters

Let's practice!

Hyperparameter Tuning in R