Lasso Regression

Dimensionality Reduction in R

Matt Pickard

Owner, Pickard Predictives, LLC

Lasso regression overview

Supervised feature selection
L1 regularization
Penalizes the regression coefficients
Shrinks coefficients
Less important coefficients shrink to zero
Naturally performs feature selection

linear_reg(engine = "glmnet", penalty = 0.001 , mixture = 1)

Standardize data

Standardize data first, so penalty applies equally across features
Use scale() for target variable
- returns matrix, so convert to vector with as.vector()
Use step_normalize() for predictor variables

Example

# Scale target variable
df <- df %>% mutate(target = as.vector(scale(target))) 
... 
# Scale predictor variables
recipe() %>% step_normalize(all_numeric_predictors())

Choosing a penalty value

Penalty is a hyperparameter to optimize
Search for the best penalty value
Use tune() in tidymodels

linear_reg(engine = "glmnet", penalty = tune() , mixture = 1)

Preparing the data

Scale the target variable

house_sales_subset_df <- house_sales_subset_df %>% 
  mutate(price = as.vector(scale(price)))

Create the training and testing sets

split <- initial_split(house_sales_subset_df, prop = 0.8)
train <- split %>% training()
test <-  split %>% testing()

Create a recipe

lasso_recipe <- 
  recipe(price ~ ., data = train) %>% 
  step_normalize(all_numeric_predictors())

Create the workflow

Create the model spec

lasso_model <- linear_reg(penalty = 0.01, mixture = 1, engine = "glmnet")

Create the workflow

lasso_workflow <- workflow(preprocessor = lasso_recipe, spec =  lasso_model)

Fit the workflow

tidy(lasso_workflow %>% fit(train)) %>% filter(estimate > 0)

# A tibble: 9 × 3
  term          estimate penalty
  <chr>            <dbl>   <dbl>
1 bathrooms      0.0477     0.01
2 sqft_living    0.434      0.01
3 floors         0.0262     0.01
4 waterfront     0.133      0.01
5 view           0.0510     0.01
6 condition      0.0319     0.01
...              ...        ...

Create a tunable model workflow

Create a tunable model spec

lasso_model <- linear_reg(penalty = tune(), mixture = 1, engine = "glmnet")
lasso_workflow <- workflow(preprocessor = lasso_recipe, spec =  lasso_model)

Create cross validation training sample

train_cv <- vfold_cv(train, v = 5)

Create grid of penalty values

penalty_grid <- grid_regular(penalty(range = c(-3, -1)), levels = 20)

A penalty range of 0.001 to 0.1 is specified as range = c(-3, -1)

Fit a grid of models

Create grid of fitted models

lasso_grid <- tune_grid(
  lasso_workflow,
  resamples = train_cv,
  grid = penalty_grid)

Plot model performances

autoplot(lasso_grid, metric = "rmse")

Penalty performance plot

penalty performance plot

Finalize the model

Retrieve the penalty value for the best model

best_rmse <- lasso_grid %>% select_best("rmse")

Refit the best model

final_lasso <- 
  finalize_workflow(lasso_workflow, best_rmse) %>% 
  fit(train)

Display the best model's coefficients

tidy(final_lasso) %>% filter(estimate > 0)

Let's practice!

Dimensionality Reduction in R