Maak je modellen netjes met broom

Machine Learning in de tidyverse

Dmitriy (Dima) Gorenshteyn

Lead Data Scientist, Memorial Sloan Kettering Cancer Center

Workflow met lijstkolommen

Machine Learning in de tidyverse

Workflow met lijstkolommen

Machine Learning in de tidyverse

Broom-toolkit

  • tidy(): geeft de statistische resultaten van het model (zoals coëfficiënten)

  • glance(): geeft een korte samenvatting in één rij

  • augment(): voegt voorspalkolommen toe aan de gemodelleerde data

Machine Learning in de tidyverse

Samenvatting van algeria_model

Machine Learning in de tidyverse

tidy()

Machine Learning in de tidyverse

tidy()

library(broom)

tidy(algeria_model)
         term      estimate   std.error statistic      p.value
1 (Intercept) -1196.5647772 39.93891866 -29.95987 1.319126e-33
2        year     0.6348625  0.02011472  31.56209 1.108517e-34
Machine Learning in de tidyverse

glance()

Machine Learning in de tidyverse

glance()

glance(algeria_model)
r.squared adj.r.squared    sigma statistic      p.value df    
0.9522064     0.9512505 2.176948  996.1653 1.108517e-34  2 
logLik        AIC       BIC           deviance    df.residual
-113.2171     232.4342  238.288       236.9552     50
Machine Learning in de tidyverse

augment()

augment(algeria_model)
   life_expectancy year  .fitted   .se.fit     .resid       .hat   .sigma 
1            47.50 1960 47.76581 0.5951714 -0.2658128 0.07474601 2.198695 
2            48.02 1961 48.40068 0.5779264 -0.3806753 0.07047725 2.198326 
3            48.55 1962 49.03554 0.5608726 -0.4855379 0.06637924 2.197878 
4            49.07 1963 49.67040 0.5440279 -0.6004004 0.06245198 2.197265 
5            49.58 1964 50.30526 0.5274124 -0.7252630 0.05869547 2.196455 
6            50.09 1965 50.94013 0.5110485 -0.8501255 0.05510971 2.195498 
Machine Learning in de tidyverse

Aangevulde data plotten

augment(algeria_model) %>% 
  ggplot(mapping = aes(x = year)) +
  geom_point(mapping = aes(y = life_expectancy)) +
  geom_line(mapping = aes(y = .fitted), color = "red")

Machine Learning in de tidyverse

Laten we broom gebruiken!

Machine Learning in de tidyverse

Preparing Video For Download...