Lending a helper hand

Programming with dplyr

Dr. Chester Ismay

Educator, Data Scientist, and R/Python Consultant

starts_with()

  • Looks for column names starting with a given substring when combined with select()
  • In the tidyselect package, but automatically loaded when library(dplyr) is run
Programming with dplyr

starts_with() example

world_bank_data %>% 
    select(starts_with("perc"))
# A tibble: 300 x 4
   perc_electric_access perc_college_complete perc_cvd_crd_70 perc_rural_pop
                  <dbl>                 <dbl>           <dbl>          <dbl>
 1                100                    7.26            15.8          45.6 
 2                100                   20.4             26.2          35.6 
 3                100                   18.0             28.1          30.8 
 4                100                    7.57            15.5          45.0 
 5                 83.8                  3.92            33.3          66.0 
# ... with 295 more rows
Programming with dplyr

Where and when are these results from!?

world_bank_data %>% 
  select(country, year, starts_with("perc"))
# A tibble: 300 x 6
   country       year perc_electric_access perc_college_complete perc_cvd_crd_70 perc_rural_pop
   <chr>        <dbl>                <dbl>                 <dbl>           <dbl>          <dbl>
 1 Portugal      2000                100                    7.26            15.8          45.6 
 2 Armenia       2001                100                   20.4             26.2          35.6 
 3 Bulgaria      2001                100                   18.0             28.1          30.8 
 4 Portugal      2001                100                    7.57            15.5          45.0 
 5 Pakistan      2005                 83.8                  3.92            33.3          66.0 
# ... with 295 more rows
Programming with dplyr

ends_with()

world_bank_data %>% 
    select(country, year, ends_with("rate"))
# A tibble: 300 x 5
   country       year infant_mortality_rate fertility_rate unemployment_rate
   <chr>        <dbl>                 <dbl>          <dbl>             <dbl>
 1 Portugal      2000                   5.5           1.47             3.81 
 2 Armenia       2001                  25.3           1.2             10.9  
 3 Bulgaria      2001                  17.1           1.2             19.9  
 4 Portugal      2001                   5.2           1.46             3.83 
 5 Pakistan      2005                  80             3.79             0.610
# ... with 295 more rows
Programming with dplyr

Let's practice!

Programming with dplyr

Preparing Video For Download...