Keluarga fungsi map

Machine Learning di Tidyverse

Dmitriy (Dima) Gorenshteyn

Lead Data Scientist, Memorial Sloan Kettering Cancer Center

Alur Kerja Kolom List

Machine Learning di Tidyverse

Alur Kerja Kolom List

Machine Learning di Tidyverse

Fungsi map

Machine Learning di Tidyverse

Fungsi map

Machine Learning di Tidyverse

Fungsi map

Machine Learning di Tidyverse

Rata-rata Populasi per Negara

mean(nested$data[[1]]$population)
[1] 23129438
Machine Learning di Tidyverse

Rata-rata Populasi per Negara

map(.x = nested$data, .f = ~mean(.x$population))
[[1]]
[1] 23129438

[[2]]
[1] 30783053

[[3]]
[1] 16074837

[[4]]
[1] 7746272
Machine Learning di Tidyverse

2: Bekerja dengan Kolom List - map() dan mutate()

pop_df <- nested %>% 
  mutate(pop_mean = map(data, ~mean(.x$population)))

pop_df
# A tibble: 77 x 3
   country    data              pop_mean 
   <fct>      <list>            <list>   
 1 Algeria    <tibble [52 × 6]> <dbl [1]>
 2 Argentina  <tibble [52 × 6]> <dbl [1]>
 3 Australia  <tibble [52 × 6]> <dbl [1]>
 4 Austria    <tibble [52 × 6]> <dbl [1]>
 5 Bangladesh <tibble [52 × 6]> <dbl [1]>
Machine Learning di Tidyverse

3: Sederhanakan Kolom List - unnest()

pop_df %>% 
  unnest(pop_mean)
# A tibble: 77 x 3
   country    data               pop_mean
   <fct>      <list>                <dbl>
 1 Algeria    <tibble [52 × 6]>  23129438
 2 Argentina  <tibble [52 × 6]>  30783053
 3 Australia  <tibble [52 × 6]>  16074837
 4 Austria    <tibble [52 × 6]>   7746272
 5 Bangladesh <tibble [52 × 6]>  97649407
Machine Learning di Tidyverse

Alur Kerja Kolom List

Machine Learning di Tidyverse

Bekerja + Menyederhanakan Kolom List dengan map_*()

function returns
map() list
map_dbl() double
map_lgl() logical
map_chr() character
map_int() integer
Machine Learning di Tidyverse

Bekerja + Menyederhanakan Kolom List dengan map_dbl()

nested %>% 
  mutate(pop_mean = map_dbl(data, ~mean(.x$population)))
# A tibble: 77 x 3
   country    data               pop_mean
   <fct>      <list>                <dbl>
 1 Algeria    <tibble [52 × 6]>  23129438
 2 Argentina  <tibble [52 × 6]>  30783053
 3 Australia  <tibble [52 × 6]>  16074837
 4 Austria    <tibble [52 × 6]>   7746272
 5 Bangladesh <tibble [52 × 6]>  97649407
Machine Learning di Tidyverse

Bangun Model dengan map()

nested %>%
   mutate(model = map(data, ~lm(formula = population~fertility, 
             data = .x)))
# A tibble: 77 x 3
   country    data              model   
   <fct>      <list>            <list>  
 1 Algeria    <tibble [52 × 6]> <S3: lm>
 2 Argentina  <tibble [52 × 6]> <S3: lm>
 3 Australia  <tibble [52 × 6]> <S3: lm>
 4 Austria    <tibble [52 × 6]> <S3: lm>
 5 Bangladesh <tibble [52 × 6]> <S3: lm>
Machine Learning di Tidyverse

Ayo memetakan sesuatu!

Machine Learning di Tidyverse

Preparing Video For Download...