List columns

Intermediate Functional Programming with purrr

Colin Fay

Data Scientist at ThinkR

What is a list column?

A data.frame with a list for a column:

library(tidyverse)
df <- tibble(
  classic = c("a", "b","c"), 
  list = list(
    c("a", "b","c"), 
    c("a", "b","c", "d"), 
    c("a", "b","c", "d", "e")))
df

 

# A tibble: 3 x 2
  classic list     
  <chr>   <list>   
1 a       <chr [3]>
2 b       <chr [4]>
3 c       <chr [5]>
Intermediate Functional Programming with purrr

Why list columns?

library(tidyverse)
library(rvest)
a_node <- partial(html_nodes, css = "a")
href <- partial(html_attr, name = "href")
get_links <- compose( href, a_node,  read_html )
urls_df <- tibble(
  urls = c("https://thinkr.fr", "https://colinfay.me",
           "https://en.wikipedia.org", "https://www.datacamp.com"))
urls_df %>% mutate(links = map(urls, get_links))
# A tibble: 4 x 2
  urls                       links      
  <chr>                      <list>     
1 https://thinkr.fr          <chr [106]>
2 https://colinfay.me        <chr [33]> 
3 https://en.wikipedia.org   <chr [93]> 
4 https://www.datacamp.com   <chr [1]>
Intermediate Functional Programming with purrr

Unnesting nested data.frame

urls_df %>% mutate(links = map(urls, get_links)) %>% unnest(cols=c(links))
# A tibble: 233 x 2
   urls           links                                          
   <chr>          <chr>                                          
 1 https://think… https://thinkr.fr/                             
 2 https://think… https://thinkr.fr/                             
 3 https://think… https://thinkr.fr/formation-au-logiciel-r/     
 4 https://think… https://thinkr.fr/formation-au-logiciel-r/intr…
 5 https://think… https://thinkr.fr/formation-au-logiciel-r/stat…
 6 https://think… https://thinkr.fr/formation-au-logiciel-r/prog…
 7 https://think… https://thinkr.fr/formation-au-logiciel-r/r-et…
 8 https://think… https://thinkr.fr/formation-au-logiciel-r/r-po…
 9 https://think… https://thinkr.fr/formation-au-logiciel-r/inte…
10 https://think… https://thinkr.fr/formation-au-logiciel-r/form…
# ... with 223 more rows
Intermediate Functional Programming with purrr

nest() a standard data.frame

library(dplyr)
library(tidyr)
iris_n <- iris %>% 
  group_by(Species) %>% 
  nest()  
iris_n
# A tibble: 3 x 2
  Species    data             
  <fct>      <list>           
1 setosa     <tibble [50 × 4]>
2 versicolor <tibble [50 × 4]>
3 virginica  <tibble [50 × 4]>
Intermediate Functional Programming with purrr

A new list to map on

iris_n %>%
  mutate(lm = map(data, ~ lm(Sepal.Length ~ Sepal.Width, data = .x)))
# A tibble: 3 x 3
  Species    data              lm      
  <fct>      <list>            <list>  
1 setosa     <tibble [50 × 4]> <S3: lm>
2 versicolor <tibble [50 × 4]> <S3: lm>
3 virginica  <tibble [50 × 4]> <S3: lm>
...
Intermediate Functional Programming with purrr

nest() and unnest()

summary_lm <- compose(summary, lm)
iris %>% 
  group_by(Species) %>% 
  nest() %>%
  mutate(data = map(data, ~ summary_lm(Sepal.Length ~ Sepal.Width, 
                                       data = .x)), 
         data = map(data, "r.squared")) %>%
  unnest(cols=c(links))
# A tibble: 3 x 2
  Species     data
  <fct>      <dbl>
1 setosa     0.551
2 versicolor 0.277
3 virginica  0.209
Intermediate Functional Programming with purrr

Let's practice!

Intermediate Functional Programming with purrr

Preparing Video For Download...