Categorical Data in the Tidyverse
Emily Robinson
Data Scientist
# A tibble: 1,040 x 27
RespondentID travel_amount do_recline
<dbl> <chr> <chr>
1 3436139758. Once a year or… NA
2 3434278696. Once a year or… About half t…
3 3434275578. Once a year or… Usually
4 3434268208. Once a year or… Always
# ... with 24 more variables: height <chr>,
# children_sub_18 <chr>,
# middle_arm_rest_three <chr>,
# middle_arm_rest_two <chr>,
# window_shade_control <chr>,
# rude_move_seats <chr>, rude_talk <chr>,
# times_get_up <chr>,
# recliner_obligation <chr>,
# rude_recline <chr>,
# eliminate_recline <chr>,
# rude_switch_seats_friend <chr>,
wide_data
# A tibble: 2 x 3
favorite_fruit favorite_vegetable disliked_dessert
<chr> <chr> <chr>
1 apple carrot cookie
2 orange cauliflower cake
wide_data %>%
mutate(across(where(is.character), as.factor))
# A tibble: 2 x 3
favorite_fruit favorite_vegetable disliked_dessert
<fct> <fct> <fct>
1 apple carrot cookie
2 orange cauliflower cake
wide_data %>%
pivot_longer(everything(), names_to = "column", values_to = "value")
# A tibble: 6 x 2
column value
<chr> <chr>
1 favorite_fruit apple
2 favorite_fruit orange
3 favorite_vegetable carrot
4 favorite_vegetable cauliflower
5 disliked_dessert cookie
6 disliked_dessert cake
wide_data %>%
select(contains("favorite"))
# A tibble: 2 x 2
favorite_fruit favorite_vegetable
<chr> <chr>
1 apple carrot
2 orange cauliflower
Categorical Data in the Tidyverse