Data Manipulation in Julia
Katerina Zahradova
Instructor
# Using unstack to pivot # unstack(DataFrame, :col_as_rows, :col_as_cols, :values, combine = f_to_aggregate)
unstack(penguins, :species, :sex, :body_mass_g, combine = median)
3×3 DataFrame
Row species MALE FEMALE
String15 Float64 Float64
_____________________________________
1 Adelie 4000.0 3400.0
2 Chinstrap 3950.0 3550.0
3 Gentoo 5500.0 4700.0
# Pivot on more columns as rows
unstack(penguins, [:species, :island], :sex, :body_mass_g, combine = median)
5×4 DataFrame
Row species island MALE FEMALE
String15 String15 Float64? Float64?
_____________________________________________________
1 Adelie Torgersen 4000.0 3400.0
2 Adelie Biscoe 4000.0 3375.0
3 Adelie Dream 3987.5 3400.0
4 Chinstrap Dream 3950.0 3550.0
5 Gentoo Biscoe 5500.0 4700.0
# Using multiple columns as columns
unstack(penguins, :sex, [:species, :island], :body_mass_g, combine = sum)
MethodError: no method matching
unstack(::DataFrame, ::Symbol, ::Vector{Symbol}, ::Symbol; combine = sum)
# Missing values for certain combinations
unstack(penguins_missing, :species, :sex, :body_mass_g, combine = median)
3×3 DataFrame
Row species MALE FEMALE
String15 Float64 Float64?
_____________________________________
1 Adelie 4000.0 3400.0
2 Chinstrap 3950.0 3550.0
3 Gentoo 5500.0 missing
# Missing values for certain combinations
unstack(penguins_missing, :species, :sex, :body_mass_g, combine = median, fill = -1)
3×3 DataFrame
Row species MALE FEMALE
String15 Float64 Float64
_____________________________________
1 Adelie 4000.0 3400.0
2 Chinstrap 3950.0 3550.0
3 Gentoo 5500.0 -1
# Saving the pivot table
pivot_penguins = unstack(penguins_missing, :species, :sex, :body_mass_g, combine = median)
# Select only female penguins
select(pivot_penguins, :species, :FEMALE)
3×2 DataFrame
Row species FEMALE
String15 Float64?
_______________________________
1 Adelie 3400.0
...
Data Manipulation in Julia