Data Manipulation with data.table in R
Matt Dowle, Arun Srinivasan
Instructors, DataCamp
Recall that you can select multiple columns using .()
# Recap: Select trip_id and duration columns
ans <- batrips[, .(trip_id, dur = duration)]
head(ans, 2)
trip_id dur
139545 435
139546 432
You can compute on multiple columns and return a data.table the same way
# Get mean and median of duration
batrips[, .(mn_dur = mean(duration),
med_dur = median(duration))]
mn_dur med_dur
1131.967 511
# Get mean and median of duration
batrips[, .(mn_dur = mean(duration), med_dur = median(duration))]
mn_dur med_dur
1131.967 511
Together with i
, you can compute on columns in j
only for those rows that satisfy a condition
batrips[start_station == "Japantown", .(mn_dur = mean(duration),
med_dur = median(duration))]
mn_dur med_dur
2464.331 782
batrips[start_station == "Japantown", .(mn_dur = mean(duration),
med_dur = median(duration))]
mn_dur med_dur
2464.331 782
Data Manipulation with data.table in R