Grouped aggregations

Data Manipulation with data.table in R

Matt Dowle, Arun Srinivasan

Instructors, DataCamp

Combining ":=" with by

ncol(batrips)
11
batrips[, n_zip_code := .N, by = zip_code]
ncol(batrips)
12
batrips[, n_zip_code := .N, by = zip_code][]
trip_id duration ... zip_code  n_zip_code
 139545      435 ...    94612        1228
 139546      432 ...    94107       36061
 139547     1523 ...    94112        2168
Data Manipulation with data.table in R

Combining ":=" with by

batrips[, n_zip_code := .N, by = zip_code][]
trip_id duration ... zip_code  n_zip_code
 139545      435 ...    94612        1228
 139546      432 ...    94107       36061
 139547     1523 ...    94112        2168
batrips[n_zip_code > 1000]
bike_id subscription_type zip_code n_zip_code
    473        Subscriber    94612       1228
    395        Subscriber    94107      36061
    331        Subscriber    94112       2168
    335          Customer    94109       6980
    580          Customer                1541
    ...               ...      ...        ...    
    677        Subscriber    94107      36061
    604        Subscriber    94133      15687
    480          Customer    94109       6980
    277          Customer    94109       6980
     56        Subscriber    94105      19899
Data Manipulation with data.table in R

Combining ":=" with by

batrips[, n_zip_code := .N, by = zip_code]

zip_1000 <- batrips[n_zip_code > 1000][, n_zip_code := NULL]
# Same as
zip_1000 <- batrips[, n_zip_code := .N, 
                    by = zip_code][n_zip_code > 1000][, n_zip_code := NULL]
Data Manipulation with data.table in R

Let's practice!

Data Manipulation with data.table in R

Preparing Video For Download...