Computing on columns the data.table way

Data Manipulation with data.table in R

Matt Dowle, Arun Srinivasan

Instructors, DataCamp

Computing on columns

Since columns can be referred to as variables, you can compute directly on them in j

# Compute mean of duration column using the data.table way
ans <- batrips[, mean(duration)]
1131.967
# Compute mean of duration column using the data.frame way
ans <- mean(batrips[, "duration"])
1131.967
Data Manipulation with data.table in R

Computing on rows and columns

Combining i and j is straightforward

# Compute mean of duration column for "Japantown" start station
batrips[start_station == "Japantown", mean(duration)]
2464.331
Data Manipulation with data.table in R

Special symbol .N in j

  • .N can be used in j as well
  • Particularly useful to get the number of rows after filtering in i
# How many trips started from "Japantown"?
batrips[start_station == "Japantown", .N]
902
# Compare this to the data.frame way
nrow(batrips[batrips$start_station == "Japantown", ])
902
Data Manipulation with data.table in R

Let's practice!

Data Manipulation with data.table in R

Preparing Video For Download...