Helpers for filtering

Data Manipulation with data.table in R

Matt Dowle and Arun Srinivasan

Instructors, DataCamp

%like%

  • %like% allows you to search for a pattern in a character or a factor vector
    • Usage: col %like% pattern
# Subset all rows where start_station starts with San Francisco
batrips[start_station %like% "^San Francisco"]

# Instead of 
batrips[grepl("^San Francisco", start_station)]
Data Manipulation with data.table in R

%between%

  • %between% allows you to search for values in the closed interval [val1, val2]
    • Usage: numeric_col %between% c(val1, val2)
# Subset all rows where duration is between 2000 and 3000
batrips[duration %between% c(2000, 3000)]

# Instead of 
batrips[duration >= 2000 & duration <= 3000]
Data Manipulation with data.table in R

%chin%

  • %chin% is similar to %in%, but it is much faster and only for character vectors
    • Usage: character_col %chin% c("val1", "val2", "val3")
# Subset all rows where start_station is 
# "Japantown", "Mezes Park" or "MLK Library"
batrips[start_station %chin% c("Japantown", "Mezes Park", "MLK Library")]

# Much faster than
batrips[start_station %in% c("Japantown", "Mezes Park", "MLK Library")]
Data Manipulation with data.table in R

Let's practice!

Data Manipulation with data.table in R

Preparing Video For Download...