Fast data writing with fwrite()

Data Manipulation with data.table in R

Matt Dowle, Arun Srinivasan

Instructors, DataCamp

fwrite

Ability to write list columns using secondary separator (|)

dt <- data.table(id = c("x", "y", "z"), val = list(1:2, 3:4, 5:6))
fwrite(dt, "fwrite.csv")
fread("fwrite.csv")
id  val
 x  1|2
 y  3|4
 z  5|6
Data Manipulation with data.table in R

date and datetime columns (ISO)

  • fwrite() provides three additional ways of writing date and datetime format - ISO, squash and epoch
  • Encourages the use of ISO standards with ISO as default
Data Manipulation with data.table in R

Date and times

now <- Sys.time()
dt  <- data.table(date = as.IDate(now), 
                  time = as.ITime(now), 
                  datetime = now)
dt
      date     time              datetime
2018-12-17 19:54:51   2018-12-17 14:54:51
Data Manipulation with data.table in R

date and datetime columns (ISO)

# "ISO" is default
fwrite(dt, "datetime.csv", dateTimeAs = "ISO")

fread("datetime.csv")
      date       time                      datetime
2018-12-17   19:55:39   2018-12-17T19:55:39.735036Z
Data Manipulation with data.table in R

date and datetime columns (Squash)

  • squash writes yyyy-mm-dd hh:mm:ss as yyyymmddhhmmss, for example
  • Read in as integer. Very useful to extract month, year etc by simply using modulo arithmetic. e.g., 20160912 %/% 10000 = 2016
  • Also handles milliseconds (ms) resolution
  • POSIXct type (17 digits with ms resolution) is automatically read in as integer64 by fread
Data Manipulation with data.table in R

date and datetime columns (Squash)

fwrite(dt, "datetime.csv", dateTimeAs = "squash")

fread("datetime.csv")

       date   time          datetime
1: 20181217 195539 20181217195539735
20181217 %/% 10000 
[1] 2018
Data Manipulation with data.table in R

date and datetime columns (Epoch)

  • epoch counts the number of days (for dates) or seconds (for time and datetime) since relevant epoch
  • Relevant epoch is 1970-01-01, 00:00:00 and 1970-01-01T00:00:00Z for date, time and datetime, respectively
Data Manipulation with data.table in R

date and datetime columns (Epoch)

fwrite(dt, "datetime.csv", dateTimeAs = "epoch")
fread("datetime.csv")
 date  time   datetime
17882 71871 1545076672
Data Manipulation with data.table in R

Let's practice!

Data Manipulation with data.table in R

Preparing Video For Download...