Data Manipulation with data.table in R
Matt Dowle, Arun Srinivasan
Instructors, DataCamp
integer64
type, provided by the bit64
package ans <- fread("id,name\n1234567890123,Jane\n5284782381811,John\n")
ans
id name
1234567890123 Jane
5284782381811 John
class(ans$id)
"integer64"
str <- "x1,x2,x3,x4,x5\n1,2,1.5,true,cc\n3,4,2.5,false,ff"
ans <- fread(str, colClasses = c(x5 = "factor")) str(ans)
Classes ‘data.table’ and 'data.frame': 2 obs. of 5 variables:
$ x1: int 1 3
$ x2: int 2 4
$ x3: num 1.5 2.5
$ x4: logi TRUE FALSE
$ x5: Factor w/ 2 levels "cc","ff": 1 2
ans <- fread(str, colClasses = c("integer", "integer",
"numeric", "logical", "factor"))
str(ans)
Classes ‘data.table’ and 'data.frame': 2 obs. of 5 variables:
$ x1: int 1 3
$ x2: int 2 4
$ x3: num 1.5 2.5
$ x4: logi TRUE FALSE
$ x5: Factor w/ 2 levels "cc","ff": 1 2
str <- "x1,x2,x3,x4,x5,x6\n1,2,1.5,2.5,aa,bb\n3,4,5.5,6.5,cc,dd"
ans <- fread(str, colClasses = list(numeric = 1:4, factor = c("x5", "x6")))
str(ans)
Classes ‘data.table’ and 'data.frame': 2 obs. of 6 variables:
$ x1: num 1 3
$ x2: num 2 4
$ x3: num 1.5 5.5
$ x4: num 2.5 6.5
$ x5: Factor w/ 2 levels "aa","cc": 1 2
$ x6: Factor w/ 2 levels "bb","dd": 1 2
str <- "1,2\n3,4,a\n5,6\n7,8,b"
fread(str)
V1 5 6
7 8 b
Warning message:
In fread(str) :
Detected 2 column names but the data has 3 columns (i.e. invalid file).
Added 1 extra default column name for the first column which is guessed to
be row names or an index.
Use setnames() afterwards if this guess is not correct,
or fix the file write command that created the file to create a valid file.
fread(str, fill = TRUE)
V1 V2 V3
1 2
3 4 a
5 6
7 8 b
Missing values are commonly encoded as: "999"
or "##NA"
or "N/A"
str <- "x,y,z\n1,###,3\n2,4,###\n#N/A,7,9"
ans <- fread(str, na.strings = c("###", "#N/A"))
ans
x y z
1 NA 3
2 4 NA
NA 7 9
Data Manipulation with data.table in R