Data Manipulation with data.table in R
Matt Dowle and Arun Srinivasan
Instructors, DataCamp
data.frame
data.frame
rows
, columns
and groups
# General form of data.table syntax
DT[i, j, by]
| | |
| | --> grouped by what?
| -----> what to do?
--------> on which rows?
Three ways of creating data tables:
data.table()
as.data.table()
fread()
library(data.table)
x_df <- data.frame(id = 1:2, name = c("a", "b"))
x_df
id name
1 a
2 b
x_dt <- data.table(id = 1:2, name = c("a", "b"))
x_dt
id name
1 a
2 b
y <- list(id = 1:2, name = c("a", "b"))
y
$id
1 2
$name
"a" "b"
x <- as.data.table(y)
x
id name
1 a
2 b
Since a data.table is a data.frame ...
x <- data.table(id = 1:2,
name = c("a", "b"))
x
id name
1 a
2 b
class(x)
"data.table" "data.frame"
Functions used to query data.frames also work on data.tables
nrow(x)
2
ncol(x)
2
dim(x)
2 2
A data table never automatically converts character columns to factors
x_df <- data.frame(id = 1:2, name = c("a", "b"))
class(x_df$name)
"factor"
x_dt <- data.table(id = 1:2, name = c("a", "b"))
class(x_dt$name)
"character"
Never sets, needs or uses row names
rownames(x_dt) <- c("R1", "R2")
x_dt
id name
1: 1 a
2: 2 b
Data Manipulation with data.table in R