Data Manipulation with data.table in R
Matt Dowle, Arun Srinivasan
Instructors, DataCamp
.SD
is a special symbol which stands for Subset of Datax <- data.table(id = c(1, 1 ,2, 2, 1, 1),
val1 = 1:6, val2 = letters[6:1])
id val1 val2
1 1 f
1 2 e
2 3 d
2 4 c
1 5 b
1 6 a
x[, print(.SD), by = id]
val1 val2
1 f
2 e
5 b
6 a
val1 val2
3 d
4 c
Empty data.table (0 rows) of 1 col: id
x[, .SD[1], by = id]
id val1 val2
1 1 f
2 3 d
x[, .SD[.N], by = id]
id val1 val2
1 6 a
2 4 c
.SDcols
holds the columns that should be included in .SD
batrips[, .SD[1], by = start_station]
start_station trip_id duration start_date
San Francisco City Hall 139545 435 2014-01-01 00:14:00
Embarcadero at Sansome 139547 1523 2014-01-01 00:17:00
# .SDcols controls the columns .SD contains
batrips[, .SD[1], by = start_station, .SDcols = c("trip_id", "duration")]
start_station trip_id duration
San Francisco City Hall 139545 435
Embarcadero at Sansome 139547 1523
batrips[, .SD[1], by = start_station, .SDcols = - c("trip_id", "duration")]
start_station start_date
San Francisco City Hall 2014-01-01 00:14:00
Embarcadero at Sansome 2014-01-01 00:17:00
Data Manipulation with data.table in R