Computations by groups

R'de data.table ile Veri İşleme

Matt Dowle, Arun Srinivasan

Instructors, DataCamp

The by argument

The by argument allows computations for each unique value of the (grouping) columns specified in by

# How many trips happened from each start_station?
ans <- batrips[, .N, by = "start_station"]
head(ans, 3)
          start_station        N
San Francisco City Hall     2145
 Embarcadero at Sansome    12879
      Steuart at Market    11579
R'de data.table ile Veri İşleme

The by argument

by argument accepts both character vector of column names as well as a list of variables/expressions

# Same as batrips[, .N, by = "start_station"]
ans <- batrips[, .N, by = .(start_station)]
head(ans, 3)
          start_station        N
San Francisco City Hall     2145
 Embarcadero at Sansome    12879
      Steuart at Market    11579
R'de data.table ile Veri İşleme

The by argument

Allows renaming grouping columns on the fly

ans <- batrips[, .(no_trips = .N), by = .(start = start_station)]
head(ans, 3)
                  start   no_trips
San Francisco City Hall       2145
 Embarcadero at Sansome      12879
      Steuart at Market      11579
R'de data.table ile Veri İşleme

Expressions in by

The list() or .() expression in by allows for grouping variables to be computed on the fly

# Get number of trips for each start_station for each month
ans <- batrips[ , .N, by = .(start_station, mon = month(start_date))]
head(ans, 3)
          start_station mon    N
San Francisco City Hall   1  193
 Embarcadero at Sansome   1  985
      Steuart at Market   1  813
R'de data.table ile Veri İşleme

Let's practice!

R'de data.table ile Veri İşleme

Preparing Video For Download...