Creating new columns

Data Manipulation in Julia

Katerina Zahradova

Instructor

Columns vs. rows

Column

  • Single number for all rows
  • E.g., mean, median, sum, ...

Rows

  • Values depend on data in each individual row

=> using ByRow()

Data Manipulation in Julia

Flipper length to inches

# Convert mm to inches
transform(penguins, :flipper_length_mm => ByRow(x -> x/25.4) => :flipper_length_inch)
333x8 DataFrame
Row species   island    ...  body_mass_g  sex      flipper_lenght_inch
    String15  String15  ...  Int64        String7  Float64 
___________________________________________________________________
1   Adelie    Torgersen ...  3750         MALE     7.12598
2   Adelie    Torgersen ...  3800         FEMALE   7.32283
...
Data Manipulation in Julia

Culmen depth and length ratio

# Select columns and calculate their ratio
select(penguins,:culmen_depth_mm, :culmen_length_mm, 
   [:culmen_depth_mm, :culmen_length_mm] => ByRow((x, y) -> x/y) => :culmen_ratio)
333x3 DataFrame
Row  culmen_depth_mm  culmen_length_mm  culmen_ratio
     Float64          Float64           Float64
_____________________________________________________
1    18.7             39.1              0.478261
2    17.4             39.5              0.440506
...
Data Manipulation in Julia

New column from a vector

# Vector id_vec that we want to add


# Using [] and : penguins[:, :id_colon] = id_vec
# Using [] and ! penguins[!, :id_exclamation] = id_vec
# Using . penguins.id_dot = id_vec
Data Manipulation in Julia

Copy or not

penguins[:, :id_colon] = id_vec
penguins[!, :id_exclamation] = id_vec
penguins.id_dot = id_vec

# Change first element
id_vec[1] = 27
select(penguins, :species, r"id")
333x4 DataFrame
Row  species   id_colon  id_exclamation  id_dot
     String15  Int64     Int64           Int64
_____________________________________________________
1    Adelie    25        27              27
...
Data Manipulation in Julia

Copy or not

penguins[:, :id_colon] = id_vec        # copies values to the DataFrame
penguins[!, :id_exclamation] = id_vec  # references id_vec
penguins.id_dot = id_vec               # references id_vec

# Change first element
id_vec[1] = 27
select(penguins, :species, r"id")
333x4 DataFrame
Row  species   id_colon  id_exclamation  id_dot
     String15  Int64     Int64           Int64
_____________________________________________________
1    Adelie    25        27              27
...
Data Manipulation in Julia

Let's practice!

Data Manipulation in Julia

Preparing Video For Download...