Dropping and moving columns

Data Manipulation in Julia

Katerina Zahradova

Instructor

How to select most columns

# Selecting all columns from chocolates except the review_date column
select(chocolates, :company, :bean_origin, :REF, :cocoa,
        :company_location, :ratings, :bean_type, :bean_location)
ArgumentError: column name "ratings" not found in the data frame; 
    existing most similar names are: "rating"
...
Data Manipulation in Julia

Not() operator

# Selecting all columns from chocolates except the review_date column
select(chocolates, Not(:review_date))
1795×9 DataFrame 
Row company  bean_origin REF   cocoa   company_location rating  bean_type bean_location
    String   String      Int64 Float64 String31         Float64 String31? String31?
______________________________________________________________________________________
1   A. Morin Agua Grande 1876  63.0    France           3.75    Sao Tome
...
Data Manipulation in Julia

select() and not() pitfalls

# Dropping the same column twice
select!(chocolates, Not(:review_date))

# Several lines of code later ...

select(chocolates, Not(:review_date))
ArgumentError: column name :review_date not found in the data frame
Data Manipulation in Julia

Dropping column that is not there safely

# Using Cols
select(chocolates, Not(Cols(==("review_date"))))
1795×9 DataFrame 
Row company  bean_origin REF   cocoa   company_location rating  bean_type bean_location
    String   String      Int64 Float64 String31         Float64 String31? String31?
...
# Using regex
select(chocolates, Not(r"review\_date"))
1795×9 DataFrame 
Row company  bean_origin REF   cocoa   company_location rating  bean_type bean_location
    String   String      Int64 Float64 String31         Float64 String31? String31?
...
Data Manipulation in Julia

Reordering columns

# Pre-selected chocolates
first(chocolates)
1795x5 DataFrame
Row  company   review_date  cocoa    rating   bean_location 
     String    Int64        Float64  Float64  String31?
__________________________________________________________
1    A. Morin  2016         63.0     3.75     Sao Tome
Data Manipulation in Julia

Move it to the left

# Moving cocoa to the left
select(chocolates, :cocoa, :)
1795x5 DataFrame
Row  cocoa    company   review_date  rating   bean_location 
     Float64  String    Int64        Float64  String31?
__________________________________________________________
1    63.0     A. Morin  2016         3.75     Sao Tome
...
Data Manipulation in Julia

Move them to the left

# Moving cocoa and rating to the left
select(chocolates, :cocoa, :rating, :)
1795x5 DataFrame

Row  cocoa    rating   company   review_date  bean_location 
     Float64  Float64  String    Int64        String31?
__________________________________________________________
1    63.0     3.75     A. Morin  2016         Sao Tome
...
Data Manipulation in Julia

Move it right

# Moving company to the right
select(chocolates, Not(:company), :company)
1795x5 DataFrame
Row  review_date  cocoa    rating   bean_location  company
     Int64        Float64  Float64  String31?      String
__________________________________________________________
1    2016         63.0     3.75     Sao Tome       A. Morin
...
Data Manipulation in Julia

Move them all around

# Combine the two moves
select(chocolates, :cocoa, :rating, Not(:company), :company)
1795x5 DataFrame
Row cocoa    rating   bean_location  review_date  company 
    Float64  Float64  String31       Int64        String
__________________________________________________________
1   63.0     3.75     Sao Tome       2016         A. Morin
...
Data Manipulation in Julia

Move and drop at the same time

# Reorder and drop review_date
# Combine the two moves
select(chocolates, :cocoa, :rating, Not([:company, :review_date]), :company)
1795x4 DataFrame
Row cocoa    rating   bean_location  company 
    Float64  Float64  String31       String
__________________________________________________________
1   63.0     3.75     Sao Tome       A. Morin
...
Data Manipulation in Julia

Cheat sheet

Dropping columns:

# Drop col1 and col 2
select(df, Not([:col1, "col 2"]))

Dropping columns safely:

# Drop col1 and col 2 that might not exist
select(df, Not(r"col1"), Not(Cols(==("col 2"))))

Moving columns to the left

# Move col1 and col 2 to the left
select(df, :col1, "col 2", :)

Moving columns to the right

# Move col1 and col 2 to the right
select(df, Not([:col1, "col 2"]) ,:col1, "col 2")
Data Manipulation in Julia

Let's practice!

Data Manipulation in Julia

Preparing Video For Download...