Data Manipulation in Julia
Katerina Zahradova
Instructor
","
, " "
, "\t"
, ...# Loading file with a space as a delimiter
penguins = DataFrame(CSV.File("penguins.csv",delim=" "))
3.14
3,14
# Loading file with comma as decimal mark
penguins = DataFrame(CSV.File(
"penguins.csv",
decimal=',', delim=" "))
# Loading lines 13 till 27
penguins_part = DataFrame(CSV.File("penguins.csv", skipto=10, limit=3))
3×7 DataFrame
Row species island culmen_length_mm culmen_depth_mm ...
String7 String15 Float64 Float64 ...
______________________________________________________________
1 Adelie Torgersen 38.6 21.2 ...
2 Adelie Torgersen 34.6 21.1 ...
3 Adelie Torgersen 36.6 17.8 ...
# Specifying header as a line
penguins_header = DataFrame(CSV.File("penguins.csv", header = 1))
333×7 DataFrame
Row species island culmen_length_mm culmen_depth_mm ...
String7 String15 Float64 Float64 ...
______________________________________________________________
1 Adelie Torgersen 39.1 18.7 ...
2 Adelie Torgersen 39.5 17.4 ...
3 Adelie Torgersen 40.3 18.0 ...
...
# Multiline header
penguins_header = DataFrame(CSV.File("penguins.csv", header = [1, 2]))
332×7 DataFrame
Row species_Adelie island_Torgersen culmen_length_mm_39.1 ...
String7 String15 Float64 ...
________________________________________________________________
1 Adelie Torgersen 39.5 ...
2 Adelie Torgersen 40.3 ...
3 Adelie Torgersen 36.7 ...
...
# Replacing header
penguins_header = DataFrame(CSV.File("penguins.csv",
header = [:species, :area, :culmen_l_mm, :culmen_d_mm,
:flipper_l_mm, :weight_g, :sex]))
333×7 DataFrame
Row species area culmen_l_mm culmen_d_mm ...
String7 String15 Float64 Float64 ...
____________________________________________________
1 Adelie Torgersen 39.1 18.7 ...
2 Adelie Torgersen 39.5 17.4 ...
3 Adelie Torgersen 40.3 18.0 ...
...
# Save DataFrame
CSV.write("temp/transformed_penguins.csv", delim = " ", decimal = ',')
delim=
: a Char
or String
separating values in columns; e.g., species,island,...
decimal=
: a Char
indicating how decimal places are separated in floats; e.g., .
in 3.14
skipto=
: an Int
specifying the row number in the file where you want to start loading; beware - header is included!
limit=
: an Int
specifying the number of rows you want to load
header=
: an Int
for row number of a header, a Vector{Int}
for multiple lines, a Vector{String}
or Vector{Symbol}
to rewrite header
CSV.File(path)
loads a file in path
CSV.write(path, df)
writes df
as a CSV in path
Documentation for CSV.File()
and CSV.write
Data Manipulation in Julia