Introduction to Data Visualization with Julia
Gustavo Vieira Suñe
Data Analyst
insurance
DataFrameAge | Sex | BMI | Children | Smoker | Region | Charges |
---|---|---|---|---|---|---|
19 | female | 27.90 | 0 | yes | southwest | 16884.90 |
18 | male | 33.77 | 1 | no | southeast | 1725.55 |
28 | male | 33.00 | 3 | no | southeast | 4449.46 |
... | ... | ... | ... | ... | ... | ... |
StatsPlots
has a recipe to plot data in DataFrames@df
notation!# Group by region and smoker grouped = groupby(insurance, [:Region, :Smoker])
# Calculate mean charges grouped_mean_charges = combine(grouped, :Charges => mean)
grouped_mean_charges.Region
extracts an array containing the regions as strings.# Grouped bar chart groupedbar(
# Pass arrays as arguments grouped_mean_charges.Region, grouped_mean_charges.Charges_mean, group=grouped_mean_charges.Smoker,
color=[:teal :orangered2], linewidth=0, legend_title="Smoker", legend_position=:outertopright) xlabel!("Region") ylabel!("Insurance Premium (USD)")
# Plot from DataFrame @df grouped_mean_charges groupedbar(
# Pass column names :Region, :Charges_mean,
group=:Smoker,
color=[:teal :orangered2], linewidth=0, legend_title="Smoker", legend_position=:outertopright) xlabel!("Region") ylabel!("Insurance Premium (USD)")
# Grouped bar chart groupedbar(
# Pass arrays as arguments grouped_mean_charges.Region, grouped_mean_charges.Charges_mean, group=grouped_mean_charges.Smoker,
color=[:teal :orangered2], linewidth=0, legend_title="Smoker", legend_position=:outertopright) xlabel!("Region") ylabel!("Insurance Premium (USD)")
# Plot from DataFrame @df grouped_mean_charges groupedbar(
# Pass column names :Region, :Charges_mean, group=:Smoker,
color=[:teal :orangered2], linewidth=0, legend_title="Smoker", legend_position=:outertopright) xlabel!("Region") ylabel!("Insurance Premium (USD)")
From before
# Group by region and smoker
grouped = groupby(insurance, [:Region, :Smoker])
# Calculate mean charges
grouped_mean_charges = combine(grouped, :Charges => mean)
Use chaining instead
using Chain
# Chain groupby and combine
grouped_mean_charges = @chain insurance begin
groupby([:Region, :Smoker])
combine(:Charges => mean)
end
# Plotting chain @chain insurance begin # Manipulate data groupby([:Region, :Smoker]) combine(:Charges => mean)
# Plot data @df groupedbar(:Region, :Charges_mean, group=:Smoker, color=[:teal :orangered2], linewidth=0, legend_title="Smoker", legend_position=:outertopright)
end xlabel!("Region") ylabel!("Insurance Premium (USD)")
Introduction to Data Visualization with Julia