Univariate visualizations

Introduction to Data Visualization with Plotly in Python

Alex Scriven

Data Scientist

What are univariate plots?

  • Univariate plots display only one variable

 

$$

Common univariate plots:

  • Bar chart
  • Histogram
  • Box plot
  • Density plots
Introduction to Data Visualization with Plotly in Python

Histograms

 

Histograms have:

  • Multiple columns (called "bins") representing a range of values
    • The height of each bar = count of samples within that bin range
  • The number of bins can be manual or automatic

 

Histogram example

Introduction to Data Visualization with Plotly in Python

Our dataset

The dataset collected by scientific researchers on Penguins:

  • Contains various body measurements like beak size, weight, etc.
  • Contains different species, genders, and ages of penguins

Penguins dataset

Introduction to Data Visualization with Plotly in Python

Histograms with plotly.express

 

fig = px.histogram(
            data_frame=penguins,

x="Body Mass (g)", nbins=10)
fig.show()

Penguin histogram

Introduction to Data Visualization with Plotly in Python

Useful histogram arguments

 

  • orientation: To orient the plot vertically (v) or horizontally (h)
  • histfunc: Set the bin aggregation (eg: average, min, max).

$$

$$

$$

$$

Check the documentation for more

Introduction to Data Visualization with Plotly in Python

Box (and whisker) plots

Summarizes a variable using quartile calculations

$$

  • Middle area represents interquartile range
    • Top line = 3rd quartile (75th percentile)
    • Middle line = median (50th percentile)
    • Bottom line = first quartile (25th percentile)
  • Top/bottom bars = min/max, excluding outliers

Box plot penguins

  • Outlying dots are outliers
Introduction to Data Visualization with Plotly in Python

Box plots with plotly.express

 

$$

fig = px.box(data_frame=penguins, 
            y="Flipper Length (mm)")
fig.show()

Box plot

Introduction to Data Visualization with Plotly in Python

Useful box plot arguments

 

  • hover_data: A list of column name(s) to display on hover
    • Useful to understand outliers
  • points: Further specify how to show outliers

$$

$$

$$

$$

Check the documentation for more

Introduction to Data Visualization with Plotly in Python

Let's practice!

Introduction to Data Visualization with Plotly in Python

Preparing Video For Download...