Introduction to Data Visualization with Plotly in Python
Alex Scriven
Data Scientist
Bivariate plots are those which display (and can therefore compare) two variables.
Common bivariate plots include:
A scatterplot is a plot consisting of:
Visualizing Flipper Length and Body Mass with plotly.express
:
import plotly.express as px
fig = px.scatter( data_frame=penguins, x="Body Mass (g)", y="Flipper Length (mm)") fig.show()
Useful plotly.express
scatterplot arguments:
trendline
: Add different types of trend linessymbol
: Set different symbols for different categoriesCheck the documentation for more!
A line chart is used to plot some variable (y-axis) over time (x-axis).
Let's visualize Microsoft's stock price.
fig = px.line(
data_frame=msft_stock,
x='Date',
y='Open',
title='MSFT Stock Price (5Y)')
fig.show()
Here is our simple line chart:
For more customization, graph_objects
uses go.Scatter()
for both scatter and line plots.
Here is the code for our penguins scatterplot using graph_objects
Here is the code for our line chart with graph_objects
import plotly.graph_objects as go
fig = go.Figure(go.Scatter(
x=penguins['Body Mass (g)'],
y=penguins['Flipper Length (mm)'],
mode='markers'))
fig = go.Figure(go.Scatter(
x=msft_stock['Date'],
y=msft_stock['Opening Stock Price'],
mode='lines'))
When should we use plotly.express
or graph_objects
? Largely, it is about customization - graph_objects
has many more options!
graph_objects |
express |
---|---|
![]() |
![]() |
A correlation plot is a way to visualize correlations between variables.
The Pearson Correlation Coefficient summarizes this relationship
df
contains data on bike sharing rental numbers in Korea with various weather variables
pandas
provides a method to create the data needed:
cr = df.corr(method='pearson')
print(cr)
Our Pearson correlation table:
Let's build a correlation plot:
import plotly.graph_objects as go
fig = go.Figure(go.Heatmap(
x=cr.columns, y=cr.columns,
z=cr.values.tolist(),
colorscale='rdylgn', zmin=-1, zmax=1))
fig.show()
Voila!
Introduction to Data Visualization with Plotly in Python