Creating a box plot

Introduction to Data Visualization with Seaborn

Erin Case

Data Scientist

What is a box plot?

  • Shows the distribution of quantitative data
  • See median, spread, skewness, and outliers
  • Facilitates comparisons between groups

Box plot of total bill broken down by day of the week

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/
Introduction to Data Visualization with Seaborn

How to create a box plot

import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time", 
                y="total_bill",
                data=tips, 
                kind="box")

plt.show()

Box plot of total bill broken down by time of day

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/
Introduction to Data Visualization with Seaborn

Change the order of categories

import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time", 
                y="total_bill",
                data=tips, 
                kind="box",
                order=["Dinner", 
                       "Lunch"])

plt.show()

Box plot with dinner shown before lunch

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/
Introduction to Data Visualization with Seaborn

Omitting the outliers using `sym`

import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time", 
                y="total_bill",
                data=tips, 
                kind="box",
                sym="")

plt.show()

Box plot with outliers omitted

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/
Introduction to Data Visualization with Seaborn

Changing the whiskers using `whis`

  • By default, the whiskers extend to 1.5 * the interquartile range
  • Make them extend to 2.0 * IQR: whis=2.0
  • Show the 5th and 95th percentiles: whis=[5, 95]
  • Show min and max values: whis=[0, 100]
Introduction to Data Visualization with Seaborn

Changing the whiskers using `whis`

import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time", 
                y="total_bill",
                data=tips, 
                kind="box",
                whis=[0, 100])

plt.show()

Box plot with whiskers set to minimum and maximum

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/
Introduction to Data Visualization with Seaborn

Let's practice!

Introduction to Data Visualization with Seaborn

Preparing Video For Download...