Scatter plots

Understanding Data Visualization

Richie Cotton

Data Evangelist at DataCamp

When should you use a scatter plot?

  1. You have two continuous variables.
  2. You want to answer questions about the relationship between the two variables.
Understanding Data Visualization

Los Angeles County home prices

city n_beds price_musd area_sqft
Long Beach 1 0.3250 846
Beverly Hills 3 2.1950 2930
Santa Monica 2 0.5740 1037
Santa Monica 1 0.5990 576
Beverly Hills 5 3.9500 5600
Long Beach 4 0.2999 1571
Westwood 3 0.6950 1913
Understanding Data Visualization

Prices vs. area

A scatter plot of Los Angeles home prices versus their area, using linear scales on the x and y axes.

A scatter plot of Los Angeles home prices versus their area, using logarithmic scales on the x and y axes.

Understanding Data Visualization

Correlation

How close are you to being able to fit a straight line through the points?

A scatter plot of correlations for theoretical pairs of x and y coordinates.

Understanding Data Visualization

Sometimes correlation isn't helpful

Scatter plots of the 13 datasets in the Dinosaurus Dozen. Each dataset looks very different to the others.

Understanding Data Visualization

Adding trend lines

A scatter plot of Los Angeles home prices versus their area, using logarithmic scales on the x and y axes. A linear trend line has been added, which is a good fit.

Understanding Data Visualization

Adding smooth trend lines

A scatter plot of Los Angeles home prices versus their area, using linear scales on the x and y axes. A linear trend line has been added, which is a poor fit.

A scatter plot of Los Angeles home prices versus their area, using linear scales on the x and y axes. A LOESS trend line has been added, which is a good fit.

Understanding Data Visualization

Let's practice!

Understanding Data Visualization

Preparing Video For Download...