Regression Plots

Intermediate Data Visualization with Seaborn

Chris Moffitt

Instructor

Bicycle Dataset

  • Aggregated bicycle sharing data in Washington DC
  • Data includes:
    • Rental amounts
    • Weather information
    • Calendar information
  • Can we predict rental amounts?
Intermediate Data Visualization with Seaborn

Plotting with regplot()

sns.regplot(data=df, x='temp', 
            y='total_rentals', marker='+')

Regression Plot

Intermediate Data Visualization with Seaborn

Evaluating regression with residplot()

  • A residual plot is useful for evaluating the fit of a model
  • Seaborn supports through residplot function
sns.residplot(data=df, x='temp', y='total_rentals')

Residual Plot

Intermediate Data Visualization with Seaborn

Polynomial regression

  • Seaborn supports polynomial regression using the order parameter
sns.regplot(data=df, x='temp', 
            y='total_rentals', order=2)

Regression Plot

Intermediate Data Visualization with Seaborn

residplot with polynomial regression

sns.residplot(data=df, x='temp', 
              y='total_rentals', order=2)

Residual Plot

Intermediate Data Visualization with Seaborn

Categorical values

sns.regplot(data=df, x='mnth', y='total_rentals',
            x_jitter=.1, order=2)

Residual Plot

Intermediate Data Visualization with Seaborn

Estimators

  • In some cases, an x_estimator can be useful for highlighting trends
sns.regplot(data=df, x='mnth', y='total_rentals',
            x_estimator=np.mean, order=2)

Residual Plot

Intermediate Data Visualization with Seaborn

Binning the data

  • x_bins can be used to divide the data into discrete bins
  • The regression line is still fit against all the data
sns.regplot(data=df,x='temp',y='total_rentals',
            x_bins=4)

Residual Plot

Intermediate Data Visualization with Seaborn

Let's practice!

Intermediate Data Visualization with Seaborn

Preparing Video For Download...