Autocorrelation and Partial autocorrelation

Visualizing Time Series Data in Python

Thomas Vincent

Head of Data Science, Getty Images

Autocorrelation in time series data

  • Autocorrelation is measured as the correlation between a time series and a delayed copy of itself
  • For example, an autocorrelation of order 3 returns the correlation between a time series at points (t_1, t_2, t_3, ...) and its own values lagged by 3 time points, i.e. (t_4, t_5, t_6, ...)
  • It is used to find repetitive patterns or periodic signal in time series
Visualizing Time Series Data in Python

Statsmodels

statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration.

Visualizing Time Series Data in Python

Plotting autocorrelations

import matplotlib.pyplot as plt
from statsmodels.graphics import tsaplots
fig = tsaplots.plot_acf(co2_levels['co2'], lags=40)

plt.show()
Visualizing Time Series Data in Python

Interpreting autocorrelation plots

Autocorrelation in time series

Visualizing Time Series Data in Python

Partial autocorrelation in time series data

  • Contrary to autocorrelation, partial autocorrelation removes the effect of previous time points
  • For example, a partial autocorrelation function of order 3 returns the correlation between our time series (t1, t2, t3, ...) and lagged values of itself by 3 time points (t4, t5, t6, ...), but only after removing all effects attributable to lags 1 and 2
Visualizing Time Series Data in Python

Plotting partial autocorrelations

import matplotlib.pyplot as plt

from statsmodels.graphics import tsaplots
fig = tsaplots.plot_pacf(co2_levels['co2'], lags=40)

plt.show()
Visualizing Time Series Data in Python

Interpreting partial autocorrelations plot

Partial Autocorrelation in time series

Visualizing Time Series Data in Python

Let's practice!

Visualizing Time Series Data in Python

Preparing Video For Download...