Cointegration Models

Time Series Analysis in Python

Rob Reider

Adjunct Professor, NYU-Courant Consultant, Quantopian

What is Cointegration?

  • Two series, $\large P_t$ and $\large Q_t$ can be random walks
  • But the linear combination $\large P_t - c \ Q_t$ may not be a random walk!
  • If that's true
    • $\large P_t - c \ Q_t$ is forecastable
    • $\large P_t$ and $\large Q_t$ are said to be cointegrated
Time Series Analysis in Python

Analogy: Dog on a Leash

  • $\large P_t = $ Owner
  • $\large Q_t = $ Dog
  • Both series look like a random walk
  • Difference, or distance between them, looks mean reverting
    • If dog falls too far behind, it gets pulled forward
    • If dog gets too far ahead, it gets pulled back
Time Series Analysis in Python

Example: Heating Oil and Natural Gas

  • Heating Oil and Natural Gas both look like random walks...

Time Series Analysis in Python

Example: Heating Oil and Natural Gas

  • But the spread (difference) is mean reverting

Time Series Analysis in Python

What Types of Series are Cointegrated?

  • Economic substitutes
    • Heating Oil and Natural Gas
    • Platinum and Palladium
    • Corn and Wheat
    • Corn and Sugar
    • ...
    • Bitcoin and Ethereum?
  • How about competitors?
    • Coke and Pepsi?
    • Apple and Blackberry? No! Leash broke and dog ran away
Time Series Analysis in Python

Two Steps to Test for Cointegration

  • Regress $\large P_t$ on $\large Q_t$ and get slope $\large c$
  • Run Augmented Dickey-Fuller test on $\large P_t - c \ Q_t$ to test for random walk
  • Alternatively, can use coint function in statsmodels that combines both steps
from statsmodels.tsa.stattools import coint
coint(P,Q)
Time Series Analysis in Python

Let's practice!

Time Series Analysis in Python

Preparing Video For Download...