Time Series Analysis in Python
Rob Reider
Adjunct Professor, NYU-Courant Consultant, Quantopian
$\ \ \ \ y_t = \alpha + \beta x_t + \epsilon_t$
import statsmodels.api as sm
sm.OLS(y, x).fit()
np.polyfit(x, y, deg=1)
pd.ols(y, x)
from scipy import stats
stats.linregress(x, y)
Warning: the order of x
and y
is not consistent across packages
import statsmodels.api as sm
df['SPX_Ret'] = df['SPX_Prices'].pct_change()
df['R2000_Ret'] = df['R2000_Prices'].pct_change()
df = sm.add_constant(df)
SPX_Price R2000_Price SPX_Ret R2000_Ret
Date
2012-11-01 1427.589966 827.849976 NaN NaN
2012-11-02 1414.199951 814.369995 -0.009379 -0.016283
df = df.dropna()
results = sm.OLS(df['R2000_Ret'],df[['const','SPX_Ret']]).fit()
print(results.summary())
results.params[0]
results.params[1]
Time Series Analysis in Python