Manipulating Time Series Data in Python
Stefan Jansen
Founder & Lead Data Scientist at Applied Artificial Intelligence
pandas
& seaborn
have tools to compute & visualize-1
and +1
$\ \ \ \ \ r = \frac{\sum_{i=1}^{N} (x_i - \bar{x})(y_i - \bar{y})}{s_xs_y}$
data = pd.read_csv('assets.csv', parse_dates=['date'], index_col='date')
data = data.dropna().info()
DatetimeIndex: 2469 entries, 2007-05-25 to 2017-05-22
Data columns (total 5 columns):
sp500 2469 non-null float64
nasdaq 2469 non-null float64
bonds 2469 non-null float64
gold 2469 non-null float64
oil 2469 non-null float64
daily_returns = data.pct_change()
sns.jointplot(x='sp500', y='nasdaq', data=data_returns);
correlations = returns.corr()
correlations
bonds oil gold sp500 nasdaq
bonds 1.000000 -0.183755 0.003167 -0.300877 -0.306437
oil -0.183755 1.000000 0.105930 0.335578 0.289590
gold 0.003167 0.105930 1.000000 -0.007786 -0.002544
sp500 -0.300877 0.335578 -0.007786 1.000000 0.959990
nasdaq -0.306437 0.289590 -0.002544 0.959990 1.000000
sns.heatmap(correlations, annot=True)
Manipulating Time Series Data in Python