Manipulating Time Series Data in Python
Stefan Jansen
Founder & Lead Data Scientist at Applied Artificial Intelligence
data = pd.read_csv('google.csv', parse_dates=['date'], index_col='date')
DatetimeIndex: 1761 entries, 2010-01-04 to 2016-12-30
Data columns (total 1 columns):
price 1761 non-null float64
dtypes: float64(1)
# Integer-based window size
data.rolling(window=30).mean() # fixed # observations
DatetimeIndex: 1761 entries, 2010-01-04 to 2017-05-24
Data columns (total 1 columns):
price 1732 non-null float64
dtypes: float64(1)
window=30
: # business daysmin_periods
: choose value < 30 to get results for first days# Offset-based window size
data.rolling(window='30D').mean() # fixed period length
DatetimeIndex: 1761 entries, 2010-01-04 to 2017-05-24
Data columns (total 1 columns):
price 1761 non-null float64
dtypes: float64(1)
30D
: # calendar daysr90 = data.rolling(window='90D').mean()
google.join(r90.add_suffix('_mean_90')).plot()
data['mean90'] = r90
r360 = data['price'].rolling(window='360D'.mean()
data['mean360'] = r360; data.plot()
r = data.price.rolling('90D').agg(['mean', 'std'])
r.plot(subplots = True)
rolling = data.google.rolling('360D')
q10 = rolling.quantile(0.1).to_frame('q10')
median = rolling.median().to_frame('median')
q90 = rolling.quantile(0.9).to_frame('q90')
pd.concat([q10, median, q90], axis=1).plot()
Manipulating Time Series Data in Python