Machine Learning for Time Series Data in Python
Chris Holdgraf
Fellow, Berkeley Institute for Data Science
from glob import glob
files = glob('data/heartbeat-sounds/files/*.wav')
print(files)
['data/heartbeat-sounds/proc/files/murmur__201101051104.wav',
 ...
 'data/heartbeat-sounds/proc/files/murmur__201101051114.wav']
import librosa as lr
# `load` accepts a path to an audio file
audio, sfreq = lr.load('data/heartbeat-sounds/proc/files/murmur__201101051104.wav')
print(sfreq)
2205
In this case, the sampling frequency is 2205, meaning there are 2205 samples per second
Create an array of indices, one for each sample, and divide by the sampling frequency
  indices = np.arange(0, len(audio))
  time = indices / sfreq
Find the time stamp for the N-1th data point. Then use linspace() to interpolate from zero to that time 
  final_time = (len(audio) - 1) / sfreq
  time = np.linspace(0, final_time, sfreq)
data = pd.read_csv('path/to/data.csv')
data.columns
Index(['date', 'symbol', 'close', 'volume'], dtype='object')
data.head()
         date symbol       close       volume
0  2010-01-04   AAPL  214.009998  123432400.0
1  2010-01-04    ABT   54.459951   10829000.0
2  2010-01-04    AIG   29.889999    7750900.0
3  2010-01-04   AMAT   14.300000   18615100.0
4  2010-01-04   ARNC   16.650013   11512100.0
dtypes attributedf['date'].dtypes
0    object
1    object
2    object
dtype: object
to_datetime() functiondf['date'] = pd.to_datetime(df['date'])
df['date']
0   2017-01-01
1   2017-01-02
2   2017-01-03
Name: date, dtype: datetime64[ns]
Machine Learning for Time Series Data in Python