Engineering features

Machine Learning for Finance in Python

Nathan George

Data Science Professor

nonlinear feature correlation

Machine Learning for Finance in Python

One problem with linear models

# add non-linear interaction term for a linear model
SMAxRSI = amd_df['14-day SMA'] * amd_df['14-day RSI']

Some models that don't require manually creating interaction features:

Decision-tree-based models

  • Random forests
  • Gradient boosting

Others

  • neural networks
Machine Learning for Finance in Python

volume vs price change

Machine Learning for Finance in Python

price and volume

Machine Learning for Finance in Python

Volume features

amd_df['Adj_Volume_1d_change'] = amd_df['Adj_Volume'].pct_change()

one_day_change = amd_df['Adj_Volume_1d_change'].values amd_df['Adj_Volume_1d_change_SMA'] = talib.SMA(one_day_change, timeperiod=10)
Machine Learning for Finance in Python

Datetime feature engineering

graphic of datetime with arrows to day of week, month, year

Machine Learning for Finance in Python

Extracting the day of week

print(amd_df.index.dayofweek)
Int64Index([2, 3, 4, 0, 1, 2, 3, 4, 0, 1,
            ...
            1, 2, 3, 4, 0, 1, 2, 3, 4, 0],
           dtype='int64', name='Date', length=4807)
Machine Learning for Finance in Python

Dummies

days_of_week = pd.get_dummies(amd_df.index.dayofweek,
                                prefix='weekday',
                                drop_first=True)

print(days_of_week.head())
            weekday_1  weekday_2  weekday_3  weekday_4
Date
2018-04-10          1          0          0          0
2018-04-11          0          1          0          0
2018-04-12          0          0          1          0
2018-04-13          0          0          0          1
2018-04-16          0          0          0          0
Machine Learning for Finance in Python

correlation plot of new features

Machine Learning for Finance in Python

Engineer some features!

Machine Learning for Finance in Python

Preparing Video For Download...