Practicing Machine Learning Interview Questions in Python
Lisa Stuart
Data Scientist
Method | Use an ML model | Select best subset | Can overfit |
---|---|---|---|
Filter | No | No | No |
Wrapper | Yes | Yes | Sometimes |
Embedded | Yes | Yes | Yes |
Feature importance | Yes | Yes | Yes |
Feature/Response | Continuous | Categorical |
---|---|---|
Continuous | Pearson's Correlation | LDA |
Categorical | ANOVA | Chi-Square |
Function | returns |
---|---|
df.corr() |
Pearson's correlation matrix |
sns.heatmap(corr_object) |
heatmap plot |
abs() |
absolute value |
sklearn.ensemble.RandomForestRegressor
sklearn.ensemble.ExtraTreesRegressor
tree_mod.feature_importances_
Function | returns |
---|---|
sklearn.svm.SVR |
support vector regression estimator |
sklearn.feature_selection.RFECV |
recursive feature elimination with cross-val |
rfe_mod.support_ |
boolean array of selected features |
ref_mod.ranking_ |
feature ranking, selected=1 |
sklearn.linear_model.LinearRegression |
linear model estimator |
sklearn.linear_model.LarsCV |
least angle regression with cross-val |
LarsCV.score |
r-squared score |
LarsCV.alpha_ |
estimated regularization parameter |
Practicing Machine Learning Interview Questions in Python