Splitting the data

HR Analytics: prevedere l'abbandono dei dipendenti in Python

Hrant Davtyan

Assistant Professor of Data Science American University of Armenia

Target and features

  • target = churn
  • features = everything else
HR Analytics: prevedere l'abbandono dei dipendenti in Python

Train/test split

  • train - the component used to develop the model
  • test - the component used to validate the model
from sklearn.model_selection import train_test_split

target_train, target_test, features_train, features_test =
                           train_test_split(target,features,test_size=0.25)
HR Analytics: prevedere l'abbandono dei dipendenti in Python

Overfitting

an error that occurs when model works well enough for the dataset it was developed on (train) but is not useful outside of it (test)

HR Analytics: prevedere l'abbandono dei dipendenti in Python

Let's practice!

HR Analytics: prevedere l'abbandono dei dipendenti in Python

Preparing Video For Download...