HR Analytics: Predicting Employee Churn in Python
Hrant Davtyan
Assistant Professor of Data Science American University of Armenia
# Change the type of the "salary" column to categorical
data.salary = data.salary.astype('category')
# Provide the correct order of categories
data.salary = data.salary.cat.reorder_categories(['low',
'medium',
'high'])
# Encode categories with integer values
data.salary = data.salary.cat.codes
Old values | New values |
---|---|
low | 0 |
medium | 1 |
high | 2 |
# Get dummies and save them inside a new DataFrame
departments = pd.get_dummies(data.department)
Example output:
IT RandD accounting hr management marketing product_mng sales support technical
0 0 0 0 0 0 0 0 0 0 1
departments.head()
IT RandD accounting hr management marketing product_mng sales support technical
0 0 0 0 0 0 0 0 0 0 1
departments = departments.drop("technical", axis = 1)
departments.head()
IT RandD accounting hr management marketing product_mng sales support
0 0 0 0 0 0 0 0 0 0
HR Analytics: Predicting Employee Churn in Python