Intermediate Predictive Analytics in Python
Nele Verbiest, Ph. D.
Senior Data Scientist @PythonPredictions
Logistic regression: $logit(a_1x_1 + a_2x_2 + ... + a_nx_n + b)$
donor_id | gender | country | segment |
---|---|---|---|
5 | F | India | Gold |
3 | M | USA | Silver |
2 | M | India | Bronze |
8 | F | UK | Silver |
1 | F | USA | Bronze |
Logistic regression: $logit(a_1x_1 + a_2x_2 + ... + a_nx_n + b)$
donor_id | gender | country | segment | gender_F | gender_M |
---|---|---|---|---|---|
5 | F | India | Gold | 1 | 0 |
3 | M | USA | Silver | 0 | 1 |
2 | M | India | Bronze | 0 | 1 |
8 | F | UK | Silver | 1 | 0 |
1 | F | USA | Bronze | 1 | 0 |
donor_id | gender | gender_F | gender_M |
---|---|---|---|
5 | F | 1 | 0 |
3 | M | 0 | 1 |
2 | M | 0 | 1 |
8 | F | 1 | 0 |
1 | F | 1 | 0 |
donor_id | gender | gender_F |
---|---|---|
5 | F | 1 |
3 | M | 0 |
2 | M | 0 |
8 | F | 1 |
1 | F | 1 |
donor_id | country | country_USA | country_India | country_UK |
---|---|---|---|---|
5 | India | 0 | 1 | 0 |
3 | USA | 1 | 0 | 0 |
2 | India | 0 | 1 | 0 |
8 | UK | 0 | 0 | 1 |
1 | USA | 1 | 0 | 0 |
donor_id | country | country_USA | country_India |
---|---|---|---|
5 | India | 0 | 1 |
3 | USA | 1 | 0 |
2 | India | 0 | 1 |
8 | UK | 0 | 0 |
1 | USA | 1 | 0 |
donor_id segment
0 32770 Gold
1 32776 Silver
2 32777 Bronze
3 65552 Bronze
# Create the dummy variable dummies_segment = pd.get_dummies(basetable["segment"],drop_first=True)
# Add the dummy variable to the basetable basetable = pd.concat([basetable, dummies_segment], axis=1)
# Delete the original variable from the basetable del basetable["segment"]
donor_id Gold Silver
0 32770 1 0
1 32776 0 1
2 32777 0 0
3 65552 0 0
Intermediate Predictive Analytics in Python