Intermediate Predictive Analytics in Python
Nele Verbiest
Senior Data Scientist @PythonPredictions
donor_id | age |
---|---|
5 | - |
3 | 25 |
2 | 36 |
8 | 40 |
1 | 26 |
donor_id | age |
---|---|
5 | 38 |
3 | 25 |
2 | 36 |
8 | 40 |
1 | 26 |
Mean age: 38
donor_id | max_donation |
---|---|
5 | - |
3 | 1 000 000 |
2 | 100 |
8 | 40 |
1 | 120 |
Mean max_donation
: 25 065
Median max_donation
: 110
donor_id | max_donation |
---|---|
5 | 110 |
3 | 1 000 000 |
2 | 100 |
8 | 40 |
1 | 120 |
Mean max_donation
: 25 065
Median max_donation
: 110
donor_id | sum_donations |
---|---|
5 | 130 |
3 | 10 |
2 | - |
8 | 40 |
1 | 120 |
donor_id | sum_donations |
---|---|
5 | 130 |
3 | 10 |
2 | 0 |
8 | 40 |
1 | 120 |
# Replace missing values by 0 replacement = 0 basetable["donations_last_year"] = basetable["donations_last_year"].fillna(replacement)
# Replace missing values by mean replacement = basetable["age"].mean() basetable["age"] = basetable["age"].fillna(replacement)
donor_id email
0 32770 [email protected]
1 32776 nan
2 32777 [email protected]
3 65552 nan
basetable["no_email"] = pd.Series(
[0 if email==email else 1
for email in basetable["email"]])
donor_id email no_email
0 32770 [email protected] 0
1 32776 nan 1
2 32777 [email protected] 0
3 65552 nan 1
Intermediate Predictive Analytics in Python