Intermediate Predictive Analytics in Python
Nele Verbiest
Senior Data Scientist @PythonPredictions
id date amount
1 2015-10-16 75
1 2014-02-11 111
2 2012-03-28 93
# Start and end date of the aggregation period start_date = datetime.date(2016,1,1) end_date = datetime.date(2017,1,1)
# Select gifts made in 2016 gifts_2016 = gifts[(gifts["date"] >= start_date) & (gifts["date"] <= end_date)]
# Sum of gifts per donor in 2016 gifts_2016_bydonor = gifts_2016.groupby(["id"])["amount"].sum().reset_index() gifts_2016_bydonor.columns = ["donor_ID","sum_2016"]
# Add sum of gifts to the basetable basetable = pd.merge(basetable, gifts_2016_bydonor, how = "left", on = "donor_ID") print(basetable.head())
donor_id sum_2016
1 837
2 29
3 682
# Number of gifts per donor in 2016 gifts_2016_bydonor = gifts_2016.groupby(["id"]).size().reset_index() gifts_2016_bydonor.columns = ["donor_ID","count_2016"]
# Add number of gifts to the basetable basetable = pd.merge(basetable, gifts_2016_bydonor, how = "left", on = "donor_ID") print(basetable.head())
donor_id count_2016
1 4
2 9
3 2
Intermediate Predictive Analytics in Python