Customer Analytics and A/B Testing in Python
Ryan Grossman
Data Scientist, EDO
# Find the days-to-subscribe of our loaded usa subs data set usa_subscriptions['sub_day'] = (usa_subscriptions.sub_date - usa_subscriptions.lapse_date).dt.days
# Filter out those who subscribed in the past week usa_subscriptions = usa_subscriptions[usa_subscriptions.sub_day <= 7]
# Find the total subscribers per day usa_subscriptions = usa_subscriptions.groupby( by=['sub_date'], as_index = False ).agg({'subs': ['sum']})
# plot USA subscribcers per day
usa_subscriptions.plot(x='sub_date', y='subs')
plt.show()
.rolling()
Series
of interestwindow
: Data points to averagecenter
: If true set the average at the center of the window# calling rolling on the "subs" Series rolling_subs = usa_subscriptions.subs.rolling(
# How many data points to average over window=7,
# Specify to average backwards center=False )
# find the rolling average usa_subscriptions['rolling_subs'] = rolling_subs.mean()
usa_subscriptions.tail()
sub_date subs rolling_subs
2018-03-14 89 94.714286
2018-03-15 96 95.428571
2018-03-16 102 96.142857
.rolling
like groupby
specifies a grouping of data points.mean()
)# Load a dataset of our highest sku purchases high_sku_purchases = pd.read_csv( 'high_sku_purchases.csv', parse_dates=True, infer_datetime_format=True )
# Plot the count of purchases by day of purchase high_sku_purchases.plot(x='date', y='purchases') plt.show()
.ewm()
: exponential weighting functionspan
: Window to apply weights over# Calculate the exp. avg. over our high sku
# purchase count
exp_mean = high_sku_purchases.purchases.ewm(
span=30)
# Find the weighted mean over this period
high_sku_purchases['exp_mean'] = exp_mean.mean()
High Sku Purchase Data
.rolling()
and .ewm()
for many more methods of smoothingCustomer Analytics and A/B Testing in Python