Writing Efficient Code with pandas
Leonidas Souliotis
PhD Candidate
Limit results based on an aggregate feature
restaurant_grouped = restaurant.groupby('day')
filter_trans = lambda x : x['total_bill'].mean() > 20
restaurant_filtered = restaurant_grouped.filter(filter_trans)
Time using .filter() 0.00414085388184 sec
print(restaurant_filtered['tip'].mean())
3.11527607362
print(restaurant['tip'].mean())
2.9982786885245902
t=[restaurant.loc[df['day'] == i]['tip'] for i in restaurant['day'].unique()
if restaurant.loc[df['day'] == i]['total_bill'].mean()>20]
restaurant_filtered = t[0]
for j in t[1:]:
restaurant_filtered=restaurant_filtered.append(j,ignore_index=True)
Time using native Python: 0.00663900375366 sec
print(restaurant_filtered.mean())
3.11527607362
Difference in time: 60.329341317157024%
Writing Efficient Code with pandas