Data Manipulation with pandas
Maggie Matsui
Senior Content Developer at DataCamp
dogs["height_cm"].mean()
49.714285714285715
.median()
, .mode()
.min()
, .max()
.var()
, .std()
.sum()
.quantile()
Oldest dog:
dogs["date_of_birth"].min()
'2011-12-11'
Youngest dog:
dogs["date_of_birth"].max()
'2018-02-27'
def pct30(column):
return column.quantile(0.3)
dogs["weight_kg"].agg(pct30)
22.599999999999998
dogs[["weight_kg", "height_cm"]].agg(pct30)
weight_kg 22.6
height_cm 45.4
dtype: float64
def pct40(column):
return column.quantile(0.4)
dogs["weight_kg"].agg([pct30, pct40])
pct30 22.6
pct40 24.0
Name: weight_kg, dtype: float64
dogs["weight_kg"]
0 24
1 24
2 24
3 17
4 29
5 2
6 74
Name: weight_kg, dtype: int64
dogs["weight_kg"].cumsum()
0 24
1 48
2 72
3 89 4 118 5 120 6 194 Name: weight_kg, dtype: int64
.cummax()
.cummin()
.cumprod()
sales.head()
store type dept date weekly_sales is_holiday temp_c fuel_price unemp
0 1 A 1 2010-02-05 24924.50 False 5.73 0.679 8.106
1 1 A 2 2010-02-05 50605.27 False 5.73 0.679 8.106
2 1 A 3 2010-02-05 13740.12 False 5.73 0.679 8.106
3 1 A 4 2010-02-05 39954.04 False 5.73 0.679 8.106
4 1 A 5 2010-02-05 32229.38 False 5.73 0.679 8.106
Data Manipulation with pandas