Data Manipulation with pandas
Maggie Matsui
Senior Content Developer at DataCamp
import matplotlib.pyplot as plt
dog_pack["height_cm"].hist()
plt.show()
dog_pack["height_cm"].hist(bins=20)
plt.show()
dog_pack["height_cm"].hist(bins=5)
plt.show()
avg_weight_by_breed = dog_pack.groupby("breed")["weight_kg"].mean()
print(avg_weight_by_breed)
breed
Beagle 10.636364
Boxer 30.620000
Chihuahua 1.491667
Chow Chow 22.535714
Dachshund 9.975000
Labrador 31.850000
Poodle 20.400000
St. Bernard 71.576923
Name: weight_kg, dtype: float64
avg_weight_by_breed.plot(kind="bar")
plt.show()
avg_weight_by_breed.plot(kind="bar",
title="Mean Weight by Dog Breed")
plt.show()
sully.head()
date weight_kg
0 2019-01-31 36.1
1 2019-02-28 35.3
2 2019-03-31 32.0
3 2019-04-30 32.9
4 2019-05-31 32.0
sully.plot(x="date",
y="weight_kg",
kind="line")
plt.show()
sully.plot(x="date", y="weight_kg", kind="line", rot=45)
plt.show()
dog_pack.plot(x="height_cm", y="weight_kg", kind="scatter")
plt.show()
dog_pack[dog_pack["sex"]=="F"]["height_cm"].hist() dog_pack[dog_pack["sex"]=="M"]["height_cm"].hist()
plt.show()
dog_pack[dog_pack["sex"]=="F"]["height_cm"].hist()
dog_pack[dog_pack["sex"]=="M"]["height_cm"].hist()
plt.legend(["F", "M"])
plt.show()
dog_pack[dog_pack["sex"]=="F"]["height_cm"].hist(alpha=0.7)
dog_pack[dog_pack["sex"]=="M"]["height_cm"].hist(alpha=0.7)
plt.legend(["F", "M"])
plt.show()
print(avocados)
date type year avg_price size nb_sold
0 2015-12-27 conventional 2015 0.95 small 9626901.09
1 2015-12-20 conventional 2015 0.98 small 8710021.76
2 2015-12-13 conventional 2015 0.93 small 9855053.66
... ... ... ... ... ... ...
1011 2018-01-21 organic 2018 1.63 extra_large 1490.02
1012 2018-01-14 organic 2018 1.59 extra_large 1580.01
1013 2018-01-07 organic 2018 1.51 extra_large 1289.07
[1014 rows x 6 columns]
Data Manipulation with pandas