Working with pivot tables

Data Manipulation with pandas

Richie Cotton

Data Evangelist at DataCamp

A bigger dog dataset

print(dog_pack)
          breed  color  height_cm  weight_kg
0         Boxer  Brown      62.64       30.4
1        Poodle  Black      46.41       20.4
2        Beagle  Brown      36.39       12.4
3     Chihuahua    Tan      19.70        1.6
4      Labrador    Tan      54.44       36.1
..          ...    ...        ...        ...
87        Boxer   Gray      58.13       29.9
88  St. Bernard  White      70.13       69.4
89       Poodle   Gray      51.30       20.4
90       Beagle  White      38.81        8.8
91       Beagle  Black      33.40       13.5
Data Manipulation with pandas

Pivoting the dog pack

dogs_height_by_breed_vs_color = dog_pack.pivot_table(
        "height_cm", index="breed", columns="color")
print(dogs_height_by_breed_vs_color)
color            Black    Brown       Gray        Tan      White
breed                                                           
Beagle       34.500000  36.4500  36.313333  35.740000  38.810000
Boxer        57.203333  62.6400  58.280000  62.310000  56.360000
Chihuahua    18.555000      NaN  21.660000  20.096667  17.933333
Chow Chow    51.262500  50.4800        NaN  53.497500  54.413333
Dachshund    21.186667  19.7250        NaN  19.375000  20.660000
Labrador     57.125000      NaN        NaN  55.190000  55.310000
Poodle       48.036000  57.1300  56.645000        NaN  44.740000
St. Bernard  63.920000  65.8825  67.640000  68.334000  67.495000
Data Manipulation with pandas

.loc[] + slicing is a power combo

dogs_height_by_breed_vs_color.loc["Chow Chow":"Poodle"]
color          Black   Brown    Gray      Tan      White
breed                                                   
Chow Chow  51.262500  50.480     NaN  53.4975  54.413333
Dachshund  21.186667  19.725     NaN  19.3750  20.660000
Labrador   57.125000     NaN     NaN  55.1900  55.310000
Poodle     48.036000  57.130  56.645      NaN  44.740000
Data Manipulation with pandas

The axis argument

dogs_height_by_breed_vs_color.mean(axis="index")
color
Black    43.973563
Brown    48.717917
Gray     48.107667
Tan      44.934738
White    44.465208
dtype: float64
Data Manipulation with pandas

Calculating summary stats across columns

dogs_height_by_breed_vs_color.mean(axis="columns")
breed
Beagle         36.362667
Boxer          59.358667
Chihuahua      19.561250
Chow Chow      52.413333
Dachshund      20.236667
Labrador       55.875000
Poodle         51.637750
St. Bernard    66.654300
dtype: float64
Data Manipulation with pandas

Let's practice!

Data Manipulation with pandas

Preparing Video For Download...