Data Manipulation with pandas
Richie Cotton
Data Evangelist at DataCamp
breeds = ["Labrador", "Poodle",
"Chow Chow", "Schnauzer",
"Labrador", "Chihuahua",
"St. Bernard"]
['Labrador',
'Poodle',
'Chow Chow',
'Schnauzer',
'Labrador',
'Chihuahua',
'St. Bernard']
breeds[2:5]
['Chow Chow', 'Schnauzer', 'Labrador']
breeds[:3]
['Labrador', 'Poodle', 'Chow Chow']
breeds[:]
['Labrador','Poodle','Chow Chow','Schnauzer',
'Labrador','Chihuahua','St. Bernard']
dogs_srt = dogs.set_index(["breed", "color"]).sort_index()
print(dogs_srt)
name height_cm weight_kg
breed color
Chihuahua Tan Stella 18 2
Chow Chow Brown Lucy 46 22
Labrador Black Max 59 29
Brown Bella 56 25
Poodle Black Charlie 43 23
Schnauzer Grey Cooper 49 17
St. Bernard White Bernie 77 74
dogs_srt.loc["Chow Chow":"Poodle"]
name height_cm weight_kg
breed color
Chow Chow Brown Lucy 46 22
Labrador Black Max 59 29
Brown Bella 56 25
Poodle Black Charlie 43 23
The final value "Poodle"
is included
Full dataset
name height_cm weight_kg
breed color
Chihuahua Tan Stella 18 2
Chow Chow Brown Lucy 46 22
Labrador Black Max 59 29
Brown Bella 56 25
Poodle Black Charlie 43 23
Schnauzer Grey Cooper 49 17
St. Bernard White Bernie 77 74
dogs_srt.loc["Tan":"Grey"]
Empty DataFrame
Columns: [name, height_cm, weight_kg]
Index: []
Full dataset
name height_cm weight_kg
breed color
Chihuahua Tan Stella 18 2
Chow Chow Brown Lucy 46 22
Labrador Black Max 59 29
Brown Bella 56 25
Poodle Black Charlie 43 23
Schnauzer Grey Cooper 49 17
St. Bernard White Bernie 77 74
dogs_srt.loc[
("Labrador", "Brown"):("Schnauzer", "Grey")]
name height_cm weight_kg
breed color
Labrador Brown Bella 56 25
Poodle Black Charlie 43 23
Schnauzer Grey Cooper 49 17
Full dataset
name height_cm weight_kg
breed color
Chihuahua Tan Stella 18 2
Chow Chow Brown Lucy 46 22
Labrador Black Max 59 29
Brown Bella 56 25
Poodle Black Charlie 43 23
Schnauzer Grey Cooper 49 17
St. Bernard White Bernie 77 74
dogs_srt.loc[:, "name":"height_cm"]
name height_cm
breed color
Chihuahua Tan Stella 18
Chow Chow Brown Lucy 46
Labrador Black Max 59
Brown Bella 56
Poodle Black Charlie 43
Schnauzer Grey Cooper 49
St. Bernard White Bernie 77
Full dataset
name height_cm weight_kg
breed color
Chihuahua Tan Stella 18 2
Chow Chow Brown Lucy 46 22
Labrador Black Max 59 29
Brown Bella 56 25
Poodle Black Charlie 43 23
Schnauzer Grey Cooper 49 17
St. Bernard White Bernie 77 74
dogs_srt.loc[
("Labrador", "Brown"):("Schnauzer", "Grey"),
"name":"height_cm"]
name height_cm
breed color
Labrador Brown Bella 56
Poodle Black Charlie 43
Schanuzer Grey Cooper 49
Full dataset
name height_cm weight_kg
breed color
Chihuahua Tan Stella 18 2
Chow Chow Brown Lucy 46 22
Labrador Black Max 59 29
Brown Bella 56 25
Poodle Black Charlie 43 23
Schnauzer Grey Cooper 49 17
St. Bernard White Bernie 77 74
dogs = dogs.set_index("date_of_birth").sort_index()
print(dogs)
name breed color height_cm weight_kg
date_of_birth
2011-12-11 Cooper Schanuzer Grey 49 17
2013-07-01 Bella Labrador Brown 56 25
2014-08-25 Lucy Chow Chow Brown 46 22
2015-04-20 Stella Chihuahua Tan 18 2
2016-09-16 Charlie Poodle Black 43 23
2017-01-20 Max Labrador Black 59 29
2018-02-27 Bernie St. Bernard White 77 74
# Get dogs with date_of_birth between 2014-08-25 and 2016-09-16
dogs.loc["2014-08-25":"2016-09-16"]
name breed color height_cm weight_kg
date_of_birth
2014-08-25 Lucy Chow Chow Brown 46 22
2015-04-20 Stella Chihuahua Tan 18 2
2016-09-16 Charlie Poodle Black 43 23
# Get dogs with date_of_birth between 2014-01-01 and 2016-12-31
dogs.loc["2014":"2016"]
name breed color height_cm weight_kg
date_of_birth
2014-08-25 Lucy Chow Chow Brown 46 22
2015-04-20 Stella Chihuahua Tan 18 2
2016-09-16 Charlie Poodle Black 43 23
print(dogs.iloc[2:5, 1:4])
breed color height_cm
2 Chow Chow Brown 46
3 Schnauzer Grey 49
4 Labrador Black 59
Full dataset
name breed color height_cm weight_kg
0 Bella Labrador Brown 56 25
1 Charlie Poodle Black 43 23
2 Lucy Chow Chow Brown 46 22
3 Cooper Schnauzer Grey 49 17
4 Max Labrador Black 59 29
5 Stella Chihuahua Tan 18 2
6 Bernie St. Bernard White 77 74
Data Manipulation with pandas