Data Manipulation with pandas
Richie Cotton
Data Evangelist at DataCamp
Chapter 3: Slicing and Indexing Data
Chapter 4: Creating and Visualizing Data
Name | Breed | Color | Height (cm) | Weight (kg) | Date of Birth |
---|---|---|---|---|---|
Bella | Labrador | Brown | 56 | 25 | 2013-07-01 |
Charlie | Poodle | Black | 43 | 23 | 2016-09-16 |
Lucy | Chow Chow | Brown | 46 | 22 | 2014-08-25 |
Cooper | Schnauzer | Gray | 49 | 17 | 2011-12-11 |
Max | Labrador | Black | 59 | 29 | 2017-01-20 |
Stella | Chihuahua | Tan | 18 | 2 | 2015-04-20 |
Bernie | St. Bernard | White | 77 | 74 | 2018-02-27 |
print(dogs)
name breed color height_cm weight_kg date_of_birth
0 Bella Labrador Brown 56 24 2013-07-01
1 Charlie Poodle Black 43 24 2016-09-16
2 Lucy Chow Chow Brown 46 24 2014-08-25
3 Cooper Schnauzer Gray 49 17 2011-12-11
4 Max Labrador Black 59 29 2017-01-20
5 Stella Chihuahua Tan 18 2 2015-04-20
6 Bernie St. Bernard White 77 74 2018-02-27
print(dogs.head())
name breed color height_cm weight_kg date_of_birth
0 Bella Labrador Brown 56 24 2013-07-01
1 Charlie Poodle Black 43 24 2016-09-16
2 Lucy Chow Chow Brown 46 24 2014-08-25
3 Cooper Schnauzer Gray 49 17 2011-12-11
4 Max Labrador Black 59 29 2017-01-20
print(dogs.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 6 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 name 7 non-null object
1 breed 7 non-null object
2 color 7 non-null object
3 height_cm 7 non-null int64
4 weight_kg 7 non-null int64
5 date_of_birth 7 non-null object
dtypes: int64(2), object(4)
memory usage: 464.0+ bytes
print(dogs.shape)
(7, 6)
print(dogs.describe())
height_cm weight_kg
count 7.000000 7.000000
mean 49.714286 27.428571
std 17.960274 22.292429
min 18.000000 2.000000
25% 44.500000 19.500000
50% 49.000000 23.000000
75% 57.500000 27.000000
max 77.000000 74.000000
print(dogs.values)
array([['Bella', 'Labrador', 'Brown', 56, 24, '2013-07-01'],
['Charlie', 'Poodle', 'Black', 43, 24, '2016-09-16'],
['Lucy', 'Chow Chow', 'Brown', 46, 24, '2014-08-25'],
['Cooper', 'Schnauzer', 'Gray', 49, 17, '2011-12-11'],
['Max', 'Labrador', 'Black', 59, 29, '2017-01-20'],
['Stella', 'Chihuahua', 'Tan', 18, 2, '2015-04-20'],
['Bernie', 'St. Bernard', 'White', 77, 74, '2018-02-27']],
dtype=object)
print(dogs.columns)
Index(['name', 'breed', 'color', 'height_cm', 'weight_kg', 'date_of_birth'],
dtype='object')
dogs.index
RangeIndex(start=0, stop=7, step=1)
There should be one -- and preferably only one -- obvious way to do it.
- The Zen of Python by Tim Peters, Item 13
Data Manipulation with pandas