Introduction to Python
Hugo Bowne-Anderson
Data Scientist at DataCamp
Get to know your data
Little data -> simply look at it
Big data -> ?
import numpy as np
np_city = ... # Implementation left out
np_city
array([[1.64, 71.78],
[1.37, 63.35],
[1.6 , 55.09],
...,
[2.04, 74.85],
[2.04, 68.72],
[2.01, 73.57]])
np.mean(np_city[:, 0])
1.7472
np.median(np_city[:, 0])
1.75
np.corrcoef(np_city[:, 0], np_city[:, 1])
array([[ 1. , -0.01802],
[-0.01803, 1. ]])
np.std(np_city[:, 0])
0.1992
sum(), sort(), ...
Enforce single data type: speed!
np.random.normal()
height = np.round(np.random.normal(1.75, 0.20, 5000), 2) weight = np.round(np.random.normal(60.32, 15, 5000), 2)
np_city = np.column_stack((height, weight))
Introduction to Python