Writing Efficient Code with pandas
Leonidas Souliotis
PhD candidate
df = pd.DataFrame({'Col1':[0, 1,
2, 3, 4, 5, 6]}, dtype=np.int8)
print(df)
| | Col1 |
|--------|------|
| 0 | 0 |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 6 | 6 |
nd = np.array(range(7))
print(nd)
[0 1 2 3 4 5 6]
start_time = time.time()
poker[['R1', 'R2', 'R3', 'R4', 'R5']].values.sum(axis=1)
print("Time using NumPy vectorization: {} sec(time.time() - start_time))
Results from the above operation calculated in 0.00157618522644 seconds
start_time = time.time()
poker[['R1', 'R2', 'R3', 'R4', 'R5']].sum(axis=1)
print("Results from the above operation calculated in %s seconds" % (time.time() - start_time))
Results from the above operation calculated in 0.00268197059631 seconds
Difference in time: 39.0482%
Writing Efficient Code with pandas