Writing Efficient Code with pandas
Leonidas Souliotis
PhD Candidate
data_sqrt = poker.apply(lambda x: np.sqrt)
head(data_sqrt, 4)
| | S1 | R1 | S2 | R2 | S3 | R3 |
|----|----------|----------|----------|----------|----------|----------|
| 0 | 1.000000 | 3.162278 | 1.000000 | 3.316625 | 3.464102 | 1.000000 |
| 1 | 1.414214 | 3.316625 | 1.414214 | 3.605551 | 1.414214 | 3.162278 |
| 2 | 1.732051 | 3.464102 | 1.732051 | 3.316625 | 1.732051 | 3.605551 |
| 3 | 2.000000 | 3.162278 | 2.000000 | 3.316625 | 2.000000 | 1.000000 |
data_sqrt_2 = np.sqrt(poker)
apply_start_time = time.time()
poker[['R1', 'R2', 'R3', 'R4', 'R5']].apply(lambda x: sum(x), axis=1)
print("Time using .apply(): {} sec".format(time.time() - apply_start_time))
Time using .apply(): 0.636334896088 sec
start_time = time.time()
for ind, value in poker.iterrows():
sum([value[1], value[3], value[5], value[7], value[9]])
print("Time using .iterrows(): {} sec".format(time.time() - start_time))
Time using .iterrows(): 3.15526986122 sec
Difference in speed: 395.85051529%
start_time = time.time()
poker[['R1', 'R2', 'R3', 'R4', 'R5']].apply(lambda x: sum(x), axis=0)
print("Time using .apply(): {} sec".format(time.time() - apply_start_time))
Time using .apply(): 0.00490880012 seconds
start_time = time.time()
poker[['R1', 'R1', 'R3', 'R4', 'R5']].sum(axis=0)
print("Time using pandas: {} sec".format(time.time() - start_time))
Time using pandas: 0.00279092788 sec
Difference in speed: 160.310951649%
Writing Efficient Code with pandas