Looping using the .apply() function

Writing Efficient Code with pandas

Leonidas Souliotis

PhD Candidate

The .apply() function

data_sqrt = poker.apply(lambda x: np.sqrt)
head(data_sqrt, 4)
|     | S1         | R1       | S2       | R2       | S3       | R3     |
|----|----------|----------|----------|----------|----------|----------|
| 0  | 1.000000 | 3.162278 | 1.000000 | 3.316625 | 3.464102 | 1.000000 |
| 1  | 1.414214 | 3.316625 | 1.414214 | 3.605551 | 1.414214 | 3.162278 |
| 2  | 1.732051 | 3.464102 | 1.732051 | 3.316625 | 1.732051 | 3.605551 |
| 3  | 2.000000 | 3.162278 | 2.000000 | 3.316625 | 2.000000 | 1.000000 |
data_sqrt_2 = np.sqrt(poker)
Writing Efficient Code with pandas

The .apply() function for rows

apply_start_time = time.time()
poker[['R1', 'R2', 'R3', 'R4', 'R5']].apply(lambda x: sum(x), axis=1)
print("Time using .apply(): {} sec".format(time.time() - apply_start_time))
Time using .apply(): 0.636334896088 sec
start_time = time.time()
for ind, value in poker.iterrows():
    sum([value[1], value[3], value[5], value[7], value[9]])
print("Time using .iterrows(): {} sec".format(time.time() - start_time))
Time using .iterrows(): 3.15526986122 sec
Difference in speed: 395.85051529%
Writing Efficient Code with pandas

The .apply() function for columns

start_time = time.time()
poker[['R1', 'R2', 'R3', 'R4', 'R5']].apply(lambda x: sum(x), axis=0)
print("Time using .apply(): {} sec".format(time.time() - apply_start_time))
Time using .apply(): 0.00490880012 seconds
start_time = time.time()
poker[['R1', 'R1', 'R3', 'R4', 'R5']].sum(axis=0)
print("Time using pandas: {} sec".format(time.time() - start_time))
Time using pandas: 0.00279092788 sec
Difference in speed: 160.310951649%
Writing Efficient Code with pandas

Let's do it!

Writing Efficient Code with pandas

Preparing Video For Download...