Aggregating and summarizing

Intermediate Python for Finance

Kennedy Behrman

Data Engineer, Author, Founder

DataFrame methods

  • .count()
  • .min()
  • .max()
  • .first()
  • .last()
  • .sum()
  • .prod()
  • .mean()
  • .median()
  • .std()
  • .var()
Intermediate Python for Finance

Axis

Rows

  • default
  • axis=0
  • axis='rows'

Columns

  • axis=1
  • axis='columns'
Intermediate Python for Finance

Count

AAD GDDL IMA
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99
2020-10-05 300.00 80.00 45.33
2020-10-07 302.90 82.92 49.00
df.count()
AAD     4
GDDL    4
IMA     4
dtype: int64
Intermediate Python for Finance

Sum

AAD GDDL IMA
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99
2020-10-05 300.00 80.00 45.33
2020-10-07 302.90 82.92 49.00
df.sum(axis=1)
2020-10-03    415.44
2020-10-04    426.47
2020-10-05    425.33
2020-10-07    434.82
dtype: float64
Intermediate Python for Finance

Product

AAD GDDL IMA
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99
2020-10-05 300.00 80.00 45.33
2020-10-07 302.90 82.92 49.00
df.prod(axis='columns')
2020-10-03    9.022416e+05
2020-10-04    1.084987e+06
2020-10-05    1.087920e+06
2020-10-07    1.230707e+06
dtype: float64
Intermediate Python for Finance

Mean

AAD GDDL IMA
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99
2020-10-05 300.00 80.00 45.33
2020-10-07 302.90 82.92 49.00
df.mean()
AAD     301.1525
GDDL     79.5575
IMA      44.8050
dtype: float64
Intermediate Python for Finance

Median

AAD GDDL IMA
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99
2020-10-05 300.00 80.00 45.33
2020-10-07 302.90 82.92 49.00
df.median()
AAD     300.855
GDDL     79.995
IMA      45.160
dtype: float64
Intermediate Python for Finance

Standard deviation

AAD GDDL IMA
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99
2020-10-05 300.00 80.00 45.33
2020-10-07 302.90 82.92 49.00
df.std()
AAD     1.337345
GDDL    3.143548
IMA     3.740183
dtype: float64
Intermediate Python for Finance

Variance

AAD GDDL IMA
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99
2020-10-05 300.00 80.00 45.33
2020-10-07 302.90 82.92 49.00
df.var()
AAD      1.788492
GDDL     9.881892
IMA     13.988967
dtype: float64
Intermediate Python for Finance

Columns and rows

AAD GDDL IMA
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99
2020-10-05 300.00 80.00 45.33
2020-10-07 302.90 82.92 49.00
df.loc[:,'AAD'].max()
302.9
df.iloc[0].min()
39.9
Intermediate Python for Finance

Let's practice!

Intermediate Python for Finance

Preparing Video For Download...