Counting, summing, and grouping data

Introduction to Databases in Python

Jason Myers

Co-Author of Essential SQLAlchemy and Software Engineer

SQL functions

  • E.g. COUNT, SUM
  • from sqlalchemy import func
  • More efficient than processing in Python
  • Aggregate data
Introduction to Databases in Python

Sum example

from sqlalchemy import func

stmt = select([func.sum(census.columns.pop2008)])
results = connection.execute(stmt).scalar()
print(results)
302876613
Introduction to Databases in Python

Group by

  • Allows us to group row by common values
Introduction to Databases in Python

Group by

stmt = select([census.columns.sex, 
  func.sum(census.columns.pop2008)])

stmt = stmt.group_by(census.columns.sex)
results = connection.execute(stmt).fetchall()
print(results)
[('F', 153959198), ('M', 148917415)]
Introduction to Databases in Python

Group by

  • Supports multiple columns to group by with a pattern similar to order_by()
  • Requires all selected columns to be grouped or aggregated by a function
Introduction to Databases in Python

Group by multiple

stmt = select([census.columns.sex,
        census.columns.age,
        func.sum(census.columns.pop2008)
    ])

stmt = stmt.group_by(census.columns.sex, census.columns.age)
results = connection.execute(stmt).fetchall() print(results)
[('F', 0, 2105442), ('F', 1, 2087705), ('F', 2, 2037280),
('F', 3, 2012742), ('F', 4, 2014825), ('F', 5, 1991082),
('F', 6, 1977923), ('F', 7, 2005470), ('F', 8, 1925725), ...
Introduction to Databases in Python

Handling ResultSets from functions

  • SQLAlchemy auto generates "column names" for functions in the ResultSet
  • The column names are often func_# such as count_1
  • Replace them with the label() method
Introduction to Databases in Python

Using label()

print(results[0].keys())
['sex', u'sum_1']
stmt = select([census.columns.sex,
        func.sum(census.columns.pop2008).label('pop2008_sum')
    ])

stmt = stmt.group_by(census.columns.sex)
results = connection.execute(stmt).fetchall() print(results[0].keys())
['sex', 'pop2008_sum']
Introduction to Databases in Python

Let's practice!

Introduction to Databases in Python

Preparing Video For Download...