Introduction

Estatísticas Resumo e Funções de Janela no PostgreSQL

Michel Semaan

Data Scientist

Motivation

USA total and running total of Summer Olympics gold medals since 2004

| Year | Medals | Medals_RT |
|------|--------|-----------|
| 2004 | 116    | 116       |
| 2008 | 125    | 241       |
| 2012 | 147    | 388       |

Discus throw reigning champion status

| Year | Champion | Last_Champion | Reigning_Champion |
|------|----------|---------------|-------------------|
| 1996 | GER      | null          | false             |
| 2000 | LTU      | GER           | false             |
| 2004 | LTU      | LTU           | true              |
| 2008 | EST      | LTU           | false             |
| 2012 | GER      | EST           | false             |
Estatísticas Resumo e Funções de Janela no PostgreSQL

Course outline

  1. Introduction to window functions
  2. Fetching, ranking, and paging
  3. Aggregate window functions and frames
  4. Beyond window functions
Estatísticas Resumo e Funções de Janela no PostgreSQL

Summer olympics dataset

  • Each row represents a medal awarded in the Summer Olympics games

Columns

  • Year, City
  • Sport, Discipline, Event
  • Athlete, Country, Gender
  • Medal
Estatísticas Resumo e Funções de Janela no PostgreSQL

Window functions

  • Perform an operation across a set of rows that are somehow related to the current row
  • Similar to GROUP BY aggregate functions, but all rows remain in the output

Uses

  • Fetching values from preceding or following rows (e.g. fetching the previous row's value)
    • Determining reigning champion status
    • Calculating growth over time
  • Assigning ordinal ranks (1st, 2nd, etc.) to rows based on their values' positions in a sorted list
  • Running totals, moving averages
Estatísticas Resumo e Funções de Janela no PostgreSQL

Row numbers

Query

SELECT
  Year, Event, Country
FROM Summer_Medals
WHERE
  Medal = 'Gold';

Result

| Year | Event                      | Country |
|------|----------------------------|---------|
| 1896 | 100M Freestyle             | HUN     |
| 1896 | 100M Freestyle For Sailors | GRE     |
| 1896 | 1200M Freestyle            | HUN     |
| ...  | ...                        | ...     |
Estatísticas Resumo e Funções de Janela no PostgreSQL

Enter ROW_NUMBER

Query

SELECT
  Year, Event, Country,
  ROW_NUMBER() OVER () AS Row_N
FROM Summer_Medals
WHERE
  Medal = 'Gold';

Result

| Year | Event                      | Country | Row_N |
|------|----------------------------|---------|-------|
| 1896 | 100M Freestyle             | HUN     | 1     |
| 1896 | 100M Freestyle For Sailors | GRE     | 2     |
| 1896 | 1200M Freestyle            | HUN     | 3     |
| ...  | ...                        | ...     | ...   |
Estatísticas Resumo e Funções de Janela no PostgreSQL

Anatomy of a window function

Query

SELECT
  Year, Event, Country,
  ROW_NUMBER() OVER () AS Row_N
FROM Summer_Medals
WHERE
  Medal = 'Gold';
  • FUNCTION_NAME() OVER (...)
    • ORDER BY
    • PARTITION BY
    • ROWS/RANGE PRECEDING/FOLLOWING/UNBOUNDED
Estatísticas Resumo e Funções de Janela no PostgreSQL

Let's practice!

Estatísticas Resumo e Funções de Janela no PostgreSQL

Preparing Video For Download...