PARTITION BY

PostgreSQL Samenvattingsstatistieken en vensterfuncties

Michel Semaan

Data Scientist

Motivatie

Query

WITH Discus_Gold AS (
  SELECT
    Year, Event, Country AS Champion
  FROM Summer_Medals
  WHERE
    Year IN (2004, 2008, 2012)
    AND Gender = 'Men' AND Medal = 'Gold'
    AND Event IN ('Discus Throw', 'Triple Jump')
    AND Gender = 'Men')

SELECT
  Year, Event, Champion,
  LAG(Champion) OVER
    (ORDER BY Event ASC, Year ASC) AS Last_Champion
FROM Discus_Gold
ORDER BY Event ASC, Year ASC;

Resultaat

| Year | Event        | Champion | Last_Champion |
|------|--------------|----------|---------------|
| 2004 | Discus Throw | LTU      | null          |
| 2008 | Discus Throw | EST      | LTU           |
| 2012 | Discus Throw | GER      | EST           |
| 2004 | Triple Jump  | SWE      | GER           |
| 2008 | Triple Jump  | POR      | SWE           |
| 2012 | Triple Jump  | USA      | POR           |
  • Wanneer Event verandert van Discus Throw naar Triple Jump, haalt LAG de laatste kampioen van Discus Throw op in plaats van null
PostgreSQL Samenvattingsstatistieken en vensterfuncties

Maak kennis met PARTITION BY

  • PARTITION BY splitst de tabel in partities op basis van unieke kolomwaarden
    • De resultaten worden niet samengevoegd tot één kolom
  • Wordt per partitie uitgevoerd door de windowfunctie
    • ROW_NUMBER reset voor elke partitie
    • LAG haalt alleen de vorige waarde op als de vorige rij in dezelfde partitie staat
PostgreSQL Samenvattingsstatistieken en vensterfuncties

Partitioneren op één kolom

Query

WITH Discus_Gold AS (...)

SELECT
  Year, Event, Champion,
  LAG(Champion) OVER
    (PARTITION BY Event
     ORDER BY Event ASC, Year ASC) AS Last_Champion
FROM Discus_Gold
ORDER BY Event ASC, Year ASC;

Resultaat

| Year | Event        | Champion | Last_Champion |
|------|--------------|----------|---------------|
| 2004 | Discus Throw | LTU      | null          |
| 2008 | Discus Throw | EST      | LTU           |
| 2012 | Discus Throw | GER      | EST           |
| 2004 | Triple Jump  | SWE      | null          |
| 2008 | Triple Jump  | POR      | SWE           |
| 2012 | Triple Jump  | USA      | POR           |
PostgreSQL Samenvattingsstatistieken en vensterfuncties

Complexer partitioneren

| Year | Country | Event                | Row_N |
|------|---------|----------------------|-------|
| 2008 | CHN     | + 78KG (Heavyweight) | 1     |
| 2008 | CHN     | - 49 KG              | 2     |
| ...  | ...     | ...                  | ...   |
| 2008 | JPN     | 48 - 55KG            | 27    |
| 2008 | JPN     | 48 - 55KG            | 28    |
| ...  | ...     | ...                  | ...   |
| 2012 | CHN     | +75KG                | 32    |
| 2012 | CHN     | - 49 KG              | 33    |
| ...  | ...     | ...                  | ...   |
| 2012 | JPN     | +75KG                | 51    |
| 2012 | JPN     | - 49 KG              | 52    |
| ...  | ...     | ...                  | ...   |
  • Rownummer moet resetten per Year en Country
PostgreSQL Samenvattingsstatistieken en vensterfuncties

Partitioneren op meerdere kolommen

Query

WITH Country_Gold AS (
  SELECT
    DISTINCT Year, Country, Event
  FROM Summer_Medals
  WHERE
    Year IN (2008, 2012)
    AND Country IN ('CHN', 'JPN')
    AND Gender = 'Women' AND Medal = 'Gold')

SELECT
  Year, Country, Event,
  ROW_NUMBER() OVER (PARTITION BY Year, Country)
FROM Country_Gold;

Resultaat

| Year | Country | Event                | Row_N |
|------|---------|----------------------|-------|
| 2008 | CHN     | + 78KG (Heavyweight) | 1     |
| 2008 | CHN     | - 49 KG              | 2     |
| ...  | ...     | ...                  | ...   |
| 2008 | JPN     | 48 - 55KG            | 1     |
| 2008 | JPN     | 48 - 55KG            | 2     |
| ...  | ...     | ...                  | ...   |
| 2012 | CHN     | +75KG                | 1     |
| 2012 | CHN     | - 49 KG              | 2     |
| ...  | ...     | ...                  | ...   |
| 2012 | JPN     | +75KG                | 1     |
| 2012 | JPN     | - 49 KG              | 2     |
| ...  | ...     | ...                  | ...   |
PostgreSQL Samenvattingsstatistieken en vensterfuncties

Laten we oefenen!

PostgreSQL Samenvattingsstatistieken en vensterfuncties

Preparing Video For Download...