PostgreSQL Summary Stats and Window Functions
Michel Semaan
Data Scientist
Chinese and Russian medals in the 2008 Summer Olympics per medal class
| Country | Medal | Awards |
|---------|--------|--------|
| CHN | Bronze | 57 |
| CHN | Gold | 74 |
| CHN | Silver | 53 |
| CHN | Total | 184 |
| RUS | Bronze | 56 |
| RUS | Gold | 43 |
| RUS | Silver | 44 |
| RUS | Total | 143 |
SELECT Country, Medal, COUNT(*) AS Awards FROM Summer_Medals WHERE Year = 2008 AND Country IN ('CHN', 'RUS') GROUP BY Country, Medal ORDER BY Country ASC, Medal ASCUNION ALL SELECT Country, 'Total', COUNT(*) AS Awards FROM Summer_Medals WHERE Year = 2008 AND Country IN ('CHN', 'RUS') GROUP BY Country, 2 ORDER BY Country ASC;
SELECT
Country, Medal, COUNT(*) AS Awards
FROM Summer_Medals
WHERE
Year = 2008 AND Country IN ('CHN', 'RUS')
GROUP BY Country, ROLLUP(Medal)
ORDER BY Country ASC, Medal ASC;
ROLLUP is a GROUP BY subclause that includes extra rows for group-level aggregationsGROUP BY Country, ROLLUP(Medal) will count all Country- and Medal-level totals, then count only Country-level totals and fill in Medal with nulls for these rowsSELECT
Country, Medal, COUNT(*) AS Awards
FROM summer_medals
WHERE
Year = 2008 AND Country IN ('CHN', 'RUS')
GROUP BY ROLLUP(Country, Medal)
ORDER BY Country ASC, Medal ASC;
ROLLUP is hierarchical, de-aggregating from the leftmost provided column to the right-mostROLLUP(Country, Medal) includes Country-level totalsROLLUP(Medal, Country) includes Medal-level totals| Country | Medal | Awards |
|---------|--------|--------|
| CHN | Bronze | 57 |
| CHN | Gold | 74 |
| CHN | Silver | 53 |
| CHN | null | 184 |
| RUS | Bronze | 56 |
| RUS | Gold | 43 |
| RUS | Silver | 44 |
| RUS | null | 143 |
| null | null | 327 |
nulls; the row with all nulls is the grand totalMedal-level totals, since it's ROLLUP(Country, Medal) and not ROLLUP(Medal, Country)SELECT
Country, Medal, COUNT(*) AS Awards
FROM summer_medals
WHERE
Year = 2008 AND Country IN ('CHN', 'RUS')
GROUP BY CUBE(Country, Medal)
ORDER BY Country ASC, Medal ASC;
CUBE is a non-hierarchical ROLLUPCUBE(Country, Medal) counts Country-level, Medal-level, and grand totals| Country | Medal | Awards |
|---------|--------|--------|
| CHN | Bronze | 57 |
| CHN | Gold | 74 |
| CHN | Silver | 53 |
| CHN | null | 184 |
| RUS | Bronze | 56 |
| RUS | Gold | 43 |
| RUS | Silver | 44 |
| RUS | null | 143 |
| null | Bronze | 113 |
| null | Gold | 117 |
| null | Silver | 97 |
| null | null | 327 |
Medal-level totals are includedSource
| Year | Quarter | Sales |
|------|---------|-------|
| 2008 | Q1 | 12 |
| 2008 | Q2 | 15 |
| 2009 | Q1 | 21 |
| 2009 | Q2 | 27 |
ROLLUP when you have hierarchical data (e.g., date parts) and don't want all possible group-level aggregationsCUBE when you want all possible group-level aggregationsROLLUP(Year, Quarter)
| Year | Quarter | Sales |
|------|---------|-------|
| 2008 | null | 27 |
| 2009 | null | 48 |
| null | null | 75 |
CUBE(Year, Quarter)
Above rows + the following
| Year | Quarter | Sales |
|------|---------|-------|
| null | Q1 | 33 |
| null | Q2 | 42 |
PostgreSQL Summary Stats and Window Functions