PostgreSQL Summary Stats and Window Functions
Michel Semaan
Data Scientist
Chinese and Russian medals in the 2008 Summer Olympics per medal class
| Country | Medal | Awards |
|---------|--------|--------|
| CHN | Bronze | 57 |
| CHN | Gold | 74 |
| CHN | Silver | 53 |
| CHN | Total | 184 |
| RUS | Bronze | 56 |
| RUS | Gold | 43 |
| RUS | Silver | 44 |
| RUS | Total | 143 |
SELECT Country, Medal, COUNT(*) AS Awards FROM Summer_Medals WHERE Year = 2008 AND Country IN ('CHN', 'RUS') GROUP BY Country, Medal ORDER BY Country ASC, Medal ASC
UNION ALL SELECT Country, 'Total', COUNT(*) AS Awards FROM Summer_Medals WHERE Year = 2008 AND Country IN ('CHN', 'RUS') GROUP BY Country, 2 ORDER BY Country ASC;
SELECT
Country, Medal, COUNT(*) AS Awards
FROM Summer_Medals
WHERE
Year = 2008 AND Country IN ('CHN', 'RUS')
GROUP BY Country, ROLLUP(Medal)
ORDER BY Country ASC, Medal ASC;
ROLLUP
is a GROUP BY
subclause that includes extra rows for group-level aggregationsGROUP BY Country, ROLLUP(Medal)
will count all Country
- and Medal
-level totals, then count only Country
-level totals and fill in Medal
with null
s for these rowsSELECT
Country, Medal, COUNT(*) AS Awards
FROM summer_medals
WHERE
Year = 2008 AND Country IN ('CHN', 'RUS')
GROUP BY ROLLUP(Country, Medal)
ORDER BY Country ASC, Medal ASC;
ROLLUP
is hierarchical, de-aggregating from the leftmost provided column to the right-mostROLLUP(Country, Medal)
includes Country
-level totalsROLLUP(Medal, Country)
includes Medal
-level totals| Country | Medal | Awards |
|---------|--------|--------|
| CHN | Bronze | 57 |
| CHN | Gold | 74 |
| CHN | Silver | 53 |
| CHN | null | 184 |
| RUS | Bronze | 56 |
| RUS | Gold | 43 |
| RUS | Silver | 44 |
| RUS | null | 143 |
| null | null | 327 |
nulls
; the row with all null
s is the grand totalMedal
-level totals, since it's ROLLUP(Country, Medal)
and not ROLLUP(Medal, Country)
SELECT
Country, Medal, COUNT(*) AS Awards
FROM summer_medals
WHERE
Year = 2008 AND Country IN ('CHN', 'RUS')
GROUP BY CUBE(Country, Medal)
ORDER BY Country ASC, Medal ASC;
CUBE
is a non-hierarchical ROLLUP
CUBE(Country, Medal)
counts Country
-level, Medal
-level, and grand totals| Country | Medal | Awards |
|---------|--------|--------|
| CHN | Bronze | 57 |
| CHN | Gold | 74 |
| CHN | Silver | 53 |
| CHN | null | 184 |
| RUS | Bronze | 56 |
| RUS | Gold | 43 |
| RUS | Silver | 44 |
| RUS | null | 143 |
| null | Bronze | 113 |
| null | Gold | 117 |
| null | Silver | 97 |
| null | null | 327 |
Medal
-level totals are includedSource
| Year | Quarter | Sales |
|------|---------|-------|
| 2008 | Q1 | 12 |
| 2008 | Q2 | 15 |
| 2009 | Q1 | 21 |
| 2009 | Q2 | 27 |
ROLLUP
when you have hierarchical data (e.g., date parts) and don't want all possible group-level aggregationsCUBE
when you want all possible group-level aggregationsROLLUP(Year, Quarter)
| Year | Quarter | Sales |
|------|---------|-------|
| 2008 | null | 27 |
| 2009 | null | 48 |
| null | null | 75 |
CUBE(Year, Quarter)
Above rows + the following
| Year | Quarter | Sales |
|------|---------|-------|
| null | Q1 | 33 |
| null | Q2 | 42 |
PostgreSQL Summary Stats and Window Functions