ROLLUP and CUBE

PostgreSQL Summary Stats and Window Functions

Michel Semaan

Data Scientist

Group-level totals

Chinese and Russian medals in the 2008 Summer Olympics per medal class

| Country | Medal  | Awards |
|---------|--------|--------|
| CHN     | Bronze | 57     |
| CHN     | Gold   | 74     |
| CHN     | Silver | 53     |
| CHN     | Total  | 184    |
| RUS     | Bronze | 56     |
| RUS     | Gold   | 43     |
| RUS     | Silver | 44     |
| RUS     | Total  | 143    |
PostgreSQL Summary Stats and Window Functions

The old way

SELECT
  Country, Medal, COUNT(*) AS Awards
FROM Summer_Medals
WHERE
  Year = 2008 AND Country IN ('CHN', 'RUS')
GROUP BY Country, Medal
ORDER BY Country ASC, Medal ASC

UNION ALL SELECT Country, 'Total', COUNT(*) AS Awards FROM Summer_Medals WHERE Year = 2008 AND Country IN ('CHN', 'RUS') GROUP BY Country, 2 ORDER BY Country ASC;
PostgreSQL Summary Stats and Window Functions

Enter ROLLUP

SELECT
  Country, Medal, COUNT(*) AS Awards
FROM Summer_Medals
WHERE
  Year = 2008 AND Country IN ('CHN', 'RUS')
GROUP BY Country, ROLLUP(Medal)
ORDER BY Country ASC, Medal ASC;
  • ROLLUP is a GROUP BY subclause that includes extra rows for group-level aggregations
  • GROUP BY Country, ROLLUP(Medal) will count all Country- and Medal-level totals, then count only Country-level totals and fill in Medal with nulls for these rows
PostgreSQL Summary Stats and Window Functions

ROLLUP - Query

SELECT
  Country, Medal, COUNT(*) AS Awards
FROM summer_medals
WHERE
  Year = 2008 AND Country IN ('CHN', 'RUS')
GROUP BY ROLLUP(Country, Medal)
ORDER BY Country ASC, Medal ASC;
  • ROLLUP is hierarchical, de-aggregating from the leftmost provided column to the right-most
    • ROLLUP(Country, Medal) includes Country-level totals
    • ROLLUP(Medal, Country) includes Medal-level totals
    • Both include grand totals
PostgreSQL Summary Stats and Window Functions

ROLLUP - Result

| Country | Medal  | Awards |
|---------|--------|--------|
| CHN     | Bronze | 57     |
| CHN     | Gold   | 74     |
| CHN     | Silver | 53     |
| CHN     | null   | 184    |
| RUS     | Bronze | 56     |
| RUS     | Gold   | 43     |
| RUS     | Silver | 44     |
| RUS     | null   | 143    |
| null    | null   | 327    |
  • Group-level totals contain nulls; the row with all nulls is the grand total
  • Notice that it didn't include Medal-level totals, since it's ROLLUP(Country, Medal) and not ROLLUP(Medal, Country)
PostgreSQL Summary Stats and Window Functions

Enter CUBE

SELECT
  Country, Medal, COUNT(*) AS Awards
FROM summer_medals
WHERE
  Year = 2008 AND Country IN ('CHN', 'RUS')
GROUP BY CUBE(Country, Medal)
ORDER BY Country ASC, Medal ASC;
  • CUBE is a non-hierarchical ROLLUP
  • It generates all possible group-level aggregations
    • CUBE(Country, Medal) counts Country-level, Medal-level, and grand totals
PostgreSQL Summary Stats and Window Functions

CUBE - Result

| Country | Medal  | Awards |
|---------|--------|--------|
| CHN     | Bronze | 57     |
| CHN     | Gold   | 74     |
| CHN     | Silver | 53     |
| CHN     | null   | 184    |
| RUS     | Bronze | 56     |
| RUS     | Gold   | 43     |
| RUS     | Silver | 44     |
| RUS     | null   | 143    |
| null    | Bronze | 113    |
| null    | Gold   | 117    |
| null    | Silver | 97     |
| null    | null   | 327    |
  • Notice that Medal-level totals are included
PostgreSQL Summary Stats and Window Functions

ROLLUP vs CUBE

Source

| Year | Quarter | Sales |
|------|---------|-------|
| 2008 | Q1      | 12    |
| 2008 | Q2      | 15    |
| 2009 | Q1      | 21    |
| 2009 | Q2      | 27    |
  • Use ROLLUP when you have hierarchical data (e.g., date parts) and don't want all possible group-level aggregations
  • Use CUBE when you want all possible group-level aggregations

ROLLUP(Year, Quarter)

| Year | Quarter | Sales |
|------|---------|-------|
| 2008 | null    | 27    |
| 2009 | null    | 48    |
| null | null    | 75    |

CUBE(Year, Quarter)

Above rows + the following

| Year | Quarter | Sales |
|------|---------|-------|
| null | Q1      | 33    |
| null | Q2      | 42    |
PostgreSQL Summary Stats and Window Functions

Let's practice!

PostgreSQL Summary Stats and Window Functions

Preparing Video For Download...