ROLLUP

Data-Driven Decision Making in SQL

Bart Baesens

Professor Data Science and Analytics

Table renting_extended

The first few rows of the table renting_extended:

| renting_id | country  | genre  | rating |
|------------|----------|--------|--------|
| 2          | Belgium  | Drama  | 10     |
| 32         | Belgium  | Drama  | 10     |
| 203        | Austria  | Drama  | 6      |
| 292        | Austria  | Comedy | 8      |
| 363        | Belgium  | Drama  | 7      |
| .......... | ........ | ...... | ...... |
Data-Driven Decision Making in SQL

Query with ROLLUP

SELECT country, 
       genre, 
       COUNT(*)
FROM renting_extended
GROUP BY ROLLUP (country, genre);
  • Levels of aggregation
    • Aggregation of each combination of country and genre
    • Aggregation of country alone
    • Total aggregation
Data-Driven Decision Making in SQL

Query with ROLLUP

SELECT country, 
       genre, 
       COUNT(*)
FROM renting_extended
GROUP BY ROLLUP (country, genre);
| country | genre  | count |
|---------|--------|-------|
| null    | null   | 22    |
| Austria | Comedy | 2     |
| Belgium | Drama  | 15    |
| Austria | Drama  | 4     |
| Belgium | Comedy | 1     |
| Belgium | null   | 16    |
| Austria | null   | 6     |
Data-Driven Decision Making in SQL

Order in ROLLUP

SELECT country, 
       genre, 
       COUNT(*)
FROM renting_extended
GROUP BY ROLLUP (genre, country);
| country | genre  | count |
|---------|--------|-------|
| null    | null   | 22    |
| Austria | Comedy | 2     |
| Belgium | Drama  | 15    |
| Austria | Drama  | 4     |
| Belgium | Comedy | 1     |
| null    | Comedy | 3     |
| null    | Drama  | 19    |
Data-Driven Decision Making in SQL

Summary ROLLUP

  • Returns aggregates for a hierarchy of values, e.g. ROLLUP (country, genre)
    • Movie rentals for each country and each genre
    • Movie rentals for each country
    • Total number of movie rentals
  • In each step, one level of detail is dropped
  • Order of column names is important for ROLLUP
Data-Driven Decision Making in SQL

Number of rentals and ratings

SELECT country, 
       genre, 
       COUNT(*) AS n_rentals,
       COUNT(rating) AS n_ratings
FROM renting_extended
GROUP BY ROLLUP (genre, country);
| country  | genre  | n_rentals | n_ratings |
|----------|--------|-----------|-----------|
| null     | null   | 22        | 9         |
| Belgium  | Drama  | 15        | 6         |
| Austria  | Comedy | 2         | 1         |
| Belgium  | Comedy | 1         | 0         |
| Austria  | Drama  | 4         | 2         |
| null     | Comedy | 3         | 1         |
| null     | Drama  | 19        | 8         |
Data-Driven Decision Making in SQL

Let's practice!

Data-Driven Decision Making in SQL

Preparing Video For Download...