Data-Driven Decision Making in SQL
Irene Ortner
Data Scientist at Applied Statistics
Extensions in SQL to facilitate OLAP operations
GROUP BY CUBEGROUP BY ROLLUPGROUP BY GROUPING SETSThe first few rows of the table renting_extended:
| renting_id | country | genre | rating |
|------------|----------|--------|--------|
| 2 | Belgium | Drama | 10 |
| 32 | Belgium | Drama | 10 |
| 203 | Austria | Drama | 6 |
| 292 | Austria | Comedy | 8 |
| 363 | Belgium | Drama | 7 |
| .......... | ........ | ...... | ...... |
Example of a query with GROUPING SETS operator:
SELECT country,
genre,
COUNT(*)
FROM rentings_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
GROUP BY GROUPING SETS returns a UNION over several GROUP BY queries.SELECT country,
genre,
COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (country, genre);
GROUPING SETS: (country, genre)SELECT country,
genre,
COUNT(*)
FROM renting_extended
GROUP BY country, genre;
| country | genre | count |
|---------|--------|--------|
| Austria | Comedy | 2 |
| Belgium | Drama | 15 |
| Austria | Drama | 4 |
| Belgium | Comedy | 1 |
SELECT country, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (country);
GROUPING SETS: (country)SELECT country, COUNT(*)
FROM renting_extended
GROUP BY country;
| country | count |
|---------|-------|
| Austria | 16 |
| Belgium | 6 |
SELECT genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (genre);
GROUPING SETS: (genre)SELECT genre, COUNT(*)
FROM renting_extended
GROUP BY genre;
| country | count |
|---------|-------|
| Comedy | 3 |
| Drama | 19 |
SELECT COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ();
GROUPING SETS: ()SELECT COUNT(*)
FROM renting_extended;
| count |
|-------|
| 22 |
GROUP BY GROUPING SETS (...)SELECT country, genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
UNION over 4 previous queries.GROUP BY CUBE (country, genre).SELECT country, genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
| country | genre |count|
|---------|--------|-----|
| NULL | NULL | 22 |
| Austria | Comedy | 2 |
| Belgium | Drama | 15 |
| Austria | Drama | 4 |
| Belgium | Comedy | 1 |
| Belgium | NULL | 16 |
| Austria | NULL | 6 |
| NULL | Comedy | 3 |
| NULL | Drama | 19 |
SELECT country,
genre,
COUNT(*),
AVG(rating) AS avg_rating
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (genre));
SELECT country, genre, COUNT(*), AVG(rating) AS avg_rating
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (genre));
| country | genre | count| avg_rating |
|---------|--------|------|------------|
| Austria | Comedy | 2 | 8.00 |
| Belgium | Drama | 15 | 9.17 |
| Austria | Drama | 4 | 6.00 |
| Belgium | Comedy | 1 | NULL |
| NULL | Comedy | 3 | 8.00 |
| NULL | Drama | 19 | 8.38 |
Data-Driven Decision Making in SQL