Data-Driven Decision Making in SQL
Irene Ortner
Data Scientist at Applied Statistics
Extensions in SQL to facilitate OLAP operations
GROUP BY CUBE
GROUP BY ROLLUP
GROUP BY GROUPING SETS
The first few rows of the table renting_extended
:
| renting_id | country | genre | rating |
|------------|----------|--------|--------|
| 2 | Belgium | Drama | 10 |
| 32 | Belgium | Drama | 10 |
| 203 | Austria | Drama | 6 |
| 292 | Austria | Comedy | 8 |
| 363 | Belgium | Drama | 7 |
| .......... | ........ | ...... | ...... |
Example of a query with GROUPING SETS
operator:
SELECT country,
genre,
COUNT(*)
FROM rentings_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
GROUP BY GROUPING SETS
returns a UNION
over several GROUP BY
queries.SELECT country,
genre,
COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (country, genre);
GROUPING SETS
: (country, genre)
SELECT country,
genre,
COUNT(*)
FROM renting_extended
GROUP BY country, genre;
| country | genre | count |
|---------|--------|--------|
| Austria | Comedy | 2 |
| Belgium | Drama | 15 |
| Austria | Drama | 4 |
| Belgium | Comedy | 1 |
SELECT country, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (country);
GROUPING SETS
: (country)
SELECT country, COUNT(*)
FROM renting_extended
GROUP BY country;
| country | count |
|---------|-------|
| Austria | 16 |
| Belgium | 6 |
SELECT genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (genre);
GROUPING SETS
: (genre)
SELECT genre, COUNT(*)
FROM renting_extended
GROUP BY genre;
| country | count |
|---------|-------|
| Comedy | 3 |
| Drama | 19 |
SELECT COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ();
GROUPING SETS
: ()
SELECT COUNT(*)
FROM renting_extended;
| count |
|-------|
| 22 |
GROUP BY GROUPING SETS (...)
SELECT country, genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
UNION
over 4 previous queries.GROUP BY CUBE (country, genre)
.SELECT country, genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
| country | genre |count|
|---------|--------|-----|
| NULL | NULL | 22 |
| Austria | Comedy | 2 |
| Belgium | Drama | 15 |
| Austria | Drama | 4 |
| Belgium | Comedy | 1 |
| Belgium | NULL | 16 |
| Austria | NULL | 6 |
| NULL | Comedy | 3 |
| NULL | Drama | 19 |
SELECT country,
genre,
COUNT(*),
AVG(rating) AS avg_rating
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (genre));
SELECT country, genre, COUNT(*), AVG(rating) AS avg_rating
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (genre));
| country | genre | count| avg_rating |
|---------|--------|------|------------|
| Austria | Comedy | 2 | 8.00 |
| Belgium | Drama | 15 | 9.17 |
| Austria | Drama | 4 | 6.00 |
| Belgium | Comedy | 1 | NULL |
| NULL | Comedy | 3 | 8.00 |
| NULL | Drama | 19 | 8.38 |
Data-Driven Decision Making in SQL