Tomada de decisão orientada por dados em SQL
Irene Ortner
Data Scientist at Applied Statistics
Extensões em SQL para facilitar operações OLAP
GROUP BY CUBEGROUP BY ROLLUPGROUP BY GROUPING SETSAs primeiras linhas da tabela renting_extended:
| renting_id | country | genre | rating |
|------------|----------|--------|--------|
| 2 | Belgium | Drama | 10 |
| 32 | Belgium | Drama | 10 |
| 203 | Austria | Drama | 6 |
| 292 | Austria | Comedy | 8 |
| 363 | Belgium | Drama | 7 |
| .......... | ........ | ...... | ...... |
Exemplo de consulta com o operador GROUPING SETS:
SELECT country,
genre,
COUNT(*)
FROM rentings_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
GROUP BY GROUPING SETS retorna um UNION de várias consultas GROUP BY.SELECT country,
genre,
COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (country, genre);
GROUPING SETS: (country, genre)SELECT country,
genre,
COUNT(*)
FROM renting_extended
GROUP BY country, genre;
| country | genre | count |
|---------|--------|--------|
| Austria | Comedy | 2 |
| Belgium | Drama | 15 |
| Austria | Drama | 4 |
| Belgium | Comedy | 1 |
SELECT country, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (country);
GROUPING SETS: (country)SELECT country, COUNT(*)
FROM renting_extended
GROUP BY country;
| country | count |
|---------|-------|
| Austria | 16 |
| Belgium | 6 |
SELECT genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (genre);
GROUPING SETS: (genre)SELECT genre, COUNT(*)
FROM renting_extended
GROUP BY genre;
| country | count |
|---------|-------|
| Comedy | 3 |
| Drama | 19 |
SELECT COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ();
GROUPING SETS: ()SELECT COUNT(*)
FROM renting_extended;
| count |
|-------|
| 22 |
GROUP BY GROUPING SETS (...)SELECT country, genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
UNION das 4 consultas anteriores.GROUP BY CUBE (country, genre).SELECT country, genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
| country | genre |count|
|---------|--------|-----|
| NULL | NULL | 22 |
| Austria | Comedy | 2 |
| Belgium | Drama | 15 |
| Austria | Drama | 4 |
| Belgium | Comedy | 1 |
| Belgium | NULL | 16 |
| Austria | NULL | 6 |
| NULL | Comedy | 3 |
| NULL | Drama | 19 |
SELECT country,
genre,
COUNT(*),
AVG(rating) AS avg_rating
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (genre));
SELECT country, genre, COUNT(*), AVG(rating) AS avg_rating
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (genre));
| country | genre | count| avg_rating |
|---------|--------|------|------------|
| Austria | Comedy | 2 | 8.00 |
| Belgium | Drama | 15 | 9.17 |
| Austria | Drama | 4 | 6.00 |
| Belgium | Comedy | 1 | NULL |
| NULL | Comedy | 3 | 8.00 |
| NULL | Drama | 19 | 8.38 |
Tomada de decisão orientada por dados em SQL