Prise de décision basée sur des données en SQL
Irene Ortner
Data Scientist at Applied Statistics
Extensions SQL pour faciliter les opérations OLAP
GROUP BY CUBEGROUP BY ROLLUPGROUP BY GROUPING SETSLes premières lignes de la table renting_extended :
| renting_id | country | genre | rating |
|------------|----------|--------|--------|
| 2 | Belgium | Drama | 10 |
| 32 | Belgium | Drama | 10 |
| 203 | Austria | Drama | 6 |
| 292 | Austria | Comedy | 8 |
| 363 | Belgium | Drama | 7 |
| .......... | ........ | ...... | ...... |
Exemple de requête avec l'opérateur GROUPING SETS :
SELECT country,
genre,
COUNT(*)
FROM rentings_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
GROUP BY GROUPING SETS renvoie une requête UNION sur plusieurs requêtes GROUP BY.SELECT country,
genre,
COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (country, genre);
GROUPING SETS : (country, genre)SELECT country, genre, COUNT(*) FROM renting_extended GROUP BY country, genre;| country | genre | count | |---------|--------|--------| | Austria | Comedy | 2 | | Belgium | Drama | 15 | | Austria | Drama | 4 | | Belgium | Comedy | 1 |
SELECT country, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (country);
GROUPING SETS : (country)SELECT country, COUNT(*)
FROM renting_extended
GROUP BY country;
| country | count |
|---------|-------|
| Austria | 16 |
| Belgium | 6 |
SELECT genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS (genre);
GROUPING SETS : (genre)SELECT genre, COUNT(*)
FROM renting_extended
GROUP BY genre;
| country | count |
|---------|-------|
| Comedy | 3 |
| Drama | 19 |
SELECT COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ();
GROUPING SETS : ()SELECT COUNT(*)
FROM renting_extended;
| count |
|-------|
| 22 |
GROUP BY GROUPING SETS (...)SELECT country, genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
UNION sur les 4 requêtes précédentes.GROUP BY CUBE (country, genre).SELECT country, genre, COUNT(*)
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (country), (genre), ());
| country | genre |count|
|---------|--------|-----|
| NULL | NULL | 22 |
| Austria | Comedy | 2 |
| Belgium | Drama | 15 |
| Austria | Drama | 4 |
| Belgium | Comedy | 1 |
| Belgium | NULL | 16 |
| Austria | NULL | 6 |
| NULL | Comedy | 3 |
| NULL | Drama | 19 |
SELECT country,
genre,
COUNT(*),
AVG(rating) AS avg_rating
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (genre));
SELECT country, genre, COUNT(*), AVG(rating) AS avg_rating
FROM renting_extended
GROUP BY GROUPING SETS ((country, genre), (genre));
| country | genre | count| avg_rating |
|---------|--------|------|------------|
| Austria | Comedy | 2 | 8.00 |
| Belgium | Drama | 15 | 9.17 |
| Austria | Drama | 4 | 6.00 |
| Belgium | Comedy | 1 | NULL |
| NULL | Comedy | 3 | 8.00 |
| NULL | Drama | 19 | 8.38 |
Prise de décision basée sur des données en SQL