Data-Driven Decision Making in SQL
Tim Verdonck
Professor Statistics and Data Science
renting
records of movie rentals with ratingscustomers
information about country of the customermovies
year of release of the movieSELECT *
FROM renting AS r
LEFT JOIN customers AS c
ON c.customer_id = r.customer_id
LEFT JOIN movies AS m
ON m.movie_id = r.movie_id;
SELECT *
FROM renting AS r
LEFT JOIN customers AS c
ON c.customer_id = r.customer_id
LEFT JOIN movies AS m
ON m.movie_id = r.movie_id
WHERE r.movie_id IN (
SELECT movie_id
FROM renting
GROUP BY movie_id
HAVING COUNT(rating) >= 4)
AND r.date_renting >= '2018-04-01';
Type of aggregation:
Levels of aggregation:
SELECT c.country,
m.year_of_release,
COUNT(*) AS n_rentals,
COUNT(DISTINCT r.movie_id) AS n_movies,
AVG(rating) AS avg_rating
FROM renting AS r
LEFT JOIN customers AS c
ON c.customer_id = r.customer_id
LEFT JOIN movies AS m
ON m.movie_id = r.movie_id
WHERE r.movie_id IN (
SELECT movie_id
FROM renting
GROUP BY movie_id
HAVING COUNT(rating) >= 4)
AND r.date_renting >= '2018-04-01'
GROUP BY ROLLUP (m.year_of_release, c.country)
ORDER BY c.country, m.year_of_release;
| year_of_release | country | n_rentals | n_movies | avg_rating |
|-----------------|---------|-----------|----------|--------------------|
| 2009 | null | 10 | 1 | 8.7500000000000000 |
| 2010 | null | 41 | 5 | 7.9629629629629630 |
| 2011 | null | 14 | 2 | 8.2222222222222222 |
| 2012 | null | 28 | 5 | 8.1111111111111111 |
| 2013 | null | 10 | 2 | 7.6000000000000000 |
| 2014 | null | 5 | 1 | 8.0000000000000000 |
| null | null | 333 | 50 | 7.9024390243902439 |
Data-Driven Decision Making in SQL