Favoriete acteurs per klantgroep identificeren

Data-driven beslissingen nemen met SQL

Irene Ortner

Data Scientist at Applied Statistics

SQL-instructies combineren in één query

  • LEFT JOIN
  • WHERE
  • GROUP BY
  • HAVING
  • ORDER BY
Data-driven beslissingen nemen met SQL

Van verhuurgegevens naar klant- en actorgegevens

Onze vraag: wie is de favoriete acteur voor een bepaalde klantgroep?

Join tabel renting met

  • customers
  • actsin
  • actors
SELECT *
FROM renting as r
LEFT JOIN customers AS c
ON r.customer_id = c.customer_id
LEFT JOIN actsin as ai
ON r.movie_id = ai.movie_id
LEFT JOIN actors as a
ON ai.actor_id = a.actor_id;
Data-driven beslissingen nemen met SQL

Mannelijke klanten

  • Acteurs die het vaakst voorkomen in films die mannelijke klanten kijken.
SELECT a.name, 
       COUNT(*)
FROM renting as r
LEFT JOIN customers AS c
ON r.customer_id = c.customer_id
LEFT JOIN actsin as ai
ON r.movie_id = ai.movie_id
LEFT JOIN actors as a
ON ai.actor_id = a.actor_id

WHERE c.gender = 'male'
GROUP BY a.name;
Data-driven beslissingen nemen met SQL

Wie is de favoriete acteur?

  • Meest bekeken acteur.
  • Beste gemiddelde beoordeling wanneer bekeken.
SELECT a.name, 
       COUNT(*) AS number_views, 
       AVG(r.rating) AS avg_rating
FROM renting as r
LEFT JOIN customers AS c
ON r.customer_id = c.customer_id
LEFT JOIN actsin as ai
ON r.movie_id = ai.movie_id
LEFT JOIN actors as a
ON ai.actor_id = a.actor_id

WHERE c.gender = 'male'
GROUP BY a.name;
Data-driven beslissingen nemen met SQL

HAVING en ORDER BY toevoegen

SELECT a.name, 
       COUNT(*) AS number_views, 
       AVG(r.rating) AS avg_rating
FROM renting as r
LEFT JOIN customers AS c
ON r.customer_id = c.customer_id
LEFT JOIN actsin as ai
ON r.movie_id = ai.movie_id
LEFT JOIN actors as a
ON ai.actor_id = a.actor_id

WHERE c.gender = 'male'
GROUP BY a.name
HAVING AVG(r.rating) IS NOT NULL
ORDER BY avg_rating DESC, number_views DESC;
Data-driven beslissingen nemen met SQL

HAVING en ORDER BY toevoegen

| name               | number_views | avg_rating |
|--------------------|--------------|------------|
| Ray Romano         | 3            | 10.00      |
| Sean Bean          | 2            | 10.00      |
| Leonardo DiCaprio  | 3            | 9.33       |
| Christoph Waltz    | 3            | 9.33       |
Data-driven beslissingen nemen met SQL

Laten we oefenen!

Data-driven beslissingen nemen met SQL

Preparing Video For Download...