Identify favorite actors of customer groups

Data-Driven Decision Making in SQL

Irene Ortner

Data Scientist at Applied Statistics

Combining SQL statements in one query

  • LEFT JOIN
  • WHERE
  • GROUP BY
  • HAVING
  • ORDER BY
Data-Driven Decision Making in SQL

From renting records to customer and actor information

Our question: Who is the favorite actor for a certain customer group?

Join table renting with tables

  • customers
  • actsin
  • actors
SELECT *
FROM renting as r
LEFT JOIN customers AS c
ON r.customer_id = c.customer_id
LEFT JOIN actsin as ai
ON r.movie_id = ai.movie_id
LEFT JOIN actors as a
ON ai.actor_id = a.actor_id;
Data-Driven Decision Making in SQL

Male customers

  • Actors which play most often in movies watched by male customers.
SELECT a.name, 
       COUNT(*)
FROM renting as r
LEFT JOIN customers AS c
ON r.customer_id = c.customer_id
LEFT JOIN actsin as ai
ON r.movie_id = ai.movie_id
LEFT JOIN actors as a
ON ai.actor_id = a.actor_id

WHERE c.gender = 'male'
GROUP BY a.name;
Data-Driven Decision Making in SQL

Who is the favorite actor?

  • Actor being watched most often.
  • Best average rating when being watched.
SELECT a.name, 
       COUNT(*) AS number_views, 
       AVG(r.rating) AS avg_rating
FROM renting as r
LEFT JOIN customers AS c
ON r.customer_id = c.customer_id
LEFT JOIN actsin as ai
ON r.movie_id = ai.movie_id
LEFT JOIN actors as a
ON ai.actor_id = a.actor_id

WHERE c.gender = 'male'
GROUP BY a.name;
Data-Driven Decision Making in SQL

Add HAVING and ORDER BY

SELECT a.name, 
       COUNT(*) AS number_views, 
       AVG(r.rating) AS avg_rating
FROM renting as r
LEFT JOIN customers AS c
ON r.customer_id = c.customer_id
LEFT JOIN actsin as ai
ON r.movie_id = ai.movie_id
LEFT JOIN actors as a
ON ai.actor_id = a.actor_id

WHERE c.gender = 'male'
GROUP BY a.name
HAVING AVG(r.rating) IS NOT NULL
ORDER BY avg_rating DESC, number_views DESC;
Data-Driven Decision Making in SQL

Add HAVING and ORDER BY

| name               | number_views | avg_rating |
|--------------------|--------------|------------|
| Ray Romano         | 3            | 10.00      |
| Sean Bean          | 2            | 10.00      |
| Leonardo DiCaprio  | 3            | 9.33       |
| Christoph Waltz    | 3            | 9.33       |
Data-Driven Decision Making in SQL

Let's practice!

Data-Driven Decision Making in SQL

Preparing Video For Download...