WHERE are the subqueries?

Data Manipulation in SQL

Mona Khalil

Data Scientist, Greenhouse Software

What is a subquery?

  • A query nested inside another query
SELECT column
FROM (SELECT column
      FROM table) AS subquery;
  • Useful for intermediary transformations
Data Manipulation in SQL

What do you do with subqueries?

  • Can be in any part of a query

    • SELECT, FROM, WHERE, GROUP BY
  • Can return a variety of information

    • Scalar quantities (3.14159, -2, 0.001)
    • A list (id = (12, 25, 392, 401, 939))
    • A table
Data Manipulation in SQL

Why subqueries?

  • Comparing groups to summarized values
    • How did Liverpool compare to the English Premier League's average performance for that year?
  • Reshaping data
    • What is the highest monthly average of goals scored in the Bundesliga?
  • Combining data that cannot be joined
    • How do you get both the home and away team names into a table of match results?
Data Manipulation in SQL

Simple subqueries

  • Can be evaluated independently from the outer query
SELECT home_goal
FROM match
WHERE home_goal > (
    SELECT AVG(home_goal) 
    FROM match);

SELECT AVG(home_goal) FROM match;
1.56091291478423
Data Manipulation in SQL

Simple subqueries

  • Is only processed once in the entire statement
SELECT home_goal
FROM match
WHERE home_goal > (
    SELECT AVG(home_goal) 
    FROM match);
Data Manipulation in SQL

Subqueries in the WHERE clause

  • Which matches in the 2012/2013 season scored home goals higher than overall average?
SELECT AVG(home_goal) FROM match;
1.56091291478423
SELECT date, hometeam_id, awayteam_id, home_goal, away_goal
FROM match
WHERE season = '2012/2013'
      AND home_goal > 1.56091291478423;
Data Manipulation in SQL

Subqueries in the WHERE clause

  • Which matches in the 2012/2013 season scored home goals higher than overall average?
SELECT date, hometeam_id, awayteam_id, home_goal, away_goal
FROM match
WHERE season = '2012/2013'
      AND home_goal > (SELECT AVG(home_goal)
                     FROM match);
| date       | hometeam_id | awayteam_id | home_goal | away_goal |
|------------|-------------|-------------|-----------|-----------|
| 2012-07-28 | 9998        | 1773        | 5         | 2         |
| 2012-07-29 | 9987        | 9984        | 3         | 3         |
| 2012-10-05 | 9993        | 9991        | 2         | 2         |
Data Manipulation in SQL

Subquery filtering list with IN

  • Which teams are part of Poland's league?
SELECT
  team_long_name,
  team_short_name AS abbr
FROM team
WHERE 
  team_api_id IN 
  (SELECT hometeam_id
   FROM match
   WHERE country_id = 15722);
| team_long_name | abbr |
|----------------|------|
| Ruch Chorzów   | CHO  |
| Jagiellonia    | BIA  |
| Lech Pozna?    | POZ  |
| P. Warszawa    | PWA  |
| Cracovia       | CKR  |
| Górnik ??czna  | LEC  |
| Polonia Bytom  | GOR  |
| Zag??bie Lubin | ZAG  |
| Pogo? Szczecin | POG  |
| Widzew ?ód?    | WID  |
| ?l?sk Wroc?aw  | SLA  |
Data Manipulation in SQL

Practice time!

Data Manipulation in SQL

Preparing Video For Download...