Nested subqueries

Data Manipulation in SQL

Mona Khalil

Data Scientist, Greenhouse Software

Nested subqueries?

  • Subquery inside another subquery
  • Perform multiple layers of transformation
Data Manipulation in SQL

A subquery...

  • How much did each country's average differ from the overall average?
SELECT
    c.name AS country,
    AVG(m.home_goal + m.away_goal) AS avg_goals,
    AVG(m.home_goal + m.away_goal) - 
        (SELECT AVG(home_goal + away_goal) 
         FROM match) AS avg_diff
FROM country AS c
LEFT JOIN match AS m
ON c.id = m.country_id
GROUP BY country;
Data Manipulation in SQL

A subquery...

| country     | avg_goals | avg_diff |
|-------------|-----------|----------|
| Belgium     | 2.8015    | 0.096    |
| England     | 2.7105    | 0.005    |
| France      | 2.4431    | -0.2624  |
| Germany     | 2.9016    | 0.196    |
| Italy       | 2.6168    | -0.0887  |
| Netherlands | 3.0809    | 0.3754   |
| Poland      | 2.425     | -0.2805  |
| Portugal    | 2.5346    | -0.1709  |
| Scotland    | 2.6338    | -0.0718  |
| Spain       | 2.7671    | 0.0616   |
| Switzerland | 2.9297    | 0.2241   |
Data Manipulation in SQL

...inside a subquery!

  • How does each month's total goals differ from the average monthly total of goals scored?
SELECT
  EXTRACT(MONTH FROM date) AS month,
  SUM(m.home_goal + m.away_goal) AS total_goals,
  SUM(m.home_goal + m.away_goal) - 
  (SELECT AVG(goals) 
   FROM (SELECT
           EXTRACT(MONTH FROM date) AS month,
           SUM(home_goal + away_goal) AS goals
         FROM match
         GROUP BY month)) AS avg_diff
FROM match AS m
GROUP BY month;
Data Manipulation in SQL

Inner subquery

SELECT
  EXTRACT(MONTH from date) AS month,
  SUM(home_goal + away_goal) AS goals
FROM match
GROUP BY month;
| month | goals |
|-------|-------|
| 01    | 2988  |
| 02    | 3768  |
| 03    | 3936  |
| 04    | 4055  |
| 05    | 2719  |
| 06    | 84    |
| 07    | 366   |
Data Manipulation in SQL

Outer subquery

SELECT AVG(goals) 
FROM (SELECT
        EXTRACT(MONTH from date) AS month,
        AVG(home_goal + away_goal) AS goals
FROM match
GROUP BY month) AS s;
2944.75
Data Manipulation in SQL

Final query

SELECT
  EXTRACT(MONTH FROM date) AS month,
  SUM(m.home_goal + m.away_goal) AS total_goals,
  SUM(m.home_goal + m.away_goal) - 
  (SELECT AVG(goals) 
   FROM (SELECT
           EXTRACT(MONTH FROM date) AS month,
           SUM(home_goal + away_goal) AS goals
         FROM match
         GROUP BY month) AS s) AS diff
FROM match AS m
GROUP BY month;
| month | goals | diff     |
|-------|-------|----------|
| 01    | 5821  | -36.25   |
| 02    | 7448  | 1590.75  |
| 03    | 7298  | 1440.75  |
| 04    | 8145  | 2287.75  |
Data Manipulation in SQL

Correlated nested subqueries

  • Nested subqueries can be correlated or uncorrelated
    • Or...a combination of the two
    • Can reference information from the outer subquery or main query
Data Manipulation in SQL

Correlated nested subqueries

  • What is the each country's average goals scored in the 2011/2012 season?
SELECT
  c.name AS country,
  (SELECT AVG(home_goal + away_goal)
   FROM match AS m
   WHERE m.country_id = c.id
         AND id IN (
             SELECT id
             FROM match
             WHERE season = '2011/2012')) AS avg_goals 
FROM country AS c
GROUP BY country;
Data Manipulation in SQL

Correlated nested subqueries

  • What is the each country's average goals scored in the 2011/2012 season?
SELECT
  c.name AS country,
  (SELECT AVG(home_goal + away_goal)
   FROM match AS m
   WHERE m.country_id = c.id
         AND id IN (
             SELECT id -- Begin inner subquery
             FROM match
             WHERE season = '2011/2012')) AS avg_goals 
FROM country AS c
GROUP BY country;
Data Manipulation in SQL

Correlated nested subquery

  • What is the each country's average goals scored in the 2011/2012 season?
SELECT
  c.name AS country,
  (SELECT AVG(home_goal + away_goal)
   FROM match AS m
   WHERE m.country_id = c.id -- Correlates with main query
         AND id IN (
             SELECT id -- Begin inner subquery
             FROM match
             WHERE season = '2011/2012')) AS avg_goals 
FROM country AS c
GROUP BY country;
Data Manipulation in SQL

Correlated nested subqueries

| country     | avg_goals        |
|-------------|------------------|
| Belgium     | 2.87916666666667 |
| England     | 2.80526315789474 |
| France      | 2.51578947368421 |
| Germany     | 2.85947712418301 |
| Italy       | 2.58379888268156 |
| Netherlands | 3.25816993464052 |
| Poland      | 2.19583333333333 |
| Portugal    | 2.64166666666667 |
| Scotland    | 2.6359649122807  |
| Spain       | 2.76315789473684 |
| Switzerland | 2.62345679012346 |
Data Manipulation in SQL

Let's practice!

Data Manipulation in SQL

Preparing Video For Download...