Common Table Expressions

Data Manipulation in SQL

Mona Khalil

Data Scientist, Greenhouse Software

When adding subqueries...

  • Query complexity increases quickly!
    • Information can be difficult to keep track of

Solution: Common Table Expressions!

Data Manipulation in SQL

Common Table Expressions

Common Table Expressions (CTEs)

  • Table declared before the main query
  • Named and referenced later in FROM statement

Setting up CTEs

WITH cte AS (
    SELECT col1, col2
    FROM table)

SELECT AVG(col1) AS avg_col FROM cte;
Data Manipulation in SQL

Take a subquery in FROM

SELECT
  c.name AS country,
  COUNT(s.id) AS matches
FROM country AS c
INNER JOIN (
  SELECT country_id, id 
  FROM match
  WHERE (home_goal + away_goal) >= 10) AS s
ON c.id = s.country_id
GROUP BY country;
| country     | matches |
|-------------|---------|
| England     | 3       |
| Germany     | 1       |
| Netherlands | 1       |
| Spain       | 4       |
Data Manipulation in SQL

Place it at the beginning

(
  SELECT country_id, id 
  FROM match
  WHERE (home_goal + away_goal) >= 10
)
Data Manipulation in SQL

Place it at the beginning

WITH s AS (
  SELECT country_id, id 
  FROM match
  WHERE (home_goal + away_goal) >= 10
)
Data Manipulation in SQL

Show me the CTE

WITH s AS (
  SELECT country_id, id 
  FROM match
  WHERE (home_goal + away_goal) >= 10
)
SELECT
  c.name AS country,
  COUNT(s.id) AS matches
FROM country AS c
INNER JOIN s
ON c.id = s.country_id
GROUP BY country;
| country     | matches |
|-------------|---------|
| England     | 3       |
| Germany     | 1       |
| Netherlands | 1       |
| Spain       | 4       |
Data Manipulation in SQL

Show me all the CTEs

WITH s1 AS (
  SELECT country_id, id 
  FROM match
  WHERE (home_goal + away_goal) >= 10),
s2 AS (                              -- New subquery
  SELECT country_id, id 
  FROM match
  WHERE (home_goal + away_goal) <= 1
)
SELECT
  c.name AS country,
  COUNT(s1.id) AS high_scores,
  COUNT(s2.id) AS low_scores         -- New column
FROM country AS c
INNER JOIN s1
ON c.id = s1.country_id
INNER JOIN s2                        -- New join
ON c.id = s2.country_id
GROUP BY country;
Data Manipulation in SQL

Why use CTEs?

  • Executed once

    • CTE is then stored in memory
    • Improves query performance
  • Improving organization of queries

  • Referencing other CTEs

  • Referencing itself (SELF JOIN)
Data Manipulation in SQL

Let's practice!

Data Manipulation in SQL

Preparing Video For Download...