Common Table Expressions

Data Manipulation in Snowflake

Jake Roach

Field Data Engineer

Common Table Expressions

Common Table Expressions (CTE's) temporarily store the results of a query to eventually be used by another query

$$

  • CTE's are defined at the beginning of a query
  • WITH <cte-name> AS ( <query> )
  • Similar to subqueries
-- Pass the name of the CTE, followed
-- by the query in parenthesis
WITH <cte-name> AS (
    <query>
)


SELECT ... FROM <cte-name> ... ;
Data Manipulation in Snowflake

Reporting on at-risk students

WITH at_risk AS (
    SELECT
        student_id
        course_name,
        teacher_name,
        grade
    FROM student_courses
    WHERE grade < 70 AND is_required
)

SELECT students.student_name, at_risk.* FROM at_risk JOIN students ON at_risk.student_id = students.id;

$$

  • Temporary results stored in at_risk
  • Select a subset of records
  • Query at_risk to generate a report

$$

$$

$$

Data Manipulation in Snowflake

Reporting on at-risk students

Showing the results of a CTE with the subset of results as well as the final report

Data Manipulation in Snowflake

Comparing Subqueries and CTE's

SELECT
    month_num,
    AVG(differential) AS avg_differential
    MIN(differential) AS most_differential
FROM (

    SELECT
        month_num,
        windchill - temperature AS differential
    FROM weather
    WHERE 
        season = 'Winter' AND
        temperature < 32

)
GROUP BY month_num;
WITH daily_temperature_differential AS (
    SELECT
        month_num,
        windchill - temperature AS differential
    FROM weather
    WHERE 
        season = 'Winter' AND 
        temperature < 32
)

SELECT
    month_num,
    AVG(differential) AS avg_differential
    MIN(differential) AS most_differential
FROM daily_temperature_differential
GROUP BY month_num;
Data Manipulation in Snowflake

Finding temperature differential

WITH daily_temperature_differential AS (
    SELECT
        month_num,
        windchill - temperature AS differential
    FROM weather
    WHERE 
        season = 'Winter' AND 
        temperature < 32
)

SELECT month_num, AVG(differential) AS avg_differential MIN(differential) AS most_differential FROM daily_temperature_differential GROUP BY month_num;
          | month_num | differential |
          | --------- | ------------ |
          |     12    |      -12     |
          |     1     |      -3      |
          |     1     |       0      |
          |     2     |      -7      |

The progression is more natural than subqueries.

| month_num | avg_differential | most_differential |
| --------- | ---------------- | ----------------- |
|     12    |       -5.77      |        -14        |
|     1     |       -1.91      |        -8         |
|     2     |       -8.13      |        -22        |
Data Manipulation in Snowflake

Let's practice!

Data Manipulation in Snowflake

Preparing Video For Download...