Using CTEs with Redshift

Introduction to Redshift

Jason Myers

Principal Architect

Common table expressions (CTEs)

  • Temporary result set
  • Simplify queries
  • Alternative to subqueries
Introduction to Redshift

Subquery and CTE structures

SELECT division_name,
       revenue_total
       -- Subquery for top 10 divions by revenue
  FROM (SELECT division_id,
               SUM(revenue) AS revenue_total
          FROM orders
         GROUP BY division_id
         ORDER BY revenue_total DESC
         LIMIT 10) top_ten_divisions_by_rev
 WHERE revenue_total > 100000; 
-- CTE for top 10 divions by revenue
WITH top_ten_divisions_by_rev AS (
  SELECT division_id,
         SUM(revenue) AS revenue_total
    FROM orders
   GROUP BY division_id
   ORDER BY revenue_total DESC
   LIMIT 10)

-- Main query SELECT division_id, revenue_total -- FROM our CTE FROM top_ten_divisions_by_rev WHERE revenue_total > 100000;
Introduction to Redshift

Multiple CTEs

-- Top 10 divisions by revenue CTE
WITH top_ten_divisions_by_rev AS(
  SELECT division_id,
         SUM(revenue) AS revenue_total
    FROM orders
   GROUP BY division_id
   ORDER BY revenue_total DESC
   LIMIT 10),

-- Division ID and Name CTE division_names AS( SELECT id AS division_id, name AS division_name FROM divisions)
-- Main query
SELECT division_name,
       revenue_total
       -- FROM Top 10 CTE
  FROM top_ten_divisions_by_rev
  -- Joining so we can use the division name
  JOIN division_names USING (DIVISION_ID)
 WHERE revenue_total > 100000;
Introduction to Redshift

Multiple CTEs, CTE joins

-- Selecting the division names
WITH division_names AS(
  SELECT id AS division_id,
         name AS division_name
    FROM divisions),
-- Selecting the top ten divisions by revenue
top_ten_divisions_by_rev AS(
  SELECT division_name,
         SUM(revenue) AS revenue_total
    FROM orders
    -- Joining in the division_names CTE
    JOIN divisions USING (division_id)
   GROUP BY division_name
   ORDER BY revenue_total DESC
   LIMIT 10);

Kernighan's Law

Everyone knows that debugging is twice as hard as writing a program in the first place. So if you are as clever as you can be when you write it, how will you ever debug it?

Introduction to Redshift

CTE performance

  • Same as subqueries
  • Better than subqueries if reused

SQL Code vs Execution order

Introduction to Redshift

Let's practice!

Introduction to Redshift

Preparing Video For Download...