Introduction to Redshift
Jason Myers
Principal Engineer
SELECT *
Use in the following clauses whenever possible
JOIN
WHERE
GROUP BY
Use SORTKEYs
in order in ORDER BY
sortkey_1, sortkey_2, sortkey_3
sort_key_1, sort_key_3
DISTKEY
and SORTKEY
SELECT receipts.cookie_id,
sum(receipts.total)
FROM receipts
JOIN cookies ON receipts.cookie_id = cookies.cookie_id
-- Keep cookies predicates in the join to push down to nodes holding the records for cookies
AND cookies.available_on < '2023-11-14'
AND cookies.end_of_sale IS null
-- Predicates that are not part of the join or on the joined table stay in the WHERE clause
WHERE receipts.order_time > '2023-11-13'
GROUP BY 1 ORDER BY 1;
When using:
GROUP BY
ORDER BY
Bad
GROUP BY col_one, col_two, col_three
ORDER BY col_two, col_three, col_one
Good
GROUP BY col_two, col_three, col_one
ORDER BY col_two, col_three, col_one
EXISTS
in your predicates when just checking for the truthfulness of a subquery resultSELECT column_name
FROM table_name
WHERE EXISTS
(SELECT column_name
FROM table_name
WHERE active is True);
Introduction to Redshift