Introduction to BigQuery
Matthew Forrest
Field CTO
SELECT customer_id, order_date, order_total, ROW_NUMBER() OVER(
PARTITION BY customer_id
ORDER BY order_date
) AS order_sequence FROM orders;
ROW_NUMBER()
: The window function that returns the row numberOVER()
: Defines the window framePARTITION BY customer_id
: Groups the data by customerORDER BY order_date
: Orders data within each partitionorder_sequence
: Result of the window function operationSELECT
product_id,
product_photos_qty,
-- Ordinal rank for each row
RANK() OVER(
ORDER BY product_photos_qty DESC
) as rank,
-- Percentile rank for each row
PERCENT_RANK() OVER(
ORDER BY product_photos_qty
) as percent
FROM ecommerce.ecomm_products
ORDER BY product_photos_qty DESC;
SELECT
product_id,
-- Returns value from previous row
LAG(product_photos_qty) OVER(
ORDER BY product_photos_qty
) as lag,
product_photos_qty,
-- Returns value from next row
LEAD(product_photos_qty) OVER(
ORDER BY product_photos_qty
) as lead
FROM ecommerce.ecomm_products
ORDER BY product_photos_qty DESC;
SELECT
order_id,
order_timestamp,
SUM(cost) OVER(
ORDER BY order_timestamp
ROWS BETWEEN 2 PRECEDING
AND CURRENT ROW) as rolling_avg
FROM sales_data
ORDER BY order_timestamp
Row-based bounding options:
UNBOUNDED PRECEDING
: All rows beforeUNBOUNDED FOLLOWING
: All rows after[INT] ROWS PRECEDING
: Specific number of rows before[INT] ROWS FOLLOWING
: Specific number of rows afterSELECT
product_id,
product_photos_qty,
RANK() OVER(
ORDER BY product_photos_qty DESC
) as rank
FROM ecommerce.ecomm_products
-- Filter using QUALIFY
QUALIFY rank < 4
ORDER BY product_photos_qty DESC;
HAVING
as we are not aggregatingIntroduction to BigQuery