Writing queries in BigQuery

Introduction to BigQuery

Matt Forrest

Field CTO

Writing simple queries

A simple query in BigQuery

-- Note the table name structure

SELECT
    * 
FROM 
    `project.ecommerce.order_items`
Introduction to BigQuery

Running queries in BigQuery

We can run queries in BigQuery via:

  • BigQuery Studio
  • Client libraries (e.g., Python)
  • Google Cloud command line tool
  • Pandas

BigQuery Console

Introduction to BigQuery

Using correct table names

-- Using the full table name structure


SELECT
    * 
FROM 
    `project.ecommerce.ecomm_order_details`
/* Using the shorthand 
table name structure */

SELECT
    * 
FROM 
    ecommerce.ecomm_order_details
Introduction to BigQuery

GoogleSQL

Data type reference for GoogleSQL

Introduction to BigQuery

Our datasets: Olist E-Commerce

  • Orders: Order number and order item information
  • Order details: Customer id, order and shipping dates
  • Payments: Payment type, split payments, amounts
  • Products: Product category, description, dimensions

Person with packages

1 https://www.kaggle.com/datasets/olistbr/brazilian-ecommerce
Introduction to BigQuery

Products

  • product_id - unique product ID
  • product_photos_qty - number of product photos
  • product_weight_g - weight of the product
  • product_category_name_english - product category name

Person with packages

Introduction to BigQuery

Orders

  • order_id - Order unique ID
  • order_items - STRUCT containing information about the order items
    • order_item_id - Item number in the order
    • product_id - Unique product ID
    • seller_id - Unique seller ID
    • price - Price of the order item

Person with packages

Introduction to BigQuery

Order details

  • order_id - unique order ID
  • customer_id - unique customer ID
  • order_status - current order status
  • order_purchase_timestamp - Timestamp when order was purchased
  • order_approved_at - Timestamp when order was approved
  • order_delivered_carrier_date - Timestamp when order was accepted by the carrier
  • order_delivered_customer_date - Timestamp when order delivered
  • order_estimated_delivery_date - Timestamp of estimated delivery date
Introduction to BigQuery

Payments

  • order_id - unique order ID
  • payment_type - type of payment
  • payment_sequential - payment number
  • payment_installments - number of payment installments
  • payment_value - value of that payment

Person with packages

Introduction to BigQuery

Review of aggregations and joins

-- Count of orders per customers

SELECT
    d.customer_id,
    COUNT(o.order_id)
FROM 
    ecommerce.ecomm_order_details d
JOIN 
    ecommerce.ecomm_orders o
USING (order_id)
GROUP BY
    d.customer_id

Five key components:

  1. Aggregate function
  2. The left dataset
  3. The right dataset
  4. Join condition
  5. Grouping condition
Introduction to BigQuery

Let's practice!

Introduction to BigQuery

Preparing Video For Download...