Joins in BigQuery

Introduction to BigQuery

Matthew Forrest

Field CTO

Joining data in BigQuery

  • INNER JOIN: Values exist in both tables.

  • LEFT JOIN: All rows in left table, matches right table.

  • RIGHT JOIN: All rows in right table, matches left table.

  • FULL JOIN: All rows from both tables, matches and non matches.

  • CROSS JOIN: Every row matched to every row from both tables.

Image showing SQL joins visually

Introduction to BigQuery

Real life examples of joins

Customers: left table

Orders: right table

  • INNER JOIN: Matching customers and their orders

  • LEFT JOIN: Showing all customers, even if they haven't placed any orders

  • RIGHT JOIN: Showing all orders, even if there is a missing customer ID

  • FULL JOIN: Showing all customers and all orders, even if some haven't interacted

  • CROSS JOIN: Match every order to every customer with no conditions

Introduction to BigQuery

INNER JOIN

  • Only returns matching results from both datasets
SELECT
  c.customer_id, s.product_name
FROM customers c
-- The INNER keyword is optional
JOIN sales_data s
ON c.customer_id = s.customer_id;
| customer_id | product_name         |
|-------------|----------------------|
| 1           | Bluetooth Headphones |
| 2           | Running Shoes        |
Introduction to BigQuery

LEFT JOIN

  • Returns all rows from the LEFT dataset
SELECT
  c.customer_id, s.product_name
FROM customers c
LEFT JOIN sales_data s
ON c.customer_id = s.customer_id;
| customer_id | product_name         |
|-------------|----------------------|
| 1           | Bluetooth Headphones |
| 2           | Running Shoes        |
| 3           | null                 |
Introduction to BigQuery

RIGHT JOIN

  • Returns all rows from the RIGHT dataset
SELECT
  c.customer_id, s.product_name
FROM customers c
RIGHT JOIN sales_data s
ON c.customer_id = s.customer_id;
| customer_id | product_name         |
|-------------|----------------------|
| 1           | Bluetooth Headphones |
| 2           | Running Shoes        |
| null        | External Microphone  |
Introduction to BigQuery

OUTER JOIN

  • A "RIGHT-LEFT" join: all rows from both RIGHT a and LEFT datasets
SELECT
  c.customer_id, s.product_name
FROM customers c
OUTER JOIN sales_data s
ON c.customer_id = s.customer_id;
| customer_id | product_name         |
|-------------|----------------------|
| 1           | Bluetooth Headphones |
| 2           | Running Shoes        |
| 3           | null                 |
| null        | External Microphone  |
Introduction to BigQuery

SELF or CROSS JOIN

  • A cartesian join - every row with every row
SELECT
  c.customer_id,
  s.product_name,

-- Adding table names separated 
-- by a comma is a CROSS JOIN
-- Order is determined by the
-- left table, here "customers"
FROM customers c, sales_data s;
| customer_id | product_name         |
|-------------|----------------------|
| 1           | Bluetooth Headphones |
| 1           | null                 |
| 2           | Bluetooth Headphones |
| 2           | null                 |
| 3           | null                 |
| 3           | Bluetooth Headphones |
Introduction to BigQuery

Joins and UNNEST

  • Also used to join unnested data
SELECT
  c.customer_id,
  payments.method
FROM customers c, 
UNNEST(
  customers.payment_methods
) payments;
| customer_id | product_name |
|-------------|--------------|
| 1           | Visa         |
| 1           | Mastercard   |
| 1           | Venmo        |
| 1           | Paypal       |
| 2           | Amex         |
| 2           | Visa         |
Introduction to BigQuery

Let's practice!

Introduction to BigQuery

Preparing Video For Download...