Defining relationships

Introduction to Databricks Genie

Gang Wang

Senior Data Scientist

Why relationships matter

nanobanana: full: Three isolated database tables without connections - Transactions, Customers, Franchises floating separately

Introduction to Databricks Genie

When inference fails

$$

nanobanana: full: Three isolated database tables without connections - Transactions, Customers, Franchises floating separately

Inference fails when:

  • Ambiguous names: creator_id, updater_id, owner_id - which connects to Users?
  • Naming discrepancies: client_no vs customer_id - Genie sees unrelated islands

Explicit relationships = a GPS map, not guessing

Introduction to Databricks Genie

How Genie uses joins

Show me transaction totals by customer region

nanobanana: full: Three-step flow - Check relationships, Identify join keys, Generate SQL with JOINs

Introduction to Databricks Genie

Screenshot 2026-04-15 at 14.48.24.png

Introduction to Databricks Genie

Animated GIF of cross-table query in Genie - email addresses of top 5 spenders with SQL JOIN

Introduction to Databricks Genie

Common relationship patterns

Which customers in Chicago haven't 
visited us in 10 days?
  • Requires multiple joins

Watch out for:

  • Join direction and cardinality - Without One-to-Many, Genie may duplicate values (fan-out)
  • Missing bridge tables - Many-to-Many needs a middle table (e.g., Line Items)

mermaid: datacamp-purple: wanderbricks star schema with relationships

Introduction to Databricks Genie

Let's practice!

Introduction to Databricks Genie

Preparing Video For Download...