Synonyms and descriptions

Introduction to Databricks Genie

Gang Wang

Senior Data Scientist

The language gap problem

recraft: half: Two speech bubbles showing different terminology - business user says clients and revenue while database shows customer_id and total_amount, communication gap visualization

Common terminology traps:

  • "Customer" - Users say "clients," but the table is sales_customers
  • "Money" - Users say "revenue," but the column is totalPrice
  • "Location" - Users say "stores," but the table is sales_franchises
Introduction to Databricks Genie

Before and after curation

Before: "What's our EBITDA?"

nanobanana: half: Chat interface showing failed EBITDA query

Genie can't map "EBITDA" to any column

After: synonyms added

nanobanana: half: Chat interface showing successful EBITDA query with chart

Same question, accurate results

Introduction to Databricks Genie

Column descriptions

Introduction to Databricks Genie

Hiding irrelevant columns

$$

Column visibility panel showing sales_transactions columns with customerID hidden

Column visibility rules:

  • Hide technical metadata (new_id, chunk_id)
  • Hide redundant IDs when a name column exists
  • Expose human-readable columns (product, quantity, review_date)
  • Fewer columns = less confusion = better accuracy
Introduction to Databricks Genie

Let's practice!

Introduction to Databricks Genie

Preparing Video For Download...