Databricks SQL key assets

Introduction to Databricks SQL

Kevin Barlow

Data Manager

Helpful analogy

A tree consists of many different components, all of which make up the entire entity

Tree System GIF

In Databricks SQL, different components combine into a data warehouse solution

Databricks SQL Assets

Introduction to Databricks SQL

Query

  • The base "unit" of analysis in Databricks SQL
  • Runs SQL code against compute
  • Uses ANSI SQL standard
  • Process data from:
    • Unity Catalog
    • Delta tables
    • Data lake files
    • Data streams
SELECT
    orderdate AS Date,
    orderpriority AS Priority
    sum(totalprice) AS TotalPrice
FROM sfdc.sales.orders
GROUP BY
    1, 2
ORDER BY
    1, 2
Introduction to Databricks SQL

SQL Warehouse

  • Compute cluster dedicated for SQL
  • Optimizations (e.g. Photon)
  • Simpler administration
  • Easy scaling
  • Queries and BI tools

SQL Warehouse GIF

Introduction to Databricks SQL

Tables versus views

Tables

  • Physical manifestations of datasets
  • Written in Delta format
  • Readable and accessible outside of the data pipeline
  • Can optimize data layout (partitioning, etc.)

Table in Object Storage

Introduction to Databricks SQL

Tables versus views

Views

  • Virtual representations of query results in Unity Catalog
  • Fast performance for reading data
  • Great for simplifying downstream queries
    • Source query has many joins, filters, etc.
  • Incremental data processing available

View Diagram

Introduction to Databricks SQL

Visualizations and dashboards

Visualizations

  • Visual representations of a query result
  • Created relative to a single query

Configuring Visualizations

Dashboards

  • Collection of several visualizations
  • Across multiple datasets / query results

Sample dashboard

Introduction to Databricks SQL

Let's practice!

Introduction to Databricks SQL

Preparing Video For Download...