Notebook fundamentals

Introduction to Databricks Lakehouse

Gang Wang

Senior Data Scientist

What is a Databricks notebook?

$$

recraft: half: A scientist at a clean modern laboratory workbench with multiple colorful instruments and tools arranged neatly, representing a multi-tool development environment

$$

  • An interactive document of runnable code cells
  • Attached to a cluster for execution
  • Mix code, results, and documentation in one place
  • Supports Python, SQL, Scala, and R
Introduction to Databricks Lakehouse

Magic commands

# Default language: Python
df = spark.table("silver_taxi_trips")
display(df)
%sql
SELECT COUNT(*) AS total_trips
FROM silver_taxi_trips
%md
## Analysis notes
Revenue is **highest** in the Northeast region.
Introduction to Databricks Lakehouse

Available magic commands

$$

Command Purpose
%python Run Python code
%sql Run SQL queries
%scala Run Scala code
%r Run R code
%md Render Markdown
%sh Run shell commands

$$

radial: Cluster, python, sql, md, scala, r, sh

Introduction to Databricks Lakehouse

Running another notebook with %run

$$

  • %run executes another notebook in the same context
  • Functions and variables become available
  • Great for reusable utilities and shared config

$$

# Load shared helper functions
%run /Shared/utils/data_helpers
# Now use a function defined
# in data_helpers
clean_df = clean_nulls(raw_df)
Introduction to Databricks Lakehouse

Interpreting results

$$

  • Code cells show output directly below
  • SQL queries render as interactive tables
  • DataFrames display with built-in visualization options
  • Errors show stack traces with line numbers

$$

recraft: half: A laptop screen displaying a colorful data dashboard with charts, tables, and a code cell, representing notebook output and visualization

Introduction to Databricks Lakehouse

Summary

$$

  • Notebooks are interactive documents attached to clusters
  • Magic commands let you mix Python, SQL, R, Scala, and Markdown
  • %run loads functions from other notebooks into your session
  • Results render inline with built-in tables and charts
Introduction to Databricks Lakehouse

Let's practice!

Introduction to Databricks Lakehouse

Preparing Video For Download...