Overview of Lakehouse AI

Databricks Concepts

Kevin Barlow

Data Practitioner

Lakehouse AI

Lakehouse High Level Diagram

Why the Lakehouse for AI / ML?

  1. Reliable data and files in the Delta lake
  2. Highly scalable compute
  3. Open standards, libraries, frameworks
  4. Unification with other data teams
1 https://www.databricks.com/blog/2020/01/30/what-is-a-data-lakehouse.html
Databricks Concepts

MLOps Lifecycle

Machine Learning Lifecycle Diagram

Databricks Concepts

MLOps in the Lakehouse

DataOps

DataOps

  • Integrating data across different sources (AutoLoader)
  • Transforming data into a usable, clean format (Delta Live Tables)
  • Creating useful features for models (Feature Store)
Databricks Concepts

MLOps in the Lakehouse

ModelOps

ModelOps

  • Develop and train different models (Notebooks)
  • Machine learning templates and automation (AutoML)
  • Track parameters, metrics, and trials (MLFlow)
  • Centralize and consume models (Model Registry)
Databricks Concepts

MLOps in the Lakehouse

DevOps

DevOps

  • Govern access to different models (Unity Catalog)
  • Continuous Integration and Continuous Deployment (CI/CD) for model versions (Model Registry)
  • Deploy models for consumption (Serving Endpoints)
Databricks Concepts

Let's review!

Databricks Concepts

Preparing Video For Download...