Introduction to Databricks Lakehouse

Databricks Concepts

Kevin Barlow

Data Analytics Practitioner

The Data Warehouse

Data Warehouse

Pros

  • Great for structured data
  • Highly performant
  • Easy to keep data clean

Cons

  • Very expensive
  • Cannot support modern applications
  • Not built for Machine Learning

Data Warehouse

1 https://www.databricks.com/blog/2021/05/19/evolution-to-the-data-lakehouse.html
Databricks Concepts

The Data Lake

Data Lake Image

Data Lake

Pros

  • Support for all use cases
  • Very flexible
  • Cost effective

Cons

  • Data can become messy
  • Historically not very performant
1 https://www.databricks.com/blog/2021/05/19/evolution-to-the-data-lakehouse.html
Databricks Concepts

Birth of the Lakehouse

Data Warehouse and Data Lake

1 https://www.databricks.com/blog/2021/05/19/evolution-to-the-data-lakehouse.html
Databricks Concepts

Birth of the Lakehouse

Birth of the Lakehouse Architecture

1 https://www.databricks.com/blog/2021/05/19/evolution-to-the-data-lakehouse.html
Databricks Concepts

The Databricks Lakehouse

The Databricks Lakehouse Platform

  • Single platform for all data workloads
  • Built on open source technology
  • Collaborative environment
  • Simplified architecture

Databricks Lakehouse High-Level Diagram

1 https://www.databricks.com/blog/2021/05/19/evolution-to-the-data-lakehouse.html
Databricks Concepts

Databricks Architecture Benefits

Unification

  • Every use case from AI to BI
  • Benefits of data warehouse and data lake

Unified Analytics Applications

Multi-Cloud

  • Bring powerful platform to your data
  • No lock-in to a specific cloud platform

Multiple Clouds

Databricks Concepts

Databricks Development Benefits

Collaborative

  • Every data persona
  • Ability to work in same platform in real-time

Collaborative Team

Open-Source

  • Underpinned by Apache Spark
  • Support for most popular languages (Python, R, Scala, SQL)

Open Source

Databricks Concepts

Let's practice!

Databricks Concepts

Preparing Video For Download...