What's the difference between data warehouses and data lakes?

Data Warehousing Concepts

Aaren Stubberfield

Data Scientist

Database

  • Structured data in rows and columns
  • Transactional databases store transactions

Three database tables

Data Warehousing Concepts

Data warehouse

  • Gather data, integrate, and make available for analysis
  • Many input data sources
  • Stores structured data
  • Complex to change
    • Upstream and downstream effects must be considered
  • Typically >100 GB in size

Three database tables feeding into a data warehouse

Data Warehousing Concepts

Why the data warehouse?

  • How quickly the query will run on a large amount of data
  • Avoid slowing down transactional database

Person frustrated by slow data

Data Warehousing Concepts

Data marts

  • A relational database for analysis
  • Data is focused on one subject area
  • Few input data sources
  • Typically <100 GB in size

Data warehouse feeding a data mart

Data Warehousing Concepts

Data lake

  • Entire organization store of data
    • Contains data from many departments
    • Many data input sources
    • Typically >100 GB in size
  • Stores structured and unstructured data
    • Examples: video, audio, and documents

An audio and video file along with database feeding a data lake

Data Warehousing Concepts

Data lake

  • Less complex to make changes
    • Fewer upstream and downstream effects to consider
  • Purpose to store data may not be known
    • Less organized

an audio and video file along with database feeding a data lake

Data Warehousing Concepts

Summary

Feature Data Warehouse Data Mart Data Lake
Data structure Structured Structured Structured & Unstructured
Complexity to change Complex Complex Less complex
Purpose of data Known Known May not be known
Coverage of departments Covers many Covers only one Covers many
Data sources Many source systems Few sources Many source systems
Typical size >100 GB <100 GB >100 GB
Data Warehousing Concepts

Let's practice!

Data Warehousing Concepts

Preparing Video For Download...