What's the difference between data warehouses and data lakes?

Concetti di Data Warehousing

Aaren Stubberfield

Data Scientist

Database

  • Structured data in rows and columns
  • Transactional databases store transactions

Three database tables

Concetti di Data Warehousing

Data warehouse

  • Gather data, integrate, and make available for analysis
  • Many input data sources
  • Stores structured data
  • Complex to change
    • Upstream and downstream effects must be considered
  • Typically >100 GB in size

Three database tables feeding into a data warehouse

Concetti di Data Warehousing

Why the data warehouse?

  • How quickly the query will run on a large amount of data
  • Avoid slowing down transactional database

Person frustrated by slow data

Concetti di Data Warehousing

Data marts

  • A relational database for analysis
  • Data is focused on one subject area
  • Few input data sources
  • Typically <100 GB in size

Data warehouse feeding a data mart

Concetti di Data Warehousing

Data lake

  • Entire organization store of data
    • Contains data from many departments
    • Many data input sources
    • Typically >100 GB in size
  • Stores structured and unstructured data
    • Examples: video, audio, and documents

An audio and video file along with database feeding a data lake

Concetti di Data Warehousing

Data lake

  • Less complex to make changes
    • Fewer upstream and downstream effects to consider
  • Purpose to store data may not be known
    • Less organized

an audio and video file along with database feeding a data lake

Concetti di Data Warehousing

Summary

Feature Data Warehouse Data Mart Data Lake
Data structure Structured Structured Structured & Unstructured
Complexity to change Complex Complex Less complex
Purpose of data Known Known May not be known
Coverage of departments Covers many Covers only one Covers many
Data sources Many source systems Few sources Many source systems
Typical size >100 GB <100 GB >100 GB
Concetti di Data Warehousing

Let's practice!

Concetti di Data Warehousing

Preparing Video For Download...