Data Ingestion

Understanding Modern Data Architecture

Miller Trujillo

Senior Software Engineer

What is data ingestion?

  • Functional requirements
  • Functional can be impacted by analytics

Generic Big data architecture including data sources, ingestion, storage, processing, orchestration, governance, serving, and analytics storage and reporting

Understanding Modern Data Architecture

Batch ingestion

  • Scheduled to ingest data periodically
  • Copy in our platform for analytics
  • Reading all data vs reading what is new to us
  • Big datasets requires reading partially
  • Smaller datasets could be overwritten
Understanding Modern Data Architecture

Batch ingestion: Bring only what changed

  • Infinite resources are impossible
  • Ingest only what has changed
  • Updated at timestamp, or flag
  • Latest state of data
  • Deletion will require a flag or consolidation
Understanding Modern Data Architecture

Streaming ingestion

  • Push model
  • Event queues
  • 24/7 compute
  • Landing zone

Streaming ingestion workflow

Understanding Modern Data Architecture

Let's practice!

Understanding Modern Data Architecture

Preparing Video For Download...