Data engineering and big data

Capire il Data Engineering

Hadrien Lacroix

Content Developer at DataCamp

About the course

  • Conceptual course
  • No coding involved
  • Objectives
    • Being able to exchange with data engineers
    • Provide a solid foundation to learn more
Capire il Data Engineering

Chapter 1

What is data engineering?

  1. Data engineering and big data
  2. Data engineers vs. data scientists
  3. Data pipelines
Capire il Data Engineering

Chapter 2

How data storage works

  1. Structured vs unstructured data
  2. SQL
  3. Data warehouse and data lakes
Capire il Data Engineering

Chapter 3

How to move and process data

  1. Processing data
  2. Scheduling data
  3. Parallel computing
  4. Cloud computing
Capire il Data Engineering

$$

$$

$$

spotflix logo

Capire il Data Engineering

Data workflow

first step - data collection and storage

Capire il Data Engineering

Data workflow

Second step - Data preparation

Capire il Data Engineering

Data workflow

Third step - exploration and visualization

Capire il Data Engineering

Data workflow

experimentation and prediction

Capire il Data Engineering

Data engineers

data collection and storage is circled

Capire il Data Engineering

Data engineers

Data engineers deliver:

  • the correct data
  • in the right form
  • to the right people
  • as efficiently as possible
Capire il Data Engineering

A data engineer's responsibilities

  • Ingest data from different sources
  • Optimize databases for analysis
  • Remove corrupted data
  • Develop, construct, test and maintain data architectures
Capire il Data Engineering

Data engineers and big data

  • Big data becomes the norm =>
Capire il Data Engineering

Data engineers and big data

  • Big data becomes the norm => data engineers are more and more needed
  • Big data:
    • Have to think about how to deal with its size
    • So large traditional methods don't work anymore
Capire il Data Engineering

Big data growth

  • Sensors and devices
  • Social media
  • Enterprise data
  • VoIP (voice communication, multimedia sessions)

graph showing big data growth

1 Data Age 2025, Seagate, November 2018
Capire il Data Engineering

The five Vs

  • Volume (how much?)
  • Variety (what kind?)
  • Velocity (how frequent?)
  • Veracity (how accurate?)
  • Value (how useful?)
Capire il Data Engineering

Summary

  • What's waiting for you
  • How data flows through an organization
  • When a data engineer intervenes
  • What their responsibilities are
  • How data engineering relates to big data
Capire il Data Engineering

Let's practice!

Capire il Data Engineering

Preparing Video For Download...