SQL databases

Understanding Data Engineering

Hadrien Lacroix

Content Developer at DataCamp

SQL

  • Structured Query Language
  • Industry standard for Relational Database Management System (RDBMS)
  • Allows you to access many records at once, and group, filter or aggregate them
  • Close to written English, easy to write and understand
  • Data engineers use SQL to create and maintain databases
  • Data scientists use SQL to query (request information from) databases
Understanding Data Engineering

Remember the employees table

index last_name first_name role team full_time office
0 Thien Vivian Data Engineer Data Science 1 Belgium
1 Huong Julian Data Scientist Data Science 1 Belgium
2 Duplantier Norbert Software Developer Infrastructure 1 United Kingdom
3 McColgan Jeff Business Developer Sales 1 United States
4 Sanchez Rick Support Agent Customer Service 0 United States
Understanding Data Engineering

SQL for data engineers

  • Data engineers use SQL to create, maintain and update tables.
CREATE TABLE employees (

employee_id INT,
first_name VARCHAR(255),
last_name VARCHAR(255),
role VARCHAR(255),
team VARCHAR(255),
full_time BOOLEAN,
office VARCHAR(255)
);
Understanding Data Engineering

SQL for data scientists

  • Data scientist use SQL to query, filter, group and aggregate data in tables.
SELECT first_name, last_name

FROM employees
WHERE role LIKE '%Data%'
Understanding Data Engineering

Database schema

  • Databases are made of tables
  • The database schema governs how tables are related
Understanding Data Engineering

albums table

Understanding Data Engineering

artists table

Understanding Data Engineering

artists table is connected to albums table through the artist_id unique key

Understanding Data Engineering

songs table

Understanding Data Engineering

albums table is connected to songs table through the album unique key

Understanding Data Engineering

playlists table

Understanding Data Engineering

playlists table is connected to songs table through the song unique key

Understanding Data Engineering

Several implementations

  • SQLite
  • MySQL
  • PostgreSQL
  • Oracle SQL
  • SQL Server
Understanding Data Engineering

Summary

  • SQL = industry standard
  • Explain how Data engineers and Data scientists use it differently
  • Database schema
  • SQL implementations
Understanding Data Engineering

Let's practice!

Understanding Data Engineering

Preparing Video For Download...