Databases

Introduction to Data Engineering

Vincent Vankrunkelsven

Data Engineer @ DataCamp

What are databases?

 

Image of a database

A usually large collection of data organized especially for rapid search and retrieval.

 

  • Holds data
  • Organizes data
  • Retrieve/Search data through DBMS
Introduction to Data Engineering

Databases and file storage

Databases

 

Image of a database

  • Very organized
  • Functionality like search, replication, ...

File systems

 

Image of file

  • Less organized
  • Simple, less added functionality
Introduction to Data Engineering

Structured and unstructured data

Structured: database schema

  • Relational database

 

Semi-structured

  • JSON

 

Unstructured: schemaless, more like files

  • Videos, photos

Image of a database

 

{ "key": "value"}

 

Image of file

Introduction to Data Engineering

SQL and NoSQL

SQL

  • Tables
  • Database schema
  • Relational databases

 

MySQL Image

PostgreSQL Image

NoSQL

  • Non-relational databases
  • Structured or unstructured
  • Key-value stores (e.g. caching)
  • Document DB (e.g. JSON objects)

 

Redis/MongoDB Image

Introduction to Data Engineering

SQL: The database schema

-- Create Customer Table 
CREATE TABLE "Customer" (
  "id" SERIAL NOT NULL,
  "first_name" varchar,
  "last_name" varchar,
  PRIMARY KEY ("id")
);

-- Create Order Table 
CREATE TABLE "Order" (
  "id" SERIAL NOT NULL,
  "customer_id" integer REFERENCES "Customer",
  "product_name" varchar,
  "product_price" integer,
  PRIMARY KEY ("id")
);

Database Schema for Customer and Order

-- Join both tables on foreign key
SELECT * FROM "Customer"
INNER JOIN "Order"
ON "customer_id" = "Customer"."id";
 id | first_name | ... | product_price
  1 | Vincent    | ... |            10
Introduction to Data Engineering

SQL: Star schema

The star schema consists of one or more fact tables referencing any number of dimension tables.

Star Schema Diagram of Customers and Orders

  • Facts: things that happened (eg. Product Orders)
  • Dimensions: information on the world (eg. Customer Information)
1 Wikipedia: https://en.wikipedia.org/wiki/Star_schema
Introduction to Data Engineering

Let's practice!

Introduction to Data Engineering

Preparing Video For Download...