Choosing the Right Approach

Introduction to Data Modeling in Snowflake

Nuno Rocha

Director of Engineering

Use cases for each modeling technique

Use cases for Entity-Relationship (ER) Models

Introduction to Data Modeling in Snowflake

Use cases for each modeling technique (1)

Use cases for Dimensional Models

Introduction to Data Modeling in Snowflake

Use cases for each modeling technique (2)

Use cases for Data Vault Modeling

Introduction to Data Modeling in Snowflake

Technical considerations

Plus and minus of ER model

Introduction to Data Modeling in Snowflake

Technical considerations (1)

Plus and minus of dimensional model

Introduction to Data Modeling in Snowflake

Technical considerations (2)

Plus and minus of data vault

Introduction to Data Modeling in Snowflake

Data models in action

Models in Venn diagram

Introduction to Data Modeling in Snowflake

Data models in action (1)

College data into models in Venn diagram

Introduction to Data Modeling in Snowflake

Retrieving data from the models

  • Retrieve detailed data from students by referencing the hub entity and its satellite:
SELECT
    hs.student_key,
    ss.student_name
FROM hub_students AS hs
    JOIN sat_student AS ss ON hs.student_key = ss.student_key;
Introduction to Data Modeling in Snowflake

Retrieving data from the models (1)

  • LEFT JOIN ON: SQL clause that combines all rows from the left entity with the matching rows from the right table, based ON a key
SELECT
    hs.student_key,
    ss.student_name
FROM hub_students AS hs
    JOIN sat_student AS ss ON hs.student_key = ss.student_key
    LEFT JOIN link_enrollment AS le ON hs.student_key = le.student_key
Introduction to Data Modeling in Snowflake

Retrieving data from the models (2)

  • COUNT: SQL aggregate function that returns the number of items in a group.
  • GROUP BY: SQL clause to aggregate data that have the same values.
SELECT
    hs.student_key,
    ss.student_name,
    COUNT(le.class_key) AS NumberOfEnrollments
FROM hub_students AS hs
    JOIN sat_student AS ss ON hs.student_key = ss.student_key
    LEFT JOIN link_enrollment AS le ON hs.student_key = le.student_key
GROUP BY hs.student_key, 
    ss.student_name
Introduction to Data Modeling in Snowflake

Retrieving data from the models (3)

  • MAX: SQL aggregate function that finds the highest value in a set of values for an attribute.
SELECT
    hs.student_key,
    ss.student_name,
    COUNT(le.class_key) AS NumberOfEnrollments
    MAX(sc.load_date) AS MostRecentEnrollmentDate
FROM hub_students hs
    JOIN sat_student ss ON hs.student_key = ss.student_key
    LEFT JOIN link_enrollment le ON hs.student_key = le.student_key
    LEFT JOIN sat_class sc ON le.class_key = sc.class_key
GROUP BY hs.student_key, 
    ss.student_name;
Introduction to Data Modeling in Snowflake

Functions overview

  • SELECT FROM: SQL command to fetch columns from an entity
  • JOIN ON: SQL clause combining rows from entities based ON a related attribute
  • LEFT JOIN ON: SQL clause that combines all rows from the left entity with the matching rows from the right table, based ON a key. If there's no match, the result will still show the left entity rows with empty values for the right attributes
  • COUNT: SQL aggregate function that returns the number of items in a group
  • MAX: SQL aggregate function that finds the highest value in a set of values for an attribute
  • GROUP BY: SQL clause to aggregate data that have the same values
Introduction to Data Modeling in Snowflake

Functions overview

SELECT column_name, 
    COUNT(another_column) AS alias_name,
    MAX(other_column) AS alias_name
FROM table_name table_alias
    -- Merge entities based on their keys
    JOIN other_table AS other_alias 
    ON table_alias.FK = other_alias.PK
    LEFT JOIN another_table AS another_alias 
    ON table_alias.FK = other_alias.PK
-- Aggregate data by specific columns
GROUP BY column_name;
Introduction to Data Modeling in Snowflake

Let's practice!

Introduction to Data Modeling in Snowflake

Preparing Video For Download...