Dimensional modeling

Introduction to Data Modeling in Snowflake

Nuno Rocha

Director of Engineering

Introduction to the dimensional data model

  • Dimensional modeling: A data structuring technique that separates measurements (facts) from descriptive details (dimensions) optimized for reporting and analysis.

Dimensional model

Introduction to Data Modeling in Snowflake

Introduction to the dimensional data model (1)

  • Dimensions: Entities with categorical data in a dimensional model.
  • Facts: Entities that capture and quantify activities within the categories in the dimensions.

Dimensional college model

Introduction to Data Modeling in Snowflake

Star dimensional model schema

Dimensional star model

Introduction to Data Modeling in Snowflake

Star dimensional model schema

  • Snowflake Data Warehouse: A cloud-based storage and analytics service.
  • Snowflake Schema: A method of organizing data in a dimensional model which includes sub-dimensions.

Snowflake, data warehouse vs dimensional model

Introduction to Data Modeling in Snowflake

Snowflake dimensional model schema

Dimensional snowflake model

Introduction to Data Modeling in Snowflake

Defining dimensions

  • Rename entities to dim_EntityName for clarity, following up dimensions in the model:
ALTER TABLE students RENAME TO dim_students;
ALTER TABLE classes RENAME TO dim_classes;
ALTER TABLE schools RENAME TO dim_schools;
Introduction to Data Modeling in Snowflake

Defining date dimension

  • Creating the dim_date table to store key fixed dates related to student enrollments in school:
CREATE OR REPLACE TABLE dim_date (
    date_id NUMBER(10,0) PRIMARY KEY,
    year NUMBER(4,0),
    semester VARCHAR(255)
);
Introduction to Data Modeling in Snowflake

Defining enrollments fact

  • Create a fact entity containing references to all the dimensions:
CREATE OR REPLACE TABLE fact_enrollments (
    enrollment_id NUMBER(10,0) PRIMARY KEY,
    student_id NUMBER(10,0),
    class_id NUMBER(10,0),
    date_id NUMBER(10,0),
    FOREIGN KEY (student_id) REFERENCES dim_students(student_id),
    FOREIGN KEY (class_id) REFERENCES dim_classes(class_id),
    FOREIGN KEY (date_id) REFERENCES dim_date(date_id)
);
Introduction to Data Modeling in Snowflake

Retrieving data from the dimensions

College dimensional data model

Introduction to Data Modeling in Snowflake

Retrieving data from the dimensions (1)

SELECT name, 
    class_name
FROM fact_enrollments
    JOIN dim_students -- Joining to get student names
    ON fact_enrollments.student_id = dim_students.student_id
    JOIN dim_classes -- Joining to get class names
    ON fact_enrollments.class_id = dim_classes.class_id
    JOIN dim_schools -- Joining to filter for the 'Science' school
    ON dim_classes.school_id = dim_schools.school_id
    JOIN dim_date -- Joining to restrict data to the year 2023
    ON fact_enrollments.date_id = dim_date.date_id
WHERE dim_schools.school_name = 'Science' 
    AND dim_date.year = 2023;
Introduction to Data Modeling in Snowflake

Terminology and functions overview

  • Dimensional modeling: A data structuring technique that separates measurements (facts) from descriptive details (dimensions) optimized for reporting and analysis
  • Dimensions: Entities with categorical data in a dimensional model
  • Facts: Entities that capture and quantify activities within the categories in the dimensions
  • ALTER TABLE: SQL command used to modify the structure of an existing entity
  • RENAME TO: SQL command, used with ALTER TABLE, to rename an entity
  • JOIN ON: SQL clause to combine rows from tables, based ON a related column
  • WHERE: SQL clause to filter records based on a set condition
  • AND: Logical operator used with WHERE clause to combine multiple conditions
Introduction to Data Modeling in Snowflake

Functions overview

-- Modifying a entity
ALTER TABLE table_name 
RENAME TO new_name;
-- Querying data from merged entities filtered by specific conditions
SELECT 
    table_name.column_name,
    other_name.*
FROM table_name
    JOIN other_table 
    ON table_name.FK = other_table.PK
WHERE column_name  condition  value
    AND column_name  condition  value;
Introduction to Data Modeling in Snowflake

Let's practice!

Introduction to Data Modeling in Snowflake

Preparing Video For Download...