Implementing Star Schemas

Transform and Analyze Data with Microsoft Fabric

Luis Silva

Solution Architect - Data & AI

Medallion Architecture

Diagram of the medallion data architecture, illustrating data ingested from data sources into the bronze layer, transformed into cleansed data in the silver layer and further transformed into a star schema in the gold layer suitable for analytics

  • Design pattern for organization of data
  • Three distinct layers that indicate progressive refinement of the data
Transform and Analyze Data with Microsoft Fabric

Medallion Architecture

Diagram of the medallion data architecture, highlighting the Bronze layer

  • Bronze layer: Data ingested from the source in its original raw format.
Transform and Analyze Data with Microsoft Fabric

Medallion Architecture

Diagram of the medallion data architecture, highlighting the Silver layer

  • Silver layer: Cleansed and standardized data.
Transform and Analyze Data with Microsoft Fabric

Medallion Architecture

Diagram of the medallion data architecture, highlighting the Gold layer

  • Gold layer: Data in a format suitable for analytics, usually star schema.
Transform and Analyze Data with Microsoft Fabric

Medallion Architecture in Fabric

Diagram of the medallion data architecture in Fabric, illustrating data pipelines to ingest data from sources into a lakehouse in the bronze layer, spark notebooks and dataflows to transform data into a lakehousein the silver layer, and further transformed into a star schema in a warehouse in the gold layer.

  • Bronze layer: Lakehouse
  • Silver layer: Lakehouse
  • Gold layer: Lakehouse or Warehouse
Transform and Analyze Data with Microsoft Fabric

Implementing a star schema

  • Use T-SQL CREATE TABLE statement to define the tables.
  • Fabric supports the most commonly used T-SQL data types.

Screenshot of the layout of a warehouse with a dw-sales schema expanded to show it contains a fact table and several dimension tables

Diagram of a Star schema

Transform and Analyze Data with Microsoft Fabric

Dimension table structure

  • Describe business entities (e.g., products).
  • Usually have the following columns:

Sample dimension table called Dim Product with columns Product ID, Product Key, Product Name and Product Brand

Transform and Analyze Data with Microsoft Fabric

Dimension table structure

  • Describe business entities (e.g., products).
  • Usually have the following columns:
    • Surrogate Key. Single-column unique identifier for dimension items.

Sample dimension table called Dim Product with columns Product ID, Product Key, Product Name and Product Brand. Product ID is highlighted as the surrogate key

Transform and Analyze Data with Microsoft Fabric

Dimension table structure

  • Describe business entities (e.g., products).
  • Usually have the following columns:
    • Surrogate Key. Single-column unique identifier for dimension items.
    • Natural Key (Business Key). Single-column unique identifier that comes from source systems.

Sample dimension table called Dim Product with columns Product ID, Product Key, Product Name and Product Brand. Product Key is highlighted as the natural key

Transform and Analyze Data with Microsoft Fabric

Dimension table structure

  • Describe business entities (e.g., products).
  • Usually have the following columns:
    • Surrogate Key. Single-column unique identifier for dimension items.
    • Natural Key (Business Key). Single-column unique identifier that comes from source systems.
    • Dimension Attributes. One or more columns that are used to filter data and provide context to the data stored in the fact table.

Sample dimension table called Dim Product with columns Product ID, Product Key, Product Name and Product Brand. Product Name and Product Brand are highlighted as the dimension attributes

Transform and Analyze Data with Microsoft Fabric

Building dimension tables

T-SQL script using the CREATE TABLE statement to create the sample dimension table dim_Product

Sample dimension table called Dim Product with columns Product ID, Product Key, Product Name and Product Brand. Product ID is the surrogate key, Product Key is the natural key, and Product Name and Product Brand are the dimension attributes

Transform and Analyze Data with Microsoft Fabric

Fact table structure

  • Describe business events (e.g., sale of a product).
  • Columns:

Sample fact table called Fact Sales with columns Date ID, Store ID, Product ID, Customer ID, Payment ID, Packaging Notes, Sale Order Number, Sale Units and Sale Price

Transform and Analyze Data with Microsoft Fabric

Fact table structure

  • Describe business events (e.g., sale of a product).
  • Columns:
    • Dimension Keys. References to the surrogate keys of the dimension tables.

Sample fact table called Fact Sales with columns Date ID, Store ID, Product ID, Customer ID, Payment ID, Packaging Notes, Sale Order Number, Sale Units and Sale Price. Date ID, Store ID, Product ID, Customer ID, and Payment ID are highlighted as the dimension keys

Transform and Analyze Data with Microsoft Fabric

Fact table structure

  • Describe business events (e.g., sale of a product).
  • Columns:
    • Dimension Keys. References to the surrogate keys of the dimension tables.
    • Attributes. Provide additional information about the fact but are neither a dimension or a numerical measure.

Sample fact table called Fact Sales with columns Date ID, Store ID, Product ID, Customer ID, Payment ID, Packaging Notes, Sale Order Number, Sale Units and Sale Price. Packaging Notes and Sale Order Number are highlighted as the attributes

Transform and Analyze Data with Microsoft Fabric

Fact table structure

  • Describe business events (e.g., sale of a product).
  • Columns:
    • Dimension Keys. References to the surrogate keys of the dimension tables.
    • Attributes. Provide additional information about the fact but are neither a dimension or a numerical measure.
    • Measures. Quantify something about the fact; they are numeric and commonly additive, e.g., number of units, dollar amount, etc.

Sample fact table called Fact Sales with columns Date ID, Store ID, Product ID, Customer ID, Payment ID, Packaging Notes, Sale Order Number, Sale Units and Sale Price. Sale Units and Sale Price are highlighted as the measures

Transform and Analyze Data with Microsoft Fabric

Building fact tables

T-SQL script using the CREATE TABLE statement to create the sample fact table fact_Sales

Sample fact table called Fact Sales with columns Date ID, Store ID, Product ID, Customer ID, Payment ID, Packaging Notes, Sale Order Number, Sale Units and Sale Price

Transform and Analyze Data with Microsoft Fabric

Let's practice!

Transform and Analyze Data with Microsoft Fabric

Preparing Video For Download...