2NF and 3NF

Introduction to Data Modeling in Snowflake

Nuno Rocha

Director of Engineering

Introduction to 2NF

  • Second normal form (2NF): Eliminates partial dependencies; every non-key attribute must functionally depend on the primary key

Steps from UNF to 3NF

Introduction to Data Modeling in Snowflake

Introduction to 2NF (1)

  • Second normal form (2NF): Eliminates partial dependencies; every non-key attribute must functionally depend on the primary key
  • Functional Dependency: The primary key explicitly identifies an attribute

Employee ID cards showing functional dependency

Introduction to Data Modeling in Snowflake

Introduction to 2NF (2)

  • Second normal form (2NF): Eliminates partial dependencies; every non-key attribute must functionally depend on the primary key.
  • Functional Dependency: The primary key explicitly identifies an attribute.
  • Partial Dependency: Only part of the primary key is needed to identify an attribute.

Employee ID cards showing partial dependency

Introduction to Data Modeling in Snowflake

The second normal form

Allproducts entity, product complaint with 2NF

Introduction to Data Modeling in Snowflake

The second normal form

Allproducts entity, detail issue with 2NF

Introduction to Data Modeling in Snowflake

The second normal form

Allproducts entity, detail issue with 2NF in manufacturer

Introduction to Data Modeling in Snowflake

Transitioning to 2NF

  • Step 1: Create new entities to allocate the attributes that had a partial dependency.
CREATE OR REPLACE TABLE manufacturers (
    manufacturer_id NUMBER(10,0) PRIMARY KEY,
    manufacturer VARCHAR(255),
    location VARCHAR(255)
);
CREATE OR REPLACE TABLE details (
    detail_id NUMBER(10,0) PRIMARY KEY,
    detail VARCHAR(255)
);
Introduction to Data Modeling in Snowflake

Transitioning to 2NF

  • Step 2: Fill entities with data from the initially unnormalized entity.
INSERT INTO manufacturers (manufacturer_id, manufacturer, location)
SELECT DISTINCT manufacturer_id, 
    manufacturer_name,
    location
FROM allproducts;
INSERT INTO details (detail_id, detail)
SELECT DISTINCT detail_id, 
    detail_description
FROM allproducts;
Introduction to Data Modeling in Snowflake

Transitioning to 2NF

All products data, from UNF to 2NF

Introduction to Data Modeling in Snowflake

Introduction to 3NF

  • Third normal form (3NF): Eliminates transitive dependencies; non-key attributes must directly depend on the primary key.

Steps from UNF to 3NF

Introduction to Data Modeling in Snowflake

Introduction to 3NF

  • Third normal form (3NF): Eliminates transitive dependencies; non-key attributes must directly depend on the primary key.
  • Transitive Dependency: An attribute depends on another attribute, which is not the primary key.

Employee ID cards showing transitive dependency

Introduction to Data Modeling in Snowflake

The third normal form

Location not compliant with 3NF

Introduction to Data Modeling in Snowflake

Transitioning to 3NF

  • Step 1: Create new entity to allocate the attributes that had a transitive dependency.
CREATE TABLE locations (
    location_id NUMBER(10,0) PRIMARY KEY,
    location VARCHAR(255)
);
Introduction to Data Modeling in Snowflake

Transitioning to 3NF

  • Step 2: Fill entities with data from the initially unnormalized entity.
INSERT INTO locations (location_id, location)
SELECT ROW_NUMBER() OVER (ORDER BY location), 
    location
FROM manufacturers
GROUP BY location;
ALTER TABLE manufacturers
DROP COLUMN location;
Introduction to Data Modeling in Snowflake

Transitioning to 3NF

  • Step 3: Create a new entity to extract from the unnormalized entity the remaining attributes.
CREATE OR REPLACE TABLE products (
    product_id NUMBER(10,0) PRIMARY KEY,
    name VARCHAR(255)
);
  • Step 4: Fill the entity with the unique values left in the unnormalized entity.
INSERT INTO products (product_id, name)
SELECT DISTINCT product_id,
    product_name
FROM allproducts;
Introduction to Data Modeling in Snowflake

Finalizing the model

Steps from UNF to 3NF

Introduction to Data Modeling in Snowflake

Finalizing the model

All products data, from UNF to 1NF

Introduction to Data Modeling in Snowflake

Finalizing the model

All products data, from UNF to 2NF

Introduction to Data Modeling in Snowflake

Finalizing the model

All products data, from UNF to 3NF

Introduction to Data Modeling in Snowflake

Finalizing the model

All products data, from UNF to 3NF with relationships

Introduction to Data Modeling in Snowflake

Terminology and functions overview

  • Second normal form (2NF): Eliminates partial dependencies; every non-key attribute must functionally depend on the primary key.
  • Third normal form (3NF): Eliminates transitive dependencies; non-key attributes must directly depend on the primary key.
  • Functional Dependency: The primary key explicitly identifies an attribute.
  • Partial Dependency: Only part of the primary key is needed to identify an attribute.
  • Transitive Dependency: An attribute depends on another attribute, which is not the primary key.
  • ROW_NUMBER() OVER (ORDER BY): SQL function to generate a sequential number.
  • DISTINCT: SQL clause to return unique (different) values from an attribute.
  • DROP: SQL command, used with ALTER TABLE, to remove elements to the entity.
Introduction to Data Modeling in Snowflake

Functions overview

INSERT INTO table_name (column_name, other_columns)
SELECT DISTINCT column_name,
    other_columns
FROM another_table;
INSERT INTO table_name (column_name)
SELECT ROW_NUMBER() OVER (ORDER BY column_name)
FROM another_table
GROUP BY TRIM(column_name);
ALTER TABLE table_name
DROP COLUMN column_name;
Introduction to Data Modeling in Snowflake

Let's practice!

Introduction to Data Modeling in Snowflake

Preparing Video For Download...