Micro-Partitions and Data Clustering

Snowflake Architecture

Emily Melhuish

Technical Curriculum Developer, Snowflake

Querying a Large Table

Example: Subscription

USER_ID REGISTRATION_DATE USER_NAME
912930 2026-04-01 JANEDOE
912931 2026-04-06 JOHNDOE
912932 2026-04-12 JIMMYDOE
... ... ...

How much data does the query need to read from this table?

Snowflake Architecture

Micro-Partitions

Example showing micro-partitions and how data is stored with columns together

  • Within micro-partitions columns are stored and compressed separately
  • Snowflake only reads columns the query needs
Snowflake Architecture

Metadata

Example of micro partitions with metadata

  • Snowflake stores metadata for each micro-partition
  • Stored in the cloud services layer
Snowflake Architecture

Partition Pruning

Example of pruning

Snowflake Architecture

Partition Pruning

Example of pruning with option selected

  • Snowflake skips micropartitions that can't possibly contain the data your query needs.
  • It only reads the relevant slices = partition pruning
Snowflake Architecture

Clustering

Example showing micro-partitions and how data is stored with columns together

  • Date-based queries often prune well (for example, time series)

  • Pruning is not useful for values that are scattered across partitions (for example, region)

  • Clustering key: Snowflake organizes micro-partitions around that column

Snowflake Architecture

Clustering Key: When to Use

Checklist

  • Large tables (100GB+)
  • Consistently filtering on the same column
  • Poorly clustered column
Snowflake Architecture

Clustering Key: Cardinality and Cost

clustering_key_v5.png

Snowflake Architecture

Let's practice!

Snowflake Architecture

Preparing Video For Download...