Semi-Structured Data Querying

Data Pipeline Automation in Snowflake

Emily Melhuish

Technical Curriculum Developer, Snowflake

Semi-structured Data Loading

data loading.png

1 * Snowflake Learning Material
Data Pipeline Automation in Snowflake

Managing Different Data Shapes

Carrier A Carrier A JSON

Carrier B

Carrier B JSON

Data Pipeline Automation in Snowflake

Transformation Example

Screenshot 2026-05-11 at 5.56.34 pm.png

Data Pipeline Automation in Snowflake

Snowflake Data Types

Category Types Example
Numeric NUMBER, INT, FLOAT credits_used NUMBER(10,2)
Text VARCHAR, STRING, TEXT carrier_name VARCHAR
Date / Time DATE, TIME, TIMESTAMP dispatch_date DATE
Boolean BOOLEAN is_active BOOLEAN
Semi-structured VARIANT, OBJECT, ARRAY event_payload VARIANT
Data Pipeline Automation in Snowflake

The VARIANT type

  • VARIANT stores semi-structured data as-is: no schema required
  • Load JSON, Parquet, Avro, ORC or XML directly
  • Data structure is resolved at query time, not load time (schema-on-read)
COPY INTO shipment_events
FROM @harbr_stage/events/
FILE_FORMAT = (TYPE = 'JSON');

VARIANT word cloud showing all semi-structured data types VARIANT holds: JSON, ARRAY, OBJECT, AVRO, PARQUET, SEMI-STRUCTURED

Data Pipeline Automation in Snowflake

Navigating JSON

Example JSON

{
  "shipment_id": "SHP-001",
  "carrier": { "name": "FastFreight" },
  "stops": [
    { "location": "Rotterdam Port" },
    { "location": "Amsterdam Hub" }
  ]
}

Navigating JSON

Traversing structure:

<column>:<level1_element>.<level2_element>

SELECT 
event:id::STRING AS shipment_id
, event:carrier.name::STRING AS carrier
, event:stops[0]:location::STRING 
AS first_stop 
FROM shipment_events;
Data Pipeline Automation in Snowflake

Turning Arrays into Rows wih FLATTEN

Flatten.png

Data Pipeline Automation in Snowflake

Using LATERAL FLATTEN

SELECT 
  s.event_data:shipment_id::STRING AS shipment_id,
  f.value:location::STRING AS stop,
  f.value:arrived_at::TIMESTAMP AS arrived_at 
FROM shipment_events s,
  LATERAL FLATTEN(input => s.event_data:stops) f;

Flatten.png

Data Pipeline Automation in Snowflake

Let's practice!

Data Pipeline Automation in Snowflake

Preparing Video For Download...