Data Pipeline Automation in Snowflake
Emily Melhuish
Technical Curriculum Developer, Snowflake

Carrier A

Carrier B


| Category | Types | Example |
|---|---|---|
| Numeric | NUMBER, INT, FLOAT | credits_used NUMBER(10,2) |
| Text | VARCHAR, STRING, TEXT | carrier_name VARCHAR |
| Date / Time | DATE, TIME, TIMESTAMP | dispatch_date DATE |
| Boolean | BOOLEAN | is_active BOOLEAN |
| Semi-structured | VARIANT, OBJECT, ARRAY | event_payload VARIANT |
VARIANT stores semi-structured data as-is: no schema requiredCOPY INTO shipment_events
FROM @harbr_stage/events/
FILE_FORMAT = (TYPE = 'JSON');

Example JSON
{
"shipment_id": "SHP-001",
"carrier": { "name": "FastFreight" },
"stops": [
{ "location": "Rotterdam Port" },
{ "location": "Amsterdam Hub" }
]
}
Navigating JSON
Traversing structure:
<column>:<level1_element>.<level2_element>
SELECT
event:id::STRING AS shipment_id
, event:carrier.name::STRING AS carrier
, event:stops[0]:location::STRING
AS first_stop
FROM shipment_events;

SELECT
s.event_data:shipment_id::STRING AS shipment_id,
f.value:location::STRING AS stop,
f.value:arrived_at::TIMESTAMP AS arrived_at
FROM shipment_events s,
LATERAL FLATTEN(input => s.event_data:stops) f;

Data Pipeline Automation in Snowflake