Data Pipeline Automation in Snowflake
Emily Melhuish
Technical Curriculum Developer, Snowflake
Use Case:
Solution: Snowpipe

COPY INTO runs at midnight — 24-hour lag-- Nightly batch: runs at 00:00, data arrives all day
COPY INTO logistics.delivery_events
FROM @harbr_s3_stage/events/
FILE_FORMAT = (FORMAT_NAME = 'harbr_json_format');
-- A 9am exception won't appear until tomorrow
COPY INTO statement — same syntax, same file formatsCREATE PIPE harbr_events_pipe AS
COPY INTO logistics.delivery_events
FROM @harbr_s3_stage/events/
FILE_FORMAT = (FORMAT_NAME = 'harbr_json_format');

AUTO_INGEST — event-driven; cloud storage publishes a notificationinsertFiles or insertReport endpoints directly from orchestration code
| Snowpipe | Snowpipe Streaming | |
|---|---|---|
| Trigger | File lands in stage | Row written by application |
| Latency | Minutes | Seconds |
| Use case | File-based event feeds | GPS, IoT, real-time app data |
Removes the file boundary entirely
# Snowpipe Streaming: application writes rows directly
channel = client.openChannel('GPS_CHANNEL', 'LOGISTICS', 'GPS_EVENTS')
channel.insertRows(rows=[
{'vehicle_id': 'V001', 'lat': 51.5, 'lng': -0.12, 'ts': now()}
])

| Method | When to use |
|---|---|
| COPY INTO | Scheduled batch loads - nightly files, weekly exports, hours of latency acceptable |
| Snowpipe | Continuous file arrivals, loads needed within minutes of arrival |
| Snowpipe Streaming | Application-generated data - GPS, IoT, financial markets - data available in seconds |
Data Pipeline Automation in Snowflake