Data Pipeline Automation in Snowflake
Emily Melhuish
Technical Curriculum Developer, Snowflake
![]()
User-Defined Functions
Recommended starting point
CREATE FUNCTION delivery_tier(days INT)
RETURNS VARCHAR AS $$
CASE WHEN days <= 2 THEN 'Express'
WHEN days <= 5 THEN 'Standard'
ELSE 'Delayed' END $$;
When SQL cannot do the job
CREATE FUNCTION parse_weight(label STRING)
RETURNS FLOAT LANGUAGE PYTHON
RUNTIME_VERSION = '3.9' HANDLER = 'parse'
AS $$ def parse(s): return float(s.split('kg')[0]) $$;

UDFs sit in the transformation layer - between raw storage and reporting
SELECT; return one value per rowSECURE to hide the definition from other usersCALL
CREATE OR REPLACE PROCEDURE archive_old_shipments(cutoff INT)
RETURNS STRING LANGUAGE SQL AS $$
DECLARE -- 1. Declare variables
rows_moved INT DEFAULT 0;
BEGIN -- 2. Begin the procedure
INSERT INTO shipments_archive -- 3. Execute SQL commands
SELECT * FROM shipments WHERE delivery_days > :cutoff;
rows_moved := SQLROWCOUNT;
DELETE FROM shipments WHERE delivery_days > :cutoff;
RETURN 'Archived ' || rows_moved -- 4. Return the result
|| ' shipments.';
END; -- 5. End the procedure
$$;
| UDF | Stored Procedure | |
|---|---|---|
| Called in | SELECT |
CALL |
| Returns | One value per row | Single value |
| Can Execute DML | No | Yes |
| Languages | SQL, Python, Java, JS | SQL (Snowflake Scripting), Python, Java, JS, Scala |
| Use in Pipeline | Transform Data | Orchestrate Logic |
Data Pipeline Automation in Snowflake