User-Defined Functions and Stored Procedures

Data Pipeline Automation in Snowflake

Emily Melhuish

Technical Curriculum Developer, Snowflake

What is a UDF?

nanobanana: half: Two abstract padlock icons side by side, one open and one closed, flat minimalist design, DataCamp navy blue and green colors, white background, no text

User-Defined Functions

Custom operations beyond built-in SQL functions
Written in SQL, Python, Java, Scala or JavaScript
No DDL or DML — returns values only
Standard or SECURE (hides logic from callers)
Returns a scalar value per row, or a table

SQL UDFs

Recommended starting point

CREATE FUNCTION delivery_tier(days INT)
RETURNS VARCHAR AS $$
  CASE WHEN days <= 2 THEN 'Express'
       WHEN days <= 5 THEN 'Standard'
       ELSE 'Delayed' END $$;

Python & Other Programming Language UDFs

When SQL cannot do the job

CREATE FUNCTION parse_weight(label STRING)
RETURNS FLOAT LANGUAGE PYTHON
RUNTIME_VERSION = '3.9' HANDLER = 'parse'
AS $$ def parse(s): return float(s.split('kg')[0]) $$;

UDFs in the Pipeline

mermaid: UDF transformation layer pipeline

UDFs sit in the transformation layer - between raw storage and reporting

Called inside SELECT; return one value per row
Centralize business logic used across multiple queries
Add SECURE to hide the definition from other users

What is a Stored Procedure?

Declare variables, loop over results, handle errors
Execute Data Manipulation Language (DML) across multiple tables
Called standalone with CALL

mermaid: datacamp-purple: stored procedure orchestration flow

Stored Procedure: Snowflake Scripting

CREATE OR REPLACE PROCEDURE archive_old_shipments(cutoff INT)
RETURNS STRING LANGUAGE SQL AS $$
DECLARE                                    -- 1. Declare variables
    rows_moved INT DEFAULT 0;
BEGIN                                      -- 2. Begin the procedure
    INSERT INTO shipments_archive          -- 3. Execute SQL commands
        SELECT * FROM shipments WHERE delivery_days > :cutoff;
    rows_moved := SQLROWCOUNT;
    DELETE FROM shipments WHERE delivery_days > :cutoff;
    RETURN 'Archived ' || rows_moved       -- 4. Return the result
        || ' shipments.';
END;                                       -- 5. End the procedure
$$;

UDFs vs Stored Procedures

	UDF	Stored Procedure
Called in	`SELECT`	`CALL`
Returns	One value per row	Single value
Can Execute DML	No	Yes
Languages	SQL, Python, Java, JS	SQL (Snowflake Scripting), Python, Java, JS, Scala
Use in Pipeline	Transform Data	Orchestrate Logic

Let's Practice!

Data Pipeline Automation in Snowflake