Menjalankan pipeline data di produksi

ETL dan ELT di Python

Jake Roach

Data Engineer

Pola arsitektur pipeline data

# Define ETL function
...
def load(clean_data):
...

# Run the data pipeline
raw_stock_data = extract("raw_stock_data.csv")
clean_stock_data = transform(raw_stock_data)
load(clean_stock_data)

> ls
 etl_pipeline.py
# Import extract, transform, and load functions
from pipeline_utils import extract, transform, load

# Run the data pipeline
raw_stock_data = extract("raw_stock_data.csv")
clean_stock_data = transform(raw_stock_data)
load(clean_stock_data)

> ls
 etl_pipeline.py
 pipeline_utils.py
ETL dan ELT di Python

Menjalankan pipeline data end-to-end

import logging
from pipeline_utils import extract, transform, load

logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.DEBUG)
try:
    # Extract, transform, and load data
    raw_stock_data = extract("raw_stock_data.csv")
    clean_stock_data = transform(raw_stock_data)
    load(clean_stock_data)

    logging.info("Successfully extracted, transformed and loaded data.")  # Log success message

# Handle exceptions, log messages
except Exception as e:
    logging.error(f"Pipeline failed with error: {e}")
ETL dan ELT di Python

Mengorkestrasi pipeline data di produksi

Alat orkestrasi berdasarkan pangsa pasar.

1 https://open.substack.com/pub/seattledataguy/p/the-state-of-data-engineering-part?r=1po78c&utm_campaign=post&utm_medium=web
ETL dan ELT di Python

Ayo berlatih!

ETL dan ELT di Python

Preparing Video For Download...