ETL e ELT em Python
Jake Roach
Data Engineer
Pipelines de dados devem ser monitorados por mudanças nos dados e falhas na execução
$$

import logging
logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.DEBUG)
# Criar diferentes tipos de logs
logging.debug(f"Variable has value {path}")
logging.info("Data has been transformed and will now be loaded.")
DEBUG: Variable has value raw_file.csv
INFO: Data has been transformed and will now be loaded.
import logging
logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.DEBUG)
# Criar diferentes tipos de logs
logging.warning("Unexpected number of rows detected.")
logging.error("{ke} arose in execution.")
WARNING: Unexpected number of rows detected.
ERROR: KeyError arose in execution.
try:
# Execute algum código aqui
...
except:
# Log sobre falhas que ocorreram
# Lógica para executar na exceção
...
Passe a exceção específica na cláusula except
try:
# Tentar filtrar por price_change
clean_stock_data = transform(raw_stock_data)
logging.info("Successfully filtered DataFrame by 'price_change'")
except KeyError as ke:
# Tratar o erro, criar nova coluna, transformar
logging.warning(f"{ke}: Cannot filter DataFrame by 'price_change'")
raw_stock_data["price_change"] = raw_stock_data["close"] - raw_stock_data["open"]
clean_stock_data = transform(raw_stock_data)
ETL e ELT em Python