Introduction to Apache Airflow in Python
Mike Metzger
Data Engineer
Data engineering is:
A workflow is:
Airflow is a platform to program workflows, including:
Other tools:
A DAG stands for Directed Acyclic Graph
Simple DAG definition:
etl_dag = DAG(
dag_id='etl_pipeline',
default_args={"start_date": "2024-01-08"}
)
Running a simple Airflow task
airflow tasks test <dag_id> <task_id> [execution_date]
Using a DAG named example-etl, a task named download-file on 2024-01-10:
airflow tasks test example-etl download-file 2024-01-10
Introduction to Apache Airflow in Python