Airflow Dags

Introduction to Apache Airflow in Python

Mike Metzger

Data Engineer

What is a Dag?

Dag, or Directed Acyclic Graph:

  • Directed - flow between components represents dependencies
  • Acyclic - does not loop or repeat
  • Graph - the set of components

Directed acyclic graph with nodes connected by one-way arrows and no loops

Introduction to Apache Airflow in Python

Dag in Airflow

  • Written in Python (but can use components written in other languages)
  • Made up of Tasks to be executed, such as operators or sensors
  • Contain dependencies defined explicitly or implicitly
    • ie, Copy the file to the server before trying to import it to the database service.

  Airflow Dag of connected tasks showing dependency order between them

Introduction to Apache Airflow in Python

Define a Dag

Example Dag (Taskflow API):

from airflow.sdk import dag

from pendulum import datetime @dag( dag_id='etl_workflow', email='[email protected]', start_date=datetime(2026, 3, 15, tz="UTC") )
def etl_workflow(): ...
etl_workflow()
Introduction to Apache Airflow in Python

Dags on the command line

 

  • airflow command line contains many subcommands
  • airflow -h - help and subcommand descriptions

 

  • Dag subcommands

    • airflow dags list - show all recognized Dags
    • airflow dags reserialize - force Airflow to reload Dag files
    • airflow tasks test - run a specific task
Introduction to Apache Airflow in Python

Command line vs Python

Use the command line tool to:

  • Start Airflow processes
  • Manually run Dags / Tasks
  • Get logging information from Airflow

Terminal running the airflow command line tool

Use Python to:

  • Create a Dag
  • Edit the individual properties of a Dag

Illustration representing Python code

Introduction to Apache Airflow in Python

Let's practice!

Introduction to Apache Airflow in Python

Preparing Video For Download...