Branching

Introduction to Apache Airflow in Python

Mike Metzger

Data Engineer

Branching

  • Provides conditional logic (ie, if -> then for tasks)
  • Using @task.branch

  • Runs a python function to return the next task id (or list of ids) to follow

Introduction to Apache Airflow in Python

Branching example

@task.branch
def branch_task(logical_date):
  if int(logical_date.month) % 3 == 0:
    # Months 3 - March, 6 - June, 9 - September, 12 - December
    return 'end_of_quarter_task'
  else:
    # All other months
    return 'regular_monthly_task'
Introduction to Apache Airflow in Python

Branching example

@task.branch
def branch_task(logical_date):
  if int(logical_date.month) % 3 == 0:
    # Months 3 - March, 6 - June, 9 - September, 12 - December
    return 'end_of_quarter_task'
  else:
    # All other months
    return 'regular_monthly_task'

start_task >> branch_task >> end_of_quarter_task >> end_of_quarter_task2
branch_task >> regular_monthly_task >> regular_monthly_task2
Introduction to Apache Airflow in Python

Branching graph view

Airflow graph view of a branching Dag with two downstream task paths

Introduction to Apache Airflow in Python

Branching End of quarter months

Airflow graph view with the end-of-quarter tasks run and monthly tasks skipped

Introduction to Apache Airflow in Python

Branching Regular months

Airflow graph view with the regular monthly tasks run and end-of-quarter tasks skipped

Introduction to Apache Airflow in Python

Date variables

  • ds - Logical date with dashes YYYY-MM-DD
  • ds_nodash - Logical date without dashes YYYYMMDD
  • prev_data_interval_start_success - Date of last successful Dag run
  • Many others - check Airflow variables documentation
1 https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html#templates-variables
Introduction to Apache Airflow in Python

Let's practice!

Introduction to Apache Airflow in Python

Preparing Video For Download...