Introduction to Apache Airflow in Python
Mike Metzger
Data Engineer
schedule_intervalrunningfailedsuccess

When scheduling a DAG, there are several attributes of note:
start_date - The date / time to initially schedule the DAG runend_date - Optional attribute for when to stop running new DAG instancesmax_tries - Optional attribute for how many attempts to makeschedule_interval - How often to runschedule_interval represents:
start_date and end_datecron style syntax or via built-in presets.
* represents running for every interval (ie, every minute, every day, etc)0 12 * * * # Run daily at noon
* * 25 2 * # Run once per minute on February 25
0,15,30,45 * * * * # Run every 15 minutes
Preset:
cron equivalent:
0 * * * *0 0 * * *0 0 * * 00 0 1 * *0 0 1 1 *Airflow has two special schedule_interval presets:
None - Don't schedule ever, used for manually triggered DAGs@once - Schedule only onceWhen scheduling a DAG, Airflow will:
start_date as the earliest possible valuestart_date + schedule_interval'start_date': datetime(2020, 2, 25),
'schedule_interval': @daily
This means the earliest starting time to run the DAG is on February 26th, 2020
Introduction to Apache Airflow in Python