Key concepts
For context around the terms used in this blog post, here are a few key concepts for Airflow:
- DAG (Directed Acyclic Graph): a workflow which glues all the tasks with inter-dependencies.
- Operator: a template for a specific type of work to be executed. For example, BashOperator represents how to execute a bash script, while PythonOperator represents how to execute a python function, etc.
- Sensor: a type of special operator which will only execute if a certain condition is met.
- Task: a parameterized instance of an operator/sensor which represents a unit of actual work to be executed.
- Plugin: an extension to allow users to easily extend Airflow with various custom hooks, operators, sensors, macros, and web views.
- Pools: concurrency limit configuration for a set of Airflow tasks.
- Connections to define any external DB, FTP etc. connection’s authentication.
- Variables to store and retrieve arbitrary content or settings as a simple key value.
- XCom to share keys/values between independent tasks.
- Pools to limit the execution parallelism on arbitrary sets of tasks.
- Hooks to reach external platforms and databases.