Apache Airflow Docker Provider
The Apache Airflow Docker Provider package enables Airflow to interact with Docker, allowing users to execute tasks within Docker containers. This provider is actively maintained and frequently updated, with the current version being 4.5.4. All core functionalities are encapsulated within the `airflow.providers.docker` Python package.
Warnings
- breaking Provider version 3.0.0 removed the `xcom_push` parameter from the `DockerOperator`. Functionality for XCom pushing might need to be implemented differently or rely on standard stdout capture.
- breaking For provider versions around 2.0.0 and newer, the `volumes` parameter in `DockerOperator` was replaced by the `mounts` parameter. This change requires updating how volumes are specified to align with the newer mount syntax.
- breaking Each `apache-airflow-providers-docker` version requires a specific minimum version of `apache-airflow`. For example, provider 3.0.0 requires Airflow >=2.2.0, while 4.5.4 requires Airflow >=2.11.0. Installing a mismatched provider version can lead to automatic Airflow upgrades or import errors.
- gotcha When using `DockerOperator` with a remote Docker Engine or in a Docker-in-Docker setup, you might encounter issues or warnings related to temporary directory mounting. Setting `mount_tmp_dir=False` might be necessary to ensure smooth operation.
- breaking Provider version 4.4.1 dropped support for Python 3.9. Users on older Python versions will need to upgrade their Python environment. The current version requires Python >=3.10.
Install
-
pip install apache-airflow-providers-docker
Imports
- DockerOperator
from airflow.providers.docker.operators.docker import DockerOperator
Quickstart
import os
from datetime import datetime
from airflow.models.dag import DAG
from airflow.providers.docker.operators.docker import DockerOperator
with DAG(
dag_id='docker_operator_quickstart',
start_date=datetime(2023, 1, 1),
schedule_interval=None,
catchup=False,
tags=['docker', 'example'],
) as dag:
run_docker_task = DockerOperator(
task_id='run_hello_world_in_docker',
image='python:3.10-slim-buster',
command='python -c "print(\'Hello from Docker container in Airflow!\')"',
auto_remove='force', # Automatically remove the container on exit
mount_tmp_dir=False, # Set to True or False depending on local/remote Docker setup
docker_url=os.environ.get('DOCKER_HOST', 'unix://var/run/docker.sock'), # Use DOCKER_HOST env var or default
)