Apache Airflow Apache Flink Provider

1.8.4 · active · verified Thu Apr 16

The Apache Airflow Apache Flink Provider package extends Apache Airflow's capabilities by integrating with Apache Flink. It enables users to programmatically author, schedule, and monitor workflows that involve submitting Flink jobs and interacting with Flink clusters directly from Airflow DAGs. This provider is actively maintained, with version 1.8.4 released on March 28, 2026, and follows a regular release cadence aligned with the broader Airflow ecosystem.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates a basic Airflow DAG using the `FlinkOperator` to submit a Flink job. Ensure that you have a Flink connection configured in your Airflow environment (e.g., named 'flink_default') and replace `/path/to/my-flink-job.jar` with the actual path to your Flink job JAR file. For Kubernetes deployments, the Flink image can be configured via `FLINK_K8S_IMAGE` environment variable.

from __future__ import annotations

import os
import pendulum

from airflow.models.dag import DAG
from airflow.providers.apache.flink.operators.flink import FlinkOperator

with DAG(
    dag_id="flink_example_dag",
    start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
    schedule=None,
    catchup=False,
    tags=["flink", "example"],
) as dag:
    submit_flink_job = FlinkOperator(
        task_id="submit_example_flink_job",
        job_name="my_example_flink_job",
        main_class="com.example.flink.MyJob",
        jar="/path/to/my-flink-job.jar",
        flink_configuration={
            "taskmanager.memory.process.size": "2g",
            "kubernetes.container.image": os.environ.get("FLINK_K8S_IMAGE", "flink:latest"),
        },
        # Ensure a Flink connection is configured in Airflow UI with ID 'flink_default'
        # flink_conn_id="flink_default", 
    )

view raw JSON →