{"id":5849,"library":"apache-airflow-providers-apache-livy","title":"Apache Airflow Livy Provider","description":"The Apache Airflow Livy Provider package enables Apache Airflow to interact with Apache Livy, an open-source REST service for submitting and managing Spark jobs on a cluster over HTTP. It includes operators and hooks to facilitate the orchestration of Spark applications within Airflow DAGs. The current version is 4.5.5, with provider packages often updated independently of, but in alignment with, core Airflow releases.","status":"active","version":"4.5.5","language":"en","source_language":"en","source_url":"https://github.com/apache/airflow/tree/main/airflow/providers/apache/livy","tags":["airflow","provider","livy","spark","etl","big-data"],"install":[{"cmd":"pip install apache-airflow-providers-apache-livy","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Core Airflow installation is required to use providers.","package":"apache-airflow","version":">=2.11.0"},{"reason":"Required for HTTP connection management.","package":"apache-airflow-providers-http","version":">=6.0.1","optional":true},{"reason":"Required for compatibility features within the Airflow ecosystem.","package":"apache-airflow-providers-common-compat","version":">=1.12.0","optional":true},{"reason":"Asynchronous HTTP client library, used by underlying HTTP operations.","package":"aiohttp","version":">=3.9.2","optional":true}],"imports":[{"symbol":"LivyOperator","correct":"from airflow.providers.apache.livy.operators.livy import LivyOperator"},{"symbol":"LivyHook","correct":"from airflow.providers.apache.livy.hooks.livy import LivyHook"}],"quickstart":{"code":"from __future__ import annotations\n\nimport os\nfrom datetime import datetime\n\nfrom airflow.models.dag import DAG\nfrom airflow.providers.apache.livy.operators.livy import LivyOperator\n\n\nwith DAG(\n    dag_id='livy_spark_pi_example',\n    schedule=None,\n    start_date=datetime(2023, 1, 1),\n    catchup=False,\n    tags=['livy', 'spark', 'example'],\n) as dag:\n    # Ensure 'livy_default' connection is configured in Airflow UI (Admin -> Connections)\n    # with appropriate host and port for your Livy server.\n    # Example: livy_default, Host: localhost, Port: 8998\n\n    submit_spark_pi_job = LivyOperator(\n        task_id='submit_spark_pi_job',\n        file=os.getenv('LIVY_SPARK_PI_JAR', '/opt/spark/examples/jars/spark-examples_2.12-3.2.1.jar'),\n        class_name='org.apache.spark.examples.SparkPi',\n        args=['10'], # Example: Calculate Pi with 10 iterations\n        livy_conn_id='livy_default',\n        driver_memory='1g',\n        executor_memory='1g',\n        num_executors=1\n    )","lang":"python","description":"This quickstart demonstrates a basic Airflow DAG using the `LivyOperator` to submit a Spark Pi calculation job to a Livy server. Ensure you have a 'livy_default' connection configured in your Airflow UI pointing to your Livy instance. The `file` parameter should point to the Spark example JAR on your cluster's accessible path."},"warnings":[{"fix":"Always check the provider's documentation for the minimum supported Airflow version. Upgrade your Airflow installation if it doesn't meet the provider's requirements. For current version 4.5.5, ensure Airflow >= 2.11.0.","message":"Provider versions have specific minimum Airflow core versions. For `apache-airflow-providers-apache-livy` version 4.5.x, Airflow 2.11.0 or higher is required. Using older Airflow versions with newer providers can lead to incompatibility errors or unexpected behavior.","severity":"breaking","affected_versions":"<4.5.x (for older Airflow versions)"},{"fix":"When upgrading providers, especially on older Airflow installations, be aware of potential core Airflow upgrades. Always run `airflow upgrade db` after an Airflow upgrade to ensure database schema compatibility. It's recommended to upgrade Airflow to a modern version (>=2.11.0) before upgrading to recent Livy provider versions.","message":"Older provider versions (e.g., 3.0.0 and earlier) introduced breaking changes due to the removal of the `apply_default` decorator. If you upgrade the Livy provider on an Airflow instance older than 2.1.0, Airflow might automatically upgrade, necessitating a manual `airflow upgrade db` command.","severity":"breaking","affected_versions":"<3.0.0 (provider versions)"},{"fix":"Before running a DAG with `LivyOperator`, navigate to Airflow UI -> Admin -> Connections. Create or verify a connection named 'livy_default' (or your specified `livy_conn_id`) with the correct host and port for your Apache Livy server.","message":"The `LivyOperator` requires a Livy connection to be configured in the Airflow UI (Admin -> Connections). The default `livy_conn_id` is 'livy_default'. Misconfiguration or absence of this connection will result in task failures.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-14T00:00:00.000Z","next_check":"2026-07-13T00:00:00.000Z"}