{"id":3873,"library":"apache-airflow-providers-apache-impala","title":"Apache Airflow Impala Provider","description":"The Apache Airflow Impala Provider integrates Apache Airflow with Apache Impala, allowing users to programmatically author, schedule, and monitor workflows that interact with Impala databases. It provides hooks and operators to execute SQL queries and manage Impala connections within Airflow DAGs. The current version is 1.9.1, released on 2026-03-28, and follows the release cadence of other Airflow providers, typically updated every few months.","status":"active","version":"1.9.1","language":"en","source_language":"en","source_url":"https://github.com/apache/airflow/tree/main/airflow/providers/apache/impala","tags":["airflow-provider","apache-impala","etl","data-pipeline","sql"],"install":[{"cmd":"pip install apache-airflow-providers-apache-impala","lang":"bash","label":"Install core provider"},{"cmd":"pip install apache-airflow-providers-apache-impala[sqlalchemy]","lang":"bash","label":"Install with SQLAlchemy support"}],"dependencies":[{"reason":"Core Airflow dependency; provider version 1.9.x requires Airflow >=2.11.0.","package":"apache-airflow","optional":false},{"reason":"Python client for Impala interaction.","package":"impyla","optional":false},{"reason":"Common compatibility utilities for providers.","package":"apache-airflow-providers-common-compat","optional":true},{"reason":"Common SQL utilities, used by SQLExecuteQueryOperator.","package":"apache-airflow-providers-common-sql","optional":true},{"reason":"Required for `ImpalaHook.sqlalchemy_url` method.","package":"sqlalchemy","optional":true},{"reason":"Required for Kerberos authentication.","package":"kerberos","optional":true}],"imports":[{"symbol":"ImpalaHook","correct":"from airflow.providers.apache.impala.hooks.impala import ImpalaHook"},{"note":"The dedicated ImpalaOperator is deprecated; use the generic SQLExecuteQueryOperator instead for executing SQL queries.","wrong":"from airflow.providers.apache.impala.operators.impala import ImpalaOperator","symbol":"SQLExecuteQueryOperator","correct":"from airflow.providers.common.sql.operators.sql import SQLExecuteQueryOperator"}],"quickstart":{"code":"from __future__ import annotations\nimport datetime\nfrom airflow import DAG\nfrom airflow.providers.common.sql.operators.sql import SQLExecuteQueryOperator\n\n# Ensure you have an Airflow connection named 'my_impala_conn'\n# with appropriate Impala host, port (default 21050), and credentials.\n# Example Extra JSON: {'auth_mechanism': 'NOSASL'}\n\nwith DAG(\n    dag_id=\"example_impala_dag\",\n    start_date=datetime.datetime(2023, 1, 1),\n    default_args={\n        \"conn_id\": \"my_impala_conn\", # Airflow connection ID for Impala\n        \"owner\": \"airflow\"\n    },\n    schedule=\"@once\",\n    catchup=False,\n    tags=[\"impala\", \"sql\"],\n) as dag:\n    create_table_task = SQLExecuteQueryOperator(\n        task_id=\"create_impala_table\",\n        sql=\"\"\"CREATE TABLE IF NOT EXISTS my_impala_table (id INT, name STRING)\"\"\"\n    )\n\n    insert_data_task = SQLExecuteQueryOperator(\n        task_id=\"insert_impala_data\",\n        sql=\"\"\"INSERT INTO my_impala_table VALUES (1, 'Alice'), (2, 'Bob')\"\"\"\n    )\n\n    select_data_task = SQLExecuteQueryOperator(\n        task_id=\"select_impala_data\",\n        sql=\"\"\"SELECT COUNT(*) FROM my_impala_table\"\"\",\n        handler=lambda x: print(f\"Row count: {x[0][0]}\")\n    )\n\n    drop_table_task = SQLExecuteQueryOperator(\n        task_id=\"drop_impala_table\",\n        sql=\"\"\"DROP TABLE IF EXISTS my_impala_table\"\"\"\n    )\n\n    (create_table_task >> insert_data_task >> select_data_task >> drop_table_task)","lang":"python","description":"This quickstart demonstrates a simple Airflow DAG that uses the `SQLExecuteQueryOperator` to interact with an Apache Impala database. It includes tasks for creating a table, inserting data, selecting data, and dropping the table. Before running, ensure you have configured an Impala connection in Airflow with the ID `my_impala_conn`."},"warnings":[{"fix":"Ensure your Airflow core installation meets the minimum version requirement for the Impala provider you intend to use. Upgrade Airflow if necessary.","message":"Provider versions have minimum Apache Airflow core version requirements. For provider version 1.9.x, you must be running Airflow 2.11.0 or newer. Installing an incompatible provider version may lead to dependency conflicts or runtime errors.","severity":"breaking","affected_versions":"1.8.0+, 1.7.0+"},{"fix":"Replace `ImpalaOperator` imports and usages with `SQLExecuteQueryOperator`.","message":"The dedicated `ImpalaOperator` is deprecated. Users should migrate to the more generic and flexible `SQLExecuteQueryOperator` from `airflow.providers.common.sql.operators.sql` for executing SQL queries against Impala.","severity":"deprecated","affected_versions":"All versions where `SQLExecuteQueryOperator` is available and preferred."},{"fix":"Install the provider with the `sqlalchemy` extra: `pip install apache-airflow-providers-apache-impala[sqlalchemy]`.","message":"The `ImpalaHook.sqlalchemy_url` property requires the `sqlalchemy` library to be installed. It is an optional dependency for the provider and needs to be installed explicitly via an extra.","severity":"gotcha","affected_versions":"All versions using `sqlalchemy_url`."},{"fix":"Ensure your Python environment is version 3.10 or newer.","message":"Support for older Python versions has been dropped. Provider 1.9.1 requires Python >=3.10. Older provider versions dropped support for Python 3.9 (e.g., 1.7.1) and Python 3.7.","severity":"breaking","affected_versions":"1.7.1+, 1.9.x"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}