{"id":6973,"library":"airflow-clickhouse-plugin","title":"Airflow ClickHouse Plugin","description":"The `airflow-clickhouse-plugin` provides Apache Airflow operators, hooks, and sensors for interacting with ClickHouse databases. It supports executing DDL/DML commands and queries. The current version is `1.6.0`, and releases are typically made to align with new Apache Airflow major and minor versions.","status":"active","version":"1.6.0","language":"en","source_language":"en","source_url":"https://github.com/bryzgaloff/airflow-clickhouse-plugin","tags":["airflow","clickhouse","database","plugin","operator","hook","sensor"],"install":[{"cmd":"pip install airflow-clickhouse-plugin","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Peer dependency, required for the plugin to function within Airflow environment.","package":"apache-airflow","optional":false},{"reason":"Underlying Python client for ClickHouse interaction used by `ClickHouseOperator` and `ClickHouseHook`.","package":"clickhouse-driver","optional":false}],"imports":[{"note":"This is a community plugin, not an official Airflow provider, so imports are directly from the plugin's package name.","wrong":"from airflow.providers.clickhouse.hooks.clickhouse import ClickHouseHook","symbol":"ClickHouseHook","correct":"from airflow_clickhouse_plugin.hooks.clickhouse import ClickHouseHook"},{"note":"This is a community plugin, not an official Airflow provider, so imports are directly from the plugin's package name. This operator uses `clickhouse-driver.Client.execute`.","wrong":"from airflow.providers.clickhouse.operators.clickhouse import ClickHouseOperator","symbol":"ClickHouseOperator","correct":"from airflow_clickhouse_plugin.operators.clickhouse import ClickHouseOperator"},{"note":"This operator is based on `airflow.providers.common.sql.operators.sql.SQLExecuteQueryOperator`.","symbol":"ClickHouseSqlOperator","correct":"from airflow_clickhouse_plugin.operators.clickhouse_sql import ClickHouseSqlOperator"},{"note":"This is a community plugin, not an official Airflow provider, so imports are directly from the plugin's package name.","wrong":"from airflow.providers.clickhouse.sensors.clickhouse import ClickHouseSensor","symbol":"ClickHouseSensor","correct":"from airflow_clickhouse_plugin.sensors.clickhouse import ClickHouseSensor"}],"quickstart":{"code":"import os\nfrom airflow.models.dag import DAG\nfrom airflow.utils.dates import days_ago\nfrom airflow_clickhouse_plugin.operators.clickhouse import ClickHouseOperator\n\n# Ensure a ClickHouse connection named 'clickhouse_default' is configured in Airflow.\n# Example Extra field JSON for ClickHouse connection (type 'ClickHouse'):\n# {\"host\": \"localhost\", \"port\": 8123, \"user\": \"default\", \"password\": \"\", \"database\": \"default\"}\n\nwith DAG(\n    dag_id='clickhouse_quickstart_dag',\n    start_date=days_ago(1),\n    schedule_interval=None,\n    tags=['clickhouse', 'example'],\n    catchup=False\n) as dag:\n    create_table = ClickHouseOperator(\n        task_id='create_example_table',\n        database='default', # Or specify a different database\n        sql=\"\"\"\n        CREATE TABLE IF NOT EXISTS my_test_table (\n            id UInt64,\n            name String\n        ) ENGINE = MergeTree()\n        ORDER BY id;\n        \"\"\",\n        clickhouse_conn_id='clickhouse_default',\n    )\n\n    insert_data = ClickHouseOperator(\n        task_id='insert_example_data',\n        database='default',\n        sql=\"INSERT INTO my_test_table VALUES (1, 'Alice'), (2, 'Bob');\",\n        clickhouse_conn_id='clickhouse_default',\n    )\n\n    query_data = ClickHouseOperator(\n        task_id='select_example_data',\n        database='default',\n        sql=\"SELECT * FROM my_test_table;\",\n        clickhouse_conn_id='clickhouse_default'\n        # Note: ClickHouseOperator executes queries; to retrieve results,\n        # you typically use a PythonOperator with ClickHouseHook or a sensor.\n    )\n\n    create_table >> insert_data >> query_data","lang":"python","description":"This quickstart demonstrates a basic Airflow DAG that uses the `ClickHouseOperator` to create a table, insert data, and query data in a ClickHouse database. Before running, configure an Airflow connection of type 'ClickHouse' (often named `clickhouse_default`) with appropriate host, port, user, password, and database details for your ClickHouse instance."},"warnings":[{"fix":"Review the changelog for v1.0.0 and update import paths, operator names, and parameters to align with the new structure. Decide which operator (`ClickHouseOperator` or `ClickHouseSqlOperator`) is best suited for your use case.","message":"Version 1.0.0 introduced significant refactoring and two distinct operator families (`ClickHouseOperator` and `ClickHouseSqlOperator`). Code written for pre-1.0.0 versions will likely require updates to import paths and operator parameters.","severity":"breaking","affected_versions":"<1.0.0"},{"fix":"Always check the plugin's GitHub releases or PyPI page for the list of supported Apache Airflow versions. Ensure your Airflow installation matches one of the supported versions or upgrade the plugin if a newer Airflow version is supported.","message":"The plugin has explicit compatibility with specific Airflow versions. Using it with an unsupported or significantly different Airflow version (e.g., a newer major version not yet listed as supported) can lead to runtime errors or unexpected behavior.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Understand the distinction between the two operators. `ClickHouseOperator` offers more direct integration with `clickhouse-driver`, while `ClickHouseSqlOperator` leverages Airflow's common SQL provider, which might be more familiar to users of other SQL database providers. Choose the one that best fits your specific requirements or existing patterns.","message":"There are two main operator families: `ClickHouseOperator` (based on `clickhouse-driver.Client.execute`) and `ClickHouseSqlOperator` (based on `airflow.providers.common.sql.operators.sql.SQLExecuteQueryOperator`). They might have subtle differences in behavior or supported SQL features.","severity":"gotcha","affected_versions":">=1.0.0"},{"fix":"Verify your Airflow connection details for the ClickHouse connection ID used by your operators (e.g., `clickhouse_default`). Ensure the connection type is 'ClickHouse' and all parameters (host, port, user, password, database, security settings) are correct and accessible from the Airflow worker.","message":"Incorrect Airflow connection configuration (wrong host, port, credentials, or database) will prevent the operators from connecting to ClickHouse, leading to task failures.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install the plugin using `pip install airflow-clickhouse-plugin` in the same Python environment where Airflow is running. Restart Airflow components (scheduler, webserver, workers) to ensure the plugin is loaded.","cause":"The `airflow-clickhouse-plugin` Python package is not installed in the Airflow environment, or the Airflow scheduler/worker restarted without the plugin being properly loaded.","error":"ModuleNotFoundError: No module named 'airflow_clickhouse_plugin'"},{"fix":"Ensure `airflow-clickhouse-plugin` is correctly installed and that Airflow components have been restarted after installation. Double-check that you are using 'ClickHouse' as the connection type in the Airflow UI, not 'clickhouse' (case might matter depending on Airflow version).","cause":"Airflow failed to register the ClickHouse connection type, usually because the plugin was not loaded correctly or there's a typo in the connection type being referenced.","error":"airflow.exceptions.AirflowException: The hook for connection type 'clickhouse' is not available."},{"fix":"Verify that your ClickHouse server is running and accessible from the machine where your Airflow worker is executing tasks. Check the `clickhouse_conn_id` configuration in Airflow for correct host, port, and security settings.","cause":"The ClickHouse server is not running, is not accessible from the Airflow worker, or the host/port in the Airflow connection are incorrect.","error":"Code: 210, e.displayText() = DB::Exception: Connection refused"},{"fix":"Ensure that `clickhouse_conn_id='your_connection_id'` is explicitly passed to the operator, and that 'your_connection_id' corresponds to a valid ClickHouse connection configured in Airflow.","cause":"The `clickhouse_conn_id` parameter was not provided to the `ClickHouseOperator` or `ClickHouseSqlOperator`, or its value is empty.","error":"airflow.exceptions.AirflowException: Missing connection id hook_type: clickhouse_conn_id, task_id: my_task"}]}