dbt-clickhouse
dbt-clickhouse is the ClickHouse plugin for dbt (data build tool), enabling data professionals to transform data in ClickHouse using SQL. It leverages dbt-core's capabilities, providing native support for ClickHouse-specific features like materialized views and database engines. Currently at version 1.10.0, the library maintains an active release cadence, frequently aligning with new dbt-core versions and introducing ClickHouse-specific enhancements.
Warnings
- gotcha In v1.10.0, the `catchup: False` configuration flag for materialized views (`materialized='materialized_view'`) is now strictly respected during full refresh operations, preventing backfilling of historical data. Previously, this might have been ignored in some scenarios.
- breaking A regression in dbt-clickhouse v1.9.4 caused `dbt test` commands to fail with `macro 'dbt_macro__get_expected_sql' takes not more than 2 argument(s)` errors.
- gotcha dbt-clickhouse versions have specific compatibility requirements with dbt-core versions. For instance, v1.9.2 limited dbt-core to `<1.10.X`, while v1.9.7 added official support for dbt-core 1.10. Using incompatible versions can lead to unexpected errors or features not working.
- gotcha dbt-clickhouse v1.9.8 updated its minimum requirement for `dbt-adapters` to `1.16.7` to resolve compatibility issues. Older versions of `dbt-adapters` might cause problems when used with `dbt-clickhouse 1.9.8` or newer.
Install
-
pip install dbt-clickhouse
Quickstart
import os
import shutil
from dbt.cli.main import dbtRunner, dbtUsageException
# Define paths for the quickstart project
project_dir = "dbt_clickhouse_quickstart_project"
profiles_dir = ".dbt" # Default location for profiles.yml
# Clean up previous runs to ensure a fresh start
if os.path.exists(project_dir):
shutil.rmtree(project_dir)
if os.path.exists(profiles_dir):
shutil.rmtree(profiles_dir)
# Create necessary directories
os.makedirs(project_dir)
os.makedirs(os.path.join(project_dir, "models"))
os.makedirs(profiles_dir)
# --- 1. Create profiles.yml for ClickHouse connection ---
# Using environment variables for connection details, falling back to localhost defaults
profiles_content = f"""
dbt_clickhouse_quickstart:
target: dev
outputs:
dev:
type: clickhouse
host: {os.environ.get('DBT_CLICKHOUSE_HOST', 'localhost')}
port: {os.environ.get('DBT_CLICKHOUSE_PORT', '8123')}
user: {os.environ.get('DBT_CLICKHOUSE_USER', 'default')}
password: {os.environ.get('DBT_CLICKHOUSE_PASSWORD', '')}
database: {os.environ.get('DBT_CLICKHOUSE_DATABASE', 'default')}
schema: default # In ClickHouse, database acts as schema for dbt
interface: http # or native
secure: false
verify: false
"""
with open(os.path.join(profiles_dir, 'profiles.yml'), 'w') as f:
f.write(profiles_content)
# --- 2. Create dbt_project.yml to link to the profile ---
dbt_project_content = f"""
name: 'dbt_clickhouse_quickstart'
version: '1.0.0'
config-version: 2
profile: 'dbt_clickhouse_quickstart'
model-paths: ["models"]
"""
with open(os.path.join(project_dir, 'dbt_project.yml'), 'w') as f:
f.write(dbt_project_content)
# --- 3. Create a simple SQL model file ---
model_content = """
{{ config(materialized='table') }}
SELECT
1 as id,
'hello dbt clickhouse' as value
"""
with open(os.path.join(project_dir, 'models', 'my_test_model.sql'), 'w') as f:
f.write(model_content)
# --- 4. Run dbt commands programmatically ---
dbt = dbtRunner()
print("\n--- Running dbt debug to test connection ---")
# The debug command does not require an active ClickHouse server to run, but will report connection errors
debug_res = dbt.invoke(["debug", "--project-dir", project_dir, "--profiles-dir", profiles_dir])
if debug_res.success:
print("dbt debug successful (connection details printed, may show connectivity errors if ClickHouse is not running).")
else:
print(f"dbt debug failed: {debug_res.exception}")
print("\n--- Attempting to run dbt run (requires a running ClickHouse instance) ---")
try:
run_res = dbt.invoke(["run", "--project-dir", project_dir, "--profiles-dir", profiles_dir])
if run_res.success:
print("dbt run successful. Model 'my_test_model' should be created as a table in ClickHouse.")
else:
print(f"dbt run failed: {run_res.exception}. Ensure ClickHouse is running and accessible.")
except dbtUsageException as e:
print(f"dbt CLI usage error: {e}")
except Exception as e:
print(f"An unexpected error occurred during dbt run: {e}")
# Clean up the created files and directories
shutil.rmtree(project_dir)
shutil.rmtree(profiles_dir)
print("\nCleaned up quickstart files.")